Skip to content

Latest commit

 

History

History
973 lines (695 loc) · 29.4 KB

README.md

File metadata and controls

973 lines (695 loc) · 29.4 KB

Mapper

Latest version Total downloads Build status Test coverage Code style

Mapper transforms arrays from one format to another using an object composition DSL. An application often receives data from a foreign source structured differently than it wants. We can use Mapper to transform foreign data into a more suitable format for our application using a Mapping as shown in the following example.

$mappedData = (new Mapper)->map($data, new MyMapping);

This supposes we already created a mapping, MyMapping, to convert $data into $mappedData.

Contents

  1. Mappings
  2. Strategies
  3. Practical example
  4. Strategy reference
    1. Fetchers
      1. Copy
      2. CopyContext
      3. CopyKey
    2. Augmenters
      1. Callback
      2. Collection
      3. Context
      4. Either
      5. Filter
      6. Flatten
      7. IfElse
      8. IfExists
      9. Join
      10. Merge
      11. Regex
      12. Replace
      13. TakeFirst
      14. ToList
      15. TryCatch
      16. Type
      17. Unique
    3. Others
      1. Debug
  5. Requirements
  6. Limitations
  7. Testing

Mappings

Mappings are data transformation descriptions that describe how to convert data from one format to another. Mappings are an object wrapper for an array, which describes the output format, with expressions that can fetch and augment input data. To write a mapping we must know the input data format so we can write an array that represents the desired output format and decorate it with expressions to transform the input data.

Example

In the following simple but contrived example we use a mapping to effectively rename the input array's key from foo to bar.

$fooData = ['foo' => 123];

class FooToBarMapping extends Mapping
{
    protected function createMapping()
    {
        return ['bar' => new Copy('foo')];
    }
}

$barData = (new Mapper)->map($fooData, new FooToBarMapping);

['bar' => 123]

In this example we declare a mapping, FooToBarMapping, and pass it to the Mapper::map method to transform $fooData into $barData. As mentioned, this is just a contrived example to demonstrate how Mapper works; one may like to see a more practical example.

This mapping introduces the Copy strategy that copies a value from the input data to the output. Strategies are just one type of expression we can specify as mapping values.

Expressions

An expression is a pseudo-type representing the list of valid mapping value types. The keys of a mapping are never modified by Mapper but its values may change depending on the expression type. Following is the list of valid expression types; any other type causes InvalidExpressionException to be thrown.

  1. Strategy
  2. Mapping
  3. Mapping fragment
  4. Scalar
  5. null

Strategies are invoked and substituted as described in the following section. Mappings may contain any number of additional embedded mappings or mapping fragments—a mapping fragment is just a mapping described by an array instead of a Mapping object. Scalar values (integer, float, string and boolean) and null have no special meaning and are presented verbatim in the output.

Writing a mapping

To write a mapping create a new class that extends Mapping and implement its abstract method, createMapping(), that returns a strategy or an array describing the output format with any combination of expressions.

For prototyping purposes we can avoid writing a new mapping class and instead create an AnonymousMapping, passing the mapping definition to its constructor, which can be quicker than writing a new class. However, the recommended way to write mappings is to write new classes so mappings have meaningful names to identify them.

It is recommended to name mapping classes XToYMapping where X is the name of the input format and Y is the name of the output format.

Strategy-based mappings

Strategy-based mappings are created by specifying a strategy at the top level. Usually mappings are array-based, and although such mappings may contain other expressions, including strategies, at the top level they are an array.

Some problems can only be solved with strategy-based mappings. For example, suppose we want to create a mapping that combines two other mappings at the top level. With array-based mappings the best we can do is something like the following.

protected function createMapping()
{
    return [
       'foo' => new FooMapping,
       'bar' => new BarMapping,
    ]
}

This composes FooMapping and BarMapping in our mapping but each mapping will be mapped under new foo and bar keys respectively. What we really want is to combine the keys of each mapping together at the top level of our mapping but there is no way to express a solution to this problem with array-based mappings. If we use the Merge strategy as the basis of our mapping we can solve this problem.

protected function createMapping()
{
    return new Merge(new FooMapping, new BarMapping);
}

Strategies

Strategies are invokable classes that are invoked by Mapper and substituted for their return values. Strategies can be broadly broken down into two categories: fetchers and augmenters. Fetch strategies retrieve data while augmenters change data provided by other strategies.

Strategies are basic building blocks from which complex data manipulation chains can be constructed to meet the bespoke requirements of an application. The composition of strategies forms a powerful object composition DSL that allows us to express how to retrieve and augment data to mould it into the desired format.

For a complete list of strategies please see the strategy reference.

Writing strategies

Strategies must implement the Strategy interface but it is common to extend Delegate or Decorator because we usually write augmenters which expect another strategy injected into them to provide data. Delegate and Decorator provide the delegate() method, which allows a strategy to evaluate an expression using Mapper, and is usually needed to evaluate the injected strategy. Delegate can delegate any expression to Mapper whereas Decorator only accepts Strategy objects.

It is recommended to name custom strategies with a Strategy suffix to help distinguish them from stock strategies.

Practical example

Suppose we receive two different postal address formats from two different third-party providers. The first provider, FooBook, provides a single UK addresses. The second provider, BarBucket, provides a collection of US addresses. We are tasked with converting both types to the same uniform address format for our application using mappings.

The address format for our application must be a flat array with the following fields.

  • line1
  • line2 (if applicable)
  • city
  • postcode
  • country

FooBook address mapping

A sample of the data we receive from FooBook is shown below.

$fooBookAddress = [
    'address' => [
        'name' => 'Mr A Smith',
        'address_line1' => '3 High Street',
        'address_line2' => 'Hedge End',
        'city' => 'SOUTHAMPTON',
        'post_code' => 'SO31 4NG',
    ],
    'country' => 'UK',
];

Before continuing, consider attempting to create the mapping on your own, consulting the reference if unsure which strategies to use. The following code shows how we can create a mapping to convert this address format to our application's format.

class FooBookAddressToAddresesMapping extends Mapping
{
    protected function createMapping()
    {
        return [
            'line1' => new Copy('address->address_line1'),
            'line2' => new Copy('address->address_line2'),
            'city' => new Copy('address->city'),
            'postcode' => new Copy('address->post_code'),
            'country' => new Copy('country'),
        ];
    }
}

Since the input data already has the values we want we only need to effectively rename the fields using Copy strategies. We do not need the name field so it is left unmapped.

The result of mapping the input data is shown below.

$address = (new Mapper)->map($fooBookAddress, new FooBookAddressToAddresesMapping);

// Output.
[  
    'line1' => '3 High Street',
    'line2' => 'Hedge End',
    'city' => 'SOUTHAMPTON',
    'postcode' => 'SO31 4NG',
    'country' => 'UK',
]

BarBucket address mapping

A sample of the data we receive from BarBucket is show below.

$barBucketAddress = [
    'Addresses' => [
        [
            'Jeremy Martinson, Jr.',
            '455 Larkspur Dr.',
            'Baviera, CA 92908',
        ],
    ],
];

This format is a lot less similar to our application's format. In particular, BarBucket's format supports multiple addresses but we're only interested in mapping one so we'll assume the first will suffice and discard any others. Their format also omits the country but we know BarBucket only supplies US addresses so we can assume the country is always "US". Once again, consider attempting to create the mapping on your own before observing the solution below.

class BarBucketAddressToAddresesMapping extends Mapping
{
    protected function createMapping()
    {
        return [
            'line1' => new Copy('Addresses->0->1'),
            'city' => new Callback(fn (array $data) => $this->extractCity($data['Addresses'][0][2])),
            'postcode' => new Regex(new Copy('Addresses->0->2'), '[.*\b(\d{5})]', 1),
            'country' => 'US',
        ];
    }

    private function extractCity($line)
    {
        return explode(',', $line, 2)[0];
    }
}

Line1 can be copied straight from the input data and country can be hard-coded with a constant value because we assume it does not change.

City and postcode must be extracted from the last line of the address. For city, we use the Callback strategy that points to a private method of our mapping. A callback is necessary because there are currently no included strategies to perform string splitting. For postcode, we can use the Regex strategy.

The anonymous function wrapper picks the relevant part of the input data to pass to our methods. The weakness of this solution is dereferencing non-existent values will cause PHP to generate undefined index notices whereas injecting Copy strategies would gracefully resolve to null if any part of the path does not exist. Therefore, the most elegant solution would be to create custom strategies to promote code reuse and avoid errors, but is beyond the scope of this demonstration. For more information see writing strategies.

The result of mapping the input data is shown below.

$address = (new Mapper)->map($barBucketAddress, new BarBucketAddressToAddresesMapping);

// Output.
[
    'line1' => '455 Larkspur Dr.',
    'city' => 'Baviera',
    'postcode' => '92908',
    'country' => 'US',
],

Note that line2 is not included in our output because it is was declared optional in the requirements. If it was required we could simply add 'line2' => null, to our mapping, to hard-code its value to null, since it is never present in the input data from this provider.

Strategy reference

The following strategies ship with Mapper and provide a suite of commonly used features, as listed below.

Strategy index

Fetchers

  • Copy – Copies a portion of input data, or specified data, according to a lookup path.
  • CopyContext – Copies a portion of context data.
  • CopyKey – Copies the current key.

Augmenters

  • Callback – Augments data using the specified callback.
  • Collection – Maps a collection of data by applying a transformation to each datum.
  • Context – Replaces the context for the specified expression.
  • Either – Either uses the primary strategy, if it returns non-null, otherwise delegates to a fallback expression.
  • Filter – Filters null values or values rejected by the specified callback.
  • Flatten – Moves all nested values to the top level.
  • IfElse – Delegates to one expression or another depending on whether the specified condition strictly evaluates to true.
  • IfExists – Delegates to one expression or another depending on whether the specified condition maps to null.
  • Join – Joins sub-string expressions together with a glue string.
  • Merge – Merges two data sets together giving precedence to the latter if keys collide.
  • Regex – Captures a portion of a string using regular expression matching.
  • Replace – Replaces one or more substrings.
  • TakeFirst – Takes the first value from a collection one or more times.
  • ToList – Converts data to a single-element list unless it is already a list.
  • TryCatch – Tries the primary strategy and falls back to an expression if an exception is thrown.
  • Type – Casts data to the specified type.
  • Unique – Creates a collection of unique values by removing duplicates.

Others

  • Debug – Debugs a mapping by breaking the debugger wherever this strategy is inserted.

Copy

Copies a portion of input data, or specified data, according to a lookup path. Supports traversing nested arrays. By default the current record is used as the data source but if the data parameter is specified it is used instead.

Copy is probably the most common strategy whether used by itself or injected into other strategies. Since both its path and data parameters can be mapped expressions it is highly versatile and can be combined with other strategies, or even itself, to produce powerful transformations.

Signature

Copy(Strategy|Mapping|array|mixed $path, Strategy|Mapping|array|mixed $data)
  1. $path – Array of path components, string of ->-delimited path components or a strategy or mapping resolving to such an expression.
  2. $data – Optional. Array data or an expression that resolves to an array to be copied instead of input data.

Example

$data = [
    'foo' => [
        'bar' => 123,
    ],
];

(new Mapper)->map($data, new Copy('foo'));

['bar' => 123]

(new Mapper)->map($data, new Copy('foo->bar'));
// or
(new Mapper)->map($data, new Copy(['foo', 'bar']));

123

Data override example

When data is specified in the second parameter it is used instead of the data sent from Mapper.

(new Mapper)->map(
    ['foo' => 'bar'],
    new Copy('foo', ['foo' => 'baz'])
);

'baz'

Recursive path resolver example

Since the path can be derived from other strategies we can nest Copy instances to look up values referenced by other keys.

(new Mapper)->map(
    [
        'foo' => 'bar',
        'bar' => 'baz',
        'baz' => 'qux',
    ],
    new Copy(new Copy(new Copy('foo')))
);

'qux'

CopyContext

Copies a portion of context data; works exactly the same way as Copy in all other respects.

Signature

CopyContext(Strategy|Mapping|array|mixed $path)
  1. $path – Array of path components, string of ->-delimited path components or a strategy or mapping resolving to such an expression.

Example

$data = ['foo' => 123];
$context = ['foo' => 456];

(new Mapper)->map($data, new CopyContext('foo'), $context);

456

CopyKey

Copies the current key from the key context. By default the key context is null. Key context may be set by CollectionMapper or the collection strategy.

Signature

CopyKey()

Example

(new Mapper)->map(
    [
        'foo' => [
            'bar' => 'baz',
        ],
    ],
    new Collection(
        new Copy('foo'),
        new CopyKey
    )
)

['bar' => 'bar']

Callback

Augments data using the return value of the specified callback.

It is recommended to only use this for prototyping if passing closures and to later convert such usages into strategies, however it is acceptable to use this strategy with method pointers. This is because strategies and methods both have names whereas closures are anonymous. Strategies are usually preferred since they are reusable.

Signature

Callback(callable $callback)
  1. $callback – Callback function that receives mapping data as its first argument and context as its second.

Example

(new Mapper)->map(
    range(1, 5),
    new Callback(
        function ($data) {
            $total = 0;

            foreach ($data as $number) {
                $total += $number;
            }

            return $total;
        }
    )
);

15

Collection

Maps a collection of data by applying a transformation to each datum using a callback. The data collection must be an expression that maps to an array otherwise null is returned.

For each item in the collection, this strategy sets the context to the current datum and the key context to the current key, which can be retrieved using CopyKey.

Signature

Collection(Strategy|Mapping|array|mixed $collection, Strategy|Mapping|array|mixed $transformation)
  1. $collection – Expression that maps to an array.
  2. $transformation – Transformation expression. The current datum is passed as context.

Example

(new Mapper)->map(
    ['foo' => range(1, 5)],
    new Collection(
        new Copy('foo'),
        new Callback(
            function ($data, $context) {
                return $context * 2;
            }
        )
    )
);

[2, 4, 6, 8, 10]

Context

Replaces the context for the specified expression.

Signature

Context(Strategy|Mapping|array|mixed $expression, Strategy|Mapping|array|mixed $context)
  1. $expression – Expression.
  2. $context – New context.

Example

(new Mapper)->map(
    ['foo' => 123],
    new Context(
        new CopyContext('foo'),
        ['foo' => 456]
    ),
    ['foo' => 789]
);

456

Either

Either uses the primary strategy, if it returns non-null, otherwise delegates to a fallback expression.

Signature

Either(Strategy $strategy, Strategy|Mapping|array|mixed $expression)
  1. $strategy – Primary strategy.
  2. $expression – Fallback expression.

Example

(new Mapper)->map(
    ['bar' => 'bar'],
    new Either(new Copy('foo'), new Copy('bar'))
);

'bar'

Filter

Filters null values or values rejected by the specified callback.

Signature

Filter(Strategy|Mapping|array|mixed $expression, callable $callback = null)
  1. $expression – Expression.
  2. $callback – Callback function that receives the current value as its first argument, the current key as its second argument and context as its third argument.

Example

(new Mapper)->map(
    ['foo' => range(1, 10)],
    new Filter(
        new Copy('foo'),
        function ($value) {
            return $value % 2;
        }
    )
);

[1, 3, 5, 7, 9]

Flatten

Moves all nested values to the top level.

Signature

Flatten(Strategy|Mapping|array|mixed $expression)
  1. $expression – Expression.

Methods

  • ignoreKeys($ignore = true) – When true, only considers values when merging, otherwise duplicate keys replace each other with the last visited key taking precedence. Defaults to false to preserve keys.

Example

$data = [
    'foo' => [
        range(1, 3),
        'bar' => [range(3, 5)],
    ],
];

(new Mapper)->map($data, new Flatten(new Copy('foo')));

[3, 4, 5]

(new Mapper)->map($data, (new Flatten(new Copy('foo')))->ignoreKeys());

[1, 2, 3, 3, 4, 5]

IfElse

Delegates to one expression or another depending on whether the specified condition strictly evaluates to true.

If the condition does not return a boolean, InvalidConditionException is thrown.

Signature

IfElse(callable $condition, Strategy|Mapping|array|mixed $if, Strategy|Mapping|array|mixed $else = null)
  1. $condition – Condition.
  2. $if – Expression used when condition evaluates to true.
  3. $else – Expression used when condition evaluates to false.

Example

(new Mapper)->map(
    ['foo' => 'foo'],
    new IfElse(
        function ($data) {
            return $data['foo'] !== 'bar';
        },
        true,
        false
    )
);

true

IfExists

Delegates to one expression or another depending on whether the specified condition maps to null.

Signature

IfExists(Strategy $condition, Strategy|Mapping|array|mixed $if, Strategy|Mapping|array|mixed $else = null)
  1. $condition – Condition.
  2. $if – Expression used when condition maps to non-null.
  3. $else – Expression used when condition maps to null.

Example

$data = ['foo' => 'foo'];

(new Mapper)->map($data, new IfExists(new Copy('foo'), true, false));

true

(new Mapper)->map($data, new IfExists(new Copy('bar'), true, false));

false

Join

Joins expressions together with a glue string.

Signature

Join(string $glue, array ...$expressions)
  1. $glue – Glue.
  2. $expressions – Expressions to join or a single expression that resolves to an array to join.

Example

(new Mapper)->map(
    ['foo' => 'foo'],
    new Join('-', new Copy('foo'), 'bar')
);

'foo-bar'

(new Mapper)->map(
    ['foo' => ['bar', 'baz']],
    new Join('-', new Copy('foo'))
);

'bar-baz'

Merge

Merges two data sets together giving precedence to the latter if string keys collide; integer keys never collide. For more information see array_merge.

Signature

Merge(Strategy|Mapping|array|mixed $first, Strategy|Mapping|array|mixed $second)
  1. $first – First data set.
  2. $second – Second data set.

Example

(new Mapper)->map(
    [
        'foo' => range(1, 3),
        'bar' => range(3, 5),
    ],
    new Merge(new Copy('foo'), new Copy('bar'))
);

[1, 2, 3, 3, 4, 5]

Regex

Captures a portion of a string using regular expression matching.

Signature

Regex(Strategy|Mapping|array|mixed $expression, string $regex, int $capturingGroup = 0)
  1. $expression – Expression to search in.
  2. $regex – Regular expression, including delimiters.
  3. $capturingGroup – Optional. Capturing group index to return. Defaults to whole matched expression.

Example

(new Mapper)->map(
    ['foo bar baz'],
    new Replace(
        new Copy(0),
        '[\h(.+)\h]',
        1,
    )
)

'bar'

Replace

Replaces all occurrences of one or more substrings.

Any number of searches and replacements can be specified. Searches and replacements are parsed in pairs. If no replacements are specified, all matches are removed instead of replaced. If fewer replacements than searches are specified, the last replacement will be used for the remaining searches. If more replacements than searches are specified, the extra replacements are ignored.

Searches can be specified as either string literals or wrapped in an Expression and treated as a regular expression. Expression and string searches can be mixed as desired. Regular expression replacements can reference sub-matches, e.g. $1 specifies the first capturing group.

Signature

Replace(Strategy|Mapping|array|mixed $expression, string|Expression|array $searches, string|string[]|null $replacements)
  1. $expression – Expression to search in.
  2. $searches – Search string(s).
  3. $replacements – Optional. Replacement string(s).

Example

(new Mapper)->map(
    ['Hello World'],
    new Replace(
        new Copy(0),
        ['Hello', new Expression('[\h*world$]i')],
        ['こんにちは', '世界']
    )
)

'こんにちは世界'

TakeFirst

Takes the first value from a collection one or more times according to the specified depth. If the depth exceeds the number of nesting levels of the collection the last item encountered will be returned.

Signature

TakeFirst(Strategy|Mapping|array|mixed $collection, int $depth = 1)
  1. $collection – Expression that maps to an array.
  2. $depth – Number of times to descending into nested collections.

Example

(new Mapper)->map(
    [
        'foo' => [
            'bar' => [
                'baz' => 123,
                'quz' => 456,
            ],
        ],
    ],
    new TakeFirst(new Copy('foo'), 2)
);

123

ToList

Converts data to a single-element list unless it is already a list. A list is defined as an array with contiguous integer keys.

This was created because some formats represent single-value lists as the bare value instead of a list containing just that value. This strategy ensures the expression is always a list by wrapping it in an array if it is not already a list.

Signature

ToList(Strategy|Mapping|array|mixed $expression)
  1. $expression – Expression.

Example

(new Mapper)->map(['foo' => 'bar'], new ToList(new Copy('foo')));

['bar']

TryCatch

Tries the primary strategy and falls back to an expression if an exception is thrown. The thrown exception is passed to the specified exception handler. The handler should throw an exception if it does not expect the exception type it receives.

Different fallback expressions can be used for different exception types by nesting multiple instances of this strategy.

Signature

TryCatch(Strategy $strategy, callable $handler, Strategy|Mapping|array|mixed $expression)
  1. $strategy – Primary strategy.
  2. $handler – Exception handler that receives the thrown exception as its first argument and data as its second.
  3. $expression – Fallback expression.

Examples

(new Mapper)->map(
    ['foo' => 'bar'],
    new TryCatch(
        new Callback(
            function () {
                throw new \DomainException;
            }
        ),
        function (\Exception $exception, array $data) {
            if (!$exception instanceof \DomainException) {
                throw $exception;
            }
        },
        new Copy('foo')
    )
);

'bar'

Type

Casts data to the specified type.

Signature

Type(DataType $type, Strategy $strategy)
  1. $type – Type to cast to.
  2. $stategy – Strategy.

Example

(new Mapper)->map(['foo' => 123], new Type(DataType::String, new Copy('foo')));

'123'

Unique

Creates a collection of unique values by removing duplicates.

Signature

Unique(Strategy|Mapping|array|mixed $collection)
  1. $collection – Expression that maps to an array.

Example

(new Mapper)->map(
    ['foo' => array_merge(range(1, 3), range(3, 5))],
    new Unique(new Copy('foo'))
);

[1, 2, 3, 4, 5]

Debug

Debugs a mapping by breaking the debugger wherever this strategy is inserted. The specified expression will be mapped immediately before triggering the breakpoint. The debugger should see the current data, context and mapped expression.

Currently only the Xdebug debugger is supported.

Signature

Debug(Strategy|Mapping|array|mixed $expression)
  1. $expression – Expression to delegate to Mapper.

Requirements

Limitations

  • Strategies do not know the name of the key they are assigned to because Mapper does not forward the key name.
  • Strategies do not know where they sit in a Mapping and therefore cannot traverse a mapping relative to their position.
  • The Collection strategy overwrites context making any previous context inaccessible to descendants.

Testing

Mapper is fully unit tested. Run the tests with the composer test command. All examples in this document can be found in DocumentationTest.