Array Path configuration

Introduction

The “arrayPath” property, which can apply to both the general configuration and the columns configuration has several options which can make it tricky to use once you try more complicated scenarios. Thus this dedicated chapter.

This property is like a path pointing some specific part of a multidimensional array. The different parts of the path are separated by some marker, itself defined by the arrayPathSeparator property. if “arrayPathSeparator” is not set, the separator defaults to /.

Examples

As a simple example, consider the following structure to import:

[
   'name' => 'Zaphod Beeblebrox',
   'book' => [
      'title' => 'Hitchiker\'s Guide to the Galaxy'
   ]
]

To import the title of the book (and not the book itself), use the following configuration:

[
   'arrayPath' => 'book/title'
]

Note

At column-level, using 'arrayPath' => 'book' is equivalent to using 'field' => 'book', but the “field” property should be preferred in such a case, as it requires less processing.

If, for some reason, you needed a different separator, you could use something like:

[
   'arrayPath' => 'book#title',
   'arrayPathSeparator' => '#'
]

It is perfectly okay to use numerical indices in the path. With this structure:

[
   'series' => 'Hitchiker\'s Guide to the Galaxy',
   'books' => [
      'The Hitchiker\'s Guide to the Galaxy',
      'The Restaurant at the End of the Universe',
      'So long, and thanks for all the Fish'
      // etc.
   ]
]

and this configuration:

[
   'arrayPath' => 'books/0'
]

The result will be “The Hitchiker’s Guide to the Galaxy”. It is always the first element inside “books” that will be selected.

Conditions

Conditions can be applied to each segment of the path using the Symfony Expression Language syntax, wrapped in curly braces. If the value being tested is an array, its items can be accessed directly in the expression. If the value is a simple type, it can be accessed in the expression with the key value.

See the Symfony documentation for reference on the Symfony Expression Language syntax.

Examples

With the following data to import:

[
   'name' => 'Zaphod Beeblebrox',
   'book' => [
      'state' => 'new',
      'title' => 'Hitchiker\'s Guide to the Galaxy'
   ]
]

let’s imagine two scenarios. First, we want to get the name of the character, but only if it’s “Zaphod Beeblebrox”. The configuration would be:

[
   'arrayPath' => 'name{value === \'Zaphod Beeblebrox\'}'
]

When the name is indeed “Zaphod Beeblebrox”, the result will be “Zaphod Beeblebrox” too. When the name is anything else, the result will be null.

A second scenario is to take the title of the book, only if the book is new. That would be achieved with a configuration like:

[
   'arrayPath' => 'book{state === \'new\'}/title'
]

With the above data, the result will be “Hitchiker’s Guide to the Galaxy”, but for a book whose state is “used”, the result would be null.

Such usage of conditions may seem a bit far-fetched at first, but can quite interesting when combined (at a later stage in the import process) with the isEmpty property. However conditions are much more interesting for looping on substructures and filtering them, as described next.

Looping and filtering

The special segment * can be included in the path. It indicates that all values selected up to that point should be looped on and the condition following the * applied to each of them (using * without a condition is meaningless). This will effectively filter the currently selected elements. Further segments in the path are applied only to that resulting set.

Note

Using * as a segment will always result in an array, which can be explored with further segments or flattened, if it contains a single result.

Usage of special segment * can be followed by usage of special segment ., which changes the way the selected elements are handled. This is better explained by using examples.

Examples

Let’s consider the following structure to import:

[
    'test' => [
        'data' => [
            0 => [
                'status' => 'valid',
                'list' => [
                    0 => 'me',
                    1 => 'you'
                ]
            ],
            1 => [
                'status' => 'invalid',
                'list' => [
                    4 => 'we'
                ]
            ],
            2 => [
                'status' => 'valid',
                'list' => [
                    3 => 'them'
                ]
            ]
        ]
    ]
]

And let’s say that we want to have all the items that are inside the “list” key, but only when the “status” is “valid”. We would use the following configuration:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/list'
]

which would result in:

[
    0 => 'me',
    1 => 'you',
    2 => 'them'
]

This may not seem very intuitive at first. This is because this feature was designed to mimic what you might get from a XML structure with a XPath query. Consider the following structure:

<books>
   <book>
      <title>Foo</title>
      <authors>
         <author>A</author>
         <author>B</author>
      </authors>
   </book>
   <book>
      <title>Bar</title>
      <authors>
         <author>C</author>
      </authors>
   </book>
</books>

With an XPath like //author, you would get values “A”, “B” and “C” in a single list, no matter what context surrounds them.

If you need to preserve the structure of the elements matched, you can add the special segment . after the * segment. This preserves the matched structure, to which you can apply further path segments. The above example would be modified as such:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/./list'
]

which changes the result to:

[
    0 => [
        0 => 'me',
        1 => 'you'
    ],
    1 => [
        3 => 'them'
    ]
]

If we change the structure to import to this:

[
    'test' => [
        'data' => [
            0 => [
                'status' => 'invalid',
                'list' => [
                    0 => 'me',
                    1 => 'you'
                ]
            ],
            1 => [
                'status' => 'invalid',
                'list' => [
                    4 => 'we'
                ]
            ],
            2 => [
                'status' => 'valid',
                'list' => [
                    3 => 'them'
                ]
            ]
        ]
    ]
]

making the first entry also “invalid” and using the same first condition:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/list'
]

we will have a single result:

[
    0 => 'them'
]

When we know that we have such a scenario, it might be convenient to get the actual value as a result (i.e. “them”) rather than a single-entry array. This is where property arrayPathFlatten can be used. Modifying the configuration to:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/list',
   'arrayPathFlatten' => true
]

changes the result to simply:

'them'