External Import

Extension key: external_import
Package name: cobweb/external_import
Version: main
Language: en
Author: François Suter (Idéative), typo3@ideative.ch
License: This document is published under the Open Publication License.
Rendered: Sat, 10 Jan 2026 20:37:43 +0000

Tool for importing data from external sources into the TYPO3 database, using an extended TCA syntax. Provides a BE module, a Scheduler task, a command-line interface, reactions and an API.

Introduction

A general presentation of the features provided by this extension.

Installation

Installing and updagring the extension, with highlights of new features. General extension configuration.

User manual

How the extension works and what are the various tools available.

Import configuration

All the options available when setting up an import configuration.

Developer's guide

Everything about events, user functions and all other entry points for programatically enhancing the import process. Description of the main APIs.

Known problems

Description of the known (and tricky) issues that are not (yet) solved.

Introduction

This extension is designed to fetch data from external sources and store them into tables of the TYPO3 CMS database. The mapping between this external data and the TYPO3 CMS tables is done by extending the syntax of the TCA. A backend module provides a way to synchronize any table manually or to define a scheduling for all synchronizations. Synchronizations can also be run using the command-line interface. Automatic scheduling can be defined using a Scheduler task. Finally, this extension provides reactions (starting with TYPO3 12) to import or delete data, responding to calls from remote sources.

The main idea of getting external data into the TYPO3 CMS database is to be able to use TYPO3 CMS standard functions on that data (such as enable fields, for example, if available).

Connection to external applications is handled by a class of services called "connectors", the base of which is available as a separate extension (svconnector).

Data from several external sources can be stored into the same table allowing data aggregation.

The extension also provides an API for receiving data from some other source. This data is stored into the TYPO3 CMS database using the same mapping process as when data is fetched directly by the extension.

This extension is quite flexible, thanks to the possibility of calling user functions to transform incoming data, listening to events to react to some part of the process or adding custom steps at any point in the process. It is also possible to create custom connectors for reading from a specific external source. Still this extension was not designed for extensive data manipulation. It is assumed that the data received from the external source is in a "palatable" format. If the external data requires a lot of processing, it is probably better to put it through an ETL or ESB tool first, and then import it into TYPO3 CMS.

Please also check extension externalimport_tut which provides a tutorial to this extension.

More examples can be found in extension "externalimport_test", which is used for testing purposes. The setup is not documented, but can be interesting to look at. This extension is distributed only via Github: https://github.com/cobwebch/externalimport_test

Note

Setting up External Import can be quite tricky, mostly because this extension offers many options, that are meant to cover as many import scenarios as possible. These options can often be combined for even more possibilities. This can be quite confusing in the beginning.

Please take time to read the whole User manual chapter and the already mentioned tutorial. In particular, you should read the following sections:

General considerations
Process overview
Mapping data

Alternatives

There exists several extensions for importing data into TYPO3, including the system extension "impexp". Extension "impexp" is specifically designed to export data from a TYPO3 installation and import it again into TYPO3, using a specific file format ("T3D"). When the need is to move around data that is already in a TYPO3 installation, "impexp" is the logical choice. External Import differs by being designed to import data into TYPO3 from a large variety of sources outside TYPO3.

There are other extensions available, like xlsimport and importr, which were released years after External Import and - as such - I never really looked into them since I had all the tools I needed. So it is hard to compare their features.

"xlsimport" is designed for one-time import of data in Excel and CSV format. It cannot be automated. An interface is provided for the mapping configuration, but it cannot be saved. The process is definitely quicker and lighter to set up than External Import, but is limited if you need to import the same data on a regular basis.

"importr" seems to come quite close to External Import in terms of features, although maybe with less flexibility in the data handling and less import sources (import resources can probably be added). It is probably easier to set up than External Import, since it allows for simply pointing to an Extbase model, plus a simple mapping of fields to import.

Questions and support

If you have any questions about this extension, use the dedicated channel in the TYPO3 Slack workspace (#ext-external_import) or the issue tracker on GitHub (https://github.com/cobwebch/external_import/issues).

Please also check the Troubleshooting section in case your issue is already described there.

Keeping the developer happy

Every encouragement keeps the developer ticking, so don't hesitate to send thanks or share your enthusiasm about the extension.

If you appreciate this work and want to show some support, please check https://www.monpetitcoin.com/en/support-me/.

Participating

This tool can be used in a variety of situations and all use cases are certainly not covered by the current version. I will probably not have the time to implement any use case that I don't personally need. However you are welcome to join the development team if you want to bring in new features. If you are interested use GitHub to submit pull requests.

Sponsoring

You are very welcome to support the further development of this extension. You will get mentioned here.

A good part of the development of version 3.0 was sponsored by the State of Vaud.
The xmlValue property was sponsored by Bendoo e-work solutions.
The development of version 5.0 benefited from much sponsoring:
- Idéative
- Bendoo e-work solutions
- mehrwert intermediale kommunikation GmbH
- Benni Mack
- Tomas Norre
Without these companies and people, it would never have been such a great update!
The development of version 6.0 was largely funded by the Lausanne University Hospital (CHUV)
The feature for sorting child records, introduced in version 6.2, was sponsored by Ines Willenbrock.

Credits

The icon for the log table records is derived from an icon made by iconixar from www.flaticon.com.

Installation

Installing this extension does nothing in and of itself. You still need to extend the TCA definition of some tables with the appropriate syntax and create specific connectors for the application you want to connect to.

TYPO3 CMS 12 or 13 is required, as well as the "scheduler" and "reactions" system extensions.

Upgrading and what's new

Upgrade to 8.2.0

XPath functions can be used in the columns configuration to directly return a string value. Previously XPath expressions could only be used to select a node or list of nodes in the XML structure.

A new event ModifyReactionResponseEvent is available to modify the response of a reaction before it is sent back. Both the response body and the HTTP return code may be changed.

Loading of the TCA has been encapsulated into a repository class, making it easier to follow the evolutions of the TYPO3 Core and allowing developers who might need it to dynamically manipulate the full TCA before the External Import configurations are extracted from it.

Upgrade to 8.1.0

\Cobweb\ExternalImport\Importer::getContext() and \Cobweb\ExternalImport\Importer::setContext() have been deprecated in favor of \Cobweb\ExternalImport\Importer::getCallType() and \Cobweb\ExternalImport\Importer::setCallType(). These methods rely on the \Cobweb\ExternalImport\Enum\CallType enumeration which is used more consistenty throughout External Import.

A new event ChangeConfigurationBeforeRunEvent makes it possible to modify the External Import configuration at run-time. This happens before any of the import steps is executed.

Upgrade to 8.0.0

Configurations can now be part of several groups. As such, the "group" property is deprecated and is replaced with the groups property (with an array value rather than string).

Note

A Rector rule is provided for migration. Use it in your rector.php file:

return RectorConfig::configure()
    ...
    ->withRules([
        ...
        \Cobweb\ExternalImport\Rector\ChangeGroupPropertyRector::class,
    ])
    ...
;

System extension "reactions" is now a requirement. The "Import external data" reaction can now target a group of configurations.

The logging mechanism has been changed to store the backend user's name rather than its id. This makes it much easier for the Log module and keeps working even if a user is removed. An update wizard is available for updating existing log records.

Warning

Don't drop the "cruser_id" field before running the update wizard, or it won't be able to do its job.

In version 7.2.0, a change was introduced to preserve null values from the imported data. It affected only fields with 'eval' => 'null' in their TCA. Since version 8.0.0, null are preserved also for relation-type fields ("group", "select", "inline" and "file") which have no minitems property or 'minitems' => 0. This makes it effectively possible to remove existing relations. This is an important change of behavior, which - although more correct - may have unexpected effects on your date.

A new disabled flag makes it possible to completely hide a configuration.

Upgrade to 7.3.0

This version introduces a new reaction dedicated to deleting already import data.

Upgrade to 7.2.0

The HandleDataStep process now keeps null values found in the imported data. This is an important change, but is has a concrete effect only if the target field is nullable (i.e. it has an eval property including null or has property nullable set to true in its TCA configuration). In such cases, existing values will be set to null where they would have been left untouched before. It may also affect user functions in transformations where a null value was not expected to be found until now.

Upgrade to 7.1.0

External Import now supports PHP 8.2.

When running the preview mode from the backend module, some steps now provide a download button, to retrieve the data being handled in its current state.

When setting a fixed value, the new column configuration property should be preferred over the historical transformation property.

It is now possible to define explicitly the order in which columns are processed.

Upgrade to 7.0.0

Support for old-style Connector services was droppped (i.e. connectors registered as TYPO3 Core Services). If you use custom connector services, make sure to update them (see the update instructions provided by extension "svconnector").

When editing Scheduler tasks in the External Import backend module, it is no longer possible to define a start date (this tiny feature was a lot of hassle to maintain across TYPO3 versions).

All hooks were removed. If you were still using hooks, please refer to the archived page about hooks to find replacement instructions.

A new ReportStep has been introduced, which triggers a webhook reporting about the just finished import run. In order for this step to run (and do the reporting) even when the process is aborted, a new possibility has been added for steps to run despite the interruption. This actually fixes a bug with the ConnectorCallbackStep which was never called when the process was aborted. If you use such a post-processing, you can now report about failed imports if needed.

New stuff

The arrayPath is now available as both a general configuration option and a column configuration option. It was also enriched with more capabilities.

A new exception \Cobweb\ExternalImport\Exception\InvalidRecordException was introduced which can be used inside user function to remove an entire record from the data to import if needed.

A new transformation property isEmpty is available for checking if a given data can be considered empty or not. For maximum flexibility, it relies on the Symfony Expression language.

It is also possible to set multiple mail recipients for the import report instead of a single one (see the extension configuration).

Upgrade to older version

In case you are upgrading from a very old version and proceeding step by step, you find all the old upgrade instructions in the Appendix.

Other requirements

As is mentioned in the introduction, this extension makes heavy use of an extended syntax for the TCA. If you are not familiar with the TCA, you are strongly advised to read up on it in the TCA Reference manual.

Extension configuration

The extension has the following configuration options:

Storage PID

Defines a general page where all the imported records are stored. This can be overridden specifically for each table (see Administration below).

Log storage PID

Defines a page where log entries will be stored. The default is 0 (root page).

Force time limit

Sets a maximum execution time (in seconds) for the manual import processes (i.e. imports launched from the BE module). This time limit affects both PHP (where the default value is defined by max_execution_time) and the AJAX calls triggered by the BE module (where the default limit is 30 seconds). This is necessary if you want to run large imports. Setting this value to -1 preserves the default time limit.

Email for reporting

If an email address is entered here, a detailed report will be sent to this address after every automated synchronization. Multiple email adresses may be defined, separated by commas.

Mails are not sent after manual synchronizations started from the BE module. The mail address used for sending the report is ($GLOBALS['TYPO3_CONF_VARS']['MAIL']['defaultMailFromAddress']). If it is not defined, the report will not be sent and an error will will be logged.

Subject of email report

A label that will be prepended to the subject of the reporting mail. It may be convenient – for example – to use the server's name, in case you have several servers running the same imports.

Debug

Check to enable the extension to log some data during import runs. This may have an effect depending on the call context (e.g. in verbose mode on the command line, debug output will be sent to standard output). Debug output is routed using the Core Logger API. Hence if you wish to see more details, you may want to add specific configuration for the \Cobweb\ExternalImport\Importer class which centralizes logging. Example:

$GLOBALS['TYPO3_CONF_VARS']['LOG']['Cobweb']['ExternalImport']['Importer']['writerConfiguration'] = [
    // configuration for ERROR level log entries
    \TYPO3\CMS\Core\Log\LogLevel::DEBUG => [
        // add a FileWriter
        \TYPO3\CMS\Core\Log\Writer\FileWriter::class => [
            // configuration for the writer
            'logFile' => 'typo3temp/logs/typo3_import.log'
        ]
    ]
];

Disable logging

Disables logging by the TYPO3 Core Engine. By default an entry will be written in the System > Log for each record touched by the import process. This may create quite a lot of log entries on large imports. Checking this box disables logging for all tables. It can be overridden at table-level by the disableLog.

Warning

There is one big drawback to this method however. If core logging is disabled, errors are not tracked at all. This means that the import will run happily all the time and never report errors. You will unfortunately have to choose between errors not being reported and your log being flooded.

User manual

General considerations

The purpose of this extension is to take data from somewhere else (called the "external source") than the local TYPO3 CMS database and store it into that local database. Data from the external source is matched to local tables and fields, using information stored in the TCA with the extended syntax provided by this extension.

The extension can either fetch the data from some external source or receive data from any kind of script using the provided API. Fetching data from an external source goes through a standardized process.

Connecting to an external source is achieved using connector services (see extension svconnector), that return the fetched data to the external import in either XML format or as a PHP array. Currently, the following connectors exist:

svconnector_csv for CSV and silimar flat files
svconnector_feed for XML source files
svconnector_json for JSON source files
svconnector_sql for connecting to another database

It is quite easy to develop a custom connector, should that be needed.

The external data is mapped to one or more TYPO3 CMS tables using the extended TCA syntax. From then on the table can be synchronized with the external source. Every time a synchronization is started (either manually or according to a schedule), the connector service is called upon to fetch the data. Such tables are referred to as "synchronizable tables". This type of action is called "pulling data".

On the other hand this extension also provides an API that can be called up to pass data directly to the external import process. No connector services are used in this case. The extension is called on a need-to basis by any script that uses it. As such it is not possible to synchronize those tables from the BE module, nor to schedule their synchronization. Such tables are referred to as "non-synchronizable tables". This type of action is called "pushing data".

Note that it is perfectly possible to also push data towards synchronizable tables. The reverse is not true (non-synchronizable tables cannot pull data).

It is perfectly possible to define several import configurations for the same table, thus pulling or pushing data from various sources into a single destination.

Synchronizations can be run in preview mode.

Using the backend modules

The extension provides two backend modules. The "Data Import" is the main one, displaying all configurations and allowing to start imports manually. The second one, "Log", displays a list of all log entries generated during External Import runs.

Synchronizable tables

The first function of the "Data Import" BE module – called "Tables with synchronization" – displays a list of all synchronizable tables. The various features are summarized in the picture below.

Note

Icons may vary depending on user rights. Users without the proper rights or without write access to a given table will not see the synchronize and preview buttons, nor the actions related to the Scheduler.

Viewing configuration details

Clicking on the information icon leads to a screen showing all the information about that particular configuration. The view consists of three tabs: the first one displays the general configuration, the second one displays the configuration for each column (including the additional fields) and the third one displays the list of steps that the process will go through, including any custom steps.

Inspecting TCA properties — Viewing the details of the TCA properties for External Import

If the configuration contains errors, they will be displayed in this detailed view.

Raised errors about wrong configuration — Viewing errors in the External Import configuration

Note

The configuration validator also runs right before an import. The import will abort if any critical error is found in the configuration. Notices do not block the import process.

Triggering a synchronization

Clicking on the synchronize data button will immediately start the synchronization of the corresponding table. This may take quite some time if the data to import is large. If you move away from the BE module during that time, the process will abort. At the end of the process, flash messages will appear with the results:

Results of synchronization — Flash messages show the results of the synchronization

Running in preview mode

Clicking on the preview button leads to the preview feature. For running a preview your first need to select a specific step from the process. The synchronization will run up to that step and stop. Preview data gets displayed if available. This depends on the step.

Again depending on the step, a download button may appear or not. If it does, you can use it to retrieve a CSV file of the records being imported, in their state at the end of the previewed step. This makes it easier to explore the data when there is a lot of it.

Most importantly nothing permanent happens in preview mode. For example, data is not stored into the database.

Preview of a synchronization — The synchronization is run up to the Transform Data step and preview data is dumped to the screen

Setting up the automatic schedule

The automatic scheduling facility relies on the Scheduler to run. On top of the normal Scheduler setup, there are some points you must pay particular attention to in the case of external import.

As can be seen in the above screenshot, the information whether the automatic synchronization is enabled or not is displayed for each table. It is possible to add or change that schedule, by clicking on the respective icons. This leads to an input form where you can choose a frequency, a task group and a start date (date of first execution; leave empty for immediate activation). The frequency can be entered as a number of seconds or using the same syntax as for cron jobs.

Note

If a Scheduler task exists but has been disabled, it cannot be enabled again from the External Import backend module. It is considered that an administrator had a good reason to disable that task and it is up to that person to enable it again.

Automation input form — Input form for setting automated synchronization parameters

Clicking on the trash can icon cancels the automatic synchronization (a confirmation window will appear first).

At the top of the screen, before the list, it is possible to define a schedule for all tables. This means that all imports will be executed one after the other, in the order of priority.

Automating all tables — Setting automated synchronization for all tables

The same input form appears as for individual automation settings.

Note

Of course, it is perfectly possible to define automation tasks from within the Scheduler's BE module. External Import offers this as a convenience and also for non-admin users.

Non-synchronizable tables

The second function of the "Data Import" BE module – called "Tables without synchronization" – displays a list of non-synchronizable tables. This view is purely informative as no action can be taken for these tables. Only the detailed configuration information can be accessed.

Logs

As its name implies, the "Log" module displays a list of all log entries generated during External Import runs. The list is sortable and searchable. Each entry has a context, which gives an idea on how the run took place, either triggered manually (via the backend module), run via the Scheduler or the command line, or called using the API. Any other status will appear as "Other".

There is also a duration associated with each log entry. This is actually the duration of the whole import run and will be the same for all log entries related to the same run.

There is not much more to it for now. It may gain new features in the future.

Note

Runs made in preview mode are not logged.

The Scheduler task

The External Import process can be automated using the provided Scheduler task. The automation can be defined from the External Import backend module or directly from the Scheduler backend module.

The taks provides two specific options:

View of the External Import Scheduler task options — The options of the External Import Scheduler task

Item to synchronize: Choose which import configuration to automate. If you choose "all", all configurations will be synchronized in order of priority. The selector also provides a choice of all available groups and of each individual configuration.
Storage page: This is the uid of a TYPO3 page. The imported data will be stored in that page, no matter what has been configured in the TCA or in the extension settings.

The command-line interface

The External Import process can be called from the command line. It can be used to run a single synchronization, all of them or a group of them. When several synchronizations are run, they happen in order of increasing priority. The following operations are possible:

List all configurations available for synchronization: path/to/php path/to/bin/typo3 externalimport:sync --list
Synchronize everything: path/to/php path/to/bin/typo3 externalimport:sync --all.
Synchronize a group of configurations: path/to/php path/to/bin/typo3 externalimport:sync --group=(group name).
Synchronize a single configuration: path/to/php path/to/bin/typo3 externalimport:sync --table=foo --index=bar.

Forcing the storage page

The storage flag can be used to pass the id of a page in the TYPO3 system where the imported data will be stored. This overrides both the TCA and the extension settings.

Running in preview mode

Preview mode can be activated by using the preview flag and a Step class name as argument. The import process will stop after the given step and return some preview data (or not; that depends on the step). No permanent changes are made (e.g. nothing is saved to the database).

A typical command will look like:

path/to/php path/to/bin/typo3 externalimport:sync --table=foo --index=bar --preview='Cobweb\\ExternalImport\\Step\\TransformDataStep'

This will stop the process after the TransformDataStep and dump the transformed data in the standard output. Mind the correct syntax for defining the Step class (quote with no opening backslash).

Note

If running a full or group synchronization, the preview mode will apply to each configuration.

Debugging on the command-line

Debugging on the command-line is achieved by using the verbose flag, which is available for all commands. If global debugging is turned on (see the Extension configuration), debugged variables will be dumped along with the usual output from the External Import command. If global debugging is disabled, it can be enabled for a single run, by using the "debug" flag:

path/to/php path/to/bin/typo3 externalimport:sync --table=foo --index=bar --debug -v

Reactions (External Import endpoints)

When using TYPO3 12, External Import provides reactions, i.e. endpoints which can be called by any third-party software to push data to import or to delete imported data.

Both reactions are defined in the same way. The expected payload is different, and this is explained further down.

The "Import external data" reaction will import the data defined in the payload as per the usual External Import process, inserting, updating and deteting records by matching the incoming data set with the existing data set. However the import reaction could also be used to import a single record (for example, if it used as a webhook in a third-party application). In such a case, it is still easy to insert or update, but the deleting of records cannot be automated anymore. This is where the "Delete external data" reaction comes in. With it, one or more records can be targeted for deletion, using their external primary key to identify them.

Defining the reaction

A reaction must be defined using the "Reactions" module in the TYPO3 backend. There can be more than one External Import reaction depending on your needs. Having several reactions allows you to distribute secret keys to different people.

Defining a reaction in the dedicated backend module

Choosing a configuration is optional. If one is chosen, the reaction will only execute if the incoming configuration matches the selected configuration. This provides better safety, but is more restrictive.

It is absolutely necessary to choose a BE user to impersonate, otherwise the data will not be stored. The easiest option is to choose the _cli_ user but this may seem too encompassing. You can use another BE user or define a specific one, but make sure that it has the proper rights for writing to the table(s) targeted by the import.

External Import configuration

The External Import configuration does not need anything special to be used by a reaction. However if it is only ever used by reactions, then it does not need connector information and can thus be a Non-synchronizable table.

Note

When using the "Delete external data" reaction, matched data will be deleted even if the "delete" operation is disabled in the configuration (using disabledOperations). It is assumed that you are meaning to delete anyway.

Request payload

To call the endpoint and trigger the External Import reaction, you need to call the URI given by the reaction and pass it the secret key in the headers. The payload in the request body is comprised of the following information:

table

The name of the table targeted by the reaction (not necessary when a configuration is explicitly defined).

index

The index of the targeted External Import configuration (not necessary when a configuration is explicitly defined).

group

Instead of defining a table and an index, it is also possible to define a group. In such a case, all configurations from the corresponding group will be executed in order of increasing priority. This is used only for the "Import external data" reaction. It is incompatible with a table and index definition. Defining both will trigger an error. It is not necessary when a group has been explicitly defined.

data

The data to handle.

For the "Import external data" reaction, this can be either a JSON array (for array-type data) or a (XML) string for XML-type data).

For the "Delete external data" reaction, it must be a JSON array, with the item(s) to delete. The key for identifying the external data must be in a field called "external_id". Example:

{
    "table": "tx_externalimporttest_tag",
    "index": "api",
    "data": [
        {
            "external_id": "miraculous"
        },
        {
            "external_id": "rotten"
        }
    ]
}

If the incoming data cannot match this structure (but is still a JSON array), use the GetExternalKeyEvent event to extract the external key from the incoming data. If the incoming data does not match the above structure at all, you have to develop your own reaction.

pid (optional)

If defined, this uid from the "pages" table will override the pid property from the general configuration.

This is not used by the "Delete external data" reaction.

Here is how it could look like (example made with Postman):

Request headers — The header with the URI, the accepted content type and the secret key

Request body — The body of the payload with the table name, configuration index and data to import

The delete reaction

Since the "Delete external data" reaction is dedicated to deleting records, it is quite different from the other bits of code in External Import. As far as reaction payload is concerned, this has been discussed above.

About the configuration, it is important to understand that most of the configuration is not used by the delete process. In fact the only properties that are used from the general configuration are:

referenceUid to know in which field the external primary key is stored.
enforcePid, which could be useful is a scenario where you would import the same records to different places in your TYPO3 installation, and thus have external primary keys which are unique only per pid.
whereClause

Note

If the delete reaction tries to delete an already deleted record, the operation will be silently successful. On the contrary, trying to delete a record which doesn't exist at all, triggers an error.

Reaction response

The response contains a success entry with value true or false.

If the success is false, the response will contain a error entry (string) for the delete reaction or an errors entry (array of strings) for the import reaction. These contain information about what went wrong.

If the success is true, the response will contain a message entry (string) for the delete reaction or an messages entry (array of strings) for the import reaction. These contain information about the number of operations performed.

Webhook (outgoing message)

When using TYPO3 12, External Import provides a webhook, i.e. a message that can be sent to some third-party endpoint.

Defining the webhook

A webhook must be defined using the "Webhooks" module in the TYPO3 backend, choosing the "... when an External Import run is completed" trigger. You can define several webhooks with the same trigger. Defining the webhook is essentially about setting the target URL and generating the "secret" using the field provided by TYPO3.

Defining a webhook in the dedicated backend module

The message is sent right after an import has completed, in the ReportStep of the import process. The payload sent by External Import message contains the following information:

the name of the table
the index of the import configuration
the description of the import configuration
all the messages reported by the process, in three categories (success, warnings and errors).

Mapping data

In the Administration chapter, you will find explanations about how to map the data from the external source to existing or newly created tables in the TYPO3 CMS database. There are two mandatory conditions for this operation to succeed:

the external data must have the equivalent of a primary key
this primary key must be stored into some column of the TYPO3 CMS database, but not the uid column which is internal to TYPO3 CMS.

The primary key in the external data is the key that is used to decide whether a given entry in the external data corresponds to a record already stored in the TYPO3 CMS database or if a new record should be created for that entry. Records in the TYPO3 CMS database that do not match primary keys in the external data can be deleted if desired.

Import scenarios

External Import offers many options, some of which can be combined. This can sometimes be confusing. This chapter attempts to explain some import scenarios in order to show what is possible with External Import. It is possible to create other scenarios than those shown below.

Above all else, the preview mode is your friend. Test and tune your configuration and check what data structure results using the preview at any step in the process.

The simplest scenario

The simplest scenario is when one row/line of external data corresponds to one record in the TYPO3 database, possibly after some transformations. This is what this image tries to convey:

One line of external data is read (red), it goes through some transformations (grey) and finally gets saved to the TYPO3 database (green).

Multiple values

One particular scenario is when one or more fields in the external data contains multiple values, often comma-separated. What you probably want is to access each individual value, apply transformations to it and then reconcatenate it. This is what the multipleValuesSeparator property does. It takes each value, tries to match it to an entry in the given database table and concatenates again (with a comma), all the values that were mapped. This is can be represented as:

The external data (red) contains values that correspond to keys in the TYPO3 database. The values are matched one by one (little magenta squares in the grey area) and concatenated again for saving to the TYPO3 database (green).

Denormalized data

One common scenario - particularly with flat (CSV) data - is to received denormalized data. This means that the data itself represents a many-to-many relation between two sets of entities and that the total number of row/lines does not represent the actual amount of entities but the amount of relationships between them. External Import takes care not to import duplicate entries and automatically filters on the defined external key (see property referenceUid).

However if you don't do any specific configuration, it is always the first row that will be imported and the others will simply be discarded. This may not be what you want. A schema for this situation could be:

Import scenario with denormalized data and no specific configuration

The black key and the white key represent the external keys. Among the four rows, there are only two different keys. And indeed, at the end of the process (green), only two records are created in the TYPO3 database.

The column with the pattern represents the denormalized data. During the process (grey), inside each row, this column may be mapped to some other database table (magenta squares), but then only the first row is actually stored.

Denormalized data with multiple rows

The previous scenario may correspond to a real use case, but most likely not, because it involves losing relationship information. To preserve it, one way is to use the multipleRows property. It is defined at column-level and instructs External Import to not discard the excess data, but to keep and merge it after all other transformations (it is assembled as a comma-separated list of values).

The result can be represented as:

Import scenario with denormalized data and multiple rows activated

Only two records are created but the many-to-many relations are preserved.

Note

This is absolutely independent of whether you are using a MM-table on the TYPO3 side or not. If you are not, the comma-separated list will be stored as is. If you are, the TYPO3 Core Engine will take care of filling the MM-table for you.

Substructure fields with multiple rows

Another scenario is that the external data is not a flat structure, but contains nested data. This is what the substructureFields property is for. It allows to fetch a value inside a deeper structure. But if there are mutliple values, it will actually trigger an on-the-fly denormalization of the external data, as the schema below attempts to portray:

Import scenario with substructure fields and multiple rows activated

The structured nested inside the external data (little yellow squares inside the red bar) is extracted leading to two rows durign the process. The process may also add columns. If the fields of the substructure are mapped to names of already defined columns (from the column configuration or the additional fields), the values will be put into those fields (and replace any existing value). If they are mapped to differents names, however, this will create new columns. A mix and match is possible.

In the schema above, the yellow column is new and the striped grey column represents an existing column which was "overridden" with values from the substructure.

Note that extra columns do not have a full definition like the other columns and thus don't go through the Transformation step (but are available in the rows for manipulation inside user functions or custom steps). They are also not stored to the database. If you map a substructure field to an existing column, it will both go through the Transformation step and be saved to the database.

As for the extra rows they are collapsed back using comma-separated list of values in the columns for which the multipleRows property was set.

Substructure fields with child records

Starting from the same scenario as above, it is also possible to define child records with the children property instead of using multipleRows. In this case, the denormalized rows are not collapsed but each row is used to create a separate child record:

Import scenario with substructure fields and child recrods

Substructure fields may be used to fill children columns.

Note

It is perfectly possible to create child records from "normal" denormalized data. Using substructure fields is just an example.

Clearing the cache

When data is imported into your TYPO3 CMS installation, you may want to clear the cache for a number of pages in order for the new data to be displayed as soon as it is available. One way to achieve this is to rely purely on TYPO3 CMS and use the TSconfig property:

TCEMAIN.clearCacheCmd = xx,yy

on the page(s) where the data is stored to automatically trigger the clearing of the cache for the given pages (xx and yy) when any record they contain is modified or deleted, or some new record inserted.

This works fine but has one big drawback: it is triggered for each record. If you manipulate a lot of records, the cache clearing may be called hundreds or thousands of times. This can be very bad for your site, especially if you have a very large cache.

It is also possible to trigger the clearing of the cache after the whole import process has completed for a given configuration. Instead of using TSconfig, the configuration would be something like:

$GLOBALS['TCA']['tx_news_domain_model_news']['external']['general']['0']['clearCache'] = 'xx,yy';

This will clear the cache for pages "xx" and "yy", but only after all records have been inserted, updated and deleted. The process still relies on DataHandler for clearing the cache of each page, so you may rely on the usual clear cache hooks if needed.

Besides page numbers, you can also use more general cache identifiers like "pages" (to clear the cache for all pages), cache tags, or any other value that can be used with TCEMAIN.clearCacheCmd.

Debugging

There are many potential sources of error during synchronization, from wrong mapping configurations to missing user rights to PHP errors in user functions. When a synchronization is launched from the BE module a status is displayed when the operation is finished.

The extension tries to report at best on the success or failure of the operation. Turning on the "debug" mode (see the Configuration chapter) will provide additional information.

As described in the Configuration chapter, it is also possible to receive a detailed report by email. It will contain a general summary of what happened during synchronization, but also all error messages logged by the TYPO3 Core Engine, if any.

Troubleshooting

This chapter tries to address a number of common issues.

The automatic synchronization is not being executed

You may observe that the scheduled synchronization is not taking place at all. Even if the debug mode is activated and you look at the logs, you will see no call to external_import. This may happen when you set a too high frequency for synchronizations (like 1 minute for example). If the previous synchronization has not finished, the Scheduler will prevent the new one from taking place. The symptom is a message like "[scheduler]: Event is already running and multiple executions are not allowed, skipping! CRID: xyz, UID: nn" in the system log (SYSTEM > Log). In this case you should stop the current execution in the Scheduler backend module.

The manual synchronization never ends

It may be that no results are reported during a manual synchronization and that the looping arrows continue spinning endlessly. This happens when something failed completely during the synchronization and the BE module received no response. See the advice in Debugging.

All the existing data was deleted

The most likely cause is that the external data could not be fetched, resulting in zero items to import. If the delete operation is not disabled, External import will take that as a sign that all existing data should be deleted, since the external source didn't provide anything.

There are various ways to protect yourself against that. Obviously you can disable the delete operation, so that no record ever gets deleted. If this is not desirable, you can use the "minimumRecords" option (see General TCA configuration) below. For example, if you always expect at least 100 items to be imported, set this option to 100. If fewer items than this are present in the external data, the import process will be aborted and nothing will get deleted.

Data on unrelated pages was deleted

If all imported data should only be syncronized in a certain page, set the enforcePid to 1 to prevent the import from altering or deleting data on pages with a different page ID.

Only a single entry was imported

This generally happens when the referenceUid property is wrongly defined. External Import is unable to differentiate the records from the external source and each record overwrites the preceding one. In the end, only the last one is actually imported.

Can I leave out records with "empty" fields?

A likely scenario is wanting to leave out records where one field is empty. There's no configuration property for that as it is a difficult topic. First of all what constitutes an "empty field" will vary depending on the incoming data and what handling is applied to it. What is more one may want to filter the data at different points in the process (e.g. after the data is read or after the data is transformed).

This is why there is no configuration property for "requiring" a field. Such a need is better addressed by creating a custom step, that can applied specific criteria and at a precise point in the import process.

Process overview

The schema below provides an overview of the external import process:

The process is comprised of steps, each of which corresponds to a PHP class (found in Classes/Step). The steps are not the same when synchronizing (pulling) data or when using the API or a reaction (pushing). In the above schema, the steps with a gradient background belong to both processes. The ones with a single color background are called only by the corresponding process.

Each step may affect the raw data (the data provided by the external source) and the so-called "records" (the data as it is transformed by External Import along the various steps). A step can also set an "abort" flag, which will interrupt the import process after the step has completed. The following steps will not be executed unless specifically designed to do so (this is indicated in the list below).

The following is an overview of what each step does:

class CheckPermissionsStep

Fully qualified name: \Cobweb\ExternalImport\Step\CheckPermissionsStep

Check permissions

This step checks whether the current user has the rights to modify the table into which data is being imported. If not, the process will abort.

class ValidateConfigurationStep

Fully qualified name: \Cobweb\ExternalImport\Step\ValidateConfigurationStep

Validate configuration

This step checks that the main configuration as well as each column configuration are valid. If any of them is not, the process will abort. The process will also abort if there is no general configuration or not a single column configuration.

class ValidateConnectorStep

Fully qualified name: \Cobweb\ExternalImport\Step\ValidateConnectorStep

Validate connector

This steps checks if a Connector has been defined for the synchronize process. In a sense, it is also a validation of the configuration, but restricted to a property used only when pulling data.

Up to that point, the \Cobweb\ExternalImport\Domain\Model\Data object contains no data at all.

class ReadDataStep

Fully qualified name: \Cobweb\ExternalImport\Step\ReadDataStep

Read data

This step reads the data from the external source using the defined Connector. It stores the result as the "raw data" of the \Cobweb\ExternalImport\Domain\Model\Data object.

class HandleDataStep

Fully qualified name: \Cobweb\ExternalImport\Step\HandleDataStep

Handle data

This step takes the raw data, which may be a XML structure or a PHP array, and makes it into an associative PHP array. The keys are the names of the columns being mapped and any additional fields declared with the additionalFields property. The values are those of the external data. The results are stored in the "records" of the \Cobweb\ExternalImport\Domain\Model\Data object.

class ValidateDataStep

Fully qualified name: \Cobweb\ExternalImport\Step\ValidateDataStep

Validate data

This steps checks that the external data passes whatever conditions have been defined. If this is not the case, the process is aborted.

class TransformDataStep

Fully qualified name: \Cobweb\ExternalImport\Step\TransformDataStep

Transform data

This step applies all the possible transformations to the external data, in particular mapping it to other database tables. The "records" in the \Cobweb\ExternalImport\Domain\Model\Data object are updated with the transformed values.

class StoreDataStep

Fully qualified name: \Cobweb\ExternalImport\Step\StoreDataStep

Store data

This is where data is finally stored to the database. Some operations related to MM relations also happen during this step. The "records" in the \Cobweb\ExternalImport\Domain\Model\Data object now contain the "uid" field.

class ClearCacheStep

Fully qualified name: \Cobweb\ExternalImport\Step\ClearCacheStep

Clear cache

This step runs whatever cache clearing has been configured.

class ConnectorCallbackStep

Fully qualified name: \Cobweb\ExternalImport\Step\ConnectorCallbackStep

Connector callback

In this step the connector is called again in case one wishes to perform some clean up operations on the source from which the data was imported (for example, mark the source data as having been imported). The postProcessOperations() method of the connector API is called.

This step is called even if the process was aborted, so that error handling can happen with regards to the connector.

class ReportStep

Fully qualified name: \Cobweb\ExternalImport\Step\ReportStep

Report

This last step on the process performs reporting, essentially writing all log entries. It also triggers the \Cobweb\ExternalImport\Event\ReportEvent, which itself triggers the end of run webhook message.

This step is called even if the process was aborted, so that error can be reported.

It is possible to add custom Step classes at any point in the process. On top of this several steps trigger events which allow for further interactions with the default process.

Tutorial

Extension externalimport_tut provides an extensive tutorial about external import. It makes use of many configuration options. All examples are discussed in the extension's manual.

Test extension

Extension externalimport_test also contains many example configurations which are use for integration (functional) testing. The extension itself does not contain a detailed documentation like the tutorial, but it is still a useful resource. The many scenarios and features covered in that extension are briefly mentioned below to help you find your way around it. It is structured according to the file names containing the TCA, either in Configuration/TCA.

tx_externalimporttest_bundle.php

Scenario: import of 1:n relationships (bundles to products) with denormalized data, preserving sorting order, using multipleRows and multipleSorting.

Additional usage of: additional fields, user function transformations, array path (at column level).

tx_externalimporttest_designer.php

Scenario: import data (designers) nested inside other data (products) in a XML structure using XPath (nodepath property).

tx_externalimporttest_invoice.php

Scenario: import denormalized data from a XML file with namespaced tags, using properties namespaces and fieldNS.

tx_externalimporttest_order.php

Scenario: import 1:n relationships (orders to products) from nested data into an IRRE structure. Usage of arrayPath (at general level), substructureFields and children properties.

tx_externalimporttest_product.php

Products are used for testing several scenarios. They are described below according to the configuration key:

base: usage of an EventListener (listening to \Cobweb\ExternalImport\Event\ProcessConnectorParametersEvent), of a custom step, of XPath at column level (property xpath); creation of 1:n relations to tags from comma-separated values (property multipleValuesSeparator) and creation of file references using both substructureFields and children properties.
more: simpler import scenario than "base", but from a siliar XML structure and thus the same mapping. Tests the usage of the useColumnIndex property.
stable: same as "more", testing the disabling of both "update" and "delete" operations, using property disabledOperations.
products_for_stores: creation of m:n relations between stores and products, from the product side. Again usage of the children property for creating IRRE entries.
general_configuration_errors: as the name implies, this configuration contains many errors and is used for testing the general configuration validator.
updated_products: importing products that change name (for testing the updateSlugs property) and also that change "pid" (for testing the moving of records).

tx_externalimporttest_store.php

Scenario: import stores and their m:n relations to products, from the store side, again usage of the children property for creating IRRE entries.

tx_externalimporttest_tag.php

Like products, tags are used to test several scenarios:

0: usage of a custom step to filter out some entries.
only-delete: this one is really specific to integration testing, as it is used to test the deletion of existing tags (loaded from a fixture during testing) when importing.
api: tests the usage of External Import as an API. See class \Cobweb\ExternalimportTest\Command\ImportCommand.

Overrides/pages.php

Scenario: importing some data (in this case products) as pages to test ordering and nesting (some pages are children of others). The configuration itself is very simple.

Overrides/sys_category.php

Two scenarios are tested here:

product_categories: simple import into an existing table, extending for storing the external id.
column_configuration_errors: this configuration contains many errors and is used for testing the column configuration validator.

Overrides/tx_externalimporttest_product.php

This is just used to demonstrate how to make a table categorizable and import categories relationships. It is related to the "base" configuration for products above.

Import configuration

To start inserting data from an external source into your TYPO3 CMS tables, you must first extend their TCA with a specific syntax. This syntax is comprised of 3 parts:

general information ("General TCA configuration")
specific information for each column where data will be stored ("Columns configuration")
so-called "additional fields" which are read from the external source, but not saved

The first two parts are required, the third is optional.

This chapter describes all possible configuration options. For each property, a step or a more general scope is mentioned to help understand which part of the process it impacts. The names of the steps correspond to the process steps.

There are some code examples throughout this chapter. They are taken either from the External Import Tutorial or from the test extension: https://github.com/fsuter/externalimport_test. You are encouraged to refer to them for more examples and more details about each example (in the Tutorial).

User rights

Before digging into the TCA specifics let's have a look at the topic of user rights. Since External Import relies on \TYPO3\CMS\Core\DataHandling\DataHandler for storing data, the user rights on the synchronized tables will always be enforced. However additional checks are performed in both the BE module and the automated tasks to avoid displaying sensitive data or throwing needless error messages.

When accessing the BE module, user rights are taken into account in that:

a user must have at least listing rights on a table to see it in the BE module.
a user must have modify rights on a table to be allowed to synchronize it manually or define an automated synchronization for it.

Furthermore explicit permissions must be set in the BE user group for allowing a user to run synchronizations from the BE module and to define Scheduler tasks. This is found at the bottom of the "Access Lists" tab.

Specific user permissions — Setting specific permissions for the BE module

DB mount points are not checked at this point, so the user may be able to start a synchronization and still get error messages if not allowed to write to the page where the imported data should be stored.

An automated synchronization will be run by the Scheduler. This means that the active user will be _cli_, who is an admin user. Thus no special setup is needed. The same is true for command-line calls.

General TCA configuration

Here is an example of a typical general section syntax, containing two import configurations.

Each configuration must be identified with a key (in the example below, 0 and 'api'). The same keys need to be used again in the column configuration.

$GLOBALS['TCA']['tx_externalimporttest_tag'] = array_merge_recursive( $GLOBALS['TCA']['tx_externalimporttest_tag'], [
    'external' => [
         'general' => [
              0 => [
                   'connector' => 'csv',
                   'parameters' => [
                        'filename' => 'EXT:externalimport_test/Resources/Private/ImportData/Test/Tags.txt',
                        'delimiter' => ';',
                        'text_qualifier' => '"',
                        'encoding' => 'utf8',
                        'skip_rows' => 1
                   ],
                   'data' => 'array',
                   'referenceUid' => 'code',
                   'priority' => 5000,
                   'description' => 'List of tags'
              ],
              'api' => [
                   'data' => 'array',
                   'referenceUid' => 'code',
                   'description' => 'Tags defined via the import API'
              ]
         ]
    ],
]);

All available properties are described below.

Properties

Property	Data type	Scope/Step
additionalFields	string	Read data
arrayPath	string	Handle data (array)
arrayPathFlatten	bool	Handle data (array)
arrayPathSeparator	string	Handle data (array)
clearCache	string	Clear cache
columnsOrder	string	Transform data
connector	string	Read data
customSteps	array	Any step
data	string	Read data
dataHandler	string	Handle data
description	string	Display
disabled	boolean	General
disabledOperations	string	Store data
disableLog	boolean	Store data
enforcePid	boolean	Store data
group	string	Sync process
groups	array	Sync process
minimumRecords	integer	Validate data
namespaces	array	Handle data (XML)
nodetype	string	Handle data (XML)
nodepath	string	Handle data (XML)
parameters	array	Read data
pid	integer	Store data
priority	integer	Display/automated import
referenceUid	string	Store data
updateSlugs	boolean	Store data
useColumnIndex	string or integer	Configuration
whereClause	string	Store data

connector

Type

string

Description

Connector service subtype.

Must be defined only for pulling data. Leave blank for pushing data. You will need to install the relevant connector extension. Here is a list of available extensions and their corresponding types:

Type	Extension
csv	svconnector_csv
json	svconnector_json
sql	svconnector_sql
feed	svconnector_feed

Scope

Read data

parameters

Type

array

Description

Array of parameters that must be passed to the connector service.

Not used when pushing data.

Scope

Read data

data

Type: string
Description: The format in which the data is returned by the connector service. Can be either xml or array.
Scope: Read data

dataHandler

Type: string
Description: A class name for replacing the standard data handlers. See the Developer's Guide for more details.
Scope: Handle data

disabled

Type: bool
Description: A disabled configuration is completely ignored by External Import. It does not appear in any listing, not will it ever by synchronized. This can be useful, for example, when you share a package between TYPO3 installations, but do not need to run the imports everywhere.
Scope: General

groups

Type: array
Description: Any External Import configuration may belong to one or more groups. A group is just an arbitrary string. It is possible to execute the synchronization of all configurations in a given group in one go, in order of priority (lowest goes first). Group synchronization is available on the command line and in the Scheduler task.
Scope: Sync process

group

Type: string
Description: Note

This property is deprecated. Use groups instead. It is still supported, but will be removed in version 9.0.

This can be any arbitrary string of characters. All External Import configurations having the same value for the "group" property will form a group of configurations. It is then possible to execute the synchronization of all configurations in the group in one go, in order of priority (lowest goes first). Group synchronization is available on the command line and in the Scheduler task.
Scope: Sync process

nodetype

Type: string
Description: Name of the reference nodes inside the XML structure, i.e. the children of these nodes correspond to the data that goes into the database fields (see also the description of the field attribute).
Scope: Handle data (XML)

nodepath

Type: string
Description: XPath expression for selecting the reference nodes inside the XML structure. This is an alternative to the nodetype property and will take precedence if both are defined.
Scope: Handle data (XML)

arrayPath

Type

string

Description

Pointer to a sub-array inside the incoming external data, as a list of keys separated by some marker. The sub-array pointed to will be used as the source of data in the subsenquent steps, rather than the whole structure that was read during the ReadDataStep.

For more details on usage and available options, see the dedicated page.

Scope

Handle data (array)

arrayPathFlatten

Type: bool
Description: When the special * segment is used in an arrayPath, the resulting structure is always an array. If the arrayPath target is actually a single value, this may not be desirable. When arrayPathFlatten is set to true, the result is preserved as a simple type.

Note

If the arrayPath property uses the special * segment several times, arrayPathFlatten will apply only to the last occurrence. The reason is that the method which traverses the array structure is called recursively on each * segment. When the result of the final call is flattened, a simple type is returned back up the call chain, which means that arrayPathFlatten has no further effect.
Scope: Handle data (array)

arrayPathSeparator

Type: string
Description: Separator to use in the arrayPath property. Defaults to / if this property is not defined.
Scope: Handle data (array)

referenceUid

Type

string

Description

Name of the column where the equivalent of a primary key for the external data is stored.

Records for which this data does not exist are skipped (since version 6.1). This is tested with PHP's isset() function. If you think your data may contain empty values and you wish to skip them too, use the isEmpty transformation property with the invalidate option set to true.

Important

This is the name of a field in the TYPO3 CMS database, not in the external data! It is the field where the reference (or primary) key of the external data is stored.

Scope

Store data

priority

Type

integer

Description

A level of priority for the execution of the synchronization. Some tables may need to be synchronized before others if foreign relations are to be established. This gives a clue to the user and a strict order for scheduled synchronizations (either when synchronizing all configurations or when synchronizing a group).

The lowest priority value goes first.

If priority is not defined, a default value of 1000 is applied (defined by class constant \Cobweb\ExternalImport\Importer::DEFAULT_PRIORITY).

Not used when pushing data.

Scope

Display/Automated import process

pid

Type: string
Description: ID of the page where the imported records should be stored. Can be ignored and the general storage pid is used instead (see Configuration).
Scope: Store data

enforcePid

Type

boolean

Description

If this is set to true, all operations regarding existing records will be limited to records stored in the defined pid (i.e. either the above property or the general extension configuration). This has two consequences:

when checking for existing records, those records will be selected only from the defined pid.
when checking for records to delete, only records from the defined pid will be affected

This is a convenient way of protecting records from operations started from within the external import process, so that it won't affect e.g. records created manually.

Scope

Store data

useColumnIndex

Type

string or integer

Description

In a basic configuration the same index must be used for the general TCA configuration and for each column configuration. With this property it is possible to use a different index for the column configurations. The general configuration part has to exist with its own index (say "index A"), but the columns may refer to another index (say "index B") and thus their configuration does not need to be defined. Obviously the index referred to ("index B") must exist for columns.

The type may be a string or an integer, because a configuration key may also be either a string or an integer.

Since version 6.1, it is possible to define specific configurations for selected columns using the index from the general configuration ("index A"). It will not be overridden by the configuration corresponding to the index referred to with useColumnIndex property ("index B").

Example:

'stable' => [
    'connector' => 'feed',
    'parameters' => [
        'uri' => 'EXT:externalimport_test/Resources/Private/ImportData/Test/StableProducts.xml',
        'encoding' => 'utf8'
    ],
    'group' => 'Products',
    'data' => 'xml',
    'nodetype' => 'products',
    'referenceUid' => 'sku',
    'priority' => 5120,
    'useColumnIndex' => 'base',
    ...
],

This general configuration makes reference to the "base" configuration. This means that all columns will use the "base" configuration, unless they have a configuration using specifically the "stable" index. So the "sku" column will use the configuration from the "base" index:

'sku' => [
    'exclude' => false,
    'label' => 'SKU',
    'config' => [
        'type' => 'input',
        'size' => 10
    ],
    'external' => [
        'base' => [
            'xpath' => './self::*[@type="current"]/item',
            'attribute' => 'sku'
        ],
        'products_for_stores' => [
            'field' => 'product'
        ],
        'updated_products' => [
            'field' => 'product_sku'
        ]
    ]
],

However, the "name" column has a specific configuration corresponding to the "stable" index, so it will be used, and not the configuration from the "base" index:

'name' => [
    'exclude' => false,
    'label' => 'Name',
    'config' => [
        'type' => 'input',
        'size' => 30,
        'eval' => 'required,trim',
    ],
    'external' => [
        'base' => [
            'xpath' => './self::*[@type="current"]/item',
        ],
        'stable' => [
            'xpath' => './self::*[@type="current"]/item',
            'transformations' => [
                10 => [
                    'userFunction' => [
                        'class' => \Cobweb\ExternalimportTest\UserFunction\Transformation::class,
                        'method' => 'caseTransformation',
                        'parameters' => [
                            'transformation' => 'upper'
                        ]
                    ]
                ]
            ]
        ],
        'updated_products' => [
            'field' => 'name'
        ]
    ]
],

Scope

Configuration

columnsOrder

Type

string

Description

By default, columns (regular columns or additional fields) are handled in alphabetical order whenever a loop is performed on all columns (typically in the \Cobweb\ExternalImport\Step\TransformDataStep class). This can be an issue when you need a specific column to be handled before another one.

With this property, you can define a comma-separated list of columns, that will be handled in that specific order. It is not necessary to define an order for all columns. If only some columns are explicitly ordered, the rest will be handled after the ordered ones, in alphabetical order. The order is visually reflected in the backend module, when viewing the configuration details.

Scope

Transform data (essentially)

customSteps

Type

array

Description

As explained in the process overview, the import process goes through several steps, depending on its type. This property makes it possible to register additional steps. Each step can be placed before or after any existing step (including previously registered custom steps).

The configuration is a simple array, each entry being itself an array with three properties:

class (required): name of the PHP class containing the custom step.
position (required): states when the new step should happen. The syntax for position is made of the keyword before or after, followed by a colon (:) and the name of an existing step class.
parameters (optional): array which is passed as is to the custom step class when it is called during the import process. Inside the step, it can be accessed using $this->parameters.

Example:

'customSteps' => [
        [
                'class' => \Cobweb\ExternalimportTest\Step\EnhanceDataStep::class,
                'position' => 'after:' . \Cobweb\ExternalImport\Step\ValidateDataStep::class
        ]
],

If any element of the custom step declaration is invalid, the step will be ignored. More information is given in the Developer's Guide.

Scope

Any step

whereClause

Type: string
Description: SQL condition that will restrict the records considered during the import process. Only records matching the condition will be updated or deleted. This condition comes on top of the "enforcePid" condition, if defined.

Warning

This may cause many records to be inserted over time. Indeed if some external data is imported the first time, but then doesn't match the whereClause condition, it will never be found for update. It will thus be inserted again and again. Whenever you make use of the whereClause property you should therefore watch for an unexpectedly high number of inserts.
Scope: Store data

additionalFields

Type: string
Description: This property is not part of the general configuration anymore. Please refer to the dedicated chapter.
Scope: Read data

updateSlugs

Type: boolean
Description: Slugs are populated automatically for new records thanks to External Import relying on the \TYPO3\CMS\Core\DataHandling\DataHandler class. The same is not true for updated records. If you want record slugs to be updated when modified external data is imported, set this flag to true.
Scope: Store data

namespaces

Type

array

Description

Associative array of namespaces that can be used in XPath queries. The keys correspond to prefixes and the values to URIs. The prefixes can then be used in XPath queries.

Example

Given the following declaration:

'namespaces' => array(
   'atom' => 'http://www.w3.org/2005/Atom'
)

a Xpath query like:

atom:link

could be used. The prefixes used for XPath queries don't need to match the prefixes used in the actual XML source. The defaut namespace has to be registered too in order for XPath queries to succeed.

Scope

Handle data (XML)

description

Type: string
Description: A purely descriptive piece of text, which should help you remember what this particular synchronization is all about. Particularly useful when a table is synchronized with multiple sources.
Scope: Display

disabledOperations

Type

string

Description

Comma-separated list of operations that should not be performed. Possible operations are insert, update and delete. This way you can block any of these operations.

insert: The operation performed when new records are found in the external source.
update: Performed when a record already exists and only its data needs to be updated.
delete: Performed when a record is in the database, but is not found in the external source anymore.

See also the column-specific property disabledOperations.

Scope

Store data

minimumRecords

Type: integer
Description: Minimum number of items expected in the external data. If fewer items are present, the import is aborted. This can be used – for example – to protect the existing data against deletion when the fetching of the external data failed (in which case there are no items to import).
Scope: Validate data

disableLog

Type: integer
Description: Set to true to disable logging by the TYP3 Core Engine. This setting will override the general "Disable logging" setting (see Configuration for more details).
Scope: Store data

clearCache

Type: string
Description: Comma-separated list of caches identifiers for caches which should be cleared at the end of the import process. See Clearing the cache.
Scope: Clear cache

Columns configuration

You also need an "external" syntax for each column to define which external data goes into that column and any handling that might apply. This is also an indexed array. Obviously indices used for each column must relate to the indices used in the general configuration. In its simplest form this is just a reference to the external data's name:

'code' => [
    'exclude' => 0,
    'label' => 'LLL:EXT:externalimport_tut/locallang_db.xml:tx_externalimporttut_departments.code',
    'config' => [
        'type' => 'input',
        'size' => 10,
        'max' => 4,
        'eval' => 'required,trim',
    ],
    'external' => [
        0 => [
            'field' => 'code'
        ]
    ]
],

The properties for the columns configuration are described below.

Warning

Columns "crdate", "tstamp" and "cruser_id" cannot be mapped as they are overwritten by the DataHandler. If you need to manipulate these columns you should use the Datamap Postprocess event or the Cmdmap Postprocess event which are triggered after DataHandler operations.

Hint

You can set static values by using the transformation > value.

Properties

Property	Data type	Step/Scope
arrayPath	string	Handle data (array)
arrayPathSeparator	string	Handle data (array)
arrayPathFlatten	bool	Handle data (array)
attribute	string	Handle data (XML)
attributeNS	string	Handle data (XML)
children	Children records configuration	Store data
disabledOperations	string	Store data
field	string	Handle data
fieldNS	string	Handle data (XML)
multipleRows	boolean	Store data
multipleSorting	string	Store data
substructureFields	array	Handle data
transformations	Transformations configuration	Transform data
value	Simple type (string, integer, float, boolean)	Handle data
xmlValue	boolean	Handle data (XML)
xpath	string	Handle data (XML)

value

Type

Simple type (string, integer, float, boolean)

Description

Sets a fixed value, independent of the data being imported. For example, this might be used to set a flag for all imported records. Or you might want to use different types for different import sources.

This can be used for both array-type and XML-type data.

Scope

Handle data

field

Type

string

Description

Name or index of the field (or node, in the case of XML data) that contains the data in the external source.

For array-type data, this information is mandatory. For XML-type data, it can be left out. In such a case, the value of the current node itself will be used, or an attribute of said node, if the attribute property is also defined.

Scope

Handle data

arrayPath

Type

string

Description

Replaces the field property for pointing to a field in a "deeper" position inside a multidimensional array. The value is a string comprised of the keys for pointing into the array, separated by some character.

For more details on usage and available options, see the dedicated page.

Works only for array-type data.

If both "field" and "arrayPath" are defined, the latter takes precedence.

Scope

Handle data (array)

arrayPathFlatten

Type: bool
Description: When the special * segment is used in an arrayPath, the resulting structure is always an array. If the arrayPath target is actually a single value, this may not be desirable. When arrayPathFlatten is set to true, the result is preserved as a simple type.

Note

If the arrayPath property uses the special * segment several times, arrayPathFlatten will apply only to the last occurrence. The reason is that the method which traverses the array structure is called recursively on each * segment. When the result of the final call is flattened, a simple type is returned back up the call chain, which means that arrayPathFlatten has no further effect.
Scope: Handle data (array)

arrayPathSeparator

Type: string
Description: Separator to use in the arrayPath property. Defaults to / if this property is not defined.
Scope: Handle data (array)

attribute

Type

string

Description

If the data is of type XML, use this property to retrieve the value from an attribute of the node rather than the value of the node itself.

This applies to the node selected with the field property or to the current node if field is not defined.

Scope

Handle data (XML)

xpath

Type

string

Description

This property can be used to execute a XPath query relative to the node selected with the field property or (since version 2.3.0) directly on the current node if field is not defined.

The value will be taken from the first node returned by the query. If the attribute property is also defined, it will be applied to the node returned by the XPath query. If the XPath query is just a function (without selector), the resulting string will be returned.

Please see the namespaces property for declaring namespaces to use in a XPath query.

Scope

Handle data (XML)

fieldNS

Type

string

Description

Namespace for the given field. Use the full URI for the namespace, not a prefix.

Example

Given the following data to import:

<?xml version="1.0" encoding="UTF-8"?>
<Invoice xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2">
    <InvoiceLine>
        <cbc:ID>A1</cbc:ID>
        <cbc:LineExtensionAmount currencyID="USD">100.00</cbc:LineExtensionAmount>
        <cac:OrderReference>
            <cbc:ID>000001</cbc:ID>
        </cac:OrderReference>
    </InvoiceLine>
    ...
</Invoice>

getting the value in the <cbc:LineExtensionAmount> tag would require the following configuration:

'external' => [
    0 => [
        'fieldNS' => 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2',
        'field' => 'LineExtensionAmount'
    ]
]

Scope

Handle data (XML)

attributeNS

Type: string
Description: Namespace for the given attribute. Use the full URI for the namespace, not a prefix. See fieldNS for example usage.
Scope: Handle data (XML)

substructureFields

Type

array

Description

Makes it possible to read several values that are located inside nested data structures. Consider the following data source:

[
  {
    "order": "000001",
    "date": "2014-08-07",
    "customer": "Conan the Barbarian",
    "products": [
      {
        "product": "000001",
        "qty": 3
      },
      {
        "product": "000005",
        "qty": 1
      },
      {
        "product": "000101",
        "qty": 10
      },
      {
        "product": "000102",
        "qty": 2
      }
    ]
  },
  {
    "order": "000002",
    "date": "2014-08-08",
    "customer": "Sonja the Red",
    "products": [
      {
        "product": "000001",
        "qty": 1
      },
      {
        "product": "000005",
        "qty": 2
      },
      {
        "product": "000202",
        "qty": 1
      }
    ]
  }
]

The "products" field is actually a nested structure, from which we want to fetch the values from both product and qty. This can be achieved with the following configuration:

'products' => [
 'exclude' => 0,
 'label' => 'Products',
 'config' => [
    ...
 ],
 'external' => [
    0 => [
       'field' => 'products',
       'substructureFields' => [
          'products' => [
             'field' => 'product'
          ],
          'quantity' => [
             'field' => 'qty'
          ]
       ],
       ...
    ]
 ]
]

The keys to the configuration array correspond to the names of the columns where the values will be stored. The configuration for each element can use all the existing properties for retrieving data:

field
fieldNS
arrayPath
arrayPathSeparator
attribute
attributeNS
xpath
xmlValue

The substructure fields are searched for inside the structure selected with the "main" data pointer. In the example above, the whole "products" structure is first fetched, then the product and qty are searched for inside that structure.

The above example will read the values in the product nested field and put it into "products" column. Same for qty and "quantity". The fact that there are several entries will multiply imported records, actually denormalising the data on the fly. The result would be something like:

order	date	customer	products	quantity
000001	2014-08-07	Conan the Barbarian	000001	3
000001	2014-08-07	Conan the Barbarian	000005	1
000001	2014-08-07	Conan the Barbarian	000101	10
000001	2014-08-07	Conan the Barbarian	000102	2
000002	2014-08-08	Sonja the Red	000001	1
000002	2014-08-08	Sonja the Red	000005	2
000002	2014-08-08	Sonja the Red	000202	1

Obviously if you have a single element in the nested structure, no denormalisation happens. Due to this denormalisation you probably want to use this property in conjunction with the multipleRows or children properties.

Note

In such scenarios you will generally want to have one of the nested fields "take the main role", i.e. have its value fill a column bearing the name of TYPO3 column which contains the substructure configuration. In the above example, the product field is matched to the "products" column name. In such a case, this nested field will go through any transformations defined for the column.

If you need to apply transformations to other substructure fields, map them to additional fields. In order for this to work, you need to write some value into each additional field, otherwise it will result in a configuration error. So you need to set some dummy value, that is overridden by the values pointed to by the substructureFields configuration, but take care that if such a value is missing, the dummy value will remain and may produce unwanted results, depending on the rest of your configuration.

Scope

Handle data

multipleRows

Type

boolean

Description

Set to true if you have denormalized data. This will tell the import process that there may be more than one row per record to import and that all values for the given column must be gathered and collapsed into a comma-separated list of values. See the Mapping data chapter for explanations about the impact of this flag.

If these values need to be sorted, use the multipleSorting property.

Scope

Store data

multipleSorting

Type: string
Description: If the multipleRows need to be sorted, use this property to name the field which should be used for sorting. This can be any of the mapped fields, additional fields or substructure fields.

Note

The sorting is done using the PHP function strnatcasecmp(), so make sure that your data plays well with it.
Scope: Store data

children

Type: array (see Children records configuration)
Description: This property makes it possible to create nested structures and import them in one go. This may typically be "sys_file_reference" records for a field containing images. This should be used anytime you are using a MM table into which you need to write specific properties (like "sys_file_reference"). For simple MM tables (like "sys_category_record_mm"), you don't need to create this children sub-structure for the MM table. It is enough to gather a comma-separated list of "sys_category" primary keys.
Scope: Store data

transformations

Type

array (see Transformations configuration)

Description

Array of transformation properties. The transformations will be executed as ordered by their array keys.

Example:

$GLOBALS['TCA']['fe_users']['columns']['starttime']['external'] = [
 0 => [
    'field' => 'start_date',
    'transformations' => [
       20 => [
          'trim' => true
       ],
       10 => [
          'userFunction' => [
             'class' => \Cobweb\ExternalImport\Transformation\DateTimeTransformation::class,
             'method' => 'parseDate'
          ]
       ]
    ]
 ]
];

The "userFunction" will be executed first (10) and the "trim" next (20).

Scope

Transform data

xmlValue

Type: boolean
Description: When taking the value of a node inside a XML structure, the default behaviour is to retrieve this value as a string. If the node contained a XML sub-structure, its tags will be stripped. When setting this value to true, the XML structure of the child nodes is preserved.
Scope: Handle data (XML)

disabledOperations

Type

array

Description

Comma-separated list of database operations from which the column should be excluded. Possible values are "insert" and "update".

See also the general property disabledOperations.

Scope

Store data

Additional fields configuration

Additional fields are fields that are read from the external source but not saved to the database. They do not match TCA columns. They are most likely used in user functions and custom steps to prepare some other data, but are not persisted in the TYPO3 database.

Since External Import 5.0, additional fields are defined in their own "configuration space":

$GLOBALS['TCA']['tx_externalimporttest_tag'] = [
   'external' => [
      'additionalFields' => [
         0 => [
            'quantity' => [
               'field' => 'qty'
            ]
         ]
      ]
   ],
];

As usual the index (here 0) must match between the general configuration, the columns configuration and the additional fields configuration.

In the above example the "qty" field from the external data will be read and stored in the "quantity" column, which will be available for any processing, but not saved to the database.

All properties from the columns configuration can be used with additional fields too (although some may not make sense).

Tip

Technically speaking the additional fields and columns configuration are merged. Additional fields are marked with a special flag that tells External Import not to save them. Any column can be marked (or unmarked) as such programmatically by invoking \Cobweb\ExternalImport\Domain\Model\Configuration::setExcludedFromSavingFlagForColumn().

Transformations configuration

A number of properties relate to transforming the data during the import process. All of these properties are used during the "Transform data" step. They are sub-properties of the transformations property.

Properties

Property	Data type	Step/Scope
isEmpty	array	Transform data
mapping	Mapping configuration	Transform data
rteEnabled	boolean	Transform data
trim	boolean	Transform data
userFunction	array	Transform data
value	simple type (string, integer, float, boolean)	Transform data

mapping

Type: Mapping configuration
Description: This property can be used to map values from the external data to values coming from some internal table. A typical example might be to match 2-letter country ISO codes to the uid of the "static_countries" table.
Scope: Transform data

value

Type

Simple type (string, integer, float, boolean)

Description

With this property, it is possible to set a fixed value for a given field. For example, this might be used to set a flag for all imported records. Or you might want to use different types for different import sources.

Note

Since External Import 7.1, the column property value should be used instead, as it makes more sense. Since there could be scenarios where this transformation property also makes sense, it is not deprecated, but its usage should be avoided.

Example:

EXT:my_extension/Configuration/Overrides/tx_sometable.php

$GLOBALS['TCA']['tx_sometable'] = array_replace_recursive($GLOBALS['TCA']['tx_sometable'],
[
  // ...
    'columns' => [
        'type' => [
            'external' => [
                0 => [
                    'transformations' => [
                        10 => [
                            // Default type
                            'value' => 0
                        ]
                    ],
                ],
                'another_import' => [
                    'transformations' => [
                        10 => [
                            // Another type
                            'value' => 1
                        ]
                    ],
                ]
            ]
        ],
     // ...
    ],
]);

Scope

Transform data

trim

Type: boolean
Description: If set to true, every value for this column will be trimmed during the transformation step.

Note

With newer versions of PHP, trying to trim a non-string causes an error. To account for that, since External Import 6.0.1, non-string data is left unchanged by this transformation. This may cause changes in your import, as previously the data used to be cast on the fly and trimmed.

If you are affected by this change, you should create a custom transformation with a userFunction to cast your data explicitly before calling trim.
Scope: Transform data

rteEnabled

Type: boolean
Description: If set to true when importing HTML data into a RTE-enable field, the imported data will go through the usual RTE transformation process on the way to the database.

Note

Since the data goes through the RTE transformation process, you should mind the settings of the RTE for the given field if the results are unexpected. This is particularly true for tags which are not inside other tags and need to be explicitly allowed using the allowTagsOutside option for example (see the RTE configuration reference).
Scope: Transform data

userFunction

Type

array

Description

This property can be used to define a function that will be called on each record to transform the data from the given field. See example below.

Example

Here is a sample setup referencing a user function:

$GLOBALS['TCA']['fe_users']['columns']['starttime']['external'] = [
 0 => [
    'field' => 'start_date',
    'transformations' => [
       10 => [
          'userFunction' => [
             'class' => \Cobweb\ExternalImport\Transformation\DateTimeTransformation::class,
             'method' => 'parseDate'
          ]
       ]
    ]
 ]
];

The definition of a user function takes three parameters:

class: (string) Required. Name of the class to be instantiated.
method: (string) Required. Name of the method that should be called.
parameters (formerly "params"): (array) Optional. Can contain any number of data, which will be passed to the method. This used to be called "params". Backwards-compatibility is ensured for now, but please update your configuration as soon as possible.

In the example above we are using a sample class provided by External Import that can be used to parse a date and either return it as a timestamp or format it using either of the PHP functions date() or strftime() .

Note

Since External Import 5.1.0, if the user function throws an exception while handling a value, that value will be unset and thus removed from the imported dataset. The rationale is that such a value is considered invalid and should not be further processed nor saved to the database.

The user function can also specifically throw the \Cobweb\ExternalImport\Exception\InvalidRecordException. The effect is to remove the entire record from the imported dataset.

For more details about creating a user function, please refer to the Developer's Guide.

Scope

Transform data

isEmpty

Type

array

Description

This property is used to assess if a value in the given column can be considered empty or not and, if yes, act on it. The action can be either to set a default value or to remove the entire record from the imported dataset.

Deciding whether a given value is "empty" is a bit tricky, since null, false, 0 or an empty string - to name a few - could all be considered empty depending on the circumstances. By default, this property will rely on the PHP function empty(). However it is also possible to evaluate an expression based on the values in the record using the Symfony Expression Language.

expression

(string) A condition using the Symfony Expression Language syntax. If it evaluates to true, the action (see below) will be triggered. The values in the record can be used, by simply referencing them with the column name.

If no expression is defined, the PHP function empty() is used.

See the Symfony documentation for reference.

invalidate

(bool) Set this property to true to discard the entire record from the imported dataset if the expression (or empty()) evaluated to true. invalidate takes precedence over default.

default

(mixed) If the expression (or empty()) evaluates to true, this value will be set in the record instead of the empty value.

Example

'store_code' => [
    'exclude' => 0,
    'label' => 'Code',
    'config' => [
        'type' => 'input',
        'size' => 10
    ],
    'external' => [
        0 => [
            'field' => 'code',
            'transformations' => [
                10 => [
                    'trim' => true
                ],
                20 => [
                    'isEmpty' => [
                        'expression' => 'store_code === ""',
                        'invalidate' => true
                    ]
                ],
            ]
        ]
    ]
],

In this example, the store_code field is compared with an empty string. Any record with an empty string in that column will be removed from the dataset.

Note

Since you can write any expression as long as it evaluates to a boolean value, this property actually makes it possible to test another condition than just emptiness, although it may be confusing to use it in this way.

Warning

There's a weird behavior in the Symfony Expression Language: if the value being evaluated is missing from the record, the parser throws an error as if the syntax were invalid. The workaround implemented in External Import is that an evaluation throwing an exception is equivalent to the evaluation returning true. This makes it possible to handle missing values, but has the drawback that a real syntax error will not be detected and all values will be considered empty.

Such events are logged (at notice-level).

This does not happen anymore with symfony/expression-language 7.2 or above. Also, with symfony/expression-language 6.x or above, it is possible to use the coalesce operator, which will prevent the above-mentioned error.

Mapping configuration

The external values can be matched to values from an existing TYPO3 CMS table, using the "mapping" property, which has its own set of properties. They are described below.

Properties

Property	Data type
default	mixed
matchMethod	string
matchSymmetric	boolean
multipleValuesSeparator	string
referenceField	string
table	string
valueField	string
valueMap	array
whereClause	string

table

Type: string
Description: Name of the table to read the mapping data from.
Scope: Transform data

referenceField

Type: string
Description: Name of the field against which external values must be matched.

Note

SQL functions may be used here. Example: 'CONCAT(first_name, \' \', last_name)'.
Scope: Transform data

valueField

Type: string
Description: Name of the field to take the mapped value from. If not defined, this will default to "uid".

Note

SQL functions may be used here. Example: 'CONCAT(first_name, \' \', last_name)'.
Scope: Transform data

whereClause

Type

string

Description

SQL condition (without the "WHERE" keyword) to apply to the referenced table. This is typically meant to be a mirror of the foreign_table_where property of select-type fields.

However only one marker is supported in this case: ###PID_IN_USE### which will be replaced by the current storage pid. So if you have something like:

'foreign_table_where' => 'AND pid = ###PAGE_TSCONFIG_ID###'

in the TCA for your column, you should replace the marker by a hard- coded value instead for external import, e.g.

'whereClause' => 'pid = 42'

Important

The clause must start with neither the "WHERE", nor the "AND" keyword.

Scope

Transform data

default

Type

mixed

Description

Default value that will be used when a value cannot be mapped. Otherwise the field is unset for the record.

Note

This is quite important when mapping MM relations. If an existing item has currently relations in the TYPO3 database, but not any longer in the data to be imported, the existing MM relations will not be removed if the field is unset. In such a case, make sure to use an empty string for the default value, as this will tell the DataHandler that it has to remove the existing MM relations.

Example

$GLOBALS['TCA']['tx_externalimporttest_product']['columns']['categories']['external']['base'] = [
     'xpath' => './self::*[@type="current"]/category',
     'transformations' => [
          10 => [
               'mapping' => [
                    'table' => 'sys_category',
                    'referenceField' => 'external_key',
                    'default' => ''
               ]
          ]
     ]
];

Scope

Transform data

valueMap

Type: array
Description: Fixed hash table for mapping. Instead of using a database table to match external values to internal values, this property makes it possible to use a simple list of key-value pairs. The keys correspond to the external values.
Scope: Transform data

multipleValuesSeparator

Type

string

Description

Set this property if the field to map contains several values, separated by some symbol (for example, a comma). The values will be split using the symbol defined in this property and each resulting value will go through the mapping process.

This makes it possible to handle 1:n or m:n relations, where the incoming values are all stored in the same field.

Note

This property does nothing when used in combination with the MM property, because we expect normalized data with one and denormalized data with the other. The chapter about mapping data hopefully helps understand this.

Scope

Transform data

matchMethod

Type

array

Description

Value can be "strpos" or "stripos".

Normally mapping values are matched based on a strict equality. This property can be used to match in a "softer" way. It will match if the external value is found inside the values pointed to by the referenceField property. "strpos" will perform a case-sensitive matching, while "stripos" is case-unsensitive.

Caution should be exercised when this property is used. Since the matching is less strict it may lead to false positives. You should review the data after such an import.

Note

It is important to understand how the matchMethod property influences the matching process. Consider trying to map freely input country names to the static_countries table inside TYPO3 CMS. This may not be so easy depending on how names were input in the external data. For example, "Australia" will not strictly match the official name, which is "Commonwealth of Australia". However setting matchMethod to "strpos" will generate a match, since "Australia" can be found inside "Commonwealth of Australia"

Scope

Transform data

matchSymmetric

Type: boolean
Description: This property complements matchMethod. If set to true, the import process will not only try to match the external value inside the mapping values, but also the reverse, i.e. the mapping values inside the external value.
Scope: Transform data

Examples

Simple mapping

Here's an example TCA configuration.

$GLOBALS['TCA']['fe_users']['columns']['tx_externalimporttut_department']['external'] = [
    0 => [
        'field' => 'department',
        'mapping' => [
            'table' => 'tx_externalimporttut_departments',
            'referenceField' => 'code'
        ]
    ]
];

The value found in the "department" field of the external data will be matched to the "code" field of the "tx_externalimporttut_departments" table, and thus create a relation between the "fe_users" and the "tx_externalimporttut_departments" table.

Mapping multiple values

This second example demonstrates usage of the multipleValuesSeparator property.

The incoming data looks like:

<catalogue>
    <products type="current">
        <item sku="000001">Long sword</item>
        <tags>attack,metal</tags>
    </products>
    <products type="obsolete">
        <item index="000002">Solar cream</item>
    </products>
    <products type="current">
        <item sku="000005">Chain mail</item>
        <tags>defense,metal</tags>
    </products>
    <item sku="000014" type="current">Out of structure</item>
</catalogue>

and the external import configuration like:

$GLOBALS['TCA']['tx_externalimporttest_product']['columns']['tags']['external'] = [
  'base' => [
      'xpath' => './self::*[@type="current"]/tags',
      'transformations' => [
           10 => [
                'mapping' => [
                     'table' => 'tx_externalimporttest_tag',
                     'referenceField' => 'code',
                     'multipleValuesSeparator' => ','
                ]
           ]
      ]
  ]
];

The values in the <tags> nodes will be split on the comma and each will be matched to a tag from "tx_externalimporttest_tag" table, using the "code" field for matching.

This example is taken from the "externalimport_test" extension.

Child records configuration

The "children" property is used to create nested structures, generally MM tables where additional information needs to be stored.

Note

This corresponds to inline-type ("IRRE") fields with MM tables having a primary key field ("uid").

See the Mapping data chapter for an overview of import scenarios which may help understand this feature.

Example:

$GLOBALS['TCA']['tx_externalimporttest_product']['columns']['pictures']['external'] = [
   'base' => [
        'field' => 'Pictures', // remote db field
        'transformations' => [
            10 => [
                'userFunction' => [
                    'class' => \Cobweb\ExternalImport\Transformation\ImageTransformation::class,
                    'method' => 'saveImageFromUri',
                    'parameters' => [
                        'storage' => '1:importedpictures', // local folder for files
                    ]
                ]
            ]
        ],
        'children' => [
            'table' => 'sys_file_reference',
            'columns' => [
                'uid_local' => [
                    'field' => 'pictures'
                ],
                'uid_foreign' => [
                    'field' => '__parent.id__'
                ],
                'title' => [
                    'field' => 'picture_title'
                ],
                'tablenames' => [
                    'value' => 'tx_externalimporttest_product'
                ],
                'fieldname' => [
                    'value' => 'pictures'
                ],
                'table_local' => [
                    'value' => 'sys_file'
                ]
            ],
            'sorting' => [
                'source' => 'picture_order',
                'target' => 'sorting_foreign'
            ],
            'controlColumnsForUpdate' => 'uid_local, uid_foreign, tablenames, fieldname, table_local',
            'controlColumnsForDelete' => 'uid_foreign, tablenames, fieldname, table_local'
        ]
       ...
   ]
]

Properties

Property	Data type	Step/Scope
columns	array	Store data
controlColumnsForUpdate	string	Store data
controlColumnsForDelete	string	Store data
disabledOperations	string	Store data
sorting	array	Store data
table	string	Store data

table

Type: string
Description: Name of the nested table. This information is mandatory.
Scope: Store data

columns

Type

array

Description

List of columns (database fields) needed for the nested table. This is an associative array, using the column name as the key. Then each column must have one of two properties:

value

This is a simple value that will be used for each entry into the nested table. Use it for invariants like the "tablenames" field of a MM table.

field

This is the name of a field that is available in the imported data. The value is copied from the current record. Note that such fields can be any of the mapped columns, any of the additionalFields or any of the substructureFields.

The special value __parent.id__ refers to the primary key of the current record and will typically be used for "uid_local" or "uid_foreign" fields in MM tables, depending on how the relation is built.

Scope

Store data

controlColumnsForUpdate

Type

string

Description

Comma-separated list of columns that need to be used for checking if a child record already exists. All these columns must exist in the list of columns defined above. Defining this property ensures that existing relations are updated instead of being created anew.

This list should contain all columns that are significant for identifying a child record without ambiguity. In the example above, we have:

'controlColumnsForUpdate' => 'uid_local, uid_foreign, tablenames, fieldname, table_local',

These are all the columns that need to be queried in the "sys_file_reference" table to be sure that we are targeting the right record in the database. Any missing information might mean retrieving another record (for a different table or field, or whatever).

Note

If this property is not defined, all children records will be considered to be new. If controlColumnsForDelete is defined and the "delete" operation is not disabled, all existing child relations will be deleted upon each import.

Scope

Store data

controlColumnsForDelete

Type: string
Description: This is similar to controlColumnsForUpdate but for finding out which existing relations are no longer relevant and need to be deleted. It is not the same list of fields as you need to leave out the field which references the relation on the "other side". In the case of "sys_file_reference", you would leave out "uid_local", which is the reference to the "sys_file" table.

Note

If this property is not defined, existing children records will not be checked and thus never be deleted.
Scope: Store data

sorting

Type

array

Description

External Import stores child records in the order in which they appear, which is generally the order in which they are in the external data source. It may be needed to sort the child records differently, according to some other data available in the in the external source.

This property allows this. It is defined by two elements:

source

The name of the column containing the sorting value in the external data source. This column should ideally contain numerical values. If that is not the case, the values are cast to integer when they are used, so you need to make sure that the values contained in this column can be cast safely.

If the sorting value is missing for some records, a value of 0 will be used instead, putting those child records at the top of the list.

target

The name of the sorting field in the child record table.

Both elements are mandatory. Configuration validation will fail otherwise.

'sorting' => [
    'source' => 'picture_order',
    'target' => 'sorting_foreign'
],

Scope

Store data

disabledOperations

Type: string
Description: Comma-separated list of operations which should not take place. This can be "insert" (no new child records), "update" (no update to existing child records) and/or "delete" (no removal of existing child records).

Note

This applies only when a parent record is being updated. When a parent record is being created, it does not make sense to forbid creation of its child records.
Scope: Store data

Array Path configuration

Introduction

The "arrayPath" property, which can apply to both the general configuration and the columns configuration has several options which can make it tricky to use once you try more complicated scenarios. Thus this dedicated chapter.

This property is like a path pointing some specific part of a multidimensional array. The different parts of the path are separated by some marker, itself defined by the arrayPathSeparator property. if "arrayPathSeparator" is not set, the separator defaults to /.

Examples

As a simple example, consider the following structure to import:

[
   'name' => 'Zaphod Beeblebrox',
   'book' => [
      'title' => 'Hitchiker\'s Guide to the Galaxy'
   ]
]

To import the title of the book (and not the book itself), use the following configuration:

[
   'arrayPath' => 'book/title'
]

Note

At column-level, using 'arrayPath' => 'book' is equivalent to using 'field' => 'book', but the "field" property should be preferred in such a case, as it requires less processing.

If, for some reason, you needed a different separator, you could use something like:

[
   'arrayPath' => 'book#title',
   'arrayPathSeparator' => '#'
]

It is perfectly okay to use numerical indices in the path. With this structure:

[
   'series' => 'Hitchiker\'s Guide to the Galaxy',
   'books' => [
      'The Hitchiker\'s Guide to the Galaxy',
      'The Restaurant at the End of the Universe',
      'So long, and thanks for all the Fish'
      // etc.
   ]
]

and this configuration:

[
   'arrayPath' => 'books/0'
]

The result will be "The Hitchiker's Guide to the Galaxy". It is always the first element inside "books" that will be selected.

Conditions

Conditions can be applied to each segment of the path using the Symfony Expression Language syntax, wrapped in curly braces. If the value being tested is an array, its items can be accessed directly in the expression. If the value is a simple type, it can be accessed in the expression with the key value.

See the Symfony documentation for reference on the Symfony Expression Language syntax.

Examples

With the following data to import:

[
   'name' => 'Zaphod Beeblebrox',
   'book' => [
      'state' => 'new',
      'title' => 'Hitchiker\'s Guide to the Galaxy'
   ]
]

let's imagine two scenarios. First, we want to get the name of the character, but only if it's "Zaphod Beeblebrox". The configuration would be:

[
   'arrayPath' => 'name{value === \'Zaphod Beeblebrox\'}'
]

When the name is indeed "Zaphod Beeblebrox", the result will be "Zaphod Beeblebrox" too. When the name is anything else, the result will be null.

A second scenario is to take the title of the book, only if the book is new. That would be achieved with a configuration like:

[
   'arrayPath' => 'book{state === \'new\'}/title'
]

With the above data, the result will be "Hitchiker's Guide to the Galaxy", but for a book whose state is "used", the result would be null.

Such usage of conditions may seem a bit far-fetched at first, but can be quite interesting when combined (at a later stage in the import process) with the isEmpty property. However conditions are much more interesting for looping on substructures and filtering them, as described next.

Looping and filtering

The special segment * can be included in the path. It indicates that all values selected up to that point should be looped on and the condition following the * applied to each of them (the * without a condition is useful when wanting to loop on an array with numerical indices). This will effectively filter the currently selected elements. Further segments in the path are applied only to that resulting set.

Note

Using * as a segment will always result in an array, which can be explored with further segments or flattened, if it contains a single result.

Usage of special segment * can be followed by usage of special segment ., which changes the way the selected elements are handled. This is better explained by using examples.

Examples

Let's consider the following structure to import:

[
    'test' => [
        'data' => [
            0 => [
                'status' => 'valid',
                'list' => [
                    0 => 'me',
                    1 => 'you'
                ]
            ],
            1 => [
                'status' => 'invalid',
                'list' => [
                    4 => 'we'
                ]
            ],
            2 => [
                'status' => 'valid',
                'list' => [
                    3 => 'them'
                ]
            ]
        ]
    ]
]

And let's say that we want to have all the items that are inside the "list" key, but only when the "status" is "valid". We would use the following configuration:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/list'
]

which would result in:

[
    0 => 'me',
    1 => 'you',
    2 => 'them'
]

This may not seem very intuitive at first. This is because this feature was designed to mimic what you might get from a XML structure with a XPath query. Consider the following structure:

<books>
   <book>
      <title>Foo</title>
      <authors>
         <author>A</author>
         <author>B</author>
      </authors>
   </book>
   <book>
      <title>Bar</title>
      <authors>
         <author>C</author>
      </authors>
   </book>
</books>

With an XPath like //author, you would get values "A", "B" and "C" in a single list, no matter what context surrounds them.

If you need to preserve the structure of the elements matched, you can add the special segment . after the * segment. This preserves the matched structure, to which you can apply further path segments. The above example would be modified as such:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/./list'
]

which changes the result to:

[
    0 => [
        0 => 'me',
        1 => 'you'
    ],
    1 => [
        3 => 'them'
    ]
]

If we change the structure to import to this:

[
    'test' => [
        'data' => [
            0 => [
                'status' => 'invalid',
                'list' => [
                    0 => 'me',
                    1 => 'you'
                ]
            ],
            1 => [
                'status' => 'invalid',
                'list' => [
                    4 => 'we'
                ]
            ],
            2 => [
                'status' => 'valid',
                'list' => [
                    3 => 'them'
                ]
            ]
        ]
    ]
]

making the first entry also "invalid" and using the same first condition:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/list'
]

we will have a single result:

[
    0 => 'them'
]

When we know that we have such a scenario, it might be convenient to get the actual value as a result (i.e. "them") rather than a single-entry array. This is where property arrayPathFlatten can be used. Modifying the configuration to:

[
   'arrayPath' => 'test/data/*{status === \'valid\'}/list',
   'arrayPathFlatten' => true
]

changes the result to simply:

'them'

Log cleanup

The log table can be cleaned up automatically using the Table garbage collection Scheduler task.

A new entry for that task can be created with the following options:

Table to clean up: tx_externalimport_domain_model_log
Delete entries older than given number of days: 30 (default)

A pre-configuration exists in the ext_localconf.php file with a configuration of 180 days.

If you run a lot of imports, make sure that this table is cleaned up regularly.

Developer's Guide

Available APIs

This chapter describes the various APIs and data models existing in this extension and which might be of use to developers.

Import API

As mentioned earlier, External Import can be used from within another piece of code, just passing it data and benefiting from its mapping, transformation and storing features.

It is very simple to use this feature. You just need to assemble data in a format that External Import can understand (XML structure or PHP array) and call the appropriate method. All you need is an instance of class \Cobweb\ExternalImport\Importer and a single call.

$importer = \TYPO3\CMS\Core\Utility\GeneralUtility::makeInstance(\Cobweb\ExternalImport\Importer::class);
$messages = $importer->import($table, $index, $rawData);

The call parameters are as follows:

Name	Type	Description
$table	string	Name of the table to store the data into.
$index	integer	Index of the relevant external configuration.
$rawData	mixed	The data to store, either as XML (string) or PHP array.

The result is a multidimensional array of messages. The first dimension is a status and corresponds to the \TYPO3\CMS\Core\Messaging\AbstractMessage::ERROR, \TYPO3\CMS\Core\Messaging\AbstractMessage::WARNING and \TYPO3\CMS\Core\Messaging\AbstractMessage::OK constants. The second dimension is a list of messages. Your code should handle these messages as needed.

Data Model

The data that goes through the import process is encapsulated in the \Cobweb\ExternalImport\Domain\Model\Data class. This class contains two member variables:

rawData

The data as it is read from the external source or as it is passed to the import API. Given the current capacities of External Import, this may be either a string representing a XML structure or a PHP array.

extraData

An array available for anyone to write into and read from. This is some kind of storage space where any type of data can be stored and passed from step to step.

On top of the usual getter and setter, use addExtraData($key, $data) to add some data to this array using the defined array key.

records

The data as structured by External Import, step after step.

downloadable

Indicates whether the records variable contains data that is appropriate for downloading as CSV. The download feature is available in the preview mode of the backend module.

There are getters and setters for each of these.

Configuration Model

Whenever an import is run, the corresponding TCA configuration is loaded into an instance of the \Cobweb\ExternalImport\Domain\Model\Configuration class. The main member variables are:

table: The name of the table for which data is being imported.
index: The index of the configuration being used.
generalConfiguration: The general part of the External Import TCA configuration.
columnConfiguration: The columns configuration part of the External Import TCA configuration.
additionalFields: Array containing the list of additional fields. This should be considered a runtime cache for an often requested property.
countAdditionalFields: Number of additional fields. This is also a runtime cache.
steps: List of steps the process will go through. When the External Import configuration is loaded, the list of steps is established, based on the type of import (synchronized or via the API) and any custom steps. This ensures that custom steps are handled in a single place.
connector: The Configuration object also contains a reference to the Connector service used to read the external data, if any.

There are getters and setters for each of these.

Furthermore the setExcludedFromSavingFlagForColumn() method makes it possible to programmatically exclude (or re-include) a field from being saved to the database. By default, all additional fields are excluded. Using this method should not be necessary is most normal usage scenarios.

The Importer class

Beyond the import() method mentioned above the \Cobweb\ExternalImport\Importer class also makes a number of internal elements available via getters:

getExtensionConfiguration: Get an array with the unserialized extension configuration.
getExternalConfiguration: Get the current instance of the Configuration model.
setContext/getContext: Define or retrieve the execution context. This is mostly informative and is used to set a context for the log entries. Expected values are "manual", "cli", "scheduler" and "api". Any other value can be set, but will not be interpreted by the External Import extension. In the Log module, such values will be displayed as "Other".

Warning

setContext/getContext is deprecated. Use setCallType/getCallType instead.
setCallType/getCallType: Define or retrieve the execution context. This is based on the \Cobweb\ExternalImport\Enum\CallType enumeration. It is normally set by External Import itself, but can be set from the outside, especially when using External Import as an API (in which case, the call type should set to \Cobweb\ExternalImport\Enum\CallType::Api).
setDebug/getDebug: Define or retrieve the debug flag. This makes it possible to programatically turn debugging on or off.
setVerbose/getVerbose: Define or retrieve the verbosity flag. This is currently used only by the command-line utility for debugging output.

and a few more which are not as significant and can be explored by anyone interested straight in the source code.

For reporting, the \Cobweb\ExternalImport\Importer class also provides the addMessage() method which takes as arguments a message and a severity (using the constants of the \TYPO3\CMS\Core\Messaging\AbstractMessage class).

The call context

External Import may be called in various contexts (command line, Scheduler task, manual call in the backend or API call). While the code tries to be as generic as possible, it is possible to hit some limits in some circumstances. The "call context" classes have been designed for such situations.

A call context class must inherit from \Cobweb\ExternalImport\Context\AbstractCallContext and implement the necessary methods. There is currently a single method called outputDebug() which is supposed to display some debug output. Currently a specific call context exists only for the command line and makes it possible to display debugging information in the Symfony console.

The reporting utility

The \Cobweb\ExternalImport\Utility\ReportingUtility class is in charge of giving feedback in various contexts, lik sending an email once a synchronization is finished.

It provides a generic API for storing values from Step classes that could make sense in terms of reporting. Currently this is used only by the \Cobweb\ExternalImport\Step\StoreDataStep class which reports on the number of operations performed (inserts, updates, deletes and moves).

Note

These values are not used for any reporting for now. The number of updates is used in functional tests. Improved reporting could ensue in the future.

User functions

The external import extension can call user functions for any field where external data is imported. Some sample functions are provided in Classes/Transformation/DateTimeTransformation.php and Classes/Transformation/ImageTransformation.php.

Basically, the function receives three parameters:

Name	Type	Description
$record	array	The complete record being handled. This makes it possible to refer to other fields of the same record during the transformation, if needed.
$index	string	The key of the field to transform. Modifying other fields in the record is not possible since the record is passed by value and not by reference. Only the field corresponding to this key should be transformed and returned.
$parameters	array	Additional parameters passed to the function. This will be very specific to each function and can even be completely omitted. External import will pass an empty array to the user function if the "parameters" property is not defined.

The function is expected to return only the value of the transformed field.

Warning

The record received as input into the user function has already gone through the renaming the fields. That means the names of the fields are not those of the external data, but those of the TYPO3 CMS fields.

If unsure, use the Preview mode to look at the results of the Handle Data step.

The class containing the user function may implement the \Cobweb\ExternalImport\ImporterAwareInterface (using the \Cobweb\ExternalImport\ImporterAwareTrait or not). In such a case, it will have access to the Importer instance simply by using $this->getImporter(). In particular, this makes it possible for user functions to check if the current run is operating in preview mode or in debug mode.

The function may throw the special exception \Cobweb\ExternalImport\Exception\CriticalFailureException. This will cause the "Transform Data" step to abort. More details in the chapter about critical exceptions.

The function may also throw the special exception \Cobweb\ExternalImport\Exception\InvalidRecordException. The related record will be removed from the imported dataset.

The function may throw any other kind of exception if the transformation it is supposed to apply to the value it receives fails. This will trigger the removal of this value from the imported dataset, thus avoiding that it be further processed and eventually saved to the database.

Events

Import-related events

Several events are triggered during the external import process in order to provide entry points for custom actions, improving the flexibility of the whole tool. Some events are not triggered when running in preview mode.

All events may throw the special exception \Cobweb\ExternalImport\Exception\CriticalFailureException. This will cause their "parent" step to abort. More details in the chapter about critical exceptions. Any other exception will just be logged (depending on your logging configuration).

For usage, see the core documentation about PSR-14 events.

Change configuration before run

class ChangeConfigurationBeforeRunEvent

Fully qualified name: \Cobweb\ExternalImport\Event\ChangeConfigurationBeforeRunEvent

This event makes it possible to change the External Import at run-time, just before any of the import steps are executed. Since the \Cobweb\ExternalImport\Domain\Model\Configuration model class does not have accessor methods for every single property of the External Import configuration, it is not possible to change a single property.

Basically, the \Cobweb\ExternalImport\Domain\Model\Configuration object keeps a copy of the general configuration, the additional fields configuration and the columns configuration. The best way to change any of them is to retrieve that raw copy, modify it and set it again using the accessor methods provided by the event.

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

getConfiguration ( ): Current instance of \Cobweb\ExternalImport\Domain\Model\Configuration.

setGeneralConfiguration ( ): Used to set the modified version of the general configuration, previously retrieved using $this->getConfiguration()->getRawGeneralConfiguration().

setAdditionalFieldsConfiguration ( ): Used to set the modified version of the additional fields configuration, previously retrieved using $this->getConfiguration()->getRawAdditionalFieldsConfiguration().

setColumnsConfiguration ( ): Used to set the modified version of the columns configuration, previously retrieved using $this->getConfiguration()->getRawColumnsConfiguration().

Process connector parameters

class ProcessConnectorParametersEvent

Fully qualified name: \Cobweb\ExternalImport\Event\ProcessConnectorParametersEvent

This allows for dynamic manipulation of the parameters array before it is passed to the connector.

Note

This event is also triggered when displaying the configuration in the BE module. This way the user can see how the processed parameters look like.

getParameters ( ): Returns the connector parameters.

setParameters ( array $parameters): Sets the (modified) connector parameters.

getExternalConfiguration ( ): Instance of \Cobweb\ExternalImport\Domain\Model\Configuration with the current import configuration.

Substructure Preprocess

class SubstructurePreprocessEvent

Fully qualified name: \Cobweb\ExternalImport\Event\SubstructurePreprocessEvent

This event is triggered whenever a data structure is going to be handled by the substructureFields property. It is fired just before the directives defined in the substructureFields property are applied and makes it possible to change the substructure.

getSubstructureConfiguration ( ): Returns the corresponding substructureFields configuration.

getColumn ( ): Returns the name of the column being handled.

getDataType ( ): Returns the type of data being handled ("array" or "xml").

getStructure ( ): Returns the structure being handled.

setStructure ( mixed $structure): Sets the (modified) structure. This must be an array for array-type data or a \DomNodeList for XML-type data. Check the incoming type using the getDataType() method.

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

Update Record Preprocess

class UpdateRecordPreprocessEvent

Fully qualified name: \Cobweb\ExternalImport\Event\UpdateRecordPreprocessEvent

This event is triggered just before a record is registered for update in the database. It is triggered for each record individually.

The event may throw the special exception \Cobweb\ExternalImport\Exception\InvalidRecordException, in which case the record will be removed from the dataset to be saved.

Note

This event listener receives records only from the main table, not from any child table.

getUid ( ): Returns the primary key of the record (since we are talking about an update operation, the record exists in the database and thus has a valid primary key).

getRecord ( ): Returns the record being handled.

setRecord ( array $record): Sets the (modified) record.

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

Insert Record Preprocess

class InsertRecordPreprocessEvent

Fully qualified name: \Cobweb\ExternalImport\Event\InsertRecordPreprocessEvent

Similar to the "Update Record Preprocess" event above, but for the insert operation. It may also throw \Cobweb\ExternalImport\Exception\InvalidRecordException.

Note

This event listener receives records only from the main table, not from any child table.

Delete Record Preprocess

class DeleteRecordsPreprocessEvent

Fully qualified name: \Cobweb\ExternalImport\Event\DeleteRecordsPreprocessEvent

This event is triggered just before any record is deleted. It can manipulate the list of primary keys of records that will eventually be deleted.

Note that even if this event throws the \Cobweb\ExternalImport\Exception\CriticalFailureException, the data to update or insert will already have been saved.

getRecords ( ): Returns the list of records to be deleted (primary keys).

Note

This list of contains only records from the main table, not from any child table.

setRecords ( array $records): Sets the (modified) list of records.

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

Datamap Postprocess

class DatamapPostprocessEvent

Fully qualified name: \Cobweb\ExternalImport\Event\DatamapPostprocessEvent

This event is triggered after all records have been updated or inserted using the TYPO3 Core Engine. It can be used for any follow-up operation. The event has the following API:

Note that even if this event throws the \Cobweb\ExternalImport\Exception\CriticalFailureException, the data to update or insert will already have been saved.

Note

This event is not triggered in preview mode.

getData ( ): Returns the list of records keyed to their primary keys (including the new primary keys for the inserted records). Each record contains an additional field called tx_externalimport:status with a value of either "insert" or "update" depending on which operation was performed on the record.

Warning

This structure is one-dimensional, which is buggy when multiple tables are handled by the import (when using the "children" property), because records with the same primary key will override each other. Use getStructuredData() instead. Don't use getData() anymore, it will be dropped in the future.

getStructuredData ( )

Returns the list of tables and their records keyed to their primary keys (including the new primary keys for the inserted records). Each record contains an additional field called tx_externalimport:status with a value of either "insert" or "update" depending on which operation was performed on the record. Example:

[
    'pages' => [
        23 => [
            'title' => 'Page title'
        ],
    ],
    'sys_file_reference' => [
        47 => [
            'title' => 'Page image'
        ],
    ],
]

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

Cmdmap Postprocess

class CmdmapPostprocessEvent

Fully qualified name: \Cobweb\ExternalImport\Event\CmdmapPostprocessEvent

This event is triggered after all records have been deleted using the TYPO3 Core Engine. The event has the following API:

Note that even if this event throws the \Cobweb\ExternalImport\Exception\CriticalFailureException, the records will already have been deleted.

Note

This event is not triggered in preview mode.

getData ( ): Returns the list of primary keys of the deleted records.

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

Report

class ReportEvent

Fully qualified name: \Cobweb\ExternalImport\Event\ReportEvent

This event is triggered in the ReportEvent step. It allows for custom reporting. It also triggers the reporting webhook.

getImporter ( ): Current instance of \Cobweb\ExternalImport\Importer.

Reaction-related events

The following events are triggered during the execution of reactions.

Get external key

class GetExternalKeyEvent

Fully qualified name: \Cobweb\ExternalImport\Event\GetExternalKeyEvent

This event is triggered by the "Delete external data" reaction. It makes it possible to retrieve the key to the external data, if it is not stored in the "external_id" field as expected.

getConfiguration ( ): Instance of \Cobweb\ExternalImport\Domain\Model\Configuration with the targeted configuration.

getData ( ): An array with the data for the item to delete.

getExternalKey ( ): Value of the external key before the event is fired. It will be null if the key was not found as expected.

setExternalKey ( ): Use this method to set the value of the external key, once you have performed your custom processing of the data.

Modify response

class ModifyReactionResponseEvent

Fully qualified name: \Cobweb\ExternalImport\Event\ModifyReactionResponseEvent

This event is triggered by both reactions. It makes it possible to modify the response body and/or the response code, before the response is sent back.

getReaction ( ): Instance of \Cobweb\ExternalImport\Reaction\AbstractReaction, which can be an import or a delete reaction.

getConfigurations ( ): Array of \Cobweb\ExternalImport\Domain\Model\ConfigurationKey objects that were called up by the reaction. In the case of the delete reaction, there's always only one configuration object.

getResponseBody ( ): Value of the response body before the event is fired. It is an array containing a success status and one or more sub-array with messages (success, error or warning messages).

setResponseBody ( ): Use this method to set the value of the response body once you have modified it.

getResponseCode ( ): Value of the response code before the event is fired. It will be either 200 or 400, depending on whether the reaction completed successfully or not.

setResponseCode ( ): Use this method to set the value of the response code once you have modified it. It should be a valid HTTP response code.

Interrupting the process: critical exceptions

One exception class plays a particular role: \Cobweb\ExternalImport\Exception\CriticalFailureException. It can be thrown from within a user function or an event and will cause the import process to abort.

The reason for this exception is to react to some critical issue that may happen during the call to a user function or inside an event listener and which affects the whole import process. For example, if you are transforming a date and a single record has an invalid date, you probably don't want to interrupt the whole process for this. You want to record the issue is some way, but not pull the hand brake. On the other hand, say that you are saving some files and the target file storage is not available: you will probably want to stop the process before every record is saved with its related files.

Such exception thrown from within any user function will cause the "Transform Data" step to abort. When thrown from within an event listener it may abort the "Transform Data", the "Handle Data", the "Validate Data" or the "Store Data" steps. For the latter, however, note that data may have already been saved depending on which event listener it is thrown from. Refer to the chapter about events for more details.

Make sure to include a helpful error message when throwing this exception.

Custom process steps

Besides all the events, it is also possible to register custom process steps. How to register a custom step is covered in the Administration chapter. This section describes what a custom step can or should do and what resources are available from within a custom step class.

Parent class

A custom step class must inherit from abstract class \Cobweb\ExternalImport\Step\AbstractStep. If it does not, the step will be ignored during import. The parent class makes a lot of features available some of which are described below.

If you want to use Dependency Injection in your custom step class, just remember to declare it as being public in your service configuration file.

Available resources

A custom step class has access to the following member variables:

data: Instance of the object model encapsulating the data being processed (\Cobweb\ExternalImport\Domain\Model\Data).
importer: Back-reference to the current instance of the \Cobweb\ExternalImport\Importer class.
parameters: Array of parameters declared in the configuration of the custom step.

See the API chapter for more information about these classes.

Furthermore, the custom step class can access a member variable called abortFlag. Setting this variable to true will cause the import process to be aborted after the custom step. Any such interruption is logged by the \Cobweb\ExternalImport\Importer class, albeit without any detail. If you feel the need to report about the reason for interruption, do so from within the custom step class:

$this->getImporter()->addMessage(
     'Your message here...',
     FlashMessage::WARNING // or whatever error level
);

It is also possible to mark a custom step so that it is executed even if the process was aborted by a previous step. This is done by setting the executeDespiteAbort member variable to true in the constructor.

public function __construct() {
    $this->setExecuteDespiteAbort(true);
}

In general, use the getters and setters to access the member variables.

Custom step basics

A custom step class must implement the run() method. This method receives no arguments and returns nothing. All interactions with the process happens via the member variables described above and their API.

The main reason to introduce a custom step is to manipulate the data being processed. To read the data, use:

// Read the raw data or...
$rawData = $this->getData()->getRawData();
// Read the processed data
$records = $this->getData()->getRecords();

Note

Depending on when you custom step happens, there may not yet be any raw nor processed data available.

If you manipulate the data, you need to store it explicitely:

// Store the raw data or...
$this->getData()->setRawData();
// Store the processed data
$this->getData()->setRecords();

Another typical usage would be to interrupt the process entirely by setting the abortFlag variable to true, as mentioned above.

The rich API that is available makes it possible to do many things beyond these. For example, one could imagine changing the External Import configuration on the fly.

In general the many existing Step classes provide many examples of API usage and should help when creating a custom process step.

Preview mode

It is very important that your custom step respects the preview mode. This has two implications:

If relevant, you should return some preview data. For example, the TransformDataStep class returns the import data once transformations have been applied to it, the StoreDataStep class returns the TCE structure, and so on. There's an API for returning preview data:
```
$this->getImporter()->setPreviewData(...);
```
Copied!
The preview data can be of any type.
Most importantly, you must respect the preview mode and not make any persistent changes, like saving stuff to the database. Use the API to know whether preview mode is on or not:
```
$this->getImporter()->isPreview();
```
Copied!
Indicate that the records of the Data object are downloadable if it makes sense (see the Data model API). This is done by overriding the hasDownloadableData() method of the \Cobweb\ExternalImport\Step\AbstractStep class to return true.

Example

Finally here is a short example of a custom step class. Note how the API is used to retrieve the list of records (processed data), which is looped over and then saved again to the Data object.

In this example, the "name" field of every record is used to filter acceptable entries.

Warning

Note the call to array_values() to compact the array again once records have been removed. This is very important to avoid having empty entries in your import.

<?php

declare(strict_types=1);

namespace Cobweb\ExternalimportTest\Step;

use Cobweb\ExternalImport\Step\AbstractStep;

/**
 * Class demonstrating how to use custom steps for external import.
 *
 * @package Cobweb\ExternalimportTest\Step
 */
class TagsPreprocessorStep extends AbstractStep
{

    /**
     * Filters out some records from the raw data for the tags table.
     *
     * Any name containing an asterisk is considered censored and thus removed.
     */
    public function run(): void
    {
        $records = $this->getData()->getRecords();
        foreach ($records as $index => $record) {
            if (strpos($record['name'], '*') !== false) {
                unset($records[$index]);
            }
        }
        $records = array_values($records);
        $this->getData()->setRecords($records);
        $this->getData()->isDownloadable(true);
        // Set the filtered records as preview data
        $this->importer->setPreviewData($records);
    }

    /**
     * Define the data as being downloadable
     *
     * @return bool
     */
    public function hasDownloadableData(): bool
    {
        return true;
    }
}

Custom data handlers

It is possible to use a custom data handler instead of the standard \Cobweb\ExternalImport\Importer::handleArray() and \Cobweb\ExternalImport\Importer::handleXML(). The value declared as a custom data handler is a class name:

$GLOBALS['TCA']['some_table']['external']['general'][0]['data'] = Foo\MyExtension\DataHandler\CustomDataHandler::class;

The class itself must implement the \Cobweb\ExternalImport\DataHandlerInterface interface, which contains only the handleData() method. This method will receive two arguments:

an array containing the raw data returned by the connector service
a reference to the calling \Cobweb\ExternalImport\Importer object

The method is expected to return a simple PHP array, with indexed entries, like the standard methods (\Cobweb\ExternalImport\Importer::handleArray() and \Cobweb\ExternalImport\Importer::handleXML()).

Note

This was not tested by myself (the extension author). It was introduced to answer the particular need to parse large arrays using methods similar to XPath. This would have relied on a library which was not considered stable enough. Having custom data handlers makes it possible.

Dynamic TCA loading

Retrieval of the TCA global array is encapsulated in a class called \Cobweb\ExternalImport\Domain\Repository\TcaDirectAccessRepository which implements the \Cobweb\ExternalImport\Domain\Repository\TcaRepositoryInterface interface. This system pursues three aims:

encapsulating the retrieval of the TCA to simplify following up the evolutions in the TYPO3 Core (like the introduction of the TCA Schema in TYPO3 13).
abstracting into a base class (\Cobweb\ExternalImport\Domain\Repository\AbstractTcaRepository) all the logic for retrieving all External Import-related configuration from the TCA.
allowing developers to perfom dynamic manipulations on the TCA by providing their own TCA repository class through dependency injection. This is detailed below.

Custom TCA repository

Although an event exists for manipulating a single import configuration, it is not unusual to have repetitive import configurations, sometimes implying a dynamic modification of the TCA. For such special cases, it may be useful to provide your own custom implementation of a TCA repository.

The recommended way is to extend the abstract class \Cobweb\ExternalImport\Domain\Repository\AbstractTcaRepository which implements all the methods related to extracting the External Import configurations from the TCA. The only method to implement is getTca(), where you can perform any processing you need. Then simply declare your repository as a service replacing \Cobweb\ExternalImport\Domain\Repository\TcaDirectAccessRepository, by placing in your extension's Services.yaml file the following:

services:
  _defaults:
    autowire: true
    autoconfigure: true
    public: false

  Vendor\ExtName\Import\DynamicTcaRepository:
    decorates: Cobweb\ExternalImport\Domain\Repository\TcaRepositoryInterface
    public: true

Known problems

There are not currently any particular issues, except those already reported. In general please report bugs and improvements at: https://github.com/cobwebch/external_import/issues.

Appendix

The appendix mostly contains information about old versions of External Import and how to perform upgrades.

Upgrading instructions for older versions

Upgrade to 6.3.0

External Import now supports Connector services registered with new system introduced with extension "svconnector" version 5.0.0, while staying compatible with the older versions.

Another small new feature is the possibility to define a storage pid for the imported data on the command line or when creating a Scheduler task, which overrides storage information that might be found in the TCA or in the extension configuration.

Upgrade to 6.2.0

The Substructure Preprocess event is now fired for both array-type and XML-type data (previously, only for array-type data). To know which type of data is being handled, a new getDataType() method is available. The type of structure that must be returned after modfication (by calling setStructure() must be either an array or a \DomNodeList, as opposed to just an array in older versions. Existing event listeners may need to be adapted.

Upgrade to 6.1.0

Records which have no external key set (the value referenced by the referenceUid property) are now skipped in the import. Indeed it makes no sense to import records without such keys, as they can never be updated and - if several are created in a single import run - they will override each other. Still it is a change of behaviour and should be noted.

Upgrade to 6.0.0

All properties that were deprecated in version 5.0.0 were removed and the backwards-compatibility layer was dropped. Please refer to the 5.0.0 upgrade instructions and check if you have applied all changes.

All hooks were marked as deprecated. They will be removed in version 7.0.0. You should migrate your code to use either custom process steps or the newly introduced PSR-14 events. See the hooks chapter for information about how to migrate each hook.

External Import is now configured for using the standard (Symfony) dependency injection mechanism. This means it is not necessary to instantiate the \Cobweb\ExternalImport\Importer class using Extbase's \TYPO3\CMS\Extbase\Object\ObjectManager anymore when using the Importer as an API.

The PHP code was cleaned up as much as possible and strict typing was declared in every class file. This may break your custom code if you were calling public methods without properly casting arguments.

Upgrade to 5.1.0

There is a single change in version 5.1.0 that may affect existing imports: when a user function fails to handle the value it was supposed to transform (by throwing an exception), that value is now removed from the imported dataset. Before that it was left unchanged.

Upgrade to 5.0.0

There are many changes in version 5.0.0, but backwards-compatibility has been provided for all them (except the minor breaking change mentioned below). Please make sure to update your configuration as soon as possible, backwards-compatibility will be dropped in version 5.1.0. Messages for deprecated configuration appear in the backend module when viewing the details of a configuration.

Changes

The general configuration must now be placed in $GLOBALS['TCA'][table-name]['external']['general'] instead of $GLOBALS['TCA'][table-name]['ctrl']['external'].

The "additionalFields" property from the general configuration (and not from the "MM" property) has been moved to its own configuration space. Rather than $GLOBALS['TCA'][table-name]['ctrl']['external'][some-index]['additionalFields] it is now $GLOBALS['TCA'][table-name]['external']['additionalFields'][some-index]. Furthermore, it is no longer a simple comma-separated list of fields, but an array structure with all the same options as standard column configurations. For more details, see the relevant chapter.

The "MM" property is deprecated. It should not be used anymore. Instead the new multipleRows or children properties should be used according to your import scenario.

The "userFunc" property of the transformations configuration has been renamed to userFunction and its sub-property "params" has been renamed "parameters".

If both "insert" and "update" operations are disabled in the general configuration (using the disabledOperations property), External Import will now delete records that were not marked for update (even if the actual update does not take place). Previously, no records would have been deleted, because the entire matching of existing records was skipped.

Accessing the external configuration inside a custom step with $this->configuration or $this->getConfiguration() is deprecated. $this->getImporter()->getExternalConfiguration() instead.

The "scheduler" system extension is required instead of just being suggested.

New stuff

It is possible to import nested structures using the children property. For example, you can now import data into some table and its images all in one go by creating a nested structure for the "sys_file_reference" table.

The multipleRows and multipleSorting properties allow for a much clearer handling of denormalized external sources.

Check out the revamped Mapping data chapter which should hopefully help you get a better picture of what is possible with External Import and how different properties (especially the new ones) can be combined.

Custom steps can now receive an array of arbitrary parameters.

Breaking changes

The \Cobweb\ExternalImport\Step\StoreDataStep class puts the list of stored records into the "records" member variable of the \Cobweb\ExternalImport\Domain\Model\Data object. This used to be a simple list of records for the imported table. Since child tables are now supported, the structure has changed so that there's now a list of records for each table that was imported. The table name is the key in the first dimension of the array. If you were relying on this data in a custom step, you will need to update your code as no backward-compatibility was provided for this change.

Upgrade to 4.1.0

Version 4.1.0 introduces one breaking change. There now exists custom permissions for backend users regarding usage of the backend module. On top of table-related permissions, users must have been given explicit rights (via the user groups they belong to) to perform synchronizations or define Scheduler tasks. See the User rights chapter for more information.

Upgrade to 4.0.0

Importer API changes

The External Import configuration is now fully centralized in a \Cobweb\ExternalImport\Domain\Model\Configuration object. Every time you need some aspect of the configuration, you should get it via the instance of this class rather than through any other mean. The most current use case was getting the name of the current table and index from the \Cobweb\ExternalImport\Importer class, using Importer::getTableName() and Importer::getIndex(). Such methods were deprecated and should not be used anymore. Use instead:

$table = $importer->getExternalConfiguration()->getTable();
$index = $importer->getExternalConfiguration()->getIndex();

The Importer::synchronizeData() method was renamed to Importer::synchronize() and the Importer::importData() method was renamed to Importer::import(). The old methods were kept, but are deprecated.

The Importer::synchronizeAllTables() method should not be used anymore as it does not allow for a satisfying reporting. Instead a loop should be done on all configurations and Importer::synchronize() called inside the loop. See for example \Cobweb\ExternalImport\Command\ImportCommand::execute().

Other deprecated methods are Importer::getColumnIndex() and Importer::getExternalConfig().

The Importer::getExistingUids() method was moved to a new class called \Cobweb\ExternalImport\Domain\Repository\UidRepository (which is a Singleton).

Transformation properties

All column properties that are related to the "Transform data" scope have been grouped into a new property called transformations. This is an ordered array, which makes it possible to use transformation properties several times on the same field (e.g. calling several user functions) and to do that in a precise order. As an example, usage of such properties should be changed from:

$GLOBALS['TCA']['fe_users']['columns']['starttime']['external'] = [
      0 => [
            'field' => 'start_date',
            'trim' => true
            'userFunction' => [
                  'class' => \Cobweb\ExternalImport\Task\DateTimeTransformation::class,
                  'method' => 'parseDate'
            ]
      ]
];

to:

$GLOBALS['TCA']['fe_users']['columns']['starttime']['external'] = [
      0 => [
            'field' => 'start_date',
            'transformations => [
                  10 => [
                        'trim' => true
                  ],
                  20 => [
                        'userFunc' => [
                              'class' => \Cobweb\ExternalImport\Task\DateTimeTransformation::class,
                              'method' => 'parseDate'
                        ]
                  ]
            ]
      ]
];

If you want to preserve "old-style" order, the transformation properties were called in the following order up to version 3.0.x: "trim", "mapping", "value", "rteEnabled" and "userFunc". Also note that "value" was ignored if "mapping" was also defined. Now both will be taken into account if both exist (although that sounds rather like a configuration mistake).

A compatibility layer ensures that old-style transformation properties are preserved, but this is a temporary convenience, which will be removed in the next version. So please upgrade your configurations.

Note

The upgrade wizard from version 3.0.0 has been removed. If you are upgrading from TYPO3 6.2 to TYPO3 8.7, you must go through TYPO3 7.6 first and use the upgrade wizard from External Import 3.0.x before moving on to TYPO3 8.7.

Renamed properties

To continue the move to unified naming conventions for properties started in version 3.0, the mapping and MM properties which had underscores in their names were moved to lowerCamelCase name.

The old properties are interpreted for backwards-compatibility, but this will be dropped in the next major version. The backend module will show you the deprecated properties.

Breaking changes

While all hooks were preserved as is, in the sense that they still receive a back-reference to the \Cobweb\ExternalImport\Importer object, the processParameters hook was modified due to its particular usage (it is called in the backend module, so that processed parameters can be viewed when checking the configuration). It now receives a reference to the \Cobweb\ExternalImport\Domain\Model\Configuration object and not to the \Cobweb\ExternalImport\Importer object anymore. Please update your hooks accordingly.

Upgrade to 3.0.0

The "excludedOperations" column configuration, which was deprecated since version 2.0.0, was entirely removed. The same goes for the "mappings.uid_foreign" configuration.

More importantly the Scheduler task was renamed from tx_externalimport_autosync_scheduler_Task to \Cobweb\ExternalImport\Task\AutomatedSyncTask. As such, existing Scheduler tasks need to be updated. An upgrade wizard is provided in the Install Tool. It will automatically migrate existing old tasks.

The update wizard shows that there are tasks to update

If there are no tasks to migrate, the External Import wizard will simply not show up. Otherwise just click on the "Execute" button and follow the instructions.

Several general TCA configuration properties were renamed, to respect a global lowerCamelCase naming convention. This is the list of properties and how they were renamed:

additional_fields => additionalFields
reference_uid => referenceUid
where_clause => whereClause

Upgrade to 2.0.0

The column configuration "excludedOperations" has been renamed to "disabledOperations", for consistency with the table configuration option. The "excludedOperations" is preserved for now and will log an entry into the deprecation log. You are advised to change the naming of this configuration if you use it, support will be dropped at some point in the future.

Migrating hooks

Warning

All hooks were removed in version 7.0.0. This chapter was preserved for those still using hooks, as it includes replacement instructions. Some hooks should be replaced by custom process steps. For others PSR-14 compliant events have been introduced. See below for each hook.

processParameters

(deprecated)

Warning

Use the Process connector parameters event instead.

This allows for dynamic manipulation of the parameters array before it is passed to the connector.

Example

Let's assume that you are using the CSV connector and that you would like the filename to automatically adjust to the current year. Your parameters could be something like:

'parameters' => [
    'filename' => 'fileadmin/imports/data-%Y.csv'
]

Inside the hook, you could run strftime() on the filename parameter in order to replace "%Y" with the current year.

The hook receives the parameters array as the first argument and a reference to the current configuration object (an instance of class \Cobweb\ExternalImport\Domain\Model\Configuration) as second argument. It is expected to return the full parameters array, even if not modified.

Note

This hook is also used when displaying the configuration in the BE module. This way the user can see how the processed parameters look like.

preprocessRawRecordset

(deprecated)

This hook makes it possible to manipulate the data just after it was fetched from the remote source, but already transformed into a PHP array, no matter what the original format. The hook receives the full recordset and a back-reference to the calling object (an instance of class \Cobweb\ExternalImport\Importer) as parameters. It is expected to return a full recordset too.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException.

Note

Since External Import version 4.0.0, use a custom step instead, using after:\Cobweb\ExternalImport\Step\HandleDataStep as a position.

validateRawRecordset

(deprecated)

This hook is called during the data validation step. It is used to perform checks on the nearly raw data (it has only been through "preprocessRawRecordset") and decide whether to continue the import or not. The hook receives the full recordset and a back-reference to the calling object (an instance of class \Cobweb\ExternalImport\Importer) as parameters. It is expected to return a boolean, true if the import may continue, false if it must be aborted. Note the following: if the minimum number of records condition was not matched, the hooks will not be called at all. Import is aborted before that. If several methods are registered with the hook, the first method that returns false aborts the import. Further methods are not called.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException.

Note

Since External Import version 4.0.0, use a custom step instead, using after:\Cobweb\ExternalImport\Step\ValidateDataStep as a position (or before: if you want to shortcircuit the default validation process).

preprocessRecordset

(deprecated)

Similar to "preprocessRawRecordset", but after the transformation step, so just before it is stored to the database. The hook receives the full recordset and a back-reference to the calling object (an instance of class \Cobweb\ExternalImport\Importer) as parameters. It is expected to return a full recordset too.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException.

Note

Since External Import version 4.0.0, use a custom step instead, using after:\Cobweb\ExternalImport\Step\TransformDataStep as a position.

updatePreProcess

(deprecated)

Warning

Use the Update Record Preprocess event instead.

This hook can be used to modify a record just before it is updated in the database. The hook is called for each record that has to be updated. The hook receives the complete record and a back-reference to the calling object (an instance of class \Cobweb\ExternalImport\Importer) as parameters. It is expected to return the complete record.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException.

Note

This hook receives records only from the main table, not from any child table.

insertPreProcess

(deprecated)

Warning

Use the Insert Record Preprocess event instead.

Similar to the "updatePreProcess" hook, but for the insert operation.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException.

Note

This hook receives records only from the main table, not from any child table.

deletePreProcess

(deprecated)

Warning

Use the Delete Record Preprocess event instead.

The event does not have a direct access to the main table name. It can be retrieved using: $event->getImporter()->getExternalConfiguration()->getTable.

This hook can be used to modify the list of records that will be deleted. As a first parameter it receives the name of the main table, as a second parameter a list of primary keys, corresponding to the records set for deletion. The third parameter is a reference to the calling object (again, an instance of class \Cobweb\ExternalImport\Importer). The method invoked is expected to return a list of primary keys too.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException. However note that the data will already have been saved.

Note

This hook receives only the list of records to be deleted from the main table, not from any child table.

datamapPostProcess

(deprecated)

Warning

Use the Datamap Postprocess event instead.

The event does not have a direct access to the main table name. It can be retrieved using: $event->getImporter()->getExternalConfiguration()->getTable.

This hook is called after all records have been updated or inserted using the TYPO3 Core Engine. It can be used for any follow- up operation. It receives as parameters the name of the affected table, the list of records keyed to their uid (including the new uid's for the new records) and a back-reference to the calling object (an instance of class \Cobweb\ExternalImport\Importer). Each record contains an additional field called tx_externalimport:status which contains either "insert" or "update" depending on what operation was performed on the record.

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException. However note that the data will already have been saved.

Note

This hook is not called in preview mode.

cmdmapPostProcess

(deprecated)

Warning

Use the Cmdmap Postprocess event instead.

The event does not have a direct access to the main table name. It can be retrieved using: $event->getImporter()->getExternalConfiguration()->getTable.

This hook is called after all records have been deleted using the TYPO3 Core Engine. It receives as parameters the name of the affected table, the list of uid's of the deleted records and a back- reference to the calling object (an instance of class \Cobweb\ExternalImport\Importer).

This hook may throw the \Cobweb\ExternalImport\Exception\CriticalFailureException. However note that the data will already have been saved.

Note

This hook is not called in preview mode.