DEPRECATION WARNING

This documentation is not using the current rendering mechanism and is probably outdated. The extension maintainer should switch to the new system. Details on how to use the rendering mechanism can be found here.

EXT: Generic User Import Tool

Author:Kasper Skårhøj
Created:2002-11-01T00:32:00
Changed:2006-03-09T12:42:01
Email:r.sudhoelter (at) web.de

EXT: Generic User Import Tool

Extension Key: rs_userimp

Copyright 2005, r.sudhoelter (at) web.de, <r.sudhoelter (at) web.de>

This document is published under the Open Content License

available from http://www.opencontent.org/opl.shtml

The content of this document is related to TYPO3

- a GNU/GPL CMS/Framework available from www.typo3.com

Table of Contents

EXT: Generic User Import Tool 1

Introduction 1

Intended Audience 1

Screenshots 1

Users manual 5

Installation 5

Administration and Configuration 5

Final warning note on timers 6

Limitations 7

Usage 7

FAQ 13

Known problems 13

To-Do list 13

Thanks 14

Changelog 14

Introduction

Intended Audience

The Generic User Import Tool might come in handy for all those TYPO3 administrators who are confronted with the tedious task of creating many user accounts.

Surprisingly, there is no user import interface in TYPO3 (<= 3.8.0) and the recommended way of doing so was to fill the database via a direct SQL INSERT statement. This has of course several drawbacks and even imposes some threats:

  • the database may not be directly accessible via command line or phpMyAdmin
  • you have no visual feedback on imported or rejected data
  • no certainty whether the so imported datasets meets all internal TYPO3 requirements (min/max field lengths, disallowed chars, ...)
  • the user's DB know-how may not be sufficient to issue a direct DB statement at all
  • DB access rights versus TYPO3 user rights

The only way to solve this was to write a backend module which would allow you to upload, prepare and finally import user data via CGL compliant methods. All this was wrapped in a straight forward, linear 4 step workflow which almost let's no room for user errors.

At its current state, the Generic User Import Tool is limited to do FE and tt_address user imports, but internally, it is already prepared for BE user imports as well.

The tool (0.2.0) was used to import ~ 15.000 FE users in under 180 seconds.

Screenshots

The tool offers 2 main functions: import of user accounts and rollback (deletion) of previously imported users. The security note summarizes what may happen if you do not carefully read this manual.

img-1

Import Step 1: file upload

Here you see the module in general. As you can see: the 4 step workflow is nicely laid out in four tabs (plus an extra info tab) which are only activated once the previous workflow step has been successfully carried out.

img-2

After we uploaded a file, the tab gets checked with a green check marker, the next tab is enabled and we see the current file info. Please note that the file upload is handled with TYPO3 standard file functions, so we do not need to care about correct file names and so on, this all is nicely handled by the core.

img-3

Import Step 2: import settings

Step 2 requires the user to fill out (or reload previously saved) basic import settings such as field delimiters, preview settings and so on.

img-4

After updating or saving the necessary settings, we may proceed with the next step, the field mapping.

Import Step 3: field mapping

This is the main task during the import process: we need to map CSV data fields to DB user data fields. The module supports us during this task with a sophisticated mapping form.

img-5

Each CSV data field is presented in a single row, showing its description and example values (both only if present in the CSV) and gives us a dropdown selector from which we may select the corresponding DB value. We don't even have to care about dual mappings or missing mandatory fields - the module takes care of that as well.

The screenshot shows the mapping process for a Outlook Express address file which has 29 data fields. Of course not all of them are useful for us.

After correcting and updating the mapping, we see that we meet all internal import criteria and may continue with step 4, the final import.

Import Step 4: CSV import

Now we are almost done. On the final screen we are reminded to save our settings for later re-use.

img-6

By clicking the Import button, all other tab functions are deactivated, the import file gets parsed and imported in its whole and we may review the import result on the Error and Messages tab.

Import Step 5: Errors and Messages

Even if we provided all settings and mapped all necessary fields, import errors might occur. Of course we get errors if the file is corrupted. Further on, based on our import settings, there might be additional errors if for example we deactivated the auto rename feature or we have empty username or password fields. All messages and errors are displayed on this tab.

img-7

Finally, to support administrators in system monitoring tasks, the module logs everything (file uploads and DB actions) to the standard system log:

img-8

Rollback

The Generic User Import Tool offers you an automated and secure way to roll back previous import sessions.

img-9

Users manual

Installation

As with any other extension, download the extension rs_userimp from TER (as local or global extension) and load it in the Extension Manager. The Generic User Import Tool relies on two additional database tables (tx_rsuserimp_presets, needed for storing import presets and tx_rsuserimp_session, used to store import session data) which will be created while loading the extension.

img-10 You should reload the BE after installation (or logout and then login again). Doing so, you will see the new module icon right under Tools

Administration and Configuration

After loading, the extension is ready to work but you may configure some safety settings for the rollback feature and general file handling directives.

img-11

If you want to keep your import files, you may set U se recycler for this purpose. Of course you have to configure the TYPO3 recycler folder first - if there is no recycler available, import files will be deleted right away.

If enabled, the Create drop file feature creates a CSV file with all dropped out (skipped) users: users which were skipped for whatever reason are written to a new CSV file which you could download, edit and re-import again.

img-12

The Rollback Safety Timespan defines the timespan (in minutes) in which you are allowed to rollback any of your import sessions. This timespan is meant as a safety measure because it seems not advisable to allow rollbacks forever.

The Rollback Data Preview Rows is the amount of preview datasets shown in the rollback info box.

With the Rollback Delete DB Records setting you may specify the record deletion characteristics of rollbacks: you may either choose the native TYPO3 way (where deleted database records get marked “deleted” but are actually kept in the DB) or the extension's own way of removing rolled back records from the DB. So, if you used the tool to import several hundreds or even thousands of user accounts and encounter errors afterwards, it might be wise to physically delete those user datasets during rollback. Also, if you have chosen NOT to delete the datasets, those usernames are no longer available, these usernames exists even if they are not active!

The Garbage Collection Trigger Timer determines when general “housekeeping” processes are triggered. Aged out import sessions, old import and drop files are deleted after the time specified here. Set this value to 0 if you want to keep these items forever.

Finally, withthe Unique Identifier - fe_users and Unique Identifier - tt_address settings you may define fields of the DB tables fe_users and tt_address which will serve as a identifier when doing user updates. Imported users are checked for this identifier and if they are already present in the corresponding DB table, the datasets are updated instead of simply inserted. Make sure that your identifier is a really unique value, otherwise you will screw up your database!

Final warning note on timers

Perhaps you have wondered what all these timer settings are really good for, so here is a final warning note:

Once a particular import session has elapsed the Rollback Safety Timespan , the session changes its internal status from “active” to “inactive”. Inactive sessions are no longer recoverable via the tool's rollback feature, even if you set the Rollback Safety Timespan to a higher value afterwards!Next, the Garbage Collection Trigger Timer determines when inactive sessions finally get marked “deleted”. Once an inactive session has elapsed this timer, it won't show up in the list of available sessions, even if you increase this timer afterwards!

BOTH STATES CANNOT BE UNSET VIA BUILTIN FUNCTIONS!

If after all you are still not sure what to do: set both timers to 0 to totally disable the built-in security settings – BUT THIS IS NOT RECOMMENDED AT ALL!!!

Limitations

Currently, the extension is available to administrators only! This is because of a basic security consideration: administrators should have full control of all system users. Logically, administrators are the only ones allowed to import users.

Usage

You start the CSV Generic User Import Tool by clicking its module icon in the backend.

The module implements a single, streamlined workflow (the task of importing users from file) where all the necessary steps to complete this workflow are organized in a so-called DynTab menu (Dynamic Tabs). These dynamic menus react on conditions set by the programmer in that they show their content only if certain criteria are met - an ideal method for our purpose.

The CSV file format

First of all you will of course need a suited import file. For example, the CSV Generic User Import Tool has successfully been tested against address export files from

  • Microsoft Outlook
  • Microsoft Outlook Express
  • Microsoft Excel
  • Lotus Notes
  • Palm Desktop's Contacts

But in general, every file with the following characteristics should work:

  • data field *must* be separated by a either comma, semicolon, colon or TAB
  • data fields *may* be encapsulated with single or double quotes
  • data rows *must* be terminated with common line ending characters such as newline, carriage return or line feed
  • data rows *should* not exceed 10000 characters per line (will be changed in a future release)

Note: For some strange reason, Outlook Express handles field encapsulation inconsistently amongst the different data fields. You should use double quotes as the encapsulation character.

Tab CSV file

Once you have a suited import file, you may upload it from your local computer to the TYPO3 server. Files get uploaded to the first available temp folder for that specific user, so normally they end up in fileadmin/_temp_. You don't really have to care about filenames: the files get renamed to a valid filename during upload.

You also may choose the overwrite function to replace previously uploaded files.

img-13

As soon as the file is in place, some basic file information is shown: the filename on the server and its size.

Tab Import Settings

Here, you have to define all basic settings for the import process. First of all, you have to select the user type you are going to import. Currently, FE users and tt_address datasets are supported.

img-14

Then, you have to choose a storage folder for your new users. The import tool queries the database for all possible storage places and displays them in this drop-down selector. FE users may be imported to any folder that has the fe_users plugin installed and tt_address datasets to any sysfolder.

img-15

Next, for fe_users you have to define one or more usergroups which all the imported users are assigned to. Again, the database is queried for all possible values and displays them in this dropdown selector (select multiple values by pressing CTRL). This option is disabled if you are importing tt_address datasets.

img-16

If you want to update existing users while importing new ones, you have to activate the following option and have to choose a unique identifier. You may specify values for the dropdown box during the extension setup. Take great care that this value really determines unique users! The list of available values for the dropdown box can be defined during the configuration of the extension.

img-17

Finally, you have to define how the data fields in your import file are organized (see section CSV file format for some general file format information).

img-18

First row of import file has field names : Normally, export files carry the field names in the first row, so you should tick this option. The field names will be shown during the mapping process and simplify this task enormously. If your file has no such descriptive row then this field names will be generated automatically (Field1, Field2 and so on).

Enable auto-rename : Probably you already experienced this: if you create a username which already exists in TYPO3, a zero is appended to the original username you entered. This option enables this feature. The renaming process is applied recursively to each username until it is unique (you could end up with something like newuser000). If you choose not to enable this option, those users will not be imported at all! The auto-rename feature is not available if you have choosen to update existing users.

Enable auto-values : auto-values are auto generated values which may be defined with a special syntax. Choose this option if you need to manipulate datasets during import. Read more about this feature in section Tab Field Mapping .

Attention: The performance of the import process might be greatly decreased by the auto-rename and auto-values setting, especially when importing hundreds or thousands of users at a time.

Field delimiter :Choose the field delimiter.

Field encapsulation character : Choose the field encapsulation character.

Preview lines : Choose how many sample lines of your CSV you want to be displayed for the mapping process. You should keep this value to something below 5.

img-19

Additional Mandatory Fields : The fields you enter here are appended to the system-defined mandatory fields (username and password for fe_users, empty for tt_address). Select multiple values by pressing CTRL. If for example you choose name and email here, all imported users are checked for username, password, name and email entries. If an import dataset misses any of this fields, the user is skipped.

Update settings : Please always press the update button once you have changed any of the settings in this configuration menu!

Presets : Probably you already imported some users in a previous session and have chosen to save your settings and field mappings as a preset. Here, you may select or delete such a preset.

img-20

After the following mapping process you could also return to this tab and save your settings as a new preset. Choose a name for the preset and save it.

Tab Field Mapping

Please note : This is a very important step and you should always carefully review what you did here! Errors in the mapping may result in many wrongly setup user accounts or address entries.

The mapping form itself is nearly self-explanatory. Each row displays one data field of your import file. If you have fieldnames in your CSV they are used as the description for that field. You have a dropdown selector which shows all available fields for TYPO3 user data information. The last column shows examples as they were read from the CSV.

The screenshot below shows a mapping session without the “Enable auto- value” option being set. We already have mapped 3 fields but still need to provide mappings for email and gender fields.

img-21

This screenshot below shows a mapping session with “Enable auto-value” option set. As you can see, things get a bit more complicated. The AV tickbox enables auto-values for that data field. You still have to choose a mapping value from the dropdown selector but the actual value of that field has to be specified with an expression of a special syntax. The actual values get then computed during import.

img-22

  • In the example above, the username value is set to {l(0)}.You might have probably guessed it: “l”stands for “lowercase” and the “0” is the argument passed to the lowercase function and represents the value of datafield #0. So the username is the lowercased christian name.
  • The password value is set to {m(1)}. “m” stands for the MD5 hash function, “1” stands for the value of datafield #1. So the password is computed as the MD5 hash of the family name and voilá, the Generic Import Tool is compatible with EXT: danp_md5fepassword and EXT: kb_md5fepw .
  • The value for module_sys_dmail_html is set to “{s(1)}”. “s” stands for string value, “1” is the string value itself. So all imported/updated users are configured to receive direct mail newsletters as HTML.

As you can see: this is a rather generic approach and gives you full control during the import process! For example, consider this mapping:

img-23

You are right: multiple auto-values are allowed for each mapping field and in this case, the username is now computed as <christian name>_<family name> (all lowercase values, for example “peter_pan”). Now, have a look at this:

img-24

The comments field is set to “{f(3)}{s( - CSV)}”: take the unaltered data value from datafield #3 and append the string “- CSV” to it (would result in something like “Peter Pan – CSV”).

And finally, this mapping

img-25

creates a mapping for direct mail categories. The thing is that these direct mail categories must be entered as a bit mask, so with “{b(3,4,5,6,7,8,9)}” we create this mask from CSV field values 3 to 9 (from lowest to highest bit). Please note: the bitmask option cannot be combined with any other auto-value.

Summary of allowed auto-value expressions:

  • {f(n)}: unaltered value of data field n
  • {l(n)}: lowercased value of data field n
  • {m(n)}: MD5 hash of data field n
  • {s(text)}: plain text
  • {b(n[,m...])}: creates bit mask of data values n to m (from lowest to highest bit)

It should be noted that mandatory fields are still scanned for illegal values (spaces, uppercase characters, ...) but that no automatic substitution takes place since the auto-rename feature is not available in this mode. Therefor, make sure that your mapping values meet the corresponding criteria!

Tab CSV Import

After correcting and updating the settings we are almost finished with the user import.

This tab gives you the chance to review your settings or to save them as a preset. You end the import workflow by clicking the import button.

Please note : After clicking the import button, all tab functions are deactivated and you may only choose to view the errors and messages tab!

img-6

Tab Errors and Messages

All messages and errors are displayed in this last tab. Here, we were about to import some users with the auto-rename feature enabled. In the midst of the file, the import tool detected some empty mandatory fields. Those users are skipped as well. Since the drop file feature is enabled as well, we are presented a link to download the generated CSV file which we can download, edit and import again.

img-7

Monitoring user import actions

Administrators may monitor the import tool's actions by inspecting the administration log. All successfully imported or updated users are fully logged.

img-26

Handling of temporary import files

After the actual import, the CSV file gets either deleted or is moved to the recycler. This behavior is determined by the “Use recycler” setting from the extension configuration (for details read section “Administration and Configuration”).

img-27

Rollback

When it comes to mass import of data, there may always occur errors which are detected just too late. In our case, this might happen mainly because of wrongly mapped data fields or auto-value expressions. It would be a lot of work to manually correct or delete this afterwards by hand.

Fortunately, the CSV Generic User Import Tool offers you an automated way to revert your changes: every import session is written to the database and with that information, you may completely roll back a session.

img-1

Rollback in this sense means a deletion of imported users, the tool does not yet offer a way restore previous values of updated datasets!

To roll back a session, choose the corresponding menu item in the main menu. What you get is a list of all import sessions along with some basic information. The most important part here is the so-called Rollback Status .

According to the configuration of the extension (see section “Administration and Configuration” for details), the rollback feature of an import session may have already expired (those session are marked red) or the session was already rolled back.

img-28

Sessions marked green may be rolled back. To choose a session, move the mouse over the table row and click.

img-29

img-30 Now, all available information for that session is displayed and you may roll back the session by clicking the Roll back button. And according to the general settings, these users are either marked deleted or are completely removed from the database.

FAQ

((generated))

Is the extension compatible with sr_feuser_register?

Yes, it is. In fact, the import settings for “Additional mandatory fields” were introduced because of the new available FE userdata fields once you have installed EXT:sr_feuser_register.In general, every extension that extends the fe_users or tt_address table with new database fields should work. These new values should all show up under “Additional mandatory fields”. But be aware : the Generic User Import Tool was designed to import TYPO3 users, so even though you might use the tool to import other users as well, it is not guaranteed that this succeeds.

Oh my golly, too late I realized that I imported thousands of wrongly configured users and the import session has already expired...

Background:

  • For security reasons, the rollback feature automatically expires after some time (defined by Rollback Safety Timespan ). This behavior may be disabled by setting this timespan to 0.
  • After that point, the import session data is still displayed until the Garbage Collection Trigger Timer has passed.
  • After that, even the import session data is no longer displayed (the latter may be prevented by setting the Garbage Collection Trigger Timer to 0).

So there are several options for you to solve your problem:

Have you already disabled the Rollback Safety Timespan by setting it to 0 (see section Administration and Configuration )? Do so and try to rollback your session.

If the Rollback Safety Timespan has already been disabled, you may choose to edit file rs_userimpext_conf_template.txt and increase the upper limit for rollbackSafetyTimespan. This is done by changing the value for “type=int [0-600]” to something like “type=int [0-6000]” (minutes).

If the session is already older than the Garbage Collection Trigger Timer then you have to edit (re-activate) the session manually using DB tools like phpMyAdmin. Locate the table tx_rsuserimp_sessions and alter the dataset for the session in question: UPDATE tx_rsuserimp_sessions SET deleted=0 WHERE uid=my_faulty_session. Apply step 2 and you should be able to rollback the session.

Delete fe_users by hand. NOT RECOMMENDED AT ALL!

Known problems

  • The extension in its current state is supposed to work with western character sets (ISO-8859-1) only. Multibyte codes have not been tested but will most probably fail. The reason for this: imported data field may have to be converted to lowercase during the import process. This works fine with ISO-8859-1 characters, but seems to fail for multibyte chars.
  • On Microsoft Internet Explorer, the info boxes for rollback session do not close automatically once you choose another session (JavaScript issue).
  • The user must have access to a temporary upload folder, otherwise the file upload will fail. I know, this is trivial, but some users reported this as a bug =:o)

To-Do list

  • Add an user export feature
  • Changing the fixed line length per row (currently set to 10000 characters): getting rid of this hard coded limitation would mean that the CSV first has to be parsed for the longest line and then has to be re-parsed with this value + 1 as the line length parameter

Thanks

I would like to thank Franz Ripfel from A.BE.ZET (www.abezet.de) for sponsoring parts of the development and providing the hook function.

Changelog

  • v1.0.0 final version with all desired functions, added auto-value feature, added tt_address import feature, added hook, bugfixes
  • v12.x and above internal revisions, never got uploaded to TER
  • v0.11.x added PHP5 compliancy, added feature to map multiple user groups, added JavaScript pop-up error messages for missing configuration fields, prepared the extension to do updates instead of inserts on existing users, updated manual to reflect these changes
  • v0.10.0 bugfix and additional security note
  • v0.9.x fixed reported bugs: honor deleted FE usergroups and deleted storage folders
  • v0.9.0 incorporated bugfixes for Unix systems (thanks to all the testers for the feedback)
  • v0.8.0 added user-defined mandatory fields
  • v0.7.0 added garbage collection routine
  • v0.6.0 introduced drop file feature (experimental), added devLog
  • v0.5.0 introduced rollback feature
  • v0.4.0 evaluated CSV sorting feature during import (dropped later for performance reasons)
  • v0.3.0 first initial release available to testers

img-31 EXT: Generic User Import Tool - 14