Configuration

All the settings for the extension can be made through the TYPO3 Extension Configuration module.

Extension configuration for EXT:tika

Extractor

Simply select what service you would like to use, either

  • Tika App(not recommended)
  • Tika Server(recommended)
  • Solr Server.

Depending on that, configure the necessary settings for your service on the according settings tab.

About Tika variants

Each variant has its advantages and its drawbacks.

Solr Cell - variant

Apache Solr Content Extraction Library (Solr Cell) variant does not support all the features supported by the App and by Server variants, but does not require to run and maintain any additional service/stack, if EXT:solr is already configured. Any connection/core used by EXT:solr can be reused there. Possible implications can be found on Apache Solr docs page

Enable Logging

Enables the logging for extraction actions.

Show Tika Backend Module

Enables a Tika module within the Solr backend module (experimental, only works with Tika server, will be removed.)

Exclude mime types

Expects a list of mime types to be excluded in metadata extraction.

File size limit...

Expects a file size limit in MB when a file should be processed. (Defaults to 500)

Enable meta data extraction

Enables MetaDataExtractor, including LanguageDetector, if available. (Default: true) Useful on frequent file movements or mass file processing or if metadata must not be overridden.