User Manual¶
Indexing Documents¶
New documents may be indexed via the TYPO3 command line interface (CLI).
Index single document¶
The command kitodo:index is used for indexing a single document:
./vendor/bin/typo3 kitodo:index -d http://example.com/path/mets.xml -p 123 -s dlfCore1
Option |
Required |
Description |
Example |
---|---|---|---|
|
yes |
This may be an UID of an existing document in tx_dlf_documents or the URL of a METS XML file. If the URL is already known as location in tx_dlf_documents, the file is processed anyway and the records in database and solr index are updated. Hint: Do not encode the URL! If you have spaces in path, use quotation marks. |
|
|
yes |
The page UID of the Kitodo.Presentation data folder. This keeps all records of documents, metadata, structures, solrcores etc. |
123 |
|
yes |
This may be the UID of the solrcore record in tx_dlf_solrcores. Alternatively you may write the index name of the solr core. The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
This may be the UID of the library record in tx_dlf_libraries which should be set as the owner of the document. If omitted, the default is to try to read the ownership from the metadata field "owner". |
123 |
|
no |
Nothing will be written to database or index. The solr-setting will be checked and the documents location URL will be shown. |
|
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. |
|
|
no |
Show processed documents uid and location with indexing parameters. |
Reindex collections¶
With the command kitodo:reindex it is possible to reindex one or more collections or even to reindex all documents on the given page.:
# reindex collection with uid 1 on page 123 with solr core 'dlfCore1'
./vendor/bin/typo3 kitodo:reindex -c 1 -p 123 -s dlfCore1
# reindex collection with uid 1 and 4 on page 123 with solr core 'dlfCore1'
./vendor/bin/typo3 kitodo:reindex -c 1,4 -p 123 -s dlfCore1
# reindex all documents on page 123 with solr core 'dlfCore1'
./vendor/bin/typo3 kitodo:reindex -a -p 123 -s dlfCore1
Option |
Required |
Description |
Example |
---|---|---|---|
|
no |
With this option, all documents from the given page will be reindex. |
|
|
no |
This may be a single collection UID or a list of UIDs to reindex. |
1 or 1,2,3 |
|
yes |
The page UID of the Kitodo.Presentation data folder. This keeps all records of documents, metadata, structures, solrcores etc. |
123 |
|
yes |
This may be the UID of the solrcore record in tx_dlf_solrcores. Alternatively you may write the index name of the solr core. The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
This may be the UID of the library record in tx_dlf_libraries which should be set as the owner of the documents. If omitted, the default is to try to read the ownership from the metadata field "owner". |
123 |
|
no |
Nothing will be written to database or index. All documents will be listed which would be processed on a real run. |
|
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. |
|
|
no |
Show each processed documents uid and location with timestamp and amount of processed/all documents. |
Harvest OAI-PMH interface¶
With the command kitodo:harvest it is possible to harvest an OAI-PMH interface and index all fetched records.:
# example
./vendor/bin/typo3 kitodo:harvest --lib=<UID> --pid=<PID> --solr=<CORE> --from=<timestamp> --until=<timestamp> --set=<set>
In order to use the command, you first have to configure a library in the backend, setting at least a label and oai_base. The latter should be a valid OAI-PMH base URL (e.g. https://digital.slub-dresden.de/oai/).
Option |
Required |
Description |
Example |
---|---|---|---|
|
yes |
This is the UID of the library record with the OAI interface that should be harvested. This library is also automatically set as the documents' owner. |
123 |
|
yes |
This is the page UID of the library record and therefore the page the documents are added to. |
123 |
|
yes |
This may be the UID of the solrcore record in tx_dlf_solrcores. Alternatively you may write the index name of the solr core. The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
This is a timestamp in the format YYYY-MM-DD. The parameters from and until limit harvesting to the given period, e.g. for incremental updates. |
2021-01-01 |
|
no |
This is a timestamp in the format YYYY-MM-DD. The parameters from and until limit harvesting to the given period, e.g. for incremental updates. |
2021-06-30 |
|
no |
This is the name of an OAI set. The parameter limits harvesting to the given set. |
'vd18' |
|
no |
Nothing will be written to database or index. All documents will be listed which would be processed on a real run. |
|
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. |
|
|
no |
Show each processed documents uid and location with timestamp and amount of processed/all documents. |