Introduction 

About this document 

The LinkValidator is provided by a system extension named linkvalidator which enables you to conveniently check your website for broken links.

This manual explains how to install and configure the extension for your needs.

What does it do? 

The LinkValidator checks the links in your website for validity, reports broken links or missing files in your TYPO3 installation and provides a way to conveniently fix these problems.

It includes the following features:

  • The LinkValidator can check all kinds of links. This includes internal links to pages and content elements, file links to files in the local file system and external links to resources somewhere else in the web.
  • The LinkValidator checks a number of fields by default, for example header and bodytext fields of content elements. It can be configured to check any field you like.
  • The LinkValidator offers a just in time check of your website. Additionally the TYPO3 Scheduler is fully supported to run checks automatically. In this case you can choose, if you want to receive an email report, if broken links were found.
  • The LinkValidator is extendable. It provides hooks to check special types of links or override how the checking of external, file and page links works.

Screenshots 

This is the Check Links backend module. It provides two actions: Report and Check Links. The Report action is always shown first. Here you can view the broken links which were found, when your website was last checked.

The Reports action

Viewing broken links in the Report action

The Check Links tab is used to check links on demand and can be hidden with TSconfig, if desired.

The Check links tab

Checking links live in the TYPO3 Backend

The workflow in the module is the following:

  • First you set the depth of pages you want to consider when checking for broken links in the Check Links tab. Then click the Check Links button.
  • Once the checks are done, the module automatically switches to the Report tab where the results are displayed.
  • The type and ID of the content containing the broken link become visible when you move the mouse over the icon for the content type. The pencil icons at the beginning of each row enable you to quickly fix the displayed elements.

The LinkValidator features full support of the TYPO3 Scheduler. This is the LinkValidator task:

The LinkValidator Scheduler task

Defining the LinkValidator task in the Scheduler

  • With this task you can run LinkValidator regularly via cron without having to manually update the stored information on broken links.
  • You can for example overwrite the TSconfig configuration. Without any change, the LinkValidator settings which apply for the respective pages will be used. If you set values there, the former will be overwritten.
  • The LinkValidator task can send you a status report via email. You can create your own email template as needed.

Credits 

This extension is particularly based on the extension cag_linkchecker, which was originally developed for Connecta AG, Wiesbaden. cag_linkchecker is maintained by Jochen Rieger and Dimitri König.

Feedback 

If you find a bug in this manual or in the extension in general, please file an issue in the TYPO3 bug tracker.

Working with LinkValidator 

This page handles how to work with LinkValidator as editor. It is intended for a non-technical audience.

LinkValidator Report 

  1. Access the module Check Links.
  2. Select the page you want to work on in the page tree
Access LinkValidator Report via "Check Links" module

Access LinkValidator Report via "Check Links" module

You will now see the Report with the list of broken links. In order to get results, select the checkboxes ("Internal Links", etc.) and choose the appropriate depth under Show this level. This will determine the page level, for example for depth This page, LinkValidator will only show broken links for the page that is currently selected in the page tree. The deeper you go, the more broken links may possibly be shown. After you change the settings, you must click Refresh display.

When you jump to a different page in the page tree, the Report will be refreshed.

Installation 

This extension is part of the TYPO3 Core, but not installed by default.

Installation with Composer 

Check whether you are already using the extension with:

composer show | grep linkvalidator
Copied!

This should either give you no result or something similar to:

typo3/cms-linkvalidator       v12.4.11
Copied!

If it is not installed yet, use the composer require command to install the extension:

composer require typo3/cms-linkvalidator
Copied!

The given version depends on the version of the TYPO3 Core you are using.

Classic installation without Composer 

In an installation without Composer, the extension is already shipped but might not be activated yet. Activate it as follows:

  1. In the backend, navigate to the System > Extensions module.
  2. Click the Activate icon for the LinkValidator extension.
Extension manager showing LinkValidator extension

Extension manager showing LinkValidator extension

Next step 

LinkValidator uses the HTTP request library shipped with TYPO3. Please have a look in the Global Configuration, particularly at the HTTP settings.

There, you may define a default timeout. Generally, it is recommended to always specify timeouts when working with the LinkValidator.

Configuration of the Linkvalidator extension 

You can find the standard configuration in EXT:linkvalidator/Configuration/page.tsconfig.

This may serve as an example on how to configure the extension for your needs.

Minimal configuration 

It is recommended to at least fill out httpAgentUrl and httpAgentEmail. The latter is only required if $GLOBALS['TYPO3_CONF_VARS']['MAIL']['defaultMailFromAddress'] is not set.

config/sites/my-site/page.tsconfig
mod.linkvalidator {
  linktypesConfig {
    external {
      httpAgentUrl = https://example.org
      httpAgentEmail = noreply@example.org
    }
  }
}
Copied!

TSconfig Reference 

You can set the following options in the TSconifg of a site, for example in file config/sites/my-site/page.tsconfig or in the global page TSconfig file packages/my_sitepackage/Configuration/page.tsconfig of your site package.

You must prefix them with mod.linkvalidator, for example mod.linkvalidator.searchFields.pages = canonical_link.

searchFields.[table]

searchFields.[table]
Type
string
Path
mod.linkvalidator.searchFields.[table]
Default
See Checked Fields

Comma separated list of table fields in which to check for broken links. LinkValidator only checks fields that have been defined in searchFields.

LinkValidator ships with sensible defaults that work well for the TYPO3 core, but additional third party extensions are not considered.

config/sites/my-site/page.tsconfig
mod.linkvalidator.searchFields {
    # Usually you want to append fields:
    tt_content := addToList(mysitepackage_carousel_morelink)

    # or you can set specific columns (overwriting defaults):
    # tt_content = bodytext

    # Do not check canonical URL in pages
    pages := removeFromList(canonical_link)
}
Copied!

linktypes

linktypes
Type
string
Path
mod.linkvalidator.linktypesConfig.linktypes
Default
db,file

Comma separated list of link types to check.

Possible values:

db
Check links to database records.
file
Check links to files located in your local TYPO3 installation.
external
Check links to external URLs.

This list may be extended by other extensions providing a custom linktype implementation.

Changed in version 13.0

The default was changed to exclude "external" link type.

linktypesConfig.external.httpAgentName

linktypesConfig.external.httpAgentName
Type
string
Path
mod.linkvalidator.linktypesConfig.linktypesConfig.external.httpAgentName
Default
TYPO3 LinkValidator

Add descriptive name to be used as 'User-Agent' header when crawling external URLs.

linktypesConfig.external.httpAgentUrl

linktypesConfig.external.httpAgentUrl
Type
string
Path
mod.linkvalidator.linktypesConfig.linktypesConfig.external.httpAgentUrl
Default
(empty string)

Add descriptive name to be used as 'User-Agent' header when crawling external URLs.

Add URL to be used in 'User-Agent' header when crawling external URLs.

linktypesConfig.external.httpAgentEmail

linktypesConfig.external.httpAgentEmail
Type
string
Path
mod.linkvalidator.linktypesConfig.linktypesConfig.external.httpAgentEmail
Default
$GLOBALS['TYPO3_CONF_VARS']['MAIL']['defaultMailFromAddress']

Add descriptive email used in 'User-Agent' header when crawling external URLs.

checkhidden

checkhidden
Type
boolean
Path
mod.linkvalidator.checkhidden
Default
0

If set, disabled pages and content elements are checked for broken links, too.

showCheckLinkTab

showCheckLinkTab
Type
boolean
Path
mod.linkvalidator.showCheckLinkTab
Default
1

If set, the backend module shows a "Check Links" tab, which you can use to perform the checks on demand.

The Check links tab is visible

The Check links tab is visible

actionAfterEditRecord

actionAfterEditRecord
Type
string
Path
mod.linkvalidator.actionAfterEditRecord
Default
recheck
Possible values
recheck | setNeedsRecheck

After a record is edited, the list of broken links may no longer be correct, because broken links were changed or removed or new broken links added. Due to this, the list of broken links should be updated.

Possible values are:

recheck
The field is rechecked. (Warning: an RTE field may contain a number of links, rechecking may lead to delays.)
setNeedsRecheck
The entries in the list are marked as needing a recheck

mail.fromname

mail.fromname
Type
string
Path
mod.linkvalidator.mail.fromname
Default
$GLOBALS['TYPO3_CONF_VARS']['MAIL']['defaultMailFromName']

Set the from name of the report mail sent by the cron script.

mail.fromemail

mail.fromemail
Type
string
Path
mod.linkvalidator.mail.fromemail
Default
$GLOBALS['TYPO3_CONF_VARS']['MAIL']['defaultMailFromAddress']

Set the from email of the report mail sent by the cron script.

mail.replytoname

mail.replytoname
Type
string
Path
mod.linkvalidator.mail.replytoname

Set the replyto name of the report mail sent by the cron script.

mail.replytoemail

mail.replytoemail
Type
string
Path
mod.linkvalidator.mail.replytoemail

Set the replyto email of the report mail sent by the cron script.

mail.subject

mail.subject
Type
string
Path
mod.linkvalidator.mail.subject
Default
TYPO3 LinkValidator report

Set the subject of the report mail sent by the cron script.

linktypesConfig

linktypesConfig
Type
array
Path
mod.linkvalidator.linktypesConfig

All settings within this key are advanced settings. In most cases, the defaults should be sufficient.

external.headers

external.headers
Type
array
Path
mod.linkvalidator.linktypesConfig.external.headers
Default
(empty array)

Additional set of HTTP headers to be passed when crawling URLs.

external.method

external.method
Type
string
Path
mod.linkvalidator.linktypesConfig.external.method
Default
HEAD

This specified which method is used for crawling URLs. By default, we use HEAD (which falls back to GET if HEAD fails).

You can use GET as an alternative, but keep in mind that HEAD is a lightweight request and should be preferred while GET will fetch the remote web page (within the limits specified by range, see next option).

"The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response." (w3 RFC2616).

external.range

external.range
Type
string
Path
mod.linkvalidator.linktypesConfig.external.range
Default
0-4048

Additional HTTP request header 'Range' to be passed when crawling URLs. Use a string to specify the range (in bytes).

external.allowRedirects

external.allowRedirects
Type
boolean
Path
mod.linkvalidator.linktypesConfig.external.allowRedirects
Default
0

New in version 14.0

If enabled, HTTP redirects with external links are reported as problems.

external.timeout

external.timeout
Type
integer
Path
mod.linkvalidator.linktypesConfig.external.timeout
Default
20

HTTP request option. This is the total timeout of the request in seconds.

If set, this overrides the timeout in $GLOBALS['TYPO3_CONF_VARS']['HTTP']['timeout'] which defaults to 0.

Linkvalidator configuration example 

Example configuration that checks the links provided by a custom content element in a site package. The TCA definition of the fields must fulfill the TCA requirements for link validation.

config/sites/my-site/page.tsconfig
mod.linkvalidator {
  searchFields {
    tt_content := addToList(mysitepackage_carousel_morelink)
    mysitepackage_carousel_carousel_items = text,typolink
  }
  linktypes = db,file,external
  checkhidden = 0
  mail {
    fromname = TYPO3 LinkValidator
    fromemail = no_reply@example.org
    replytoname =
    replytoemail =
    subject = TYPO3 LinkValidator report
  }
  linktypesConfig {
    external {
      httpAgentUrl = https://example.org/info.html
      httpAgentEmail = info@example.org
    }
  }
}
Copied!

Additionally, email reports and the HTTP agent for external URLS are configured.

Fields checked by the Linkvalidator 

Changed in version 13.3

The following fields where added to the list of fields that are checked by default:

  • pages = canonical_link
  • sys_redirect = target
  • sys_file_reference = link

Changed in version 14.0

Field pages.url has been removed in favour of the new pages.link field as replacement and is used as default now.

The following tables and fields are supported by default:

  • pages = link, canonical_link
  • sys_redirect = target
  • sys_file_reference = link
  • tt_content = bodytext, header_link

Two special fields are currently defined, but are not checked yet due to their TCA configuration:

  • pages = media has TCA type="file"
  • tt_content = records has TCA type="group"

The following fields could theoretically be included in custom configurations, as their type / softref is available, but they are specifically not added in the default configuration:

  • sys_webhook = url (webhook should not be invoked)
  • tt_content = subheader (has softref email[subst] which is not a supported link type)
  • pages = tsconfig_includes (system configuration)
  • sys_template = constants, include_static_file, config (system configuration)
  • tx_scheduler_task_group = groupName (scheduler system configuration)

Required TCA configuration so a field can be checked by the Linkvalidator 

Currently, LinkValidator will only detect links for fields if the TCA configuration meets one of these criteria:

For this reason, it is currently not possible to check for pages.media. This will be fixed in the future.

Examples for working fields:

  • pages.canonical_link ( 'type' => 'link')
  • pages.link ( 'type' => 'link')
  • sys_file_reference.link ( 'type' => 'link')

Example for not working fields:

  • pages.media ( 'type' => 'file')

Linkvalidator information for developers 

Custom linktypes 

The LinkValidator uses so called linktypes to check for different types of links. Learn how to implement a custom linktype within an extension.

API 

Overview of important classes and interfaces in the public API.

Public API of the TYPO3 linkvalidator 

The following classes and interfaces are frequently used by developers of third party extensions. For the complete API have a look into the code.

  • \TYPO3\CMS\Linkvalidator\Linktype\AbstractLinktype
  • \TYPO3\CMS\Linkvalidator\Linktype\LinktypeInterface
  • \TYPO3\CMS\Linkvalidator\Linktype\LabelledLinktypeInterface

The following events can be listened to:

Hints for large sites 

If you have a website with many hundreds of pages, checking all links will take some time and might lead to a time out. It will also need some resources so that it might make sense to do the check at night. If you want to check many pages, you should not use the "Check Links" tab in the backend module of LinkValidator. Use the TYPO3 Scheduler instead. The task provided by LinkValidator will cache the broken links just like the button "Check Links" would do. Afterwards you can use the backend module as usual to fix the according elements.

If you still want to check trees with many pages just in time, set the depth to a reasonable level like 2 or 3. Do not use "infinite".

Known problems 

Sitemap