Crawlers

The extension uses crawlers to visit all URLs configured for cache warmup. While visiting a URL, the appropriate page cache gets warmed. Learn more about which crawlers are available by default and how to implement a custom crawler on this page.

interface Crawler
Fully qualified name
\EliasHaeussler\CacheWarmup\Crawler\Crawler

Interface for crawlers used to crawl and warm up URLs.

crawl ( $urls)

Crawl a given list of URLs.

param array $urls

List of URLs to be crawled.

returntype

\EliasHaeussler\CacheWarmup\Result\CacheWarmupResult

Default crawlers

The extension ships with two default crawlers:

  • \EliasHaeussler\Typo3Warming\Crawler\ConcurrentUserAgentCrawler: Used for cache warmup triggered within the TYPO3 backend
  • \EliasHaeussler\Typo3Warming\Crawler\OutputtingUserAgentCrawler: Used for cache warmup executed from the command-line

Both crawlers use a custom User-Agent header for all warmup requests. By using this custom header, it is possible to exclude warmup requests from the statistics of analysis tools, for example. The header is generated by a HMAC hash of the string TYPO3/tx_warming_crawler.

The generated header value can be copied form the cache warmup modal in the TYPO3 backend. Alternatively, a command warming:showuseragent is available which can be used to read the current User-Agent header.

Implement a custom crawler

Available interfaces

The actual cache warmup is done via the library eliashaeussler/cache-warmup . It provides the \EliasHaeussler\CacheWarmup\Crawler\Crawler interface, which must be implemented when developing your own crawler.

Verbose crawlers

There is also a \EliasHaeussler\CacheWarmup\Crawler\VerboseCrawler interface that redirects user-oriented output to an instance of \Symfony\Component\Console\Output\OutputInterface.

interface VerboseCrawler
Fully qualified name
\EliasHaeussler\CacheWarmup\Crawler\VerboseCrawler

Interface that redirects user-oriented output to a given output.

setOutput ( $output)

Set output where to redirect user-oriented output.

param \Symfony\Component\Console\Output\OutputInterface $output

Output where to redirect user-oriented output.

Configurable crawlers

Custom crawlers can also implement the \EliasHaeussler\CacheWarmup\Crawler\ConfigurableCrawler, interface allowing users to configure warmup requests themselves.

interface ConfigurableCrawler
Fully qualified name
\EliasHaeussler\CacheWarmup\Crawler\ConfigurableCrawler

Interface allowing users to configure warmup requests themselves.

setOptions ( $options)

Set custom crawler options.

param array $options

Associative array of custom crawler options.

Logging crawlers

Crawling results can be logged using a dedicated PSR-3 logger. For this, crawlers must implement the \EliasHaeussler\CacheWarmup\Crawler\LoggingCrawler interface and inject an appropriate PSR-3 logger. In TYPO3 context, this is mostly done using TYPO3's log manager. Read more about logging in the official documentation.

interface LoggingCrawler
Fully qualified name
\EliasHaeussler\CacheWarmup\Crawler\LoggingCrawler

Interface that allows crawling results to be logged using a dedicated PSR-3 logger.

setLogger ( $logger)

Inject PSR-3 compatible logger.

param \Psr\Log\LoggerInterface $logger

PSR-3 compatible logger.

setLogLevel ( $logLevel)

Set minimum log level.

param string $logLevel

The minimum log level.

Stoppable crawlers

Crawlers implementing the \EliasHaeussler\CacheWarmup\Crawler\StoppableCrawler interface may cancel a cache warmup prematurely if any crawling failure occurs. This can be especially useful for validation purposes to check whether any page within an XML sitemap is inaccessible or failing.

interface StoppableCrawler
Fully qualified name
\EliasHaeussler\CacheWarmup\Crawler\StoppableCrawler

Interface that may cancel a cache warmup prematurely if any crawling failure occurs.

stopOnFailure ( $stopOnFailure)

Configure crawler to cancel cache warmup on failure.

param bool $stopOnFailure

Cancel cache warmup on failure.

Streamable crawlers

When running cache warmup from the TYPO3 backend, the current crawling progress is streamed to the cache warmup progress modal. However, this is only supported for crawlers implementing the \EliasHaeussler\Typo3Warming\Crawler\StreamableCrawler interface.

Those crawlers will then get an \EliasHaeussler\SSE\Stream\EventStream injected. It can be used to send events to the current event stream. The following events are currently available:

  • \EliasHaeussler\Typo3Warming\Http\Message\Event\WarmupFinishedEvent
  • \EliasHaeussler\Typo3Warming\Http\Message\Event\WarmupProgressEvent

By default, when implementing a streamable crawler, there's no need to trigger these events by your own. Instead, it's better to use the provided \EliasHaeussler\Typo3Warming\Http\Message\Handler\StreamResponseHandler which takes care of sending appropriate events.

interface StreamableCrawler
Fully qualified name
\EliasHaeussler\Typo3Warming\Crawler\StreamableCrawler

Interface that allows streaming of cache warmup events using an EventStream.

setStream ( $stream)

Set event stream used to send cache warmup events.

param \EliasHaeussler\SSE\Stream\EventStream $stream

Event stream used to send cache warmup events.

Steps to implement a new crawler

  1. Create a new crawler

    The new crawler must implement at least one of the following interfaces:

  2. Configure the new crawler

    Add the new crawler to the extension configuration. Note that you should configure either the crawler or verboseCrawler option, depending on what interface you have implemented.

  3. Flush system caches

    Finally, flush all system caches to ensure the correct crawler class is used for further cache warmup requests.