Introduction

What does it do?

The Indexed Search Engine provides two major elements to TYPO3:

  1. Indexing: An indexing engine which indexes TYPO3 pages on-the-fly as they are rendered by TYPO3's frontend. Indexing a page means that all words from the page (or specifically defined areas on the page) are registered, counted, weighted and finally inserted into a database table of words. Then another table will be filled with relation records between the word table and the page. This is the basic idea.
  2. Searching: A plugin you can insert on your website which allows website users to search for information on your website. By searching the plugin first looks in the word-table if the word exist and if it does all pages which has a relation to that word will be considered for the search result display. The search results are ordered based on factors like where on the page the word was found or the frequency of the word on the page.

This is an example of how the search interface on a website looks:

Frontend search results

Search results in the frontend

Features of the indexer

The indexing engine has several features:

  • HTML data priority: 1) <title>-data 2) <meta-keywords>, 3) <meta- description>, 4) <body>
  • Indexing external files: Text formats like html and txt and doc, pdf by external programs (catdoc / pdftotext)
  • Wordcounting and frequency used to rate results
  • Exact or partial search
  • Searching freely for sentences (non-indexed).
  • NOT case-sensitive in any ways though.

Features of the search frontend (the plugin)

The search interface has several options for advanced searching. Any of those can be disabled and/or preset with default values:

  • Searching whole word, part of word, sentence
  • Logical AND and OR search including syntactical recognition of AND, OR and NOT as logical keywords. Furthermore sentences encapsulated in quotes will be recognized.
  • Searching can be targeted at specific media, for instance searching only indexed PDF files, HTML-files, Word-files, TYPO3-pages or everything
  • The engine is language-sensitive based on the multiple-language feature of the TYPO3 CMS frontend.
  • Searching can be performed in specific sections of the website.
  • Results can be sorted descending or ascending and ordered by word frequency, weight, location relative to page top, page modification date, page title, etc.
  • The display of search results can be intelligently divided into sections based on the internal page hierarchy. Thus results are primarily grouped by relation, then by hit-relevance.

This shows the full range of default options for "advanced search":

Advanced search options

All possible advanced search options