:navigation-title: Introduction .. _introduction: ============ Introduction ============ What is llms.txt? ================= ``llms.txt`` is an emerging standard for providing AI and Large Language Model (LLM) crawlers with structured information about a website. It serves as a machine-readable index that helps AI systems understand your website's content and structure. The `llmstxt.org specification `__ defines how websites can provide this information in a standardized format. Concept ======= This extension provides a two-tier approach for LLM content access: 1. **llms.txt Index File** - A single file containing: - Website metadata (title, description, domain) - Page structure with SEO descriptions and keywords - Instructions for accessing full page content 2. **Markdown Content Access** - Access any page content via: - ``.md`` suffix - Returns clean Markdown with YAML frontmatter - Example: ``/about.md`` returns the "About" page as Markdown Multi-Language Support ====================== Instead of generating separate ``llms.txt`` files per language, this extension uses a simpler approach: - **Single llms.txt** - Contains the site structure in the default language - **Language-specific content** - Access any page in any language using the ``.md`` suffix with language URL prefix: - Default: ``https://example.com/about.md`` - English: ``https://example.com/en/about.md`` - German: ``https://example.com/de/ueber-uns.md`` This approach follows how multi-language sites actually work in TYPO3. Features ======== Core Features ------------- - **Automatic llms.txt generation** with smart caching - **Markdown output** for all pages via ``.md`` URL suffix - **Multi-language support** via URL prefixes - **API key protection** for restricted access - **YAML frontmatter** in Markdown output with page metadata Page-Level Control ------------------ - **LLM tab** in page properties for fine-grained control - **Custom descriptions** and summaries per page - **Keywords** for better LLM understanding - **Priority setting** (0-100) for page ordering - **Exclude option** to hide specific pages from llms.txt Technical Features ------------------ - **24-hour caching** for optimal performance - **HTML-to-Markdown conversion** using League/html-to-markdown - **Clean output** - removes scripts, styles, navigation elements - **UTF-8 BOM** for proper encoding detection - **Backend notification** if robots.txt lacks llms.txt reference - **Header link injection** (````) in HTML pages Requirements ============ - TYPO3 13.0 - 14.x - PHP 8.2 or higher Supported Content Elements ========================== The extension converts the following TYPO3 content elements to Markdown: - **Header** (``header``) - **Text** (``text``) - **Text with Image** (``textpic``, ``textmedia``) - **Image** (``image``) - **Bullet List** (``bullets``) - **Table** (``table``) - **HTML** (``html``) - **Menu** elements (``menu_*``) - All other elements via HTML-to-Markdown fallback