.. include:: /Includes.rst.txt .. _adr-016: ============================================== ADR-016: Thinking/Reasoning Block Extraction ============================================== :Status: Accepted :Date: 2025-12 :Authors: Netresearch DTT GmbH .. _adr-016-context: Context ======= Modern reasoning models emit structured thinking blocks alongside their final output. Anthropic Claude uses native ``thinking`` content blocks in its API response. DeepSeek, Qwen, and other models wrap reasoning in ``...`` XML tags within the text content. These blocks should be accessible for debugging and transparency but must not pollute the main response. .. _adr-016-decision: Decision ======== Extract thinking blocks from LLM responses using a two-tier strategy: 1. **Native extraction** -- Provider-specific structured thinking blocks (Anthropic ``type: "thinking"`` content blocks). 2. **Regex fallback** -- ``...`` tag extraction for models that embed reasoning inline (DeepSeek, Qwen, local models via Ollama/OpenRouter). :php:`CompletionResponse` carries an optional ``thinking`` property: .. code-block:: php :caption: CompletionResponse with thinking support final readonly class CompletionResponse { public function __construct( public string $content, public string $model, public UsageStatistics $usage, public string $finishReason = 'stop', public string $provider = '', public ?array $toolCalls = null, public ?array $metadata = null, public ?string $thinking = null, // Extracted thinking content ) {} public function hasThinking(): bool { return $this->thinking !== null && trim($this->thinking) !== ''; } } The base :php:`AbstractProvider` implements the shared regex extraction: .. code-block:: php :caption: AbstractProvider::extractThinkingBlocks() protected function extractThinkingBlocks(string $content): array { $thinking = null; if (preg_match_all('#([\s\S]*?)#i', $content, $matches)) { $thinking = trim(implode("\n", $matches[1])); $cleaned = preg_replace('#[\s\S]*?#i', ' ', $content); $content = trim(preg_replace('/[ \t]+/', ' ', $cleaned)); } return [$content, $thinking !== '' ? $thinking : null]; } Provider-specific integration: - **ClaudeProvider** -- Iterates response ``content`` array. Collects ``type: "thinking"`` blocks natively, then runs ``extractThinkingBlocks()`` on text content. Merges both. - **OpenAiProvider** -- Runs ``extractThinkingBlocks()`` on message content (covers DeepSeek, Qwen via OpenAI-compatible API). - **GeminiProvider** -- Runs ``extractThinkingBlocks()`` on first candidate text part. - **OpenRouterProvider** -- Inherits OpenAI behavior (covers all OpenRouter-hosted models). .. _adr-016-consequences: Consequences ============ **Positive:** - ●● Thinking content is preserved without polluting main output. - ● Two-tier extraction covers both native and inline thinking formats. - ● ``hasThinking()`` convenience method for conditional UI display. - ◐ Regex handles multiple ```` blocks per response, concatenating them. - ◐ Content between tags is cleaned without word-gluing (space insertion). **Negative:** - ◑ Regex extraction adds marginal processing overhead per response. - ◑ Non-thinking uses of ```` tags would be incorrectly extracted. **Net Score:** +5.0 (Strong positive) .. _adr-016-files-changed: Files changed ============= **Modified:** - :file:`Classes/Domain/Model/CompletionResponse.php` -- Added ``thinking`` property and ``hasThinking()``. - :file:`Classes/Provider/AbstractProvider.php` -- Added ``extractThinkingBlocks()`` and ``createCompletionResponse()`` with thinking parameter. - :file:`Classes/Provider/ClaudeProvider.php` -- Native thinking block extraction plus regex fallback. - :file:`Classes/Provider/OpenAiProvider.php` -- Regex-based thinking extraction. - :file:`Classes/Provider/GeminiProvider.php` -- Regex-based thinking extraction. - :file:`Classes/Provider/OpenRouterProvider.php` -- Inherits OpenAI behavior.