ADR-016: Thinking/Reasoning Block Extraction
- Status
-
Accepted
- Date
-
2025-12
- Authors
-
Netresearch DTT GmbH
Context
Modern reasoning models emit structured thinking blocks alongside their final
output. Anthropic Claude uses native thinking content blocks in its API
response. DeepSeek, Qwen, and other models wrap reasoning in
<think>...</think> XML tags within the text content. These blocks should be
accessible for debugging and transparency but must not pollute the main response.
Decision
Extract thinking blocks from LLM responses using a two-tier strategy:
- Native extraction -- Provider-specific structured thinking blocks
(Anthropic
type: "thinking"content blocks). - Regex fallback --
<think>...</think>tag extraction for models that embed reasoning inline (DeepSeek, Qwen, local models via Ollama/OpenRouter).
Completion carries an optional thinking property:
final readonly class CompletionResponse
{
public function __construct(
public string $content,
public string $model,
public UsageStatistics $usage,
public string $finishReason = 'stop',
public string $provider = '',
public ?array $toolCalls = null,
public ?array $metadata = null,
public ?string $thinking = null, // Extracted thinking content
) {}
public function hasThinking(): bool
{
return $this->thinking !== null && trim($this->thinking) !== '';
}
}
The base Abstract implements the shared regex extraction:
protected function extractThinkingBlocks(string $content): array
{
$thinking = null;
if (preg_match_all('#<think>([\s\S]*?)</think>#i', $content, $matches)) {
$thinking = trim(implode("\n", $matches[1]));
$cleaned = preg_replace('#<think>[\s\S]*?</think>#i', ' ', $content);
$content = trim(preg_replace('/[ \t]+/', ' ', $cleaned));
}
return [$content, $thinking !== '' ? $thinking : null];
}
Provider-specific integration:
- ClaudeProvider -- Iterates response
contentarray. Collectstype: "thinking"blocks natively, then runsextractThinkingBlocks()on text content. Merges both. - OpenAiProvider -- Runs
extractThinkingBlocks()on message content (covers DeepSeek, Qwen via OpenAI-compatible API). - GeminiProvider -- Runs
extractThinkingBlocks()on first candidate text part. - OpenRouterProvider -- Inherits OpenAI behavior (covers all OpenRouter-hosted models).
Consequences
Positive:
- ●● Thinking content is preserved without polluting main output.
- ● Two-tier extraction covers both native and inline thinking formats.
- ●
hasThinking()convenience method for conditional UI display. - ◐ Regex handles multiple
<think>blocks per response, concatenating them. - ◐ Content between tags is cleaned without word-gluing (space insertion).
Negative:
- ◑ Regex extraction adds marginal processing overhead per response.
- ◑ Non-thinking uses of
<think>tags would be incorrectly extracted.
Net Score: +5.0 (Strong positive)
Files changed
Modified:
Classes/-- AddedDomain/ Model/ Completion Response. php thinkingproperty andhasThinking().Classes/-- AddedProvider/ Abstract Provider. php extractThinkingBlocks()andcreateCompletionResponse()with thinking parameter.Classes/-- Native thinking block extraction plus regex fallback.Provider/ Claude Provider. php Classes/-- Regex-based thinking extraction.Provider/ Open Ai Provider. php Classes/-- Regex-based thinking extraction.Provider/ Gemini Provider. php Classes/-- Inherits OpenAI behavior.Provider/ Open Router Provider. php