ADR-033: Specialized Models in the Model Registry
- Status
-
Accepted
- Date
-
2026-06-11
- Authors
-
Netresearch DTT GmbH
Context
The backend Models module manages tx_nrllm_model records for the
chat/embedding pipeline, but the specialized services (image
generation, text-to-speech, transcription — ADR-030: Specialized Services Authenticate Through nr-vault,
ADR-032: Specialized Usage Tracking and Pricing Catalog) selected their models from hardcoded constants
(dall-e-3, tts-1, whisper-1) and never consulted the
registry. Image and speech models were therefore invisible in the
backend: administrators could not curate them, mark a preferred
default, or see usage linked to a record. Consuming extensions had no
way to ask "which image model should I use on this instance?".
Decision
- Specialized capabilities.
ModelgainsCapability IMAGE,TEXT_TO_SPEECHandTRANSCRIPTIONcases, exposed in thetx_nrllm_modelTCA capabilities select, the BE group capability permissions and the model-picker capability badges. Image, TTS and transcription models are regular registry records. - Capability-based default resolution.
Dall,EImage Service TextandTo Speech Service WhisperexposeTranscription Service resolve: ACTIVE registry records carrying the service's capability are considered provider-agnostically; anDefault Model (string $fallback): string is_defaultrecord wins, then the lowestsorting; the record'smodel_idis returned. Fail-soft — any error, missing repository, or no matching record returns the fallback unchanged; the method never throws (the same posture asSpecialized, ADR-032: Specialized Usage Tracking and Pricing Catalog).Cost Calculator - Usage linkage. Specialized usage rows now carry the matching
registry record's uid as
model_uid(resolved fail-soft from the usedmodel_id), so the Analytics model breakdowns link image and speech spend to the curated records;0remains the value for models without a registry record. -
Configuration-based resolution for specialized services.
tx_nrllm_configurationrecords are the stable indirection layer for image/TTS/transcription exactly as for chat: a consumer references a configuration by identifier, the administrator swaps the assigned model (or adjusts the system prompt) on the record, and every consumer picks it up without re-configuring anything. The three services expose the consumer-facing APIresolve— resolution order: the ACTIVE configuration's ACTIVE model record'sModel For Configuration (string $configuration Identifier, string $fallback): string model_id(records with an emptymodel_idare skipped) → the capability-based registry default (decision 2) → the given fallback. Fail-soft, never throws.get— the configuration's system prompt; the empty string when the configuration is unknown, inactive, or unreadable. The prompt is returned to the consumer, never injected implicitly, so the consumer always records the exact prompt it sent (transparency requirement).Configuration System Prompt (string $configuration Identifier): string
For image generation the model MUST be resolved before the options object is constructed:
ImagevalidatesGeneration Options sizeagainst the concrete model value at construction time. - Usage attribution per configuration. The specialized options
DTOs (
Image,Generation Options Speech,Synthesis Options Transcription) carry an optionalOptions configurationidentifier — pure metadata that never reaches the upstream API and never alters validation. When set, the services resolve the configuration uid fail-soft and pass it asconfigurationUidtotrackUsage(), so the Analytics module aggregates specialized spend per configuration just like chat spend. - Snippet-enforcement hook (Phase 2). The planned prompt-snippet
feature (pinning/enforcing prompt snippets) attaches at the
Configuration level.
getConfigurationSystemPrompt()is the single seam where enforced snippets will be folded into the returned prompt — consumers keep calling the same method and stay unchanged when Phase 2 lands.
Consequences
- ● Image, TTS and transcription models are first-class registry citizens: curated, activatable, default-flagged and visible in the backend Models module like chat models.
- ● Consuming extensions resolve the instance-preferred specialized
model via
resolveDefaultModel()instead of hardcoding one, with a guaranteed-safe fallback. - ● Configurations are the stable consumer contract for specialized calls too: model swaps and system-prompt changes are central, one-record edits — no consumer redeployment.
- ● Analytics model breakdowns link specialized spend to registry
records via
model_uidand to configurations viaconfiguration_uid. - ◐ Hardcoded service defaults remain as fallbacks — instances without curated records keep working unchanged.
- ◑ Up to two additional fail-soft repository lookups per tracked specialized call (indexed single-row queries; negligible next to the API call).