ADR-032: Specialized Usage Tracking and Pricing Catalog
- Status
-
Accepted
- Date
-
2026-06-10
- Authors
-
Netresearch DTT GmbH
Context
The chat/embedding path records complete usage rows: the middleware
pipeline (ADR-026: Provider Middleware Pipeline) tracks tokens and derives a cost from the
admin-curated tx_nrllm_model pricing via Model::.
The specialised services bypass that pipeline by design — but they
recorded almost nothing. The image services passed metric keys
(size, quality, count) that
Usage does not map, so only
request_count = 1 landed in tx_nrllm_service_usage: no cost, no
tokens, no images_generated, no model_id. TTS recorded
characters but no cost; Whisper recorded nothing but the request.
Consequently the Analytics module, the MonthlyCost widget and
BudgetService systematically excluded all image and speech spend —
defeating the requirement that nr_llm can monitor total AI spend.
Two structural problems compounded this:
- the specialised services have no access to model pricing (their
models —
gpt-image-2,tts-1,whisper-1— usually have notx_nrllm_modelrow), and gpt-image-*responses carry ausagetoken object (DALL·E responses do not), which was discarded.
Decision
- Real units in the callers. The services pass the metric keys the
tracker actually maps:
images(→images_generated),characters,audioSeconds(→audio_seconds_used, from theverbose_jsonWhisper duration), token keys when the response reports them, and the model identifier asmodelId(→model_id). Provider strings drop the ad-hocprovider:modelsuffixes (dall-e:dall-e-3→ providerdall-e+model_id). - Token usage parsing.
Dallparses theEImage Service usageobject ofgpt-image-*responses (input_tokens,output_tokens,total_tokens,input_tokens_details) so token aggregates include image calls; DALL·E responses withoutusagegracefully omit token metrics. - Static price catalog with a DB override.
Specializedencodes the published OpenAI list prices (each constant documents source URL and verification date): gpt-image-* token prices and per-image fallback estimates, DALL·E per-image prices by quality/size,Pricing Open Ai Price Catalog tts-1/tts-1-hdper 1M characters,whisper-1per minute.Specialized(injected intoCost Calculator Abstract) resolves in order: admin-curatedSpecialized Service tx_nrllm_modelrow matching the model identifier (reusingModel::, so negotiated prices win) → catalog token prices → catalog per-image price →estimate Cost () 0.0. Unknown models never get a guessed cost — a zero cost signals "no price data" instead of fabricating numbers. - No double counting.
Llmno longer repeats the token count on its translation row (the pipeline already records tokens and cost on the underlying chat row); it keeps the translation-level request/characters view.Translator Whisperloses its secondTranscription Service:: translate To English () trackUsage()call — the dispatch path records the request exactly once.
Consequences
- ● Image, TTS and Whisper spend appears in the Analytics module, the MonthlyCost widget and BudgetService aggregates — total spend monitoring covers all service types.
- ● Costs follow published list prices and can be overridden per model
by creating a
tx_nrllm_modelrow with token pricing. - ◑ The catalog requires manual maintenance when OpenAI changes list prices; constants carry source URLs and verification dates to make the review mechanical.
- ◑ Analytics grouped by
service_providernow showsdall-e/fal/tts/whisperinstead of suffixed variants (dall-e:dall-e-3); historic rows keep their old strings, the model dimension moved tomodel_id. - ◑ FAL calls record images but cost
0.0— FAL publishes no static list prices for its hosted models.