Changelog

What's new in Synthr

New features, improvements, and fixes — shipped every two weeks.

May 8, 2025v2.1.0

Semantic caching GA + new model support

Semantic caching is now generally available for all plans. We've also added support for Claude 3.7 Sonnet, Gemini 2.0 Flash, and o3-mini.

NewSemantic caching now available on all plans including Hobby

NewClaude 3.7 Sonnet added to the model roster

NewGemini 2.0 Flash — fastest Gemini model yet, great for classification

Newo3-mini support for reasoning-heavy workloads

ImprovedCache hit rate improved by 12% with better embedding model

ImprovedDashboard now shows per-model cost breakdown

FixedStreaming responses no longer drop the final token on some edge nodes

FixedWebhook retries now correctly back off on rate limit errors

April 15, 2025v2.0.0

Synthr v2 — rebuilt from the ground up

Our biggest release ever. The entire routing layer is new, the SDK has been rewritten with full TypeScript inference, and the observability dashboard is live.

NewNew routing engine with automatic provider failover

NewTypeScript SDK with full type inference — zero any

NewReal-time observability dashboard with token usage and latency histograms

NewGlobal edge network expanded to 32 regions

NewWebhooks for request completion, errors, and cache events

NewCustom rate limit configuration per API key

ImprovedP95 latency reduced from 95ms to 44ms

ImprovedModel roster expanded from 12 to 50+ models

Deprecatedv1 SDK (synthr-legacy) will reach end-of-life on July 1, 2025

March 3, 2025v1.9.0

Streaming improvements + Llama 3.1 support

Streaming is now stable across all providers. We've also shipped first-class Llama 3.1 support with routing through our dedicated inference nodes.

NewLlama 3.1 8B, 70B, and 405B now available

NewStreaming now works consistently across all 30+ supported models

ImprovedError messages are now structured JSON with actionable codes

ImprovedSDK retry logic now uses exponential backoff by default

FixedFunction calling now works correctly with Mistral models

FixedToken counts in responses now accurate for all providers

January 20, 2025v1.8.0

Analytics API + team management

Query your usage data programmatically via the Analytics API. Teams can now manage members and per-key permissions directly from the dashboard.

NewAnalytics API: query token usage, latency, and costs programmatically

NewTeam management: invite members, assign roles, and manage API keys

NewPer-key model restrictions — limit keys to specific models

ImprovedDashboard loads 3x faster with new data pipeline

FixedBilling portal no longer shows incorrect usage on the first day of a cycle