Changelog

What's new in Synthr

New features, improvements, and fixes — shipped every two weeks.

May 8, 2025v2.1.0

Semantic caching GA + new model support

Semantic caching is now generally available for all plans. We've also added support for Claude 3.7 Sonnet, Gemini 2.0 Flash, and o3-mini.

NewSemantic caching now available on all plans including Hobby
NewClaude 3.7 Sonnet added to the model roster
NewGemini 2.0 Flash — fastest Gemini model yet, great for classification
Newo3-mini support for reasoning-heavy workloads
ImprovedCache hit rate improved by 12% with better embedding model
ImprovedDashboard now shows per-model cost breakdown
FixedStreaming responses no longer drop the final token on some edge nodes
FixedWebhook retries now correctly back off on rate limit errors
April 15, 2025v2.0.0

Synthr v2 — rebuilt from the ground up

Our biggest release ever. The entire routing layer is new, the SDK has been rewritten with full TypeScript inference, and the observability dashboard is live.

NewNew routing engine with automatic provider failover
NewTypeScript SDK with full type inference — zero any
NewReal-time observability dashboard with token usage and latency histograms
NewGlobal edge network expanded to 32 regions
NewWebhooks for request completion, errors, and cache events
NewCustom rate limit configuration per API key
ImprovedP95 latency reduced from 95ms to 44ms
ImprovedModel roster expanded from 12 to 50+ models
Deprecatedv1 SDK (synthr-legacy) will reach end-of-life on July 1, 2025
March 3, 2025v1.9.0

Streaming improvements + Llama 3.1 support

Streaming is now stable across all providers. We've also shipped first-class Llama 3.1 support with routing through our dedicated inference nodes.

NewLlama 3.1 8B, 70B, and 405B now available
NewStreaming now works consistently across all 30+ supported models
ImprovedError messages are now structured JSON with actionable codes
ImprovedSDK retry logic now uses exponential backoff by default
FixedFunction calling now works correctly with Mistral models
FixedToken counts in responses now accurate for all providers
January 20, 2025v1.8.0

Analytics API + team management

Query your usage data programmatically via the Analytics API. Teams can now manage members and per-key permissions directly from the dashboard.

NewAnalytics API: query token usage, latency, and costs programmatically
NewTeam management: invite members, assign roles, and manage API keys
NewPer-key model restrictions — limit keys to specific models
ImprovedDashboard loads 3x faster with new data pipeline
FixedBilling portal no longer shows incorrect usage on the first day of a cycle