Build faster.
Save more.
Welcome to WatchLLM documentation. Here you'll find everything you need to implement semantic caching at the edge.
Join Discord
Get help from community
Security Policy
How we handle your data
Getting Started
Everything you need to get WatchLLM up and running in your project.
Guides & Concepts
Deep dive into how WatchLLM handles semantic caching and analytics.
Python SDK
Complete reference for the WatchLLM Python SDK with auto-instrumentation.
Node.js SDK
Complete reference for the WatchLLM Node.js/TypeScript SDK.
Self-Hosting Guide
Enterprise deployment guide for self-hosted WatchLLM infrastructure.
Architecture
Understanding the edge proxy system design.
Analytics Guide
Mastering cost savings and performance metrics.
Code Examples
Boilerplate for JS, Python, and cURL.
API & Reference
Technical specifications and error resolution guides.
SDKs
Complete SDK documentation for Node.js and Python integrations.
Semantic A/B Testing
Compare performance and cost between different LLM providers in real-time. Automatically route requests to the most efficient variant.
View Example Implementation