API.md - WatchLLM API Specification
Purpose: Complete API documentation for developers integrating with WatchLLM.
Table of Contents
- Quick Start
- Authentication
- Proxy API Endpoints
- Dashboard API Endpoints
- Webhooks
- Error Handling
- Rate Limits
- SDKs & Examples
1. Quick Start
Get Your API Key
- Sign up at https://WatchLLM.com/signup
- Create a project
- Copy your API key (starts with
lgw_)
Make Your First Request
cURL:
curl https://proxy.WatchLLM.com/v1/chat/completions \
-H "Authorization: Bearer lgw_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'Node.js (OpenAI SDK):
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'lgw_your_api_key_here', // Your WatchLLM key
baseURL: 'https://proxy.WatchLLM.com/v1'
});
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(response.choices[0].message.content);Python (OpenAI SDK):
from openai import OpenAI
client = OpenAI(
api_key='lgw_your_api_key_here',
base_url='https://proxy.WatchLLM.com/v1'
)
response = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': 'Hello!'}]
)
print(response.choices[0].message.content)2. Authentication
API Key Format
All proxy requests require an API key in the Authorization header:
Authorization: Bearer lgw_your_api_key_here
API Key Types
| Type | Format | Use Case |
|---|---|---|
| Project Key | lgw_proj_... |
Standard API access |
| Test Key | lgw_test_... |
Sandbox testing (free, limited) |
Security Best Practices
- Never expose API keys in client-side code
- Store keys in environment variables
- Rotate keys regularly
- Use separate keys for dev/staging/prod
3. Proxy API Endpoints
Base URL: https://proxy.WatchLLM.com
3.1 Chat Completions
POST /v1/chat/completions
OpenAI-compatible chat endpoint. Supports all OpenAI parameters.
Request:
{
"model": "gpt-4",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 100,
"stream": false
}Response:
{
"id": "chatcmpl-xyz123",
"object": "chat.completion",
"created": 1234567890,
"model": "gpt-4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 10,
"total_tokens": 25
},
"x-WatchLLM-cached": false,
"x-WatchLLM-cost-usd": "0.000625",
"x-WatchLLM-latency-ms": 1847
}Custom Response Headers:
| Header | Description | Example |
|---|---|---|
x-WatchLLM-cached |
Was response cached? | true or false |
x-WatchLLM-cost-usd |
Estimated cost | "0.000625" |
x-WatchLLM-latency-ms |
Response time | 1847 |
x-WatchLLM-tokens-saved |
Tokens saved (if cached) | 25 |
x-WatchLLM-provider |
Which provider was used | "openai" |
Supported Models:
| Provider | Models |
|---|---|
| OpenAI | gpt-4, gpt-4-turbo, gpt-3.5-turbo |
| Anthropic | claude-3-opus, claude-3-sonnet, claude-3-haiku |
| Groq | llama-3-70b, mixtral-8x7b |
Streaming:
Set "stream": true to receive Server-Sent Events (SSE):
const response = await fetch('https://proxy.WatchLLM.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': 'Bearer lgw_xxx',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Write a story' }],
stream: true
})
});
const reader = response.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(new TextDecoder().decode(value));
}3.2 Completions (Legacy)
POST /v1/completions
OpenAI legacy completions endpoint.
Request:
{
"model": "gpt-3.5-turbo-instruct",
"prompt": "Once upon a time",
"max_tokens": 50,
"temperature": 0.7
}Response:
{
"id": "cmpl-xyz123",
"object": "text_completion",
"created": 1234567890,
"model": "gpt-3.5-turbo-instruct",
"choices": [
{
"text": " there was a kingdom...",
"index": 0,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 4,
"completion_tokens": 50,
"total_tokens": 54
}
}3.3 Embeddings
POST /v1/embeddings
Generate text embeddings.
Request:
{
"model": "text-embedding-ada-002",
"input": "The quick brown fox jumps over the lazy dog"
}Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.123, 0.456, ..., 0.789],
"index": 0
}
],
"model": "text-embedding-ada-002",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}Batch Embeddings:
{
"model": "text-embedding-ada-002",
"input": [
"First sentence",
"Second sentence",
"Third sentence"
]
}3.4 Health Check
GET /health
Check if proxy is operational.
Response:
{
"status": "ok",
"version": "1.0.0",
"timestamp": "2024-12-12T10:30:00Z"
}4. Dashboard API Endpoints
Base URL: https://WatchLLM.com/api
Authentication: Session cookie (automatically handled by Supabase)
4.1 Dashboard Stats
GET /api/dashboard/stats
Get overview statistics.
Query Parameters:
period- Time period (7d,30d,all) - Default:30d
Response:
{
"requests_total": 125000,
"requests_today": 4200,
"cache_hit_rate": 0.42,
"cost_this_month": 87.50,
"cost_saved": 62.30,
"top_models": [
{"model": "gpt-4", "requests": 80000, "cost": 65.00},
{"model": "gpt-3.5-turbo", "requests": 45000, "cost": 22.50}
],
"chart_data": [
{"date": "2024-12-01", "requests": 3500, "cost": 2.80},
{"date": "2024-12-02", "requests": 4100, "cost": 3.20}
]
}4.2 Projects
GET /api/projects
List all projects for authenticated user.
Response:
{
"projects": [
{
"id": "proj_abc123",
"name": "My ChatGPT Wrapper",
"provider": "openai",
"created_at": "2024-01-15T10:00:00Z",
"stats": {
"requests_this_month": 50000,
"cost_this_month": 45.00,
"cache_hit_rate": 0.38
}
}
]
}POST /api/projects
Create a new project.
Request:
{
"name": "Production App",
"provider": "openai"
}Response:
{
"id": "proj_xyz789",
"name": "Production App",
"provider": "openai",
"api_key": "lgw_proj_newkey123abc...",
"created_at": "2024-12-12T10:30:00Z"
}GET /api/projects/:id
Get project details.
Response:
{
"id": "proj_abc123",
"name": "My ChatGPT Wrapper",
"provider": "openai",
"created_at": "2024-01-15T10:00:00Z",
"api_keys": [
{
"id": "key_123",
"name": "Production Key",
"key": "lgw_proj_abc...",
"last_used_at": "2024-12-12T09:45:00Z",
"created_at": "2024-01-15T10:00:00Z"
}
],
"stats": {
"requests_total": 125000,
"requests_this_month": 50000,
"cost_total": 450.00,
"cost_this_month": 45.00,
"cache_hit_rate": 0.38
}
}DELETE /api/projects/:id
Delete a project and all its API keys.
Response:
{
"success": true,
"message": "Project deleted"
}4.3 API Keys
POST /api/projects/:project_id/keys
Create a new API key for a project.
Request:
{
"name": "Staging Key"
}Response:
{
"id": "key_456",
"name": "Staging Key",
"key": "lgw_proj_newkey456def...",
"created_at": "2024-12-12T10:30:00Z"
}⚠️ The key field is only returned once. Save it securely!
DELETE /api/projects/:project_id/keys/:key_id
Revoke an API key.
Response:
{
"success": true,
"message": "API key revoked"
}4.4 Usage Logs
GET /api/projects/:project_id/usage
Get usage logs for a project.
Query Parameters:
from- Start date (ISO 8601) - Default: 30 days agoto- End date (ISO 8601) - Default: nowlimit- Max results - Default: 100, Max: 1000model- Filter by model - Optional
Response:
{
"logs": [
{
"id": "log_abc123",
"model": "gpt-4",
"tokens_used": 150,
"cost_usd": 0.0045,
"cached": false,
"latency_ms": 2100,
"created_at": "2024-12-12T10:30:00Z"
},
{
"id": "log_def456",
"model": "gpt-4",
"tokens_used": 0,
"cost_usd": 0,
"cached": true,
"tokens_saved": 150,
"latency_ms": 45,
"created_at": "2024-12-12T10:31:00Z"
}
],
"total": 2,
"summary": {
"total_requests": 2,
"total_tokens": 150,
"total_cost": 0.0045,
"cache_hits": 1,
"cache_hit_rate": 0.5
}
}GET /api/projects/:project_id/usage/export
Export usage logs as CSV.
Query Parameters:
- Same as
/usageendpoint
Response: CSV file download
date,model,tokens_used,cost_usd,cached,latency_ms
2024-12-12T10:30:00Z,gpt-4,150,0.0045,false,2100
2024-12-12T10:31:00Z,gpt-4,0,0,true,454.5 Provider Configuration
POST /api/projects/:project_id/providers
Add or update provider API key.
Request:
{
"provider": "openai",
"api_key": "sk-your-actual-openai-key"
}Response:
{
"success": true,
"message": "Provider key updated",
"provider": "openai"
}⚠️ Keys are encrypted and never returned in API responses.
DELETE /api/projects/:project_id/providers/:provider
Remove provider API key.
Response:
{
"success": true,
"message": "Provider key removed"
}4.6 Billing
POST /api/billing/create-checkout
Create Stripe checkout session to upgrade plan.
Request:
{
"plan": "pro",
"billing_period": "monthly"
}Response:
{
"checkout_url": "https://checkout.stripe.com/c/pay/cs_xxx..."
}Redirect user to checkout_url to complete payment.
POST /api/billing/create-portal
Create Stripe customer portal session (manage subscription).
Response:
{
"portal_url": "https://billing.stripe.com/p/session/xxx..."
}GET /api/billing/subscription
Get current subscription details.
Response:
{
"plan": "pro",
"status": "active",
"current_period_end": "2025-01-12T10:30:00Z",
"cancel_at_period_end": false,
"usage": {
"requests_this_month": 500000,
"plan_limit": 1000000,
"percentage_used": 0.5
}
}5. Webhooks
5.1 Stripe Webhooks
Endpoint: https://WatchLLM.com/api/webhooks/stripe
Events Handled:
checkout.session.completed
User completed checkout, subscription created.
Payload:
{
"type": "checkout.session.completed",
"data": {
"object": {
"id": "cs_xxx",
"customer": "cus_xxx",
"subscription": "sub_xxx",
"metadata": {
"user_id": "user_abc123"
}
}
}
}Action: Create subscription record, upgrade user plan.
invoice.payment_succeeded
Recurring payment succeeded.
Action: Log payment, extend subscription period.
invoice.payment_failed
Payment failed.
Action: Send email alert, downgrade after 3 failures.
customer.subscription.deleted
User canceled subscription.
Action: Downgrade to free plan at period end.
5.2 Usage Alert Webhooks (Coming Soon)
Configure webhooks to receive alerts when usage thresholds are hit.
6. Error Handling
Error Response Format
{
"error": {
"code": "rate_limit_exceeded",
"message": "You have exceeded your plan's rate limit",
"details": {
"plan": "starter",
"limit": 60,
"reset_at": "2024-12-12T10:31:00Z"
}
}
}HTTP Status Codes
| Code | Meaning | Example |
|---|---|---|
| 200 | Success | Request completed |
| 400 | Bad Request | Invalid JSON |
| 401 | Unauthorized | Invalid API key |
| 402 | Payment Required | Subscription expired |
| 429 | Rate Limited | Too many requests |
| 500 | Server Error | Internal error |
| 502 | Bad Gateway | Provider API down |
| 503 | Service Unavailable | Maintenance mode |
Error Codes
| Code | Description | Solution |
|---|---|---|
invalid_api_key |
API key not found or revoked | Check key, create new one |
rate_limit_exceeded |
Too many requests | Wait or upgrade plan |
quota_exceeded |
Monthly quota reached | Upgrade plan or wait for reset |
invalid_model |
Model not supported | Check supported models list |
provider_error |
Provider API error | Retry, check provider status |
cache_error |
Cache operation failed | Retry, contact support if persists |
Retry Logic
For 502, 503, 429 errors, implement exponential backoff:
async function makeRequestWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options);
if (response.ok) return response;
if ([502, 503, 429].includes(response.status)) {
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw new Error(`Request failed: ${response.status}`);
}
throw new Error('Max retries exceeded');
}7. Rate Limits
By Plan
| Plan | Requests/Minute | Requests/Month |
|---|---|---|
| Free | 10 | 10,000 |
| Starter | 60 | 100,000 |
| Pro | 600 | 1,000,000 |
| Agency | 6,000 | 10,000,000 |
Rate Limit Headers
Every response includes these headers:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1702377660
X-RateLimit-Limit- Max requests per minute for your planX-RateLimit-Remaining- Requests remaining in current windowX-RateLimit-Reset- Unix timestamp when limit resets
Handling Rate Limits
const response = await fetch('https://proxy.WatchLLM.com/v1/chat/completions', {
/* ... */
});
if (response.status === 429) {
const resetTime = response.headers.get('X-RateLimit-Reset');
const waitSeconds = parseInt(resetTime) - Math.floor(Date.now() / 1000);
console.log(`Rate limited. Retry in ${waitSeconds} seconds`);
await new Promise(resolve => setTimeout(resolve, waitSeconds * 1000));
// Retry request
}8. SDKs & Examples
8.1 JavaScript/TypeScript
Install:
npm install openaiUsage:
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.WatchLLM_API_KEY,
baseURL: 'https://proxy.WatchLLM.com/v1'
});
// Chat completion
const response = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }]
});
// Streaming
const stream = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}8.2 Python
Install:
pip install openaiUsage:
from openai import OpenAI
client = OpenAI(
api_key=os.environ['WatchLLM_API_KEY'],
base_url='https://proxy.WatchLLM.com/v1'
)
# Chat completion
response = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': 'Hello!'}]
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': 'Tell me a story'}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or '', end='')8.3 cURL Examples
Chat completion:
curl https://proxy.WatchLLM.com/v1/chat/completions \
-H "Authorization: Bearer $WatchLLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'Embeddings:
curl https://proxy.WatchLLM.com/v1/embeddings \
-H "Authorization: Bearer $WatchLLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": "Hello world"
}'8.4 Next.js Example
pages/api/chat.ts:
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.WatchLLM_API_KEY,
baseURL: 'https://proxy.WatchLLM.com/v1'
});
export default async function handler(req, res) {
const { message } = req.body;
const response = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: message }]
});
res.json({
reply: response.choices[0].message.content,
cached: response.x_WatchLLM_cached,
cost: response.x_WatchLLM_cost_usd
});
}8.5 Go Example
Install:
go get github.com/sashabaranov/go-openaiUsage:
package main
import (
"context"
"fmt"
"os"
"github.com/sashabaranov/go-openai"
)
func main() {
config := openai.DefaultConfig(os.Getenv("WatchLLM_API_KEY"))
config.BaseURL = "https://proxy.WatchLLM.com/v1"
client := openai.NewClientWithConfig(config)
resp, err := client.CreateChatCompletion(
context.Background(),
openai.ChatCompletionRequest{
Model: openai.GPT4,
Messages: []openai.ChatCompletionMessage{
{
Role: openai.ChatMessageRoleUser,
Content: "Hello!",
},
},
},
)
if err != nil {
fmt.Printf("ChatCompletion error: %v\n", err)
return
}
fmt.Println(resp.Choices[0].Message.Content)
}8.6 Ruby Example
Install:
gem install ruby-openaiUsage:
require 'openai'
client = OpenAI::Client.new(
access_token: ENV['WatchLLM_API_KEY'],
uri_base: "https://proxy.WatchLLM.com/v1"
)
response = client.chat(
parameters: {
model: "gpt-4",
messages: [{ role: "user", content: "Hello!" }],
temperature: 0.7,
}
)
puts response.dig("choices", 0, "message", "content")Support
- Documentation: https://docs.WatchLLM.com
- API Status: https://status.WatchLLM.com
- Email: support@WatchLLM.com
- Discord: https://discord.gg/WatchLLM
Changelog
v1.0.0 (2024-12-12)
- Initial release
- Chat completions endpoint
- Embeddings endpoint
- Semantic caching
- Rate limiting
- Usage analytics