# AI Gateway

Observe and control your AI applications

> Links below point directly to Markdown versions of each page. Any page can also be retrieved as Markdown by sending an `Accept: text/markdown` header to the page's URL without the `index.md` suffix (for example, `curl -H "Accept: text/markdown" https://docs.ahq.lat/ai-gateway/`).
>
> For other Cloudflare products, see the [Cloudflare documentation directory](https://docs.ahq.lat/llms.txt).

## Overview

- [Cloudflare AI Gateway](https://docs.ahq.lat/ai-gateway/index.md): Observe and control your AI applications with analytics, caching, rate limiting, and model fallback through AI Gateway.

## Getting started

- [Getting started](https://docs.ahq.lat/ai-gateway/get-started/index.md): Set up AI Gateway and send your first request to observe and control AI API traffic.

## Models

- [Models](https://docs.ahq.lat/ai/models/index.md): Explore AI models available through AI Gateway, including models from OpenAI, Anthropic, and Google.

## Using AI Gateway

- [Using AI Gateway](https://docs.ahq.lat/ai-gateway/usage/index.md): Connect your AI applications to AI Gateway using the unified API, provider-native endpoints, or WebSockets.
- [Unified API (OpenAI compat)](https://docs.ahq.lat/ai-gateway/usage/chat-completion/index.md): Send requests to multiple AI providers through a single OpenAI-compatible endpoint on AI Gateway.
- [Anthropic](https://docs.ahq.lat/ai-gateway/usage/providers/anthropic/index.md): Route Anthropic API requests through AI Gateway for observability and control.
- [Azure OpenAI](https://docs.ahq.lat/ai-gateway/usage/providers/azureopenai/index.md): Route Azure OpenAI requests through AI Gateway for observability and control.
- [Baseten](https://docs.ahq.lat/ai-gateway/usage/providers/baseten/index.md): Route Baseten model inference requests through AI Gateway for observability and control.
- [Amazon Bedrock](https://docs.ahq.lat/ai-gateway/usage/providers/bedrock/index.md): Route Amazon Bedrock requests through AI Gateway for observability and control.
- [Cartesia](https://docs.ahq.lat/ai-gateway/usage/providers/cartesia/index.md): Route Cartesia text-to-speech requests through AI Gateway for observability and control.
- [Cerebras](https://docs.ahq.lat/ai-gateway/usage/providers/cerebras/index.md): Route Cerebras inference requests through AI Gateway for observability and control.
- [Cohere](https://docs.ahq.lat/ai-gateway/usage/providers/cohere/index.md): Route Cohere API requests through AI Gateway for observability and control.
- [Deepgram](https://docs.ahq.lat/ai-gateway/usage/providers/deepgram/index.md): Route Deepgram speech-to-text and text-to-speech requests through AI Gateway for observability and control.
- [DeepSeek](https://docs.ahq.lat/ai-gateway/usage/providers/deepseek/index.md): Route DeepSeek API requests through AI Gateway for observability and control.
- [ElevenLabs](https://docs.ahq.lat/ai-gateway/usage/providers/elevenlabs/index.md): Route ElevenLabs text-to-speech requests through AI Gateway for observability and control.
- [Fal AI](https://docs.ahq.lat/ai-gateway/usage/providers/fal/index.md): Route Fal AI generative media requests through AI Gateway for observability and control.
- [Google AI Studio](https://docs.ahq.lat/ai-gateway/usage/providers/google-ai-studio/index.md): Route Google AI Studio and Gemini requests through AI Gateway for observability and control.
- [xAI](https://docs.ahq.lat/ai-gateway/usage/providers/grok/index.md): Route xAI (Grok) API requests through AI Gateway for observability and control.
- [Groq](https://docs.ahq.lat/ai-gateway/usage/providers/groq/index.md): Route Groq API requests through AI Gateway for observability and control.
- [HuggingFace](https://docs.ahq.lat/ai-gateway/usage/providers/huggingface/index.md): Route HuggingFace Inference API requests through AI Gateway for observability and control.
- [Ideogram](https://docs.ahq.lat/ai-gateway/usage/providers/ideogram/index.md): Route Ideogram image generation requests through AI Gateway for observability and control.
- [Mistral AI](https://docs.ahq.lat/ai-gateway/usage/providers/mistral/index.md): Route Mistral AI requests through AI Gateway for observability and control.
- [OpenAI](https://docs.ahq.lat/ai-gateway/usage/providers/openai/index.md): Route OpenAI API requests through AI Gateway for observability and control.
- [OpenRouter](https://docs.ahq.lat/ai-gateway/usage/providers/openrouter/index.md): Route OpenRouter API requests through AI Gateway for observability and control.
- [Parallel](https://docs.ahq.lat/ai-gateway/usage/providers/parallel/index.md): Route Parallel API requests through AI Gateway for observability and control.
- [Perplexity](https://docs.ahq.lat/ai-gateway/usage/providers/perplexity/index.md): Route Perplexity API requests through AI Gateway for observability and control.
- [Replicate](https://docs.ahq.lat/ai-gateway/usage/providers/replicate/index.md): Route Replicate API requests through AI Gateway for observability and control.
- [Google Vertex AI](https://docs.ahq.lat/ai-gateway/usage/providers/vertex/index.md): Route Google Vertex AI requests through AI Gateway for observability and control.
- [Workers AI](https://docs.ahq.lat/ai-gateway/usage/providers/workersai/index.md): Route Workers AI requests through AI Gateway for analytics, caching, and rate limiting.
- [REST API](https://docs.ahq.lat/ai-gateway/usage/rest-api/index.md): Call third-party and Workers AI models through the Cloudflare API with AI Gateway features like logging, caching, and rate limiting.
- [Universal Endpoint (Deprecated)](https://docs.ahq.lat/ai-gateway/usage/universal/index.md): Route requests to any AI provider through a single AI Gateway endpoint with support for fallbacks and retries.
- [WebSockets API](https://docs.ahq.lat/ai-gateway/usage/websockets-api/index.md): Use persistent WebSocket connections through AI Gateway for real-time and non-realtime AI interactions.
- [Non-realtime WebSockets API](https://docs.ahq.lat/ai-gateway/usage/websockets-api/non-realtime-api/index.md): Establish persistent WebSocket connections for AI requests through AI Gateway without real-time streaming.
- [Realtime WebSockets API](https://docs.ahq.lat/ai-gateway/usage/websockets-api/realtime-api/index.md): Connect to AI providers that support real-time WebSocket interactions through AI Gateway.

## Features

- [Features](https://docs.ahq.lat/ai-gateway/features/index.md): Explore AI Gateway features including caching, rate limiting, guardrails, dynamic routing, and data loss prevention.
- [Caching](https://docs.ahq.lat/ai-gateway/features/caching/index.md): Override caching settings on a per-request basis.
- [Data Loss Prevention (DLP)](https://docs.ahq.lat/ai-gateway/features/dlp/index.md): Protect sensitive data in AI Gateway prompts and responses using Cloudflare DLP detection engines.
- [Set up Data Loss Prevention (DLP)](https://docs.ahq.lat/ai-gateway/features/dlp/set-up-dlp/index.md): Enable and configure DLP policies on your AI Gateway to scan prompts and responses for sensitive data.
- [Dynamic routing](https://docs.ahq.lat/ai-gateway/features/dynamic-routing/index.md): Route AI Gateway requests based on conditions, quotas, and fallbacks using a visual interface or JSON configuration.
- [JSON Configuration](https://docs.ahq.lat/ai-gateway/features/dynamic-routing/json-configuration/index.md): Define AI Gateway dynamic routing flows using the REST API and JSON element structure.
- [Using a dynamic route](https://docs.ahq.lat/ai-gateway/features/dynamic-routing/usage/index.md): Send requests through an AI Gateway dynamic route using the OpenAI SDK or REST API.
- [Guardrails](https://docs.ahq.lat/ai-gateway/features/guardrails/index.md): Evaluate AI Gateway prompts and responses for harmful content and enforce safety policies across providers.
- [Set up Guardrails](https://docs.ahq.lat/ai-gateway/features/guardrails/set-up-guardrail/index.md): Enable and configure AI Gateway Guardrails to flag or block harmful content in prompts and responses.
- [Supported model types](https://docs.ahq.lat/ai-gateway/features/guardrails/supported-model-types/index.md): Review which AI model types AI Gateway Guardrails evaluates for text generation, embeddings, and unknown models.
- [Usage considerations](https://docs.ahq.lat/ai-gateway/features/guardrails/usage-considerations/index.md): Understand latency, availability, language support, and Workers AI usage when enabling AI Gateway Guardrails.
- [Rate limiting](https://docs.ahq.lat/ai-gateway/features/rate-limiting/index.md): Control traffic to your AI Gateway with fixed or sliding rate limits to prevent excessive costs and suspicious activity.
- [Unified Billing](https://docs.ahq.lat/ai-gateway/features/unified-billing/index.md): Use the Cloudflare billing to pay for and authenticate your inference requests.

## Integrations

- [Integrations](https://docs.ahq.lat/ai-gateway/integrations/index.md): Connect AI Gateway with Workers bindings, Vercel AI SDK, and other platforms.
- [Agents](https://docs.ahq.lat/agents/index.md): Build AI-powered Agents on Cloudflare
- [Set up Workers AI with AI Gateway](https://docs.ahq.lat/ai-gateway/integrations/aig-workers-ai-binding/index.md): This guide will walk you through setting up and deploying a Workers AI project. You will use Workers, an AI Gateway binding, and a large language model (LLM) to deploy your first AI-powered application on the Cloudflare global network.
- [Vercel AI SDK](https://docs.ahq.lat/ai-gateway/integrations/vercel-ai-sdk/index.md): Route Vercel AI SDK requests through AI Gateway using the ai-gateway-provider package.
- [Workers Bindings](https://docs.ahq.lat/ai-gateway/integrations/worker-binding-methods/index.md): Reference for the AI binding with AI Gateway. Call Workers AI and third-party models with env.AI.run(), access log IDs, and use gateway methods for feedback, logging, and URLs.

## Tutorials

- [Tutorials](https://docs.ahq.lat/ai-gateway/tutorials/index.md): Step-by-step AI Gateway tutorials for deploying Workers, connecting providers, and building AI applications.
- [Create your first AI Gateway using Workers AI](https://docs.ahq.lat/ai-gateway/tutorials/create-first-aig-workers/index.md): This tutorial guides you through creating your first AI Gateway using Workers AI on the Cloudflare dashboard.
- [Use Pruna P-video through AI Gateway](https://docs.ahq.lat/ai-gateway/tutorials/pruna-p-video/index.md): Learn how to call prunaai/p-video on Replicate through AI Gateway

## Changelog

- [Changelog](https://docs.ahq.lat/ai-gateway/changelog/index.md): Track the latest updates, new features, and fixes for AI Gateway.

## Header Glossary

- [Header Glossary](https://docs.ahq.lat/ai-gateway/glossary/index.md): Reference all supported AI Gateway headers for configuring, customizing, and managing API requests.

## REST API reference

- [REST API reference](https://docs.ahq.lat/api/resources/ai_gateway/methods/list/index.md): Manage AI Gateway resources programmatically using the Cloudflare REST API.

## Evaluations

- [Evaluations](https://docs.ahq.lat/ai-gateway/evaluations/index.md): Assess AI Gateway application performance with datasets, human feedback, and evaluation metrics.
- [Add Human Feedback using Dashboard](https://docs.ahq.lat/ai-gateway/evaluations/add-human-feedback/index.md): Annotate AI Gateway logs with thumbs-up or thumbs-down feedback in the Cloudflare dashboard.
- [Add Human Feedback using API](https://docs.ahq.lat/ai-gateway/evaluations/add-human-feedback-api/index.md): Submit human feedback on AI Gateway request logs using the Cloudflare API.
- [Add human feedback using Worker Bindings](https://docs.ahq.lat/ai-gateway/evaluations/add-human-feedback-bindings/index.md): Provide human feedback on AI Gateway evaluations programmatically using Worker bindings.
- [Set up Evaluations](https://docs.ahq.lat/ai-gateway/evaluations/set-up-evaluations/index.md): Create datasets, select evaluators, and run evaluations for your AI Gateway logs.

## Architectures

- [Architectures](https://docs.ahq.lat/ai-gateway/demos/index.md): Explore reference architectures and design guides that incorporate AI Gateway into your infrastructure.

## configuration

- [Authenticated Gateway](https://docs.ahq.lat/ai-gateway/configuration/authentication/index.md): Add security by requiring a valid authorization token for each request.
- [BYOK (Store Keys)](https://docs.ahq.lat/ai-gateway/configuration/bring-your-own-keys/index.md): Securely store AI provider API keys in AI Gateway and reference them in your gateway configuration.
- [Custom costs](https://docs.ahq.lat/ai-gateway/configuration/custom-costs/index.md): Override default or public model costs on a per-request basis.
- [Custom Providers](https://docs.ahq.lat/ai-gateway/configuration/custom-providers/index.md): Create and manage custom AI providers for your account.
- [Fallbacks](https://docs.ahq.lat/ai-gateway/configuration/fallbacks/index.md): Specify model or provider fallbacks in AI Gateway to handle request failures and ensure reliability.
- [Manage gateways](https://docs.ahq.lat/ai-gateway/configuration/manage-gateway/index.md): Create, edit, and delete AI Gateway instances using the dashboard or API.
- [Request handling](https://docs.ahq.lat/ai-gateway/configuration/request-handling/index.md): Configure AI Gateway request timeouts and retries for reliable AI provider interactions.

## observability

- [Analytics](https://docs.ahq.lat/ai-gateway/observability/analytics/index.md): View AI Gateway metrics for requests, tokens, caching, errors, and costs in the dashboard or via GraphQL.
- [Costs](https://docs.ahq.lat/ai-gateway/observability/costs/index.md): Track and estimate token-based costs across AI providers using AI Gateway cost metrics.
- [Custom metadata](https://docs.ahq.lat/ai-gateway/observability/custom-metadata/index.md): Tag AI Gateway requests with custom metadata such as user IDs to improve log filtering and analysis.
- [Logging](https://docs.ahq.lat/ai-gateway/observability/logging/index.md): Store and inspect AI Gateway request logs including prompts, responses, tokens, costs, and DLP actions.
- [Workers Logpush](https://docs.ahq.lat/ai-gateway/observability/logging/logpush/index.md): Export encrypted AI Gateway logs to external storage using Workers Logpush.
- [OpenTelemetry](https://docs.ahq.lat/ai-gateway/observability/otel-integration/index.md): Export AI Gateway trace spans to OpenTelemetry-compatible backends for distributed tracing and performance monitoring.

## reference

- [Audit logs](https://docs.ahq.lat/ai-gateway/reference/audit-logs/index.md): View audit log entries for AI Gateway configuration changes such as gateway creation, deletion, and updates.
- [Limits](https://docs.ahq.lat/ai-gateway/reference/limits/index.md): Review AI Gateway limits for gateways, log storage, cache size, metadata entries, and Logpush jobs.
- [Pricing](https://docs.ahq.lat/ai-gateway/reference/pricing/index.md): Review AI Gateway pricing, including free core features, persistent log storage limits, and premium add-ons.
- [Troubleshooting](https://docs.ahq.lat/ai-gateway/reference/troubleshooting/index.md): Resolve common AI Gateway issues including authentication errors, missing logs, and provider connectivity problems.