# Workers AI

Run AI models in Workers, Pages, or via API

> Links below point directly to Markdown versions of each page. Any page can also be retrieved as Markdown by sending an `Accept: text/markdown` header to the page's URL without the `index.md` suffix (for example, `curl -H "Accept: text/markdown" https://docs.ahq.lat/workers-ai/`).
>
> For other Cloudflare products, see the [Cloudflare documentation directory](https://docs.ahq.lat/llms.txt).

## Overview

- [Cloudflare Workers AI](https://docs.ahq.lat/workers-ai/index.md): Run machine learning models, powered by serverless GPUs, on Cloudflare's global network.

## Getting started

- [Getting started](https://docs.ahq.lat/workers-ai/get-started/index.md): Set up your first Workers AI project using the dashboard, CLI, or REST API.
- [Dashboard](https://docs.ahq.lat/workers-ai/get-started/dashboard/index.md): Create and deploy a Workers AI application using the Cloudflare dashboard.
- [REST API](https://docs.ahq.lat/workers-ai/get-started/rest-api/index.md): Use the Cloudflare Workers AI REST API to deploy a large language model (LLM).
- [Workers Bindings](https://docs.ahq.lat/workers-ai/get-started/workers-wrangler/index.md): Deploy your first Cloudflare Workers AI project using the CLI.

## Models

- [Models](https://docs.ahq.lat/workers-ai/models/index.md): Browse the catalog of machine learning models available on Workers AI.

## Agents

- [Agents](https://docs.ahq.lat/agents/index.md): Build AI assistants that perform complex tasks using Workers AI and the Cloudflare Agents SDK.

## REST API reference

- [REST API reference](https://docs.ahq.lat/api/resources/ai/methods/run/index.md): Run Workers AI inference models programmatically using the Cloudflare REST API.

## Changelog

- [Changelog](https://docs.ahq.lat/workers-ai/changelog/index.md): Review recent changes to Cloudflare Workers AI.

## configuration

- [Vercel AI SDK](https://docs.ahq.lat/workers-ai/configuration/ai-sdk/index.md): Use Workers AI with the Vercel AI SDK for streaming text generation, tool calls, and structured output.
- [Workers Bindings](https://docs.ahq.lat/workers-ai/configuration/bindings/index.md): Create an AI binding to connect your Cloudflare Worker to Workers AI.
- [Hugging Face Chat UI](https://docs.ahq.lat/workers-ai/configuration/hugging-face-chat-ui/index.md): Connect Workers AI models to Hugging Face's open-source Chat UI interface.
- [OpenAI compatible API endpoints](https://docs.ahq.lat/workers-ai/configuration/open-ai-compatibility/index.md): Use the OpenAI SDK to call Workers AI models through compatible API endpoints.

## features

- [Asynchronous Batch API](https://docs.ahq.lat/workers-ai/features/batch-api/index.md): Queue large inference workloads for asynchronous processing with the Workers AI Batch API.
- [REST API](https://docs.ahq.lat/workers-ai/features/batch-api/rest-api/index.md): Send and retrieve batch inference requests using the Workers AI REST API.
- [Workers Binding](https://docs.ahq.lat/workers-ai/features/batch-api/workers-binding/index.md): Send and retrieve batch inference requests using a Workers AI binding.
- [Fine-tunes](https://docs.ahq.lat/workers-ai/features/fine-tunes/index.md): Run fine-tuned inference on Workers AI using LoRA adapters.
- [Using LoRA adapters](https://docs.ahq.lat/workers-ai/features/fine-tunes/loras/index.md): Upload and use LoRA adapters to get fine-tuned inference on Workers AI.
- [Public LoRA adapters](https://docs.ahq.lat/workers-ai/features/fine-tunes/public-loras/index.md): Cloudflare offers a few public LoRA adapters that are immediately ready for use.
- [Function calling](https://docs.ahq.lat/workers-ai/features/function-calling/index.md): Enable Workers AI models to execute functions and interact with external APIs.
- [Embedded](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/index.md): Execute function code alongside inference calls using Workers AI embedded function calling.
- [API Reference](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/api-reference/index.md): Reference for the runWithTools and autoTrimTools methods in embedded function calling.
- [Use fetch() handler](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/examples/fetch/index.md): Learn how to use the fetch() handler in Cloudflare Workers AI to enable LLMs to perform API calls, like retrieving a 5-day weather forecast using function calling.
- [Use KV API](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/examples/kv/index.md): Learn how to use Cloudflare Workers AI to interact with KV storage, enabling persistent data handling with embedded function calling in a few lines of code.
- [Tools based on OpenAPI Spec](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/examples/openapi/index.md): Generate Workers AI function calling tools from an OpenAPI specification using the ai-utils package.
- [Get Started](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/get-started/index.md): Set up and deploy your first Workers AI project with embedded function calling.
- [Troubleshooting](https://docs.ahq.lat/workers-ai/features/function-calling/embedded/troubleshooting/index.md): Debug and resolve common issues with Workers AI embedded function calling.
- [Traditional](https://docs.ahq.lat/workers-ai/features/function-calling/traditional/index.md): Define tools and schemas for industry-standard function calling with Workers AI models.
- [JSON Mode](https://docs.ahq.lat/workers-ai/features/json-mode/index.md): Force Workers AI text generation models to return valid JSON output using response_format or JSON schemas.
- [Markdown Conversion](https://docs.ahq.lat/workers-ai/features/markdown-conversion/index.md): Convert documents in multiple formats to Markdown using the Workers AI toMarkdown method.
- [Conversion Options](https://docs.ahq.lat/workers-ai/features/markdown-conversion/conversion-options/index.md): Configure per-format options for Workers AI Markdown Conversion, including HTML and image settings.
- [How it works](https://docs.ahq.lat/workers-ai/features/markdown-conversion/how-it-works/index.md): Learn how Workers AI pre-processes and converts HTML, images, and other files to Markdown.
- [Supported Formats](https://docs.ahq.lat/workers-ai/features/markdown-conversion/supported-formats/index.md): View the list of file formats supported by Workers AI Markdown Conversion.
- [Workers Binding](https://docs.ahq.lat/workers-ai/features/markdown-conversion/usage/binding/index.md): Convert documents to Markdown using the Workers AI binding and toMarkdown method.
- [REST API](https://docs.ahq.lat/workers-ai/features/markdown-conversion/usage/rest-api/index.md): Convert documents to Markdown using the Workers AI REST API endpoint.
- [Prompt caching](https://docs.ahq.lat/workers-ai/features/prompt-caching/index.md): Use prefix caching and the x-session-affinity header to reduce latency and inference costs on Workers AI.
- [Prompting](https://docs.ahq.lat/workers-ai/features/prompting/index.md): Structure prompts for Workers AI text generation models using system, user, and assistant message roles.

## guides

- [Agents](https://docs.ahq.lat/agents/index.md): Build AI-powered Agents on Cloudflare
- [Demos and architectures](https://docs.ahq.lat/workers-ai/guides/demos-architectures/index.md): Explore demo applications and reference architectures built with Workers AI.
- [Tutorials](https://docs.ahq.lat/workers-ai/guides/tutorials/index.md): Step-by-step Workers AI tutorials for building AI-powered applications on Cloudflare.
- [Build a Retrieval Augmented Generation (RAG) AI](https://docs.ahq.lat/workers-ai/guides/tutorials/build-a-retrieval-augmented-generation-ai/index.md): Build your first AI app with Cloudflare AI. This guide uses Workers AI, Vectorize, D1, and Cloudflare Workers.
- [Whisper-large-v3-turbo with Cloudflare Workers AI](https://docs.ahq.lat/workers-ai/guides/tutorials/build-a-workers-ai-whisper-with-chunking/index.md): Learn how to transcribe large audio files using Workers AI.
- [Explore Code Generation Using DeepSeek Coder Models](https://docs.ahq.lat/workers-ai/guides/tutorials/explore-code-generation-using-deepseek-coder-models/index.md): Explore how you can use AI models to generate code and work more efficiently.
- [Explore Workers AI Models Using a Jupyter Notebook](https://docs.ahq.lat/workers-ai/guides/tutorials/explore-workers-ai-models-using-a-jupyter-notebook/index.md): This Jupyter notebook explores various models (including Whisper, Distilled BERT, LLaVA, and Meta Llama 3) using Python and the requests library.
- [Fine Tune Models With AutoTrain from HuggingFace](https://docs.ahq.lat/workers-ai/guides/tutorials/fine-tune-models-with-autotrain/index.md): Fine-tuning AI models with LoRA adapters on Workers AI allows adding custom training data, like for LLM finetuning.
- [Choose the Right Text Generation Model](https://docs.ahq.lat/workers-ai/guides/tutorials/how-to-choose-the-right-text-generation-model/index.md): There's a wide range of text generation models available through Workers AI. In an effort to aid you in your journey of finding the right model, this notebook will help you get to know your options in a speed dating type of scenario.
- [Build an AI Image Generator Playground (Part 1)](https://docs.ahq.lat/workers-ai/guides/tutorials/image-generation-playground/image-generator-flux/index.md): The new flux models on Workers AI are our most powerful text-to-image AI models yet. Using Workers AI, you can get access to the best models in the industry without having to worry about inference, ops, or deployment.
- [Add New AI Models to your Playground (Part 2)](https://docs.ahq.lat/workers-ai/guides/tutorials/image-generation-playground/image-generator-flux-newmodels/index.md): In part 2, Kristian expands upon the existing environment built in part 1, by showing you how to integrate new AI models and introduce new parameters that allow you to customize how images are generated.
- [Store and Catalog AI Generated Images with R2 (Part 3)](https://docs.ahq.lat/workers-ai/guides/tutorials/image-generation-playground/image-generator-store-and-catalog/index.md): In the final part of the AI Image Playground series, Kristian teaches how to utilize Cloudflare's R2 object storage.
- [Llama 3.2 11B Vision Instruct model on Cloudflare Workers AI](https://docs.ahq.lat/workers-ai/guides/tutorials/llama-vision-tutorial/index.md): Learn how to use the Llama 3.2 11B Vision Instruct model on Cloudflare Workers AI.
- [Using BigQuery with Workers AI](https://docs.ahq.lat/workers-ai/guides/tutorials/using-bigquery-with-workers-ai/index.md): Learn how to ingest data stored outside of Cloudflare as an input to Workers AI models.

## platform

- [AI Gateway](https://docs.ahq.lat/ai-gateway/index.md): Use AI Gateway to manage, monitor, and cache your Workers AI requests.
- [Data usage](https://docs.ahq.lat/workers-ai/platform/data-usage/index.md): How Cloudflare handles your data, inputs, and outputs when using Workers AI.
- [Errors](https://docs.ahq.lat/workers-ai/platform/errors/index.md): Reference table of Workers AI error codes, HTTP statuses, and descriptions.
- [Event subscriptions](https://docs.ahq.lat/workers-ai/platform/event-subscriptions/index.md): Subscribe to Workers AI events using Cloudflare Queues for asynchronous processing.
- [Glossary](https://docs.ahq.lat/workers-ai/platform/glossary/index.md): Definitions of key terms used in the Workers AI documentation.
- [Limits](https://docs.ahq.lat/workers-ai/platform/limits/index.md): Rate limits for Workers AI inference requests, organized by task type and model.
- [Pricing](https://docs.ahq.lat/workers-ai/platform/pricing/index.md): Workers AI pricing is based on Neurons, with a free daily allocation and per-model rates.
- [Choose a data or storage product](https://docs.ahq.lat/workers/platform/storage-options/index.md): Compare Cloudflare storage products to use alongside Workers AI.