curl POST /v1/tasks

curl -X "POST" "https://api.videototext.dev/v1/tasks" \
  -H "Authorization: Bearer $VTT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "assetId": "asset_9m2k8r",
    "language": "Auto",
    "timestampMode": "CHUNK",
    "transcriptionMode": "balanced"
  }'

Video To Text API

Add fast, accurate transcription to your product with a production-ready API for video, audio, timestamps, and searchable text.

Start building View API reference 2 modes / signed uploads / timestamped output

What it is

Speech-to-text infrastructure for modern apps

Video To Text API turns recordings, meetings, interviews, lessons, and media libraries into clean text your product can search, summarize, subtitle, and automate.

Transcribe media at scale

Handle video and audio files without building your own upload, processing, and polling pipeline.

Return product-ready data

Get full text, timestamped chunks, word timings, source metadata, task status, and billing fields.

Choose speed or accuracy

Use balanced for everyday workloads, or switch to precision for quality-sensitive content.

Features

Everything needed to ship transcription workflows

Designed for SaaS teams that need reliable transcription inside real product flows.

Large-file uploads

Upload media through signed URLs so files do not have to pass through your application server.

Reliable task creation

Use predictable task states and optional retry safeguards to build resilient transcription queues.

Timestamped transcripts

Build captions, search, editors, clips, and review tools from chunk and word-level timing data.

Clear mode controls

Pick balanced for speed and cost, or precision when transcript quality matters most.

Use cases

Make spoken content usable across your product

Give users faster ways to find, review, edit, and repurpose recorded speech.

Meeting intelligence

Turn calls and recordings into searchable notes, summaries, action items, and customer insights.

Media operations

Generate transcripts for podcasts, webinars, interviews, learning content, and long-form videos.

Subtitle and editing tools

Use timestamps as the foundation for captions, clip selection, review workflows, and timeline editors.

Ready to build

Start with one upload and one transcription task.

Quick Start API Reference