Quick Start

Create your first Video To Text API transcription task.

This guide walks through the full API flow: create an upload, send the media file, complete the upload, create a transcription task, and poll for results.

Visit Video To Text to manage your account, create API keys, and return to the product workspace.

Set your API base URL and key:

export VTT_API_BASE_URL="https://api.videototext.dev"
export VTT_API_KEY="vtt_xxxxx"

1. Create an upload

Create a signed upload URL for the media file.

curl -X POST "https://api.videototext.dev/v1/uploads" \
  -H "Authorization: Bearer $VTT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "filename": "meeting.mp4",
    "mimetype": "video/mp4"
  }'

The response includes the fields needed for the next steps:

{
  "data": {
    "uploadUrl": "https://storage.example.com/signed-upload-url",
    "fileKey": "uploads/example.mp4",
    "fileUrl": "https://static.example.com/uploads/example.mp4",
    "uploadId": "00000000-0000-0000-0000-000000000000",
    "expiresAt": "2026-06-06T10:00:00.000Z"
  },
  "meta": {}
}

2. Upload the file

Upload the binary file directly to uploadUrl. Use the same Content-Type value passed when creating the upload.

curl -X PUT "https://storage.example.com/signed-upload-url" \
  -H "Content-Type: video/mp4" \
  --data-binary "@meeting.mp4"

3. Complete the upload

Complete the upload session so Video To Text can validate the object and create an asset.

curl -X POST "https://api.videototext.dev/v1/uploads/$UPLOAD_ID/operations/complete" \
  -H "Authorization: Bearer $VTT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "fileKey": "uploads/example.mp4",
    "fileUrl": "https://static.example.com/uploads/example.mp4",
    "filename": "meeting.mp4",
    "mimetype": "video/mp4",
    "fileSize": 10485760
  }'

The response returns the assetId used to create a transcription task.

{
  "data": {
    "assetId": "00000000-0000-0000-0000-000000000000"
  },
  "meta": {}
}

4. Create a transcription task

Create the task from the uploaded asset. The default mode is balanced.

curl -X POST "https://api.videototext.dev/v1/tasks" \
  -H "Authorization: Bearer $VTT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "assetId": "00000000-0000-0000-0000-000000000000",
    "language": "Auto",
    "timestampMode": "CHUNK",
    "transcriptionMode": "balanced"
  }'

Use precision when the workflow favors accuracy over cost.

The response returns data.task.transcriptId; use that value as TASK_ID when polling.

Mode	Best for
`balanced`	Fast, cost-efficient transcription for most production workflows.
`precision`	Higher-accuracy transcription when quality matters more than cost.

5. Poll the task

Poll the task endpoint until status becomes SUCCEEDED, FAILED, or CANCELED.

curl "https://api.videototext.dev/v1/tasks/$TASK_ID" \
  -H "Authorization: Bearer $VTT_API_KEY"

Successful tasks return transcript text, public chunk timing fields, word timestamps, source file details, and billed credits.