Skip to content

Create your first Video To Text API transcription task.

This guide walks through the full API flow: create an upload, send the media file, complete the upload, create a transcription task, and poll for results.

Visit Video To Text to manage your account, create API keys, and return to the product workspace.

Set your API base URL and key:

Terminal window
export VTT_API_BASE_URL="https://api.videototext.dev"
export VTT_API_KEY="vtt_xxxxx"

Create a signed upload URL for the media file.

Terminal window
curl -X POST "https://api.videototext.dev/v1/uploads" \
-H "Authorization: Bearer $VTT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"filename": "meeting.mp4",
"mimetype": "video/mp4"
}'

The response includes the fields needed for the next steps:

{
"data": {
"uploadUrl": "https://storage.example.com/signed-upload-url",
"fileKey": "uploads/example.mp4",
"fileUrl": "https://static.example.com/uploads/example.mp4",
"uploadId": "00000000-0000-0000-0000-000000000000",
"expiresAt": "2026-06-06T10:00:00.000Z"
},
"meta": {}
}

Upload the binary file directly to uploadUrl. Use the same Content-Type value passed when creating the upload.

Terminal window
curl -X PUT "https://storage.example.com/signed-upload-url" \
-H "Content-Type: video/mp4" \
--data-binary "@meeting.mp4"

Complete the upload session so Video To Text can validate the object and create an asset.

Terminal window
curl -X POST "https://api.videototext.dev/v1/uploads/$UPLOAD_ID/operations/complete" \
-H "Authorization: Bearer $VTT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"fileKey": "uploads/example.mp4",
"fileUrl": "https://static.example.com/uploads/example.mp4",
"filename": "meeting.mp4",
"mimetype": "video/mp4",
"fileSize": 10485760
}'

The response returns the assetId used to create a transcription task.

{
"data": {
"assetId": "00000000-0000-0000-0000-000000000000"
},
"meta": {}
}

Create the task from the uploaded asset. The default mode is balanced.

Terminal window
curl -X POST "https://api.videototext.dev/v1/tasks" \
-H "Authorization: Bearer $VTT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"assetId": "00000000-0000-0000-0000-000000000000",
"language": "Auto",
"timestampMode": "CHUNK",
"transcriptionMode": "balanced"
}'

Use precision when the workflow favors accuracy over cost.

The response returns data.task.transcriptId; use that value as TASK_ID when polling.

ModeBest for
balancedFast, cost-efficient transcription for most production workflows.
precisionHigher-accuracy transcription when quality matters more than cost.

Poll the task endpoint until status becomes SUCCEEDED, FAILED, or CANCELED.

Terminal window
curl "https://api.videototext.dev/v1/tasks/$TASK_ID" \
-H "Authorization: Bearer $VTT_API_KEY"

Successful tasks return transcript text, public chunk timing fields, word timestamps, source file details, and billed credits.