Submit Transcribe Task - MountSea API

curl --request POST \ --url https://api.mountsea.ai/hub/v1/transcribe \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "whisper-v3", "input": { "audio_url": "https://example.com/audio.mp3", "language": "en" } } '

{ "task_id": "hub-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "status": "pending", "capability": "video", "model": "veo-3.1-fast", "vendor": "Google", "mode": "text-to-video", "created_at": "2026-05-18T09:00:00.000Z" }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model

string

required

Model ID for transcription / translation.

See GET /hub/v1/models?capability=transcribe for the full list.

Example:

"whisper-v3"

input

object

required

Transcription input parameters.

audio_url (required) — URL of the audio or video file
language — BCP-47 language code (e.g. "en", "zh"). Omit for auto-detect.
task — "transcribe" (default) or "translate" (translate to English)
timestamps — "word" or "segment" for timestamped output

Example:

{
  "audio_url": "https://example.com/audio.mp3",
  "language": "en"
}

Response

200 - application/json

task_id

string

required

Unique task ID — use this to poll GET /hub/v1/tasks/:task_id

Example:

"hub-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

status

string

required

Task status at creation time (usually pending)

Example:

"pending"

capability

string

required

Capability: image | video | audio | transcribe

Example:

"video"

model

string

required

Model ID

Example:

"veo-3.1-fast"

vendor

string

required

Model vendor

Example:

"Google"

mode

string

required

Generation mode (e.g. text-to-video, image-to-image)

Example:

"text-to-video"

created_at

string

required

ISO 8601 creation timestamp

Example:

"2026-05-18T09:00:00.000Z"