接口地址
GET /hub/v1/tasks/:taskId,直到 ready = true。
模型参考(Model Reference)
生成模式(文生视频 / 图生视频 / 参考图 / 首尾帧 / 视频编辑)完全由你选择的model 决定。先选类别,展开模型,复制示例即可。
- 文生视频 · 9
- 图生视频 · 9
- 参考图生视频 · 5
- 首尾帧 · 1
- 视频编辑 · 1
veo-3.1 — Veo 3.1 (Google)
veo-3.1 — Veo 3.1 (Google)
Google Veo 3.1 text-to-video with optional native audio.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | The seed for the random number generator. | |
prompt | string | ✓ | – | – | The text prompt describing the video you want to generate |
auto_fix | boolean | true | – | Whether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them. | |
duration | string | 8s | 4s 6s 8s | The duration of the generated video. | |
resolution | string | 720p | 720p 1080p 4k | The resolution of the generated video. | |
aspect_ratio | string | 16:9 | 16:9 9:16 | Aspect ratio of the generated video | |
generate_audio | boolean | true | – | Whether to generate audio for the video. | |
negative_prompt | string | – | – | A negative prompt to guide the video generation. | |
safety_tolerance | string | 4 | 1 2 3 4 5 6 | The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Note: API-only parameter. |
veo-3.1-fast — Veo 3.1 Fast (Google)
veo-3.1-fast — Veo 3.1 Fast (Google)
Veo 3.1 Fast: lower-latency text-to-video at reduced cost.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | The seed for the random number generator. | |
prompt | string | ✓ | – | – | The text prompt describing the video you want to generate |
auto_fix | boolean | true | – | Whether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them. | |
duration | string | 8s | 4s 6s 8s | The duration of the generated video. | |
resolution | string | 720p | 720p 1080p 4k | The resolution of the generated video. | |
aspect_ratio | string | 16:9 | 16:9 9:16 | Aspect ratio of the generated video | |
generate_audio | boolean | true | – | Whether to generate audio for the video. | |
negative_prompt | string | – | – | A negative prompt to guide the video generation. | |
safety_tolerance | string | 4 | 1 2 3 4 5 6 | The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Note: API-only parameter. |
veo-3.1-lite — Veo 3.1 Lite (Google)
veo-3.1-lite — Veo 3.1 Lite (Google)
Veo 3.1 Lite: lowest cost text-to-video (720p/1080p only).
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | The seed for the random number generator. | |
prompt | string | ✓ | – | – | The text prompt describing the video you want to generate |
auto_fix | boolean | true | – | Whether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them. | |
duration | string | 8s | 4s 6s 8s | The duration of the generated video. | |
resolution | string | 720p | 720p 1080p | The resolution of the generated video. | |
aspect_ratio | string | 16:9 | 16:9 9:16 | Aspect ratio of the generated video | |
generate_audio | boolean | true | – | Whether to generate audio for the video. | |
negative_prompt | string | – | – | A negative prompt to guide the video generation. | |
safety_tolerance | string | 4 | 1 2 3 4 5 6 | The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Note: API-only parameter. |
veo-3 — Veo 3 (Google)
veo-3 — Veo 3 (Google)
Google Veo 3 text-to-video with native audio.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | The seed for the random number generator. | |
prompt | string | ✓ | – | – | The text prompt describing the video you want to generate |
auto_fix | boolean | true | – | Whether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them. | |
duration | string | 8s | 4s 6s 8s | The duration of the generated video. | |
resolution | string | 720p | 720p 1080p | The resolution of the generated video. | |
aspect_ratio | string | 16:9 | 16:9 9:16 | The aspect ratio of the generated video. | |
generate_audio | boolean | true | – | Whether to generate audio for the video. | |
negative_prompt | string | – | – | A negative prompt to guide the video generation. | |
safety_tolerance | string | 4 | 1 2 3 4 5 6 | The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Note: API-only parameter. |
kling-v3-standard — Kling v3 Standard (Kuaishou)
kling-v3-standard — Kling v3 Standard (Kuaishou)
Kling v3 Standard text-to-video with optional native audio.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
prompt | string | – | – | Text prompt for video generation. Either prompt or multi_prompt must be provided, but not both. | |
duration | string | 5 | 3 4 5 6 7 8 9 10 11 12 13 14 15 | The duration of the generated video in seconds | |
cfg_scale | number | 0.5 | – | The CFG (Classifier Free Guidance) scale is a measure of how close you want the model to stick to your prompt. | |
shot_type | string | customize | customize intelligent | The type of multi-shot video generation. ‘intelligent’ lets the model automatically determine shot structure. | |
aspect_ratio | string | 16:9 | 16:9 9:16 1:1 | The aspect ratio of the generated video frame | |
multi_prompt | array | – | – | List of prompts for multi-shot video generation. If provided, overrides the single prompt and divides the video into multiple shots with specified prompts and durations. | |
generate_audio | boolean | true | – | Whether to generate native audio for the video. Supports Chinese and English voice output. Other languages are automatically translated to English. For English speech, use lowercase letters; for acronyms or proper nouns, use uppercase. | |
negative_prompt | string | blur, distort, and low quality | – | – |
kling-v3-pro — Kling v3 Pro (Kuaishou)
kling-v3-pro — Kling v3 Pro (Kuaishou)
Kling v3 Pro text-to-video with optional native audio.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
prompt | string | – | – | Text prompt for video generation. Either prompt or multi_prompt must be provided, but not both. | |
duration | string | 5 | 3 4 5 6 7 8 9 10 11 12 13 14 15 | The duration of the generated video in seconds | |
cfg_scale | number | 0.5 | – | The CFG (Classifier Free Guidance) scale is a measure of how close you want the model to stick to your prompt. | |
shot_type | string | customize | customize intelligent | The type of multi-shot video generation. ‘intelligent’ lets the model automatically determine shot structure. | |
aspect_ratio | string | 16:9 | 16:9 9:16 1:1 | The aspect ratio of the generated video frame | |
multi_prompt | array | – | – | List of prompts for multi-shot video generation. If provided, overrides the single prompt and divides the video into multiple shots with specified prompts and durations. | |
generate_audio | boolean | true | – | Whether to generate native audio for the video. Supports Chinese and English voice output. Other languages are automatically translated to English. For English speech, use lowercase letters; for acronyms or proper nouns, use uppercase. | |
negative_prompt | string | blur, distort, and low quality | – | – |
wan-2.7 — WAN 2.7 (Alibaba)
wan-2.7 — WAN 2.7 (Alibaba)
WAN 2.7 text-to-video - high quality generation. Default resolution is 1080p.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | Random seed for reproducibility (0-2147483647). | |
prompt | string | ✓ | – | – | Text prompt describing the desired video. Max 5000 characters. |
duration | integer | 5 | 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | Output video duration in seconds (2-15). | |
audio_url | string | – | – | URL of driving audio. Supports WAV and MP3. Duration: 3-30s. Max 15 MB. If not provided, the model auto-generates matching background music. | |
resolution | string | 1080p | 720p 1080p | Output video resolution tier. | |
aspect_ratio | string | 16:9 | 16:9 9:16 1:1 4:3 3:4 | Aspect ratio of the generated video. | |
negative_prompt | string | – | – | Content to avoid in the video. Max 500 characters. | |
enable_safety_checker | boolean | true | – | Enable content moderation for input and output. | |
enable_prompt_expansion | boolean | true | – | Enable intelligent prompt rewriting. |
seedance-2.0 — Seedance 2.0 (ByteDance)
seedance-2.0 — Seedance 2.0 (ByteDance)
ByteDance Seedance 2.0: cinematic text-to-video with native audio, physics, and camera control.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | Random seed for reproducibility. Note that results may still vary slightly even with the same seed. | |
prompt | string | ✓ | – | – | The text prompt used to generate the video |
duration | string | 4 | 4 5 6 7 8 9 10 11 12 13 14 15 | Duration of the video in seconds (4-15). | |
resolution | string | 720p | 480p 720p 1080p | Video resolution - 480p for faster generation, 720p for balance, 1080p for highest quality. | |
end_user_id | string | – | – | The unique user ID of the end user. | |
aspect_ratio | string | auto | auto 21:9 16:9 4:3 1:1 3:4 9:16 | The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. | |
generate_audio | boolean | true | – | Whether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not. |
seedance-2.0-fast — Seedance 2.0 Fast (ByteDance)
seedance-2.0-fast — Seedance 2.0 Fast (ByteDance)
ByteDance Seedance 2.0 fast tier: lower-latency text-to-video with native audio.
| Parameter | Type | Req | Default | Values / Range | Description |
|---|---|---|---|---|---|
seed | integer | – | – | Random seed for reproducibility. Note that results may still vary slightly even with the same seed. | |
prompt | string | ✓ | – | – | The text prompt used to generate the video |
duration | string | 4 | 4 5 6 7 8 9 10 11 12 13 14 15 | Duration of the video in seconds (4-15). | |
resolution | string | 720p | 480p 720p | Video resolution - 480p for faster generation, 720p for balance. | |
end_user_id | string | – | – | The unique user ID of the end user. | |
aspect_ratio | string | auto | auto 21:9 16:9 4:3 1:1 3:4 9:16 | The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. | |
generate_audio | boolean | true | – | Whether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not. |