Skip to main content
Create a personalized Voice Persona through voice verification. The process requires two steps: init (upload voice + get verification phrase) → create (upload verification recording + create persona).

Workflow

 User's voice audio


 ① voicePersona/init
      │  Upload voice → Extract vocals → Return verification phrase

      │  Returns: { taskId }
      │  Poll: GET /suno/v2/status?taskId=xxx
      │  Result: vox_audio_id, voice_recording_id, phrase_id,
      │          phrase_text, vocal_start_s, vocal_end_s


 User reads phrase_text aloud and records it


 ② voicePersona/create
      │  Upload verification recording → Voice verification → Create Persona

      │  Returns: { taskId }
      │  Poll: GET /suno/v2/status?taskId=xxx
      │  Result: persona details

    Done → Use persona in /generate

Step 1: Init — Upload Voice & Get Verification Phrase

Upload the user’s voice audio. The system extracts vocals and returns a verification phrase that the user must read aloud.
This is an async task. Poll Get Task Status with the returned taskId.

Request

POST /suno/v2/voicePersona/init
FieldTypeRequiredDescription
voice_audio_urlstring (URL)YesPublicly downloadable URL of the voice audio (WAV/MP3)
languagestringYesVerification phrase language: zh en ja ko es fr de pt ru hi
vocal_start_snumberNoVocal extraction start time (seconds), default: 0
vocal_end_snumberNoVocal extraction end time (seconds), default: auto-detected

Task Result Fields

When the task succeeds, result contains the following fields needed for Step 2:
FieldDescription
vox_audio_idExtracted vocal audio ID
voice_recording_idRecording ID
phrase_idVerification phrase ID
phrase_textVerification phrase text (user must read this aloud and record)
vocal_start_sVocal start time (seconds)
vocal_end_sVocal end time (seconds)
See Init API Reference →

Step 2: Create — Upload Verification Recording & Create Persona

After the user reads phrase_text aloud and records it, upload the verification recording to complete voice verification and create the persona.
This is an async task. Poll Get Task Status with the returned taskId. The result contains the created persona details.

Request

POST /suno/v2/voicePersona/create
FieldTypeRequiredDescription
vox_audio_idstringYesFrom init result
voice_recording_idstringYesFrom init result
phrase_idstringYesFrom init result
verification_audio_urlstring (URL)YesUser’s verification recording URL (WAV/MP3)
vocal_start_snumberYesFrom init result
vocal_end_snumberYesFrom init result
namestringYesPersona name
descriptionstringNoPersona description
is_publicbooleanNoWhether public (default: false)
image_s3_idstringNoCover image (base64), auto-generated if not provided
See Create API Reference →

Complete Example

const API_BASE = 'https://api.mountsea.ai';
const headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer your-api-key'
};

async function pollTask(taskId) {
  while (true) {
    const res = await fetch(`${API_BASE}/suno/v2/status?taskId=${taskId}`, { headers });
    const task = await res.json();
    if (task.status === 'success') return task.data;
    if (task.status === 'failed') throw new Error(task.failReason);
    await new Promise(r => setTimeout(r, 3000));
  }
}

// Step 1: Init — upload voice and get verification phrase
const initRes = await fetch(`${API_BASE}/suno/v2/voicePersona/init`, {
  method: 'POST',
  headers,
  body: JSON.stringify({
    voice_audio_url: 'https://example.com/my-voice.wav',
    language: 'zh'
  })
});
const { taskId: initTaskId } = await initRes.json();
const initResult = await pollTask(initTaskId);

console.log('Please read aloud:', initResult.phrase_text);
// → User records themselves reading the phrase

// Step 2: Create — upload verification recording
const createRes = await fetch(`${API_BASE}/suno/v2/voicePersona/create`, {
  method: 'POST',
  headers,
  body: JSON.stringify({
    vox_audio_id: initResult.vox_audio_id,
    voice_recording_id: initResult.voice_recording_id,
    phrase_id: initResult.phrase_id,
    verification_audio_url: 'https://example.com/verification.wav',
    vocal_start_s: initResult.vocal_start_s,
    vocal_end_s: initResult.vocal_end_s,
    name: 'My Voice'
  })
});
const { taskId: createTaskId } = await createRes.json();
const persona = await pollTask(createTaskId);

console.log('Voice Persona created:', persona);

Important Notes

The verification recording must clearly contain the full phrase_text content. Incomplete or unclear recordings may cause voice verification to fail.
  • Same account guarantee: The init and create steps automatically use the same Suno account — no manual account specification needed.
  • Language selection: language determines the verification phrase language. It’s recommended to match the language of the original voice audio.
  • Processing time: Init takes ~20-60s (includes vocal extraction); Create takes ~10-30s (includes voice verification).
  • Using the persona: Once created, use the persona in the Generate endpoint via the persona parameter to create music with consistent vocal characteristics.