Looking to hire Laravel developers? Try LaraJobs

laravel-voicevox maintained by revolution

Description
VOICEVOX for Laravel
Author
Last update
2026/05/24 12:35 (dev-main)
License
Links
Downloads
0

Comments
comments powered by Disqus

VOICEVOX for Laravel

tests linter Maintainability Code Coverage Ask DeepWiki

Work In Progress.

Overview

This package brings VOICEVOX, a Japanese TTS / singing synthesis ecosystem, to Laravel. You can use both client mode (official VOICEVOX Engine over HTTP) and native mode (direct synthesis through PHP FFI + VOICEVOX Core) with a Laravel-style API.

Since VOICEVOX only supports Japanese, you must first translate the text from English to Japanese using an AI/LLM or similar tool before using this package for speech synthesis.

VOICEVOX is widely used in Japan, and many well-known "Zundamon" voice clips are created with it.

Feature Supported Description
VOICEVOX Client Client for the official VOICEVOX Engine API. Works without FFI.
VOICEVOX Core voicevox-core-php wraps VOICEVOX Core dynamic libraries through FFI.
Laravel style Uses a Laravel-friendly API for voicevox-core-php features.
Laravel AI SDK Integration Supported from Laravel AI SDK Audio in both client and native modes.
VOICEVOX Engine ⚠️ Provides a VOICEVOX-compatible API inside Laravel, with fallback to the official engine for unsupported parts.
VOICEVOX Editor The Editor itself is out of scope for this package.

Requirements

  • PHP 8.3+
  • Laravel 12.x+
  • FFI: Required for everything except client-only usage.

FFI is disabled on most web servers (including Laravel Cloud), so this package is mainly intended for local CLI usage.

In CLI it is typically enabled by default. If you run local web server processes for the Laravel Engine API, enable FFI in php.ini:

ffi.enable=true

Installation

Install both packages to use VOICEVOX Core features:

composer require revolution/laravel-voicevox revolution/voicevox-core-php

You can install only laravel-voicevox for client-only mode:

composer require revolution/laravel-voicevox

VOICEVOX Core Dynamic Library Setup

To use FFI-based features, install VOICEVOX Core libraries by following the voicevox-core-php README.

Configuration

Publish the package config file (config/voicevox.php):

php artisan vendor:publish --tag="voicevox-config"

For Core features, configure the path in .env:

VOICEVOX_CORE_PATH=/.../.local/voicevox_core/

Usage

Client mode

Use client mode through the Voicevox facade.

Client mode connects to the official VOICEVOX Engine. Start it with Docker (GPU image is also available in supported environments):

docker pull voicevox/voicevox_engine:cpu-latest
docker run --rm -p '127.0.0.1:50021:50021' voicevox/voicevox_engine:cpu-latest

Text-to-speech. Client mode enables enable_katakana_english, so English words are automatically converted into katakana.

use Revolution\Voicevox\Voicevox;
use Revolution\Voicevox\Client\TalkAudioQuery;

$response = Voicevox::talk('Laravelが好きなのだ', id: 1)
    ->tap(function (TalkAudioQuery $talk): void {
        $talk->audioQuery['speedScale'] = 1.2;
    })->generate(id: 1);

$response->storeAs('client', 'talk.wav');

For singing synthesis, create a Score first. length is a frame length value, and Note::len($ticks, $bpm) helps MIDI-oriented workflows.

use Revolution\Voicevox\Song\Note;
use Revolution\Voicevox\Song\Score;
use Revolution\Voicevox\Voicevox;

$score = Score::make([
    Note::make(length: 15), // first note must be a rest
    Note::make(length: Note::len(ticks: 480, bpm: 120), lyric: 'ド', key: 60), // quarter note
    Note::make(length: Note::len(480, 120), lyric: 'レ', key: 62), // quarter note
    Note::make(length: Note::len(960, 120), lyric: 'ミ', key: 64), // half note
    Note::make(length: 2), // optional short tail silence
]);

$response = Voicevox::song($score, teacher: 6000)->generate(id: 3001);

$response->storeAs('client', 'song.wav');

Native mode

Use native mode through talk() / song() helper functions. Other than removing Voicevox::, usage is kept close to client mode.

Text-to-speech in native mode. Native mode does not provide enable_katakana_english, so for English text you may preprocess it (for example, convert to katakana with AI/LLM before synthesis). TalkAudioQuery has the same class name but is a different class from the client one.

use Revolution\Voicevox\Talk\TalkAudioQuery;
use function Revolution\Voicevox\talk;

$response = talk('ララベルが好きなのだ', id: 1)
    ->tap(function (TalkAudioQuery $talk): void {
        $talk->audioQuery['speedScale'] = 1.2;
    })->generate(id: 1);

$response->storeAs('native', 'talk.wav');

Singing synthesis in native mode. Score and Note are shared with client mode.

use Revolution\Voicevox\Song\Note;
use Revolution\Voicevox\Song\Score;
use function Revolution\Voicevox\song;

$score = Score::make([
    Note::make(length: 15), // first note must be a rest
    Note::make(length: Note::len(ticks: 480, bpm: 120), lyric: 'ド', key: 60),
    Note::make(length: Note::len(480, 120), lyric: 'レ', key: 62),
    Note::make(length: Note::len(960, 120), lyric: 'ミ', key: 64),
    Note::make(length: 2), // optional short tail silence
]);

$response = song($score, teacher: 6000)->generate(id: 3001);

$response->storeAs('native', 'song.wav');

Documentation

Terms of Use

You must follow the terms of use for VOICEVOX and each voice library.

License

MIT