API Reference

`samtts`

SAMTTS

A Python port of Software Automatic Mouth Test-To-Speech program.

Ported by: Quan Lin
License: None

`samtts.Reciter`

Reciter converts text to phonemes.

Parameters:	`debug` (`bool`, default: `False` ) – Set or clear debug flag.

`samtts.Reciter.text_to_phonemes(input_text)`

Convert text to phonemes.

Parameters:	`input_text` (`str \| bytes \| bytearray`) – The input text to convert.

Returns:	`bytearray` – The phonemes bytearray.

`samtts.Processor`

Processor takes phonemes and prepares output parameters.

Parameters:	`debug` (`bool`, default: `False` ) – Set or clear debug flag.

`samtts.Processor.process(input_phonemes)`

Process the phonemes and prepare output parameters.

When it is successful, the output parameters are stored in:

self.phoneme_index
self.phoneme_length
self.stress

Parameters:	`input_phonemes` (`str \| bytes \| bytearray`) – The input phonemes to process.

Returns:	`bool` – Whether the phonemes are processed successfully.

`samtts.Renderer`

Renderer takes the phoneme parameters and renders sound waveform.

Parameters:

speed (int, default: 72 ) –

Set speed value.
pitch (int, default: 64 ) –

Set pitch value.
mouth (int, default: 128 ) –

Set mouth value.
throat (int, default: 128 ) –

Set throat value.
sing_mode (bool, default: False ) –

Set or clear sing_mode flag.
buffer_size (int, default: 220500 ) –

Set a large enough buffer size for rendering.
debug (bool, default: False ) –

Set or clear debug flag.

`samtts.Renderer.config(speed=None, pitch=None, mouth=None, throat=None, sing_mode=None)`

Configure renderer parameters.

Parameters:	`speed` (`int \| None`, default: `None` ) – Set speed value. `pitch` (`int \| None`, default: `None` ) – Set pitch value. `mouth` (`int \| None`, default: `None` ) – Set mouth value. `throat` (`int \| None`, default: `None` ) – Set throat value. `sing_mode` (`bool \| None`, default: `None` ) – Set or clear sing_mode flag.

`samtts.Renderer.render(processor)`

Render sound waveform.

When it is successful, the audio data is stored in self.buffer. And the length of the valid data is stored in self.buffer_end.

Parameters:	`processor` (`Processor`) – A `Processor` instance that has output parameters prepared.

Returns:	`bool` – Whether the sound waveform are rendered successfully.

`samtts.SamTTS`

SamTTS combines Reciter, Processor and Renderer together.

Parameters:

speed (int, default: 72 ) –

Set speed value.
pitch (int, default: 64 ) –

Set pitch value.
mouth (int, default: 128 ) –

Set mouth value.
throat (int, default: 128 ) –

Set throat value.
sing_mode (bool, default: False ) –

Set or clear sing_mode flag.
buffer_size (int, default: 220500 ) –

Set a large enough buffer size for rendering.
debug (bool, default: False ) –

Set or clear debug flag.

`samtts.SamTTS.get_audio_data(input_data, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050)`

Get audio data from input text or phonemes.

It can only process very short inputs.

Parameters:

input_data (str | bytes | bytearray) –

The input text or phonemes.
phonetic (bool, default: False ) –

The flag indicates if the input is phonemes.
speed (int | None, default: None ) –

Set speed value.
pitch (int | None, default: None ) –

Set pitch value.
mouth (int | None, default: None ) –

Set mouth value.
throat (int | None, default: None ) –

Set throat value.
sing_mode (bool | None, default: None ) –

Set or clear sing_mode flag.
sample_rate (int, default: 22050 ) –

The sample rate of the audio data. It can be one of 5513, 11025 and 22050.

Returns:	`bytearray` – The rendered audio data bytearray.

`samtts.SamTTS.iter_audio_data_from_paragraph(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations)`

Get audio data from a paragraph segment by segment.

Parameters:

paragraph (str) –

The input string paragraph.
phonetic (bool, default: False ) –

The flag indicates if the input is phonemes.
speed (int | None, default: None ) –

Set speed value.
pitch (int | None, default: None ) –

Set pitch value.
mouth (int | None, default: None ) –

Set mouth value.
throat (int | None, default: None ) –

Set throat value.
sing_mode (bool | None, default: None ) –

Set or clear sing_mode flag.
sample_rate (int, default: 22050 ) –

The sample rate of the audio data. It can be one of 5513, 11025 and 22050.
iter_segments_from_paragraph (Callable, default: iter_by_punctuations ) –
The iter_segments_from_paragraph function whose signature is:
```
iter_segments_from_paragraph(paragraph: str) -> Iterable[str]
```

Yields:	`Iterable[bytearray]` – Audio data.

`samtts.SamTTS.save(paragraph, output_file_path, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, save_audio_data=save_audio_data_in_wav_format)`

Save audio data from a paragraph to output file.

Parameters:

paragraph (str) –

The input paragraph.
output_file_path (str) –

The path of the output file.
phonetic (bool, default: False ) –

The flag indicates if the input is phonemes.
speed (int | None, default: None ) –

Set speed value.
pitch (int | None, default: None ) –

Set pitch value.
mouth (int | None, default: None ) –

Set mouth value.
throat (int | None, default: None ) –

Set throat value.
sing_mode (bool | None, default: None ) –

Set or clear sing_mode flag.
sample_rate (int, default: 22050 ) –

The sample rate of the audio data. It can be one of 5513, 11025 and 22050.
iter_segments_from_paragraph (Callable, default: iter_by_punctuations ) –
The iter_segments_from_paragraph function whose signature is:
```
iter_segments_from_paragraph(paragraph: str) -> Iterable[str]
```

save_audio_data (Callable, default: save_audio_data_in_wav_format ) –

The save_audio_data function whose signature is:

save_audio_data(
    audio_data: bytes | bytearray,
    output_file_path: str,
    num_channels: int,
    bytes_per_sample: int,
    sample_rate: int,
)

`samtts.SamTTS.play(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, play_audio_data=play_audio_data_with_simpleaudio)`

Play audio data from a paragraph.

Parameters:

paragraph (str) –

The input paragraph.
phonetic (bool, default: False ) –

The flag indicates if the input is phonemes.
speed (int | None, default: None ) –

Set speed value.
pitch (int | None, default: None ) –

Set pitch value.
mouth (int | None, default: None ) –

Set mouth value.
throat (int | None, default: None ) –

Set throat value.
sing_mode (bool | None, default: None ) –

Set or clear sing_mode flag.
sample_rate (int, default: 22050 ) –

The sample rate of the audio data. It can be one of 5513, 11025 and 22050.
iter_segments_from_paragraph (Callable, default: iter_by_punctuations ) –
The iter_segments_from_paragraph function whose signature is:
```
iter_segments_from_paragraph(paragraph: str) -> Iterable[str]
```
play_audio_data (Callable, default: play_audio_data_with_simpleaudio ) –
The play_audio_data function whose signature is:
```
play_audio_data(
    audio_data: bytes | bytearray,
    num_channels: int,
    bytes_per_sample: int,
    sample_rate: int,
)
```

`samtts.SamTTS.async_play(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, async_play_audio_data=async_play_audio_data_with_simpleaudio)` `async`

Async play audio data from a paragraph.

Parameters:

paragraph (str) –

The input paragraph.
phonetic (bool, default: False ) –

The flag indicates if the input is phonemes.
speed (int | None, default: None ) –

Set speed value.
pitch (int | None, default: None ) –

Set pitch value.
mouth (int | None, default: None ) –

Set mouth value.
throat (int | None, default: None ) –

Set throat value.
sing_mode (bool | None, default: None ) –

Set or clear sing_mode flag.
sample_rate (int, default: 22050 ) –

The sample rate of the audio data. It can be one of 5513, 11025 and 22050.
iter_segments_from_paragraph (Callable, default: iter_by_punctuations ) –
The iter_segments_from_paragraph function whose signature is:
```
iter_segments_from_paragraph(paragraph: str) -> Iterable[str]
```
async_play_audio_data (Awaitable, default: async_play_audio_data_with_simpleaudio ) –
The async_play_audio_data function whose signature is:
```
async_play_audio_data(
    audio_data: bytes | bytearray,
    num_channels: int,
    bytes_per_sample: int,
    sample_rate: int,
)
```

API Reference

samtts

samtts.Reciter

samtts.Reciter.text_to_phonemes(input_text)

samtts.Processor

samtts.Processor.process(input_phonemes)

samtts.Renderer

samtts.Renderer.config(speed=None, pitch=None, mouth=None, throat=None, sing_mode=None)

samtts.Renderer.render(processor)

samtts.SamTTS

samtts.SamTTS.get_audio_data(input_data, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050)

samtts.SamTTS.iter_audio_data_from_paragraph(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations)

samtts.SamTTS.save(paragraph, output_file_path, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, save_audio_data=save_audio_data_in_wav_format)

samtts.SamTTS.play(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, play_audio_data=play_audio_data_with_simpleaudio)

samtts.SamTTS.async_play(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, async_play_audio_data=async_play_audio_data_with_simpleaudio) async

`samtts`

`samtts.Reciter`

`samtts.Reciter.text_to_phonemes(input_text)`

`samtts.Processor`

`samtts.Processor.process(input_phonemes)`

`samtts.Renderer`

`samtts.Renderer.config(speed=None, pitch=None, mouth=None, throat=None, sing_mode=None)`

`samtts.Renderer.render(processor)`

`samtts.SamTTS`

`samtts.SamTTS.get_audio_data(input_data, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050)`

`samtts.SamTTS.iter_audio_data_from_paragraph(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations)`

`samtts.SamTTS.save(paragraph, output_file_path, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, save_audio_data=save_audio_data_in_wav_format)`

`samtts.SamTTS.play(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, play_audio_data=play_audio_data_with_simpleaudio)`

`samtts.SamTTS.async_play(paragraph, phonetic=False, speed=None, pitch=None, mouth=None, throat=None, sing_mode=None, sample_rate=22050, iter_segments_from_paragraph=iter_by_punctuations, async_play_audio_data=async_play_audio_data_with_simpleaudio)` `async`