SAMTTS
A Python port of Software Automatic Mouth Text-To-Speech program.
- Ported by: Quan Lin
- License: None
WARNING: This project is not under any open source software license. Use it at your own risk.
It is tested on Windows with Python 3.11.9.
What is SAM?
SAM is the Text-To-Speech (TTS) software SAM (Software Automatic Mouth) for the Commodore C64 published in the year 1982 by Don't Ask Software (now SoftVoice, Inc.).
This project is an unofficial Python port of SAM. It is translated by hand from the adaption to C by Stefan Macke and the refactorings by Vidar Hokstad.
Installation
pip install samtts
Usage
Use samtts in Python script
The minimum example:
from samtts import SamTTS
SamTTS().play("Hello. My name is Sam.")
A conversation between Sam and Little Robot:
from samtts import SamTTS
# The default config is Sam.
sam = SamTTS()
# Config SamTTS for a different character.
robot = SamTTS(speed=92, pitch=60, mouth=190, throat=190)
sam.play("Hello. Little Robot. How are you today?")
robot.play("Hello! I am functioning well, thank you. How can I assist you today?")
sam.play("Could you hand me the hammer please?")
robot.play("Of course! Here you are.")
sam.play("Thank you very much!")
SamTTS does not pronouce all the words correctly.
Sometimes you may want to use phonemes directly.
Phonemes are powerful and flexible.
But make sure the phonemes are valid, otherwise it will raise exceptions.
from samtts import SamTTS
# Make SamTTS say "Hello. My name is Sam." in phonemes.
SamTTS().play("/HEHLOH3OW. MAY4 NEY4M IHZ SAE4M.", phonetic=True)
Make SamTTS sing:
from samtts import SamTTS
singer = SamTTS(speed=200, mouth=90, throat=90, sing_mode=True)
for pitch in (52, 41, 34, 41, 52):
singer.play("AHAHAHAHAHAHAHAH", phonetic=True, pitch=pitch)
Save the audio data generated by SamTTS to a wav file:
from samtts import SamTTS
SamTTS().save("Hello. My name is Sam.", "output.wav")
Use SamTTS with asyncio:
import asyncio
from samtts import SamTTS
asyncio.run(SamTTS().async_play("Hello. My name is Sam."))
The core of samtts consists of Reciter, Processor and Renderer.
SamTTS is a combination of the three.
Reciter converts text to phonemes.
Processor and Renderer turns phonemes into audio data in bytearray.
But they can only process very short inputs.
To work around this,
SamTTS splits the input paragraph by punctuations !,.:;?.
It works for most of the cases, but not always.
You can design your own functions to split the input paragraph.
Make SamTTS read the paragraph word by word:
from samtts import SamTTS
def iter_by_space(paragraph):
for item in paragraph.split():
yield item
SamTTS().play(
"Hello. My name is Sam.",
iter_segments_from_paragraph = iter_by_space,
)
In case you know your input is very small, you do not have to split it at all:
from samtts import SamTTS
def iter_no_split(paragraph):
yield paragraph
SamTTS().play(
"Hello. My name is Sam.",
iter_segments_from_paragraph = iter_no_split,
)
By default SamTTS saves audio data in wav format.
But you can design your own save function
to save audio data in other formats:
from samtts import SamTTS
# Make sure this function signature is followed.
def save_audio_data_in_other_formats(
audio_data: bytes | bytearray,
output_file_path: str,
num_channels: int = 1,
bytes_per_sample: int = 1,
sample_rate: int = 22050,
):
...
SamTTS().save(
"Hello. My name is Sam.",
"output.ext",
save_audio_data = save_audio_data_in_other_formats,
)
By default SamTTS plays audio with simpleaudio backend.
In case simpleaudio does not work for your platform,
you can design your own play audio function
to play audio with other audio backends:
from samtts import SamTTS
# Make sure this function signature is followed.
def play_audio_data_with_other_backends(
audio_data: bytes | bytearray,
num_channels: int = 1,
bytes_per_sample: int = 1,
sample_rate: int = 22050,
):
...
SamTTS().play(
"Hello. My name is Sam.",
play_audio_data = play_audio_data_with_other_backends,
)
The core of samtts (Reciter, Processor and Renderer)
does not depend on any 3rd party or even built-in libraries.
For finer control, you can use them directly:
import simpleaudio
from samtts import Reciter, Processor, Renderer
reciter = Reciter()
processor = Processor()
renderer = Renderer()
input_text = "Hello. My name is Sam. How are you?"
print(f"{input_text = }")
phonemes = reciter.text_to_phonemes(input_text)
print(f"{phonemes = }")
processor.process(phonemes)
renderer.render(processor)
print(f"{renderer.buffer_end = }")
print(f"The first 100 bytes in the buffer: {renderer.buffer[: 100]}")
play_obj = simpleaudio.play_buffer(
renderer.buffer[: renderer.buffer_end],
num_channels=1,
bytes_per_sample=1,
sample_rate=22050,
)
while play_obj.is_playing():
pass
There are more examples in examples directory.
Use samtts with command line interface
To get help information:
python -m samtts
Usage: python -m samtts [OPTIONS] [INPUT_STRING]
A Python port of Software Automatic Mouth Test-To-Speech program.
- If `--phoneme-info` or `--pitch-info` is used, the argument and all the other
options are ignored.
- If `--phonetic` is used, the input must be valid phonemes.
╭─ Arguments ────────────────────────────────────────────────────────────────────╮
│ input_string [INPUT_STRING] Input text or phonemes. │
╰────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────╮
│ --phoneme-info --no-phoneme-info Show phoneme info. │
│ [default: │
│ no-phoneme-info] │
│ --pitch-info --no-pitch-info Show pitch info. │
│ [default: no-pitch-info] │
│ --phonetic --no-phonetic Set phonetic flag. │
│ [default: no-phonetic] │
│ --speed INTEGER Set speed value. │
│ [default: 72] │
│ --pitch INTEGER Set pitch value. │
│ [default: 64] │
│ --mouth INTEGER Set mouth value. │
│ [default: 128] │
│ --throat INTEGER Set throat value. │
│ [default: 128] │
│ --sing --no-sing Set sing mode. │
│ [default: no-sing] │
│ --sample-rate INTEGER Set sample rate 11025 or │
│ 22050. │
│ [default: 22050] │
│ --wav TEXT Set output wav file name │
│ or path. │
│ --debug --no-debug Set debug flag. │
│ [default: no-debug] │
│ --install-completion Install completion for │
│ the current shell. │
│ --show-completion Show completion for the │
│ current shell, to copy │
│ it or customize the │
│ installation. │
│ --help Show this message and │
│ exit. │
╰────────────────────────────────────────────────────────────────────────────────╯
The minimum example:
python -m samtts "Hello. My name is Sam."
To config its voice:
python -m samtts --speed 92 --pitch 60 --mouth 190 --throat 190 "Hello. My name is Little Robot."
To save to a wav file:
python -m samtts --wav "output.wav" "Hello. My name is Sam."
Useful information
Phonemes
Phoneme Information
VOWELS VOICED CONSONANTS
IY f(ee)t R red
IH p(i)n L allow
EH beg W away
AE Sam W whale
AA pot Y you
AH b(u)dget M Sam
AO t(al)k N man
OH cone NX so(ng)
UH book B bad
UX l(oo)t D dog
ER bird G again
AX gall(o)n J judge
IX dig(i)t Z zoo
ZH plea(s)ure
DIPHTHONGS V seven
EY m(a)de DH (th)en
AY h(igh)
OY boy
AW h(ow) UNVOICED CONSONANTS
OW slow S Sam
UW crew Sh fish
F fish
TH thin
SPECIAL PHONEMES P poke
UL sett(le) (=AXL) T talk
UM astron(omy) (=AXM) K cake
UN functi(on) (=AXN) CH speech
Q kitt-en (glottal stop) /H a(h)ead
Pitches
Pitch Information
PITCH NOTE | PITCH NOTE | PITCH NOTE
104 C1 | 52 C2 | 26 C3
92 D1 | 46 D2 | 23 D3
82 E1 | 41 E2 | 21 E3
78 F1 | 39 F2 | 19 F3
68 G1 | 34 G2 | 17 G3
62 A1 | 31 A2 |
55 B1 | 28 B2 |
Characters
DESCRIPTION SPEED PITCH MOUTH THROAT
Elf 72 64 160 110
Little Robot 92 60 190 190
Stuffy Guy 82 72 105 110
Little Old Lady 82 32 145 145
Extra-Terrestrial 100 64 200 150
SAM 72 64 128 128
Limitations
- SAM was developed more than 40 years ago. It was advanced in 1980s. But now its sound quality is not comparable to AI based TTS programs.
- The core of SAM can only process very short inputs. To work around this, long inputs must be split.
- SAM does not pronouce all the words correctly. To work around this, phonemes can be used directly. But make sure the phonemes are valid, otherwise it will raise exceptions.
Further development
This project is meant to be a fairly faithful port of the original SAM. It will not improve upon SAM in any manner, like improving the quality of the sound or breaking the limitations of SAM.
The further development of this project is limited to bug fixing. If anyone is interested in improving it, please fork it and start a new project.
About license
According to Stefan Macke and Vidar Hokstad the status of the original software can be best described as Abandonware.
Neither Stefan Macke nor Vidar Hokstad put their projects under any open source software license. As long this is the case I cannot put my code under any open source software license either. However the software might be used under the "Fair Use" act in the USA.
References
Software Automatic Mouth on wikipedia