Now you can add some texts, lyrics, poetry or Kraftwerk’s style “sprechgesang” to you Eurorack set-up!
Add the TextToSpeech click from MikroElektronika to your Emy, insert the SD card with the VAX firmware and enjoy the fun of the Dectalk Speech engine.
Text To Speech click is a mikroBUS™ add-on board that carries an Epson S1V30120 speech synthesis IC. The IC is powered by the Fonix DECtalk® v5 speech synthesis engine that can talk in US English, Castilian Spanish or Latin American Spanish, in one of nine pre-defined voices.
What is Dectalk
Dectalk was a speech synthesizer and text-to-speech technology developed by Digital Equipment Corporation in 1984, based largely on the work of Dennis Klatt at MIT.
The Dectalk Express what connected to the serial port and would simply speak what was being “printed”.
The synthesizer can process text and produce speech with 9 different voices.
The Dectalk engine includes a parser that gives users fine control over the quality, pitch, and intonation of the synthesized speech.
Dectalk can also be programmed to play phonemes and sing with quite a realistic expression.
[hxae<300,10>piy<300,10> brr<600,12>th<100>dey<600,10> tuw<600,15> yu<1200,14>_<120>]
[hxae<300,10>piy<300,10> brr<600,12>th<100>dey<600,10> tuw<600,17> yu<1200,15>_<120>]
[hxae<300,20>piy<300,20> brr<600,19>th<100>dey<600,15> tuw<600,17> yu<1200,15>]
The command syntax for coding musical sequences is:
[phoneme<duration, pitch number>]
You can find a ready-to-use VAX-vox board here:
If you want to do this yourself: follow these instructions :
First, you will need to route the audio signal to Emy’s audio entry: just add a little wire from the audio jack to the audio pin.
Unfortunately, the S1V30120 chip is not releasing the SDO signal after use and thus generate a conflict with the micro SD card. (I’ve explained this issue to MikroElektronika and I hope they will update it in the next version.)
To avoid this, you will need to add a tri-state buffer (SN74HC125N) that will leave the SDO pin in high impedance state when not in use.
When not in use the CS pin is HIGH and therefore the SDO will stay in Z – high impedance.
When used the CS is LOW and the buffer allows the original SDO* signal to be copied to the SDO pin
I use hot glue to attach the buffer it on its back on top of the S1V30120.
Note that the trace arriving in the SDO pin has to be cut and routed into the buffer before going back to that pin.
Here is how to connect the buffer :
If needed, you can adjust the gain of Emy’s amplification circuit with the trim pot to set the audio output level.
There is a latency of 200 ms between the trigger and the start of the speech. This latency is very consistent so it allows the speech to stays in the tempo even if not exactly on the beat.
The firmware uses the gate going down to stop the speech, preparing the chip for the next utterance, so when sequencing some stutter-like speech in a loop they still fire up in sync with the tempo.
The various voice parameters are applied just before triggering the speech and are ineffective while speaking. They are applied to the next utterance. Best is to fiddle a bit with the knobs for the desired effects.