I wanted to play with text to speech, I’ve been looking for a cross-platform open-source solution that sounds reasonable. I’m really impressed with the festival project, the web demo lets you enter your own text.
Update – I’m including this post in my plans for an Artificial Intelligence Handbook.
Festival is cross-platform but compiling it on a Mac takes a touch of effort (it looks like it is easier on Linux and Win).
This article shows you how to use it and how to web-enable it with some php. For the simplest demo I used ‘bin/text2wave input.txt -o output.wav’ with input.txt containing a sentence.
To get started, get the latest code. I have v1.96beta. You may also want the official festlang-talk list and possibly this more complete archive.
Compiling speech_tools-1.2.96-beta.tar.gz
It ought to have been as simple as ‘make clean; make’ but there’s a few changes to make first. First we need this fix or we get a compile error in macosxaudio in kAudioUnitProperty_SetInputCallback:
If you add #include <AudioUnit/AUNTComponent.h> after the include block on lines 45-48 in audio/macosxaudio.cc the problem should be solved. By the way, remember to change the byte order if you have an intel mac, i.e. on line 131: waveformat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked; // For Intel | kLinearPCMFormatFlagIsPacked; // For PowerPC | kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsBigEndian;
The following was a trickier error to solve:
g++ -c -fno-implicit-templates -O3 -Wall -I../include sigpr_frame.cc sigpr_frame.cc: In function ‘void lpc2cep(const EST_FVector&, EST_FVector&)’: sigpr_frame.cc:318: error: ‘__isnan’ was not declared in this scope make[1]: *** [sigpr_frame.o] Error 1 make: *** [sigpr] Error 2
The fix was known but the relevant archive was missing, some googling for ‘__isnan mac‘ results in this cached 2006 page:
--- ../test/speech_tools/include/EST_math.h 2006-08-03 08:49:35.000000000 -0500 +++ include/EST_math.h 2006-08-17 17:53:33.000000000 -0500 @@ -43,7 +43,7 @@ #if defined(__APPLE__) /* Not sure why I need this here, but I do */ -extern "C" int isnan(double); +extern "C" int isnan(float); #endif /* this isn't included from c, but just to be safe... */ @@ -101,7 +101,6 @@ /* Apple OSX */ #if defined(__APPLE__) #define isnanf(X) isnan(X) -#define isnan(X) __isnan(X) #endif /* FreeBSD *and other 4.4 based systems require anything, isnanf is defined */
Compiling festival-1.96-beta.tar.gz
Once speech-tools is compiled, getting ‘festival-1.96-beta.tar.gz’ compiled is as easy as ‘make clean;make’.
Python’s MacSpeechX
I also had a play with the macspeechx module which ties Python to the Mac’s voice-synthesiser. See list_voice_name() in macspeechX.py for an example of how it all works.
It works to power the speech synthesiser but it doesn’t appear to let you record the speech to a file (unlike festival above).
Update – Mike Driscoll has a post about pyTTS which hooks into Microsoft’s SAPI on Windows and pyTTSX which is cross-platform, along with some speech recognition links.
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
1 Comment