I wanted to play with text to speech, I’ve been looking for a cross-platform open-source solution that sounds reasonable. I’m really impressed with the festival project, the web demo lets you enter your own text.
Update – I’m including this post in my plans for an Artificial Intelligence Handbook.
Festival is cross-platform but compiling it on a Mac takes a touch of effort (it looks like it is easier on Linux and Win).
This article shows you how to use it and how to web-enable it with some php. For the simplest demo I used ‘bin/text2wave input.txt -o output.wav’ with input.txt containing a sentence.
It ought to have been as simple as ‘make clean; make’ but there’s a few changes to make first. First we need this fix or we get a compile error in macosxaudio in kAudioUnitProperty_SetInputCallback:
If you add #include <AudioUnit/AUNTComponent.h> after the include block on lines 45-48 in audio/macosxaudio.cc the problem should be solved. By the way, remember to change the byte order if you have an intel mac, i.e. on line 131: waveformat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked; // For Intel | kLinearPCMFormatFlagIsPacked; // For PowerPC | kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsBigEndian;
The following was a trickier error to solve:
g++ -c -fno-implicit-templates -O3 -Wall -I../include sigpr_frame.cc sigpr_frame.cc: In function ‘void lpc2cep(const EST_FVector&, EST_FVector&)’: sigpr_frame.cc:318: error: ‘__isnan’ was not declared in this scope make: *** [sigpr_frame.o] Error 1 make: *** [sigpr] Error 2
The fix was known but the relevant archive was missing, some googling for ‘__isnan mac‘ results in this cached 2006 page:
--- ../test/speech_tools/include/EST_math.h 2006-08-03 08:49:35.000000000 -0500 +++ include/EST_math.h 2006-08-17 17:53:33.000000000 -0500 @@ -43,7 +43,7 @@ #if defined(__APPLE__) /* Not sure why I need this here, but I do */ -extern "C" int isnan(double); +extern "C" int isnan(float); #endif /* this isn't included from c, but just to be safe... */ @@ -101,7 +101,6 @@ /* Apple OSX */ #if defined(__APPLE__) #define isnanf(X) isnan(X) -#define isnan(X) __isnan(X) #endif /* FreeBSD *and other 4.4 based systems require anything, isnanf is defined */
Once speech-tools is compiled, getting ‘festival-1.96-beta.tar.gz’ compiled is as easy as ‘make clean;make’.
It works to power the speech synthesiser but it doesn’t appear to let you record the speech to a file (unlike festival above).
Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight and in his Mor Consulting, sign-up for Data Science tutorials in London. He also founded the image and text annotation API Annotate.io, lives in London and is a consumer of fine coffees.