Ian Ozsvald picture

This is Ian Ozsvald's blog (@IanOzsvald), I'm an entrepreneurial geek, a Data Science/ML/NLP/AI consultant, author of O'Reilly's High Performance Python book, co-organiser of PyDataLondon, a Pythonista, co-founder of ShowMeDo and also a Londoner. Here's a little more about me.

High Performance Python book with O'Reilly

View Ian Ozsvald's profile on LinkedIn

ModelInsight Data Science Consultancy London Protecting your bits. Open Rights Group

13 December 2009 - 17:05Text to Speech – Festival (cross platform) and MacSpeechX (Python on Mac)

I wanted to play with text to speech, I’ve been looking for a cross-platform open-source solution that sounds reasonable.  I’m really impressed with the festival project, the web demo lets you enter your own text.

Update – I’m including this post in my plans for an Artificial Intelligence Handbook.

Festival is cross-platform but compiling it on a Mac takes a touch of effort (it looks like it is easier on Linux and Win).

This article shows you how to use it and how to web-enable it with some php.  For the simplest demo I used ‘bin/text2wave input.txt -o output.wav’ with input.txt containing a sentence.

To get started, get the latest code.  I have v1.96beta.  You may also want the official festlang-talk list and possibly this more complete archive.

Compiling speech_tools-1.2.96-beta.tar.gz

It ought to have been as simple as ‘make clean; make’ but there’s a few changes to make first.  First we need this fix or we get a compile error in macosxaudio in kAudioUnitProperty_SetInputCallback:

If you add
#include <AudioUnit/AUNTComponent.h>
after the include block on lines 45-48 in audio/macosxaudio.cc the
problem should be solved.

By the way, remember to change the byte order if you have an intel
mac, i.e. on line 131:
     waveformat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger
		 | kLinearPCMFormatFlagIsPacked;
	// For Intel	| kLinearPCMFormatFlagIsPacked;
     // For PowerPC    | kLinearPCMFormatFlagIsPacked |

The following was a trickier error to solve:

g++ -c -fno-implicit-templates -O3 -Wall -I../include sigpr_frame.cc
sigpr_frame.cc: In function
‘void lpc2cep(const EST_FVector&, EST_FVector&)’:
sigpr_frame.cc:318: error: ‘__isnan’ was not declared in this scope
make[1]: *** [sigpr_frame.o] Error 1
make: *** [sigpr] Error 2

The fix was known but the relevant archive was missing, some googling for ‘__isnan mac‘ results in this cached 2006 page:

--- ../test/speech_tools/include/EST_math.h     2006-08-03  
08:49:35.000000000 -0500
+++ include/EST_math.h  2006-08-17 17:53:33.000000000 -0500
@@ -43,7 +43,7 @@
#if defined(__APPLE__)
/* Not sure why I need this here, but I do */
-extern "C" int isnan(double);
+extern "C" int isnan(float);
/* this isn't included from c, but just to be safe... */
@@ -101,7 +101,6 @@
/* Apple OSX */
#if defined(__APPLE__)
#define isnanf(X) isnan(X)
-#define isnan(X) __isnan(X)
/* FreeBSD *and other 4.4 based systems require anything, isnanf is  
defined */

Compiling festival-1.96-beta.tar.gz

Once speech-tools is compiled, getting ‘festival-1.96-beta.tar.gz’ compiled is as easy as ‘make clean;make’.

Python’s MacSpeechX

I also had a play with the macspeechx module which ties Python to the Mac’s voice-synthesiser.  See list_voice_name() in macspeechX.py for an example of how it all works.

It works to power the speech synthesiser but it doesn’t appear to let you record the speech to a file (unlike festival above).

Update – Mike Driscoll has a post about pyTTS which hooks into Microsoft’s SAPI on Windows and pyTTSX which is cross-platform, along with some speech recognition links.

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

1 Comment | Tags: ArtificialIntelligence, Python