About

Ian Ozsvald picture

This is Ian Ozsvald's blog (@IanOzsvald), I'm an entrepreneurial geek, a Data Science/ML/NLP/AI consultant, author of O'Reilly's High Performance Python book, co-organiser of PyDataLondon, a Pythonista, co-founder of ShowMeDo and also a Londoner. Here's a little more about me.

High Performance Python book with O'Reilly

View Ian Ozsvald's profile on LinkedIn

ModelInsight Data Science Consultancy London Protecting your bits. Open Rights Group

Archive

13 December 2009 - 17:05Text to Speech – Festival (cross platform) and MacSpeechX (Python on Mac)

I wanted to play with text to speech, I’ve been looking for a cross-platform open-source solution that sounds reasonable.  I’m really impressed with the festival project, the web demo lets you enter your own text.

Update – I’m including this post in my plans for an Artificial Intelligence Handbook.

Festival is cross-platform but compiling it on a Mac takes a touch of effort (it looks like it is easier on Linux and Win).

This article shows you how to use it and how to web-enable it with some php.  For the simplest demo I used ‘bin/text2wave input.txt -o output.wav’ with input.txt containing a sentence.

To get started, get the latest code.  I have v1.96beta.  You may also want the official festlang-talk list and possibly this more complete archive.

Compiling speech_tools-1.2.96-beta.tar.gz

It ought to have been as simple as ‘make clean; make’ but there’s a few changes to make first.  First we need this fix or we get a compile error in macosxaudio in kAudioUnitProperty_SetInputCallback:

If you add
#include <AudioUnit/AUNTComponent.h>
after the include block on lines 45-48 in audio/macosxaudio.cc the
problem should be solved.

By the way, remember to change the byte order if you have an intel
mac, i.e. on line 131:
     waveformat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger
		 | kLinearPCMFormatFlagIsPacked;
	// For Intel	| kLinearPCMFormatFlagIsPacked;
     // For PowerPC    | kLinearPCMFormatFlagIsPacked |
kLinearPCMFormatFlagIsBigEndian;

The following was a trickier error to solve:

g++ -c -fno-implicit-templates -O3 -Wall -I../include sigpr_frame.cc
sigpr_frame.cc: In function
‘void lpc2cep(const EST_FVector&, EST_FVector&)’:
sigpr_frame.cc:318: error: ‘__isnan’ was not declared in this scope
make[1]: *** [sigpr_frame.o] Error 1
make: *** [sigpr] Error 2

The fix was known but the relevant archive was missing, some googling for ‘__isnan mac‘ results in this cached 2006 page:

--- ../test/speech_tools/include/EST_math.h     2006-08-03  
08:49:35.000000000 -0500
+++ include/EST_math.h  2006-08-17 17:53:33.000000000 -0500
@@ -43,7 +43,7 @@
#if defined(__APPLE__)
/* Not sure why I need this here, but I do */
-extern "C" int isnan(double);
+extern "C" int isnan(float);
#endif
/* this isn't included from c, but just to be safe... */
@@ -101,7 +101,6 @@
/* Apple OSX */
#if defined(__APPLE__)
#define isnanf(X) isnan(X)
-#define isnan(X) __isnan(X)
#endif
/* FreeBSD *and other 4.4 based systems require anything, isnanf is  
defined */

Compiling festival-1.96-beta.tar.gz

Once speech-tools is compiled, getting ‘festival-1.96-beta.tar.gz’ compiled is as easy as ‘make clean;make’.

Python’s MacSpeechX

I also had a play with the macspeechx module which ties Python to the Mac’s voice-synthesiser.  See list_voice_name() in macspeechX.py for an example of how it all works.

It works to power the speech synthesiser but it doesn’t appear to let you record the speech to a file (unlike festival above).

Update – Mike Driscoll has a post about pyTTS which hooks into Microsoft’s SAPI on Windows and pyTTSX which is cross-platform, along with some speech recognition links.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

1 Comment | Tags: ArtificialIntelligence, Python