About

Ian Ozsvald picture

This is Ian Ozsvald's blog, I'm an entrepreneurial geek, an A.I. consultant, author of the A.I.Cookbook, professional screencast producer, author of The Screencasting Handbook, a Pythonista, co-founder of ShowMeDo and FivePoundApps and also a Brightonian. Here's a little more about me.

View Ian Ozsvald's profile on LinkedIn Protecting your bits. Open Rights Group

17 July 2010 - 23:17EuroPython 2010

I’m hugely looking forward to EuroPython in Birmingham from Monday. I’m driving up Monday very early (I wish I’d booked the hotel room for Sunday night too…). Browsing through the abstracts I’d say all the following look darned interesting!

  • C++ integration
  • concurrent sequential processes
  • Arduino hacking
  • javascript
  • OpenData
  • aerodynamics
  • PyPy and Unladen Swallow
  • game programming
  • OpenGL
  • Pyjamas
  • idiomatic Python
  • MediaCore
  • Twisted and gevent
  • science and maths
  • SHOGUN machine learning

I’ll bring Headroid along and I hope to organise a Birds of a Feather session on Artificial Intelligence and robotics. If you’re interested in these topics, I’d love to say hi!


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Python

14 July 2010 - 12:5222,937* faster Python math using pyCUDA

I’ve just uploaded a new Mandelbrot.py demo for pyCUDA, it adds a new calculation routine that straddles the numpy (C based math) and the pure-CUDA implementations. In total there are 4 variants to choose from. The speed differences are huge!

Update – this Reddit thread has more details including real-world timings for two client problems (showing 10-3,677* speed-ups over a C task).

This post builds upon my earlier pyCUDA on Windows and Mac for super-fast Python math using CUDA.

You’ll need CUDA 3.1 and pyCUDA installed with a compatible NVIDIA graphics card. This version of the Mandelbrot code forces single precision math – this means it’ll work on all CUDA cards (even the older ones – full list). It runs on my MacBook (Leopard) and Windows, the Windows machines use a 9800 GT and GTX 480. Here’s what it generates:

The big-beast graphics card for my physics client is a GTX 480 – this is NVIDIA’s top of the line consumer card (costing £420GBP in the UK a few weeks back). It is huge – it covers two slots, uses one PCIe 2.0×16 slot and has a requirement for 300-400W of power (I’m using a 750W PSU to be safe on a Gigabyte GA H55M S2H motherboard):

The mandelbrot.py demo has four options (e.g. ‘python mandelbrot.py gpu’):

  • ‘gpu’ is a pure CUDA solution on the GPU
  • ‘gpuarray’ uses a numpy-like CUDA wrapper in Python on the GPU
  • ‘numpy’ is a pure Numpy (C-based) solution on the CPU
  • ‘python’ is a pure Python solution on the CPU with numpy arrays

The default problem is a 1000*1000 Mandelbrot plot with 1000 max iterations. I’m running this on a 2.9GHz dual core Windows XP SP3 with Python 2.6 (only 1 thread is used for all CPU tests). The timings:

  • ‘gpu’ – 0.07 seconds
  • ‘gpuarray’ – 3.45 seconds – 49* slower than GPU version
  • ‘numpy’ – 43.4 seconds – 620* slower than GPU version
  • ‘python’ – 1605.6 seconds – 22,937* slower than GPU version
  • ‘python’ with psyco.full() – 1428.3 seconds – 20,404* slower than GPU version

By default mandelbrot.py forces single precision for all the math. Interestingly on my box if I let numpy default to numpy.complex128 (two double precision floating point numbers rather than numpy.complex64 with two single precision floats) then the Python result is faster:

  • ‘numpy’ – 34.0 seconds (double precision)
  • ‘python’ – 627 seconds (double precision) – 2.5* faster than the single precision version

The ’22,937*’ figure is a little unfair in light of the 627 second result (which is 8,957* slower) but I wanted to use only single precision math for consistency and compatibility across all CUDA cards (the older cards can only do single precision math).

On my older dual core 2.66GHz machine with a 9800 GT I get:

  • ‘gpu’ – 1.5 seconds
  • ‘gpuarray’ – 7.1 seconds – 4.7* slower than GPU version
  • ‘numpy’ – 51 seconds – 34* slower than GPU version
  • ‘python’ – 1994.3 seconds – 1,329* slower than GPU version

If we compare the 0.07 seconds for the GTX 480 against the 1.5 seconds for the 9800 GT (albeit on different machines but the runtime is just measuring the GPU work) then the GTX 480 is 21* faster than the 9800 GT. That’s not a bad speed-up for a couple of years difference in architectures.

If you take a look at the source code you’ll see that the ‘gpu’ option uses a lump of C-like CUDA code, behind the scenes all pyCUDA code is converted into this C-like code and then down to PTX via their compiler. This is the way to go if you understand the memory model and you want to write very fast code.

The gpuarray option uses a numpy-like interface to pyCUDA which, behind the scenes, is converted into CUDA code. Because it is compiled from Python code the resulting CUDA code isn’t as efficient – the compiler can’t make the same assumptions about memory usage as I can make when hand-crafting CUDA code (at least – that’s my best understanding at present!).

The numpy version uses C-based math running on the CPU – generally it is regarded as being ‘pretty darned fast’. The python version uses numpy arrays with straight Python arithmetic, this makes it awfully slow. Psyco 2.0.0 makes it a bit faster.

Feedback and extensions are welcomed via the wiki!

If you want to get started then make sure you have a compatible CUDA card, get pyCUDA (installation instructions), compile pyCUDA (takes 30 minutes from scratch if you’re on a well-known system), try the examples and run mandelbrot.py. The mailing list is helpful.

It’d be nice to see some comparisons with PyPy, ShedSkin and other Python implementations. You’ll find links in my older ShedSkin post. It’ll also be interesting to tie this in to some of the A.I. projects in the A.I. Cookbook, I’ll have to ponder some of the problems that might be tackled.

Books:

The following two books will be useful if you’re new to CUDA. The first is very friendly, I’m still finding it very useful.


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

3 Comments | Tags: ArtificialIntelligence, Life, Python

11 July 2010 - 14:17Presenting A.I. at FlashBrighton (using Python!)

A couple of weeks back I presented an Artificial Intelligence evening at FlashBrighton with John Montgomery and Emily Toop. The night covered optical character recognition, face detection, robots and some futurology. A video link should follow.

Optical Character Recognition to Read Plaques

Recently I’ve been playing with OCR to read photos with text, a particular example I care about is extracting the text from English Heritage Plaques for the OpenPlaques project:

I gave an overview of the tesseract open source OCR tool (originally created by HP). Some of the notes I explained came from this tesseract OSCON paper. Some notes:

  • tesseract ranked highly in international competitions for scanned-image text extraction
  • it works better if you remove non-text regions (e.g. you isolate just the blue plaque in the above image) and threshold the image to a grey scale
  • it runs very quickly – it’ll extract text in a fraction of a second so it will run on a mobile phone (iPhone ports exist)

To get people thinking about the task from the computer’s point of view I had everyone read out the text from this blurry photo. Treating the image as a computer would see it shows that you need several passes to learn which country is involved and to guess at some of the terms:

You can guess that the domain is music/theatre (which helps you to specialise the dictionary you’re using), based in the US (so you know that 1.25 is $1.25USD) and even though the time is hard to read it is bound to be 7.30PM (rather than 7.32 or 7.37) because events normally start on the hour or half hour. General knowledge about the domain greatly increases the chance that OCR can extract the correct text.

I talked about the forthcoming competition to write a Plaque-transcriber system, that project is close to starting and you can see demo Python source code in the AI Cookbook.

Optical Character Recognition Web Service and Translator iPhone Demo

To help make OCR a bit easier to use I’ve setup a simple website: http://ocr.aicookbook.com/. You call a URL with an image that’s on the web (I use flickr for my examples) and it returns a JSON string with the translated text. The website is a few lines of Python code created using the fabulous bottle.py.

The JSON also contains a French translation and mp3 links for text to speech, this shows how easy it is to make a visual-assist device for the hard of sight.

Emily built an iPhone demo based on this web service – you can a photograph of some text, it uploads the text to flickr, retrieves the JSON and then plays the mp3s and shows you the translated text.

OCR on videos

The final OCR demo shows a proof of concept that extracts keywords from ShowMeDo‘s screencast videos. The screencasts show programming in action – it is easy to extract frames, perform OCR and build up strong lists of keywords. These keywords can then be added back to the ShowMeDo video page to give Google more indexable content.

There’s a write-up of the early system here.

OCR futurology

Text is all around us and mobile phones are everywhere. It strikes me that sooner or later we’ll be pointing our mobile phone at a poster like this and we’ll get extra information in return:

From the photo we can extract names of places, we also know the phone’s location so a WikiPedia geo-lookup will return relevant pages. Probably we can also extract dates and costs from posters and these can go into our calendar. I used tesseract on this image and extracted enough information to link to several WikiPedia pages with history and a map.

Face Detection for Privacy Invasion

John and I built a system for correlating gowalla check-ins with faces seen in images from the SkiffCam – the webcam that’s hosted in the Skiff co-working space. The goal was to show that we lose quite a lot of privacy without realising it – the SkiffCam has 29,000 images (1Gb of data) dating back over several years.

Using openCV’s face detection system I extracted thousands of faces. John retrieved all the gowalla check-ins based at the Skiff and built a web service that lets us correlate the faces with check-ins. We showed faces for many well-known Brightoners including Seb, Niqui, Paulo, Jon & Anna and Nat.

Given a persons face we could then train a face recogniser to see other occurrences of that person at the Skiff even if they’re not checking in with gowalla. We can also mine their twitter accounts for other identifying data like blogs and build a profile of where they go, who they know and what they talk about. This feels pretty invasive – all with open source tools and public data.

Emotion detection

Building on the face detector I next demonstrated the FaceL face labeling project from Colorado State Uni, built on pyVision. The tool works out of the box on a Mac – it can learn several faces or poses during a live demo. Most face recognisers only label the name of the person – the difference with FaceL is that it can recognise basic emotional states such as ‘happy’, ‘neutral’ and ‘sad’. This makes it really easy to work towards an emotion-detecting user interface.

During my demo I showed FaceL correctly recognising ‘happy’ and ‘sad’ on my face, then ‘left’ and ‘right’ head poses’, then ‘up’ and ‘down’ poses. I suspect with the up/down poses that it is really easy to build a nod-detecting interface!

Headroid2 – a Face Tracking Robot

Finally I demo’d Headroid2 – my face tracking robot (using the same openCV module as above) that uses an Arduino, a servo board, pySerial and a few lines of code to give the robot the ability to track faces, smile and frown:

Here’s a video of the earlier version (without the smiling face feedback):

For full details including build instructions see building a face tracking robot.

EuroPython

I’ll bring Headroid3 (this adds face-seeking behaviour) to EuroPython in a few weeks, hopefully I can find a few other A.I. folk and we can run some demos.

Reading material:

If you’re curious about A.I. then the following books will interest you:


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Life, Python

18 June 2010 - 15:30Talking on Artificial Intelligence next Tuesday at FlashBrighton

I’ve been invited to speak with John Montgomery next Tuesday at FlashBrighton – 7pm at The Werks for 1.5-2 hours or so of demos. We’ll be covering:

  • Head tracking robot (build your own in a few hours!)
  • Skiff Privacy Invasion – what we can learn from data mining the SkiffCam (the Gov’t can do it – now you can too)
  • Optical Character Recognition web service with an iPhone visual-assistant demo
  • Automatic transcription of OpenPlaques images (because Google can’t read images!)
  • Extracting text from videos to feed Google (because Google can’t read videos!)
  • Face detection proof of concept web service

Which, frankly, is quite a lot to cover in 1.5 hours and a couple of the demos still need some development…but that’s part of the fun, right? The demos are mostly in Python and will be written up on the A.I. Cookbook. The goal is to show non-A.I. programmers that a lot of A.I. is pretty accessible now via good open-source libraries.

Richard has given me a lovely Victorian-researcher inspired write-up, it is worth a proper read:

I have spoken this night with Sir Seb Lee-Delisle, the gentleman who runs the FlashBrighton club, an institution of long standing repute. He expressed great delight with my research into Artificial Intelligence, which he assuryes me he has been following with the greatest assiduity, and kindly invited me to present my findings at his club. I did of course accept, and have spent the remaynder of the day deliberating over how I might present these goode labours. I have settled on involving my £5 app collaborator Mr. John Montgomery, with whom I have been engaged on a number of projects for some little time now. …

[keep reading]

We’ll hope to see you along!


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Programming, Python, sussexdigital

21 May 2010 - 14:44Headroid1 – a face tracking robot head

The video below introduces Headroid1, this face-tracking robot will grow into a larger system that can follow people’s faces, detect emotions and react to engage with the visitor.

The above system uses openCV’s face detection (using the Python bindings and facedetect.py) to figure out whether the face is in the centre of the screen, if the camera needs to move it then talks via pySerial to BotBuilder‘s ServoBoard to pan or tilt the camera until the face is back in the centre of the screen.

Update – see Building A Face Tracking Robot In An Afternoon for full details to build your own Headroid1.

Headroid is pretty good at tracking faces as long as there’s no glare, he can see people from 1 foot up to about 8 feet from the camera. He moves at different speeds depending on your distance from the centre of the screen and stops with a stable picture when you’re back at the centre of his attention. The smile/frown detector which will follow will add another layer of behaviour.

Heather (founder of Silicon Beach Training) used Headroid1 (called Robocam in her video) at Likemind coffee this morning, she’s written up the event:

Andy White (@doctorpod) also did a quick 2 minute MP3 interview with me via audioboo.

Later over coffee Danny Hope and I discussed (with Headroid looking on) some ideas for tracking people, watching for attention, monitoring for frustration and concentration and generally playing with ways people might interact with this little chap:

The above was built in collaboration with BuildBrighton, there’s some discussion about it in this thread. The camera is a Philips SPC900NC which works using macam on my Mac (and runs on Linux and Win too). The ServoBoard has a super-simple interface – you send it commands like ’90a’ (turn servo A to 90 degress) as text and ‘it just works’ – it makes interactive testing a doddle.

Update – the blog for the A.I. Cookbook is now active, more A.I. and robot updates will occur there.

Reference material:

The following should help you move forwards:


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

7 Comments | Tags: ArtificialIntelligence, Python

10 May 2010 - 18:59“Artificial Intelligence in the Real World” lecture at Sussex University 2010

I’m chuffed to have delivered the second version of my “A.I. in the real world” lecture (I gave it last May too) to 2nd year undergraduates at Sussex University this afternoon.

The slides are below, I cover:

  • A.I. that I’ve seen and have been involved with in the last 10 years
  • Some project ideas for undergraduates
  • How to start a new tech business/project in A.I.

In the talk I also showed or talked about:

Artificial Intelligence in the Real World May 2010 Sussex University Guest Lecture

Here’s the YouTube video showing the Grand Challenge entries:

Update – the blog for the A.I. Cookbook is now active, more A.I. and robot updates will occur there.


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Python, ShowMeDo, SussexUniversity, projectbrightonblogs, sussexdigital, £5 App Meet

4 April 2010 - 19:39New book/wiki – a practical artificial intelligence ‘cookbook’

Having almost completed The Screencasting Handbook I’m now thinking about my next project. I’ve been involved in the field of artificial intelligence since my first computer (a Commodore 64 back in the 80s) and I’ve continued to be paid to work in this area since the end of the 90s.

Update – as mentioned below the new project has started – read more at the A.I. Cookbook blog.

My goal now is to write a collaborative book (probably using a wiki) that takes a very practical look at the use of artificial intelligence in web-apps and desktop software. The big goal would be to teach you how to effectively use A.I. techniques in your job and for your own research. Here’s a few of the topics that could be covered:

  • Using open source and commercial tools for face, object and speech recognition
  • Playing with open source and commercial text to speech tools (e.g. the open source festival)
  • Automated control of driving and flight simulators with artificial brains
  • Building chatbot systems using tools like AIML, CHAT-L and natural language parsing kits
  • Using natural language parsing to add some smarts to apps – maybe for reading and identifying interesting people in Twitter and on blogs
  • Building useful demos around techniques like neural networks and evolutionary optimisation
  • Adding brains to real robots with some Arduinos and open source robot kits
  • Teaching myself machine learning and pattern matching (an area I’m weak on) along with useful libraries like Bayesian classification (Python’s reverend is great for this)
  • Parallel computation engines like Amazon’s EC2, libcloud and GPU programming with CUDA and OpenCL
  • Using Python and C++ for prototyping (along with Matlab and some other relevant languages)
  • and a whole bunch of other stuff – your input is very welcome

I’ve noticed that there are an awful lot of open source (and commercial) toolkits but very few practical guides to using them in your own software. What I want to encourage are some fun projects that’ll run for a month or two, here are some ideas:

  • Using optical character recognition engines to augment projects like OpenPlaques.org with free meta data from real-world photos (for a start see my Tesseract OCR post)
  • Collaborating in real-world competitions like the Simulated Car Racing Competition 2010: Demolition Derby (they’re running a simulated project that’s not unlike the DARPA Grand Challenge)
  • Applying face recognition algorithms to flickr photos so we can track who is posting images of us for identity management
  • Creating a Twitter bot that responds to questions and maybe can have a chat (checking the weather should be easy, some memory could be useful – using Twitter as an interface to tools like OCR for plaques might be fun too) – I have one of these in development right now
  • Build a Zork-solving bot (using NLP and tools like ConceptNet) that can play interactive fiction, build maps and try to solve puzzles
  • Using evolutionary optimisation techniques like genetic algorithms on the traveling salesman problem
  • Building Braitenberg-like brains for open source robot kits (like those by Steve at BotBuilder)
  • Crate a QR code and Bar Code reader, tied to a camera

LinkedIn has my history – here’s my work site (please forgive it being a little…simple) Mor Consulting Ltd, I’m the AI Consultant for Qtara.com and I used to be the Senior Programmer for the UK R&D arm of MasaGroup.net/BlueKaizen.com.

I don’t have a definite timeline for the book, I’ll be making that up with you and everyone else once I’ve finished The Screencasting Handbook (end of April).

The Artificial Intelligence Cookbook project has started – the blog is currently active (along with the @aicookbook Twitter account). There is a mailing list to join for occasional updates – email AICookbook@Aweber.com to join.

It will be a commercial project and I will be looking to make it very relevant to however you’re using AI. Sign-up and you’ll get some notifications from me as the project develops.


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

2 Comments | Tags: ArtificialIntelligence, Programming, Python

10 February 2010 - 4:11Fix for ConceptNet error “Settings cannot be imported, because environment variable DJANGO_SETTINGS_MODULE is undefined”

If you’re using ConceptNet and you see:

ImportError: Settings cannot be imported, because environment variable
DJANGO_SETTINGS_MODULE is undefined.

then the fix is simple (I’ve been hacking away at an idea whilst at IUI2010 – thanks Rob for the fix).

To replicate the error run:

from csc.nl import get_nl
en_nl = get_nl('en')
en_nl.is_stopword('the')

The fix is to run:

import csc.conceptnet.models

which sets up Django, the call is_stopword again and all is fine.


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: Python

26 January 2010 - 14:01pyCUDA on Windows and Mac for super-fast Python math using CUDA

I’ve just started to play with pyCUDA which lets you run parallel math operations on a CUDA-compliant NVidia graphics card through Python.

CUDA stands for Compute Unified Device Architecture – it is an architecture that lets us program the Graphics Processing Unit (GPU) on a high powered graphics card to do scientific or graphical math calculations rather than the usual texture processing for games.  In essence it is a mini supercomputer that is specialised just for fast math operations – if you can figure out how to use it.

The goal is to off-load the CPU-intensive calculations for two of my clients (a physics company and a flood modelling company) to achieve 10* to 100* speed-ups using commodity graphics cards.

pyCUDA makes it easy to interactively program a CUDA device rather than hitting C++ code with the slow write/compile/debug loop.  Recent MacBooks (mine was bought in January 2009) have NVidia cards with CUDA-compatible devices built-in (mine is a 9400M).  For my desktop computer I have a 9800 GT (costing £100).

It turns out that this is bleeding-edge stuff – getting pyCUDA compiled on my MacBook and Win XP machine took some time (forum posts for Mac and Windows issues) thankfully the group is helpful and the wiki has an installation section for Windows, Mac and Linux and some reasonable documentation.

Right now I’ve got as far as running some of the demo code on my MacBook (showing a 5* speed-up over the CPU) and my desktop (showing a 30* speed-up over the CPU).  I’ll report more as I progress.

Update – pyCUDA works inside IPython too, lovely.

Update – I don’t have OpenGL working for gl_interop.py but as noted here you need “CUDA_ENABLE_GL = True” in siteconf.py and you need PyOpenGL installed.  When rebuilding my MSVC threw a hissy fit, it isn’t essential to my work so I’m skipping this demo.

Update – I’ve submitted a patch and two examples to the wiki (SimpleSpeedTest, Mandelbrot). I get 200* speed-ups on the speed test (using a for loop on a sin() calculation) and 5 to 20* speed-up on Mandelbrots (it seems to scale very well vs numpy with increasing dimensions).

Update – There are lots of interesting papers for CUDA surfacing like this one showing a 3* speed-up for voice recognition tasks (using CPU and GPU together) and yet another way to improve fluid dynamic simulations. This Tom’s 3D article gives a great write-up (starting with the history of audio cards) on where 3D is right now and how NVidia is beating ATI for scientific computing.

Books to read:

The following CUDA books will help you understand the basics of CUDA programming – I particularly like the first (Kirk and Hwu).


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: Python

13 December 2009 - 17:05Text to Speech – Festival (cross platform) and MacSpeechX (Python on Mac)

I wanted to play with text to speech, I’ve been looking for a cross-platform open-source solution that sounds reasonable.  I’m really impressed with the festival project, the web demo lets you enter your own text.

Update – I’m including this post in my plans for an Artificial Intelligence Handbook.

Festival is cross-platform but compiling it on a Mac takes a touch of effort (it looks like it is easier on Linux and Win).

This article shows you how to use it and how to web-enable it with some php.  For the simplest demo I used ‘bin/text2wave input.txt -o output.wav’ with input.txt containing a sentence.

To get started, get the latest code.  I have v1.96beta.  You may also want the official festlang-talk list and possibly this more complete archive.

Compiling speech_tools-1.2.96-beta.tar.gz

It ought to have been as simple as ‘make clean; make’ but there’s a few changes to make first.  First we need this fix or we get a compile error in macosxaudio in kAudioUnitProperty_SetInputCallback:

If you add
#include <AudioUnit/AUNTComponent.h>
after the include block on lines 45-48 in audio/macosxaudio.cc the
problem should be solved.

By the way, remember to change the byte order if you have an intel
mac, i.e. on line 131:
     waveformat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger
		 | kLinearPCMFormatFlagIsPacked;
	// For Intel	| kLinearPCMFormatFlagIsPacked;
     // For PowerPC    | kLinearPCMFormatFlagIsPacked |
kLinearPCMFormatFlagIsBigEndian;

The following was a trickier error to solve:

g++ -c -fno-implicit-templates -O3 -Wall -I../include sigpr_frame.cc
sigpr_frame.cc: In function
‘void lpc2cep(const EST_FVector&, EST_FVector&)’:
sigpr_frame.cc:318: error: ‘__isnan’ was not declared in this scope
make[1]: *** [sigpr_frame.o] Error 1
make: *** [sigpr] Error 2

The fix was known but the relevant archive was missing, some googling for ‘__isnan mac‘ results in this cached 2006 page:

--- ../test/speech_tools/include/EST_math.h     2006-08-03  
08:49:35.000000000 -0500
+++ include/EST_math.h  2006-08-17 17:53:33.000000000 -0500
@@ -43,7 +43,7 @@
#if defined(__APPLE__)
/* Not sure why I need this here, but I do */
-extern "C" int isnan(double);
+extern "C" int isnan(float);
#endif
/* this isn't included from c, but just to be safe... */
@@ -101,7 +101,6 @@
/* Apple OSX */
#if defined(__APPLE__)
#define isnanf(X) isnan(X)
-#define isnan(X) __isnan(X)
#endif
/* FreeBSD *and other 4.4 based systems require anything, isnanf is  
defined */

Compiling festival-1.96-beta.tar.gz

Once speech-tools is compiled, getting ‘festival-1.96-beta.tar.gz’ compiled is as easy as ‘make clean;make’.

Python’s MacSpeechX

I also had a play with the macspeechx module which ties Python to the Mac’s voice-synthesiser.  See list_voice_name() in macspeechX.py for an example of how it all works.

It works to power the speech synthesiser but it doesn’t appear to let you record the speech to a file (unlike festival above).

Update – Mike Driscoll has a post about pyTTS which hooks into Microsoft’s SAPI on Windows and pyTTSX which is cross-platform, along with some speech recognition links.


Ian applies Artificial Intelligence for companies (Mor Consulting), programs Python, produces professional screencasts (ProCasts), writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Python