Archives of #Lt

ANN: twitter-text-python release (Python Tweet parsing library)

A few weeks back I took over as maintainer of the twitter-text-python library (source on github). This library lets you take a tweet like: "@ianozsvald, you now support #IvoWertzel's tweet ... parser!" and extract the Twitter entities as defined in the Twitter conformance tests. The entities in the above tweet would be: reply: 'ianozsvald' […]

Using ZeroFree to shrink a VirtualBox Linux Image

My development Ubuntu image inside VirtualBox was using too much space to store empty but non-zero disk blocks on its virtual drive. This sucked space from my laptop’s SSD (which is already not big enough!). Shrinking it by zeroing the blocks took a little bit of effort. Inside VirtualBox if I boot my Ubuntu 11.04 […]

Extracting keyword text from screencasts with OCR

Last week I played with the Optical Character Recognition system tesseract applied to video data. The goal – extract keywords from the video frames so Google has useful text to index. I chose to work with ShowMeDo‘s screencasts as many show programming in action – there’s great keyword information in these videos that can be […]

Text to Speech – Festival (cross platform) and MacSpeechX (Python on Mac)

I wanted to play with text to speech, I’ve been looking for a cross-platform open-source solution that sounds reasonable.  I’m really impressed with the festival project, the web demo lets you enter your own text. Update – I’m including this post in my plans for an Artificial Intelligence Handbook. Festival is cross-platform but compiling it […]

Running Skype on Ubuntu + QuickCam Pro 9000

I use Skype on my Win desktop and MacBook as a matter of course now, I rather like to use the video feed via the MacBook when co-working with my team on our screencasts. Since the desktop box usually runs Ubuntu 9.04, I wanted to try my new QuickCam Pro 9000.  The short story is […]