Entrepreneurial Geekiness

Ian is a London-based independent Chief Data Scientist who coaches teams, teaches and creates data products. More about Ian here.
Entrepreneurial Geekiness
Ian is a London-based independent Chief Data Scientist who coaches teams, teaches and creates data products.
Coaching
Training
Jobs
Products
Consulting

Building a Social Microprinter

Over the last couple of months I’ve been building up a social microprinter (inspired by Tom Taylor‘s implementation and Matt Webb‘s original idea). Here’s the current version – Arduino+WiShield+CBM231+off-site server (powered partly by BenOSteen’s Python driver):

There’s a second quick video and talk for the £5 App event I ran earlier in the week.

The goal is to build a social microprinter – a printer that’d live in a social environment (currently The Skiff co-working office in Brighton) which would help bring people a little bit closer. Currently it prints tweets (for ‘theskiff’) and shows events, later it’ll show recent Gowalla check-ins and maybe some local news headlines or the weather (but there’s got to be better stuff to show, right?…ideas on a postcard please).

My original intent was to build a device that could be stuck on the wall in a cafe, it would show tweets on a screen (probably under the cafe’s or Brighton’s hashtag) and let non-Internet folk post their own messages back. Doing this nicely would have needed a screen, machine, wall space etc – using a receipt printer seemed like an easy way to prototype the idea.

Jumping forward, here’s an early version – this is a CBM231 connected to my Ubuntu laptop via a USB->RS232 lead (note – this lead is good, the cheap ones on eBay can be bad – see below). Here I’m using BenOSteen’s Python driver to send tweets via serial to the printer.

This device has done the rounds, here it is on display at BuildBrighton’s talk to the British Computer Society:

Here it is in use at Likemind Brighton showing international #likemind tweets as other groups meet around the world on Friday morning (note – unicode converted to ‘?’ as I haven’t figured out if/how to get international characters out of the printer yet!):

It ran during the weekend of Barcamp Brighton and printed out barcampy stuff, I added some notes about local cafes and a job ad for one of the companies:

The goal all along was to build an independent controller (so removing the laptop from the equation). For this I coupled an Arduino with a WiShield 1.0. The WiShield libraries are easy enough to work with, after an hour’s experimentation I got WPA2 working (it takes 25 seconds to negotiate the connection on each attempt), we use WPA2 at home and in The Skiff.

Coupling the Arduino to the printer was easy enough, I have been trying (and so far failing) to get a Max233 chip acting as a voltage level converter so for now I’m using a pre-built RS232 Level Shifter. This converts the Arduino’s 0V/5V TTL to +12V/-12V RS232 levels (powered from the Aruino’s 5V out). To output text I’m using Roo Reynold’s Aduino sketch, this handily includes some control codes to cut the receipt after printing.

Next I wanted live data. At first I simply put a short plain text file on a web site, used the WiShield to fetch it and Roo’s code to print it. Now I’m using a hacked version of Ben’s code to write tweets (including bold and underline control codes) to a text file which is stored online (microprinter.ianozsvald.com), this ready-to-print file is grabbed over the WiShield, printed and then cut. The online file is updated every 2 minutes.

The final tweak was to add a button to the printer. Using the Arduino’s demo button sketch I hooked up a big thumb-sized button. The Arduino’s main loop is looking for a combination of ‘at least 5 seconds have passed since the last print’ and ‘button pressed’, then it’ll kick off the web request for new data. Once this request returns it prints out the text.

I look for the pattern “————–” (14 dashes) to start and end the message, before this we get HTTP headers (from the WiShield) that I didn’t want to print.

Here’s the finished hardware:

This is a WiShield 1.0. The button (shown just out of shot top-left) is connected 3.3V->button, button->Pin 6 AND Ground (via a 15k resistor). For the printer I’m using Pin 8 for tx (blue lead on the RS232 level converter) and Ground, the level converter is powered by the 5V out.

Here’s the connector:

The connector is overly-connected in this image. I think all you actually need is Pin 2 from the RS232 Level Converter to Pin 3 on the 25 pin connector along with Pin 5 (GND) to Pin 7 (GND on 25 pin connector). With yellow wires I’ve shorted Pins 4&5 and 8&20 but I think this is overkill (they’re used for bus control but they’re probably ignored in this configuration).  Here’s a full pinout.

During all the hacking our faithful cat Mia has attempted to assist whenever she could. Here she’s taken ownership of the bag used to transport the early versions:

Along the way I also acquired an Epson TM T88 II receipt printer, it is ‘just another serial printer’ but takes different control codes (and it looks like it might have a smaller character set than the CBM 231). As yet I’ve only tried printing plain ASCII, I’d like to investigate further and build a library that supports this printer too.

Note on buying leads from eBay! be aware that if you buy cheap leads from eBay (e.g. £2 silver/blue leads) then you might end up with a pack of 5 (because if you buy 5 and one breaks, you’ve got 4 more that work, right?), you might have 5 dead-on-arrival leads. You could then report the problem and the nice people could then ship you a replacement set, but then you might discover that you’ve got another 5 DOA leads. You have been warned.

If you’re buying your first microprinter do try to buy a working serial lead with it (it’ll probably be a 9 pin to 25 pin converter lead) – if you get the wrong lead (null modem vs straight serial – I forget which you need!) then you won’t get anything (the bane of my first few week’s of testing). Buy a printer+lead that’s known to work and you won’t go wrong.

Spend the £8 per lead and buy from Amazon if you don’t want to waste hours wondering why your printer is just printing out reams of ‘?’ rubbish:

If you want to build your own then the first best source of info is the microprinter wiki. Roo Reynolds has Arduino drivers (which I hacked a bit for my implementation) that don’t depend on external data sources.

You’ll find my Python server source and Arduino sketch (which assumes you’ve got a WiShield 1.0) here: social_microprinter. Note that the code is horribly hacky, it was written over many short sessions when I could steal an hour or two from other projects.

It could do with being straightened out and commented and a few nice new features would include Gowalla check-in notifications, event RSS reading and weather printing.

Many thanks to my fellow hackers at BuildBrighton for help debugging my early serial problems and to Barney for the lend of his RS232 Shifter (I’ll soon get this Max233 working, promise!).

Here’s the finished, installed unit on the work bench at BuildBrighton in The Skiff (just by the social kitchen space). Once it is a bit more robust it’ll move to the front of the building:


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

£5 App #23 on 2nd Nov, 8pm at The Skiff

Next Tuesday at 8pm at The Skiff we’re holding our 23rd £5 App event. This is our second this year, we’ve been a bit slow. To make up for being slow we’ve given it the title “Things we built this summer“, here’s our fine speaker list:

The evening will run for about 2 1/2 hours, we’ll provide free beer and cake as usual. My Mor Consulting is sponsoring the beer, John‘s fine cooking skills are providing the cake.

Because we’re buying beer and baking cake we need to know if you’re attending! Please sign-up on Lanyrd or Upcoming (and unAttend if you subsequently can’t attend).

I’d be especially happy if you use Lanyrd (you just need to tweet ‘@lanyrd attending #fivepoundapp’ for that to happen automatically) as I’ll be collecting that data for Emily‘s Social Ties talk.

As usual we’ll drift to the pub after the event. If you want to meet a bright selection of get-off-of-bottom-and-do-interesting-things people then you should attend next Tuesday.

Whitenight Festival

If you’re attending the Whitenight festival this Saturday (you really should if you’re in town) then do check out Shardcore’s Enlightenment Machine and Cats and BuildBrighton’s Light Brigade build-flashing-social-lights hack event.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

Visualising Lanyrd’s social connectivity graph

Over the weekend at BarCampBrighton5 I demonstrated a quick visualisation that Kyran and I built over breakfast in Berlin last Friday. It looks like:

To see it yourself open Bar Camp Brighton 5 Visualisation using Chrome or WebKit (it’ll work in Firefox but might be rather slow). It is interactive so it is worth opening, about 60 people are shown here.

If you reload the page you’ll see the force directed graph bouncing around as it settles into a low energy configuration. The nodes are people attending the event, edges are friend links to other people at the event. The image sizes for the nodes reflect the number of links a person has at that event.

As you can see above Jot (the main host and co-organiser) is most connected at the event. Two people aren’t following anyone at the event, they’ve been pushed to the bottom left of the window.

You can drag nodes using the grab-handles (blue circles) or move the entire graph by dragging the image.

For a larger example (80 people) see the Flash on the Beach 2010 Visualisation:

Here you can see that seb_ly is the most connected, closely followed by niqui and bitchwhocodes. At the bottom left is a sub graph of two nodes – these two people follow each other but don’t follow anyone in the main graph.

In both cases the data is extracted from the relevant Lanyrd pages (BCB5, FOTB), friends for each attendee are read from Twitter and then a graph is built as a JSON dictionary which links nodes (screen_names) to friends (lists of screen_names). Ready to run Python source code is at github: LanyrdViewerUsingProtoVis.

Both of these links should work on a mobile device but they’ll be awfully slow (they’re useless on my iPhone 3G!) 🙂

Kyran used ProtoVis to build the force directed graph, it includes a bit of a hack to make images work on the nodes.

If you’re interested in seeing more of this stuff then Kyran will have more to demo at our upcoming £5 App show and tell.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

Scrapy, libxml + libxslt, Mac, “checking for libxml libraries >= 2.6.8… configure: error:”

In the hope that this’ll save someone else the bother…if you’re installing the web scraping Python library scrapy on your Mac (I’m on Leopard 10.5.8) and you come across an error like:

checking for libxml libraries >= 2.6.8... configure: error:
Version 2.6.7 found. You need at least libxml2 2.6.8 for this
version of libxslt

then here’s the solution.

Presumably you’ll be following the Scrapy install instructions. I used the supplied links for libxml2-2.7.3 and libxslt-1.1.24. libxml built and installed to /usr/local/lib just fine. libxslt wouldn’t ./configure – it kept reporting that it could only see the older libxml from /usr/lib, not the newer one in /usr/local/lib.

The fix is here, and this is my configure line:

 $ ./configure 
    --with-python=/Library/Frameworks/Python.framework/Versions/2.5/ 
    --prefix=/usr/local 
    --with-libxml-prefix=/usr/local 
    --with-libxml-include-prefix=/usr/local/include 
    --with-libxml-libs-prefix=/usr/local/lib

At this point libxslt configured, built and installed just fine. To make python see it I had to update my .bash_profile so PYTHONPATH linked to the default output directory:

export PYTHONPATH=$PYTHONPATH:/usr/local/lib/python2.5/site-packages

Side note – whatever you do, don’t mess with /usr/lib. I tried moving the default libxml and libxslt libraries and I had the same consequence mentioned by Kevin Watters – lots of system tools (including su!) depend on libxslt to be in /usr/lib. I had to boot to Single User Mode to copy the files back before the system would work again.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

Demoing pyCUDA at the London Financial Python User Group

On Wednesday night I jumped on a train up to London to visit the London Financial Python User Group to give a short demo of pyCUDA. I’m using CUDA heavily for my physics consultancy and I figured the finance guys would be interested in 10-1000* speed-ups for their calculations.

The raw figures and the Mandelbrot demo that I gave are already covered in my earlier blog post: 22,937* faster Python math using pyCUDA.

To introduce pyCUDA I used P. Narayanan’s GPUs: For Graphics and Beyond PDF presentation (the first 13 pages), his explanation and diagrams are very clear.

To put CUDA in context against regular CPUs I used the recent Peak MHz graph and the main power/speed/transistor count graph in The Free Lunch is Over: A Fundamental Turn to Concurrency in Software. The main point here is that we’ve topped out at 2-3GHz CPUs and now we have to parallelise our code. Doing so on CPUs means we get 4, 8, 16 (and soon 24 then 32) cores to play with…but with CUDA if the problem is mathematics based we have 480 cores to use!

If you’re interested in the general use of CUDA and GPUs then check out the excellent gpgpu.org.

You may wonder about real-world performance with CUDA. Without naming names I can say that I’m now delivering a 115* speed-up on a particularly gnarly problem (I mentioned during the talk that I’d reached 80* – I’ve managed to improve that in the last 2 days). On an earlier problem when I knew far less about CUDA I delivered a 100* speed-up for the same company.

It was grand to meet a lot of new faces at the group, a few people I’ve met before at PyCons (hi Ben! Giles!). Making a contact with Didrik of Enthought was rather grand too. I hope to visit again.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More