About

Ian Ozsvald picture

This is Ian Ozsvald's blog, I'm an entrepreneurial geek, an AI consultant, co-founder of the StrongSteam AI and data mining API, co-founder of the SocialTies App, author of the A.I.Cookbook, author of The Screencasting Handbook, a Pythonista, co-founder of ShowMeDo and FivePoundApps and also a Brightonian. Here's a little more about me.

View Ian Ozsvald's profile on LinkedIn Visit Ian Ozsvald's data science consulting business Protecting your bits. Open Rights Group

11 May 2012 - 17:40StrongSteam’s first novel OCR matching API (Python demo)

Here’s a preview of our first novel API in StrongSteam. We’ve been working with Optical Character Recognition (OCR) for a while, we set ourselves the task of matching a noisy photograph of some text to a pre-seeded database of entries. If you follow my blog you’ll already have seen our example iPhone app for the Royal Botanic Gardens, Kew, London (developed in collaboration with Kasabi):

Now rather than having to re-label 10,000 Latin plant labels with QR codes Kew can now use our matching technology on their existing labels to enrich a visitor’s experience of the gardens (and it turns out that a lot of visitors have iPhones and use Kew’s official app).

With our API we can do the same kind of task with photos of plaques from the London Science Museum where we match against 836 entries scraped from the Science Museum website. In the following video we match against text from the information plaque of Old Bess (née ‘Beelzebub’) in the Energy Hall:

This is just a preview, we’ve sent the Python & cURL API to some of our alpha users and will be inviting more in over the coming month. Here are some more OCR videos and here’s a work-in-progress demo of our image matching (using PhoneGap on an Android):

If you’d like to get access to our RESTful cloud-based computer vision APIs please sign-up on our StrongSteam homepage. Soon we’ll be adding raw OCR (with co-ordinates and font size reports) and image matching (particularly for stuff like brand logos and beer labels).

We’re super-keen to hear about your use cases and needs – please send me an email (ian AT strongsteam.com) and tell me what you need. We used to work on these problems in my consultancy (Mor Consulting), now we’re working to make our IP more available to all.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Life, Python, StrongSteam

6 May 2012 - 21:40Python Introductory Course (OpenSource, StartupChile)

We’ve just run 5 of the 6 nights of our Introductory Python course here in Santiago for StartupChile. The course aims to ‘give back’ to the Chilean economy by helping more people learn to program (we had a mix of locals and StartupChile members in our classes). In total we’ve taught 25 people (only 5 women though!). Here’s the group on the first night:

Initially I’d planned to run my 12 hour (2 day) course to one group of 15 people. In total over 80 people indicated that they wanted to attend the course so I split it into two groups of 15 (of the 80 a total of 27 committed to the available dates). We started this last Monday, we finished Group A last night (Saturday night late night coding lessons FTW!) and we finish Group B on Monday.

Both groups had prior coding experience (I said you’d need to know about variables, command line access etc) but little experience with Python. We covered 3 of the pythonchallenges and looked at:

  • IDLE immediate mode and for writing modules
  • variables, logic, loops
  • functions
  • id, mutable and immutable types
  • simple debugging and documentation
  • importing modules, using pip

I’ve open-sourced the course as Python in 6 Hours, you can grab the slides (ppt/pdf) and solutions to the two exercises from Github. This isn’t a deep/thorough introduction to Python, it is aimed at taking people with existing experience in another language into Python whilst covering some of the less-often-covered basics (like id, mutability etc).

If you’d like to run your own 6 hour Python course then feel very free to grab the notes and have a crack.

 


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

1 Comment | Tags: Python

31 March 2012 - 5:17Demos for Botanical Garden Label Matcher from StrongSteam

After a fair bit of graft we’ve finished our first product using StrongSteam – a Latin Botanical Garden label matcher (AKA “OpenPlants”) which runs at Kew Gardens, Wakehurst Place and other botanical gardens in Europe that use the usual black rectangular labels. If you’re not sure what I’m talking about then these 30 second demo videos should make it clear:

Update – I totally should have added that we built this app in partnership with Kasabi using their DBPedia dataset.

 

 

As you can see you photograph the plant label, we use optical character recognition to read what’s on the sign and then we bring back relevant information from wikipedia about the Family and Genus including pictures and links to other resources. We’ll launch this as a free app in a few weeks.

Seeing as StrongSteam is a cloud based API it makes sense to show it being used from another platform. Here’s a screencast showing a webcam on a Linux laptop taking a photograph of a printed plant label using our Python API which is uploaded and recognised, with the results being shown in the local web browser:

We’ll launch the alpha OCR API for developers in April. Add your email to the email list on our homepage to get an announce. Once the iPhone app is available we’ll also announce it here.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: Life, Python, StrongSteam

24 March 2012 - 21:54This Week In Startups, StrongSteam Pitch, Reimbursements, Mentorship

It has been a pretty nutty couple of weeks. PyCon a week back was ace, we signed up some clients and partners for StrongSteam and got offered investment. David Kim was good enough to interview me so I got to demo our OCR for text recognition and image recognition APIs via some mobile demos – check out the second video on David’s Enthought post.

Last night we got featured on Jason Calacanis’ and Tyler Crowley‘s This Week In Startups (@twistartups), via StartupChile. This was a bit nuts. I pitched earlier in the week for James (@jameskennedy) and Tyler’s (@looglalanguage) BizCamp pitch contest, we won the ‘best 4′ competition and that gave us a pass to be featured on the show. A competition was run yesterday here for 20 other companies to pitch to get a 5th place on the show. Once the show started I was up second.

Check out the video below at 0:19:00 to 0:32:00 to see me pitch and then at 1:05:00 to 1:07:30 to see the three judges decide that StrongSteam was ‘best bet for investment’. Being judged was fun. Focusing on giving our users what they need from our API is more our focus for now.

A few days before I was submitting the second month of reimbursement paperwork for our StartupChile placement. Emily has written a long piece on this already.

Below you can see my pile of paperwork – for each transaction (few big purchases, some contractors, some travel) I have a full audit trail that starts at the receipt and ends, via banks and credit cards, to a bank account in my name, with proof that I own that bank account. For contractors I include a full contract too. This proof is required, this is the ‘price’ of giving up 0% equity under a government scheme. It took 8 hours including my meeting with my account executive. They haven’t reimbursed this round yet, assuming they don’t reject anything (which is far from guaranteed) then this only costs 8 hours (last month cost 2 days). If they reject stuff then maybe I’ll invest a total of 10-16 hours.

Something that’s painfully obvious from yesterday’s pitching and today’s BizCamp is that pretty much all of us here lack t-shirts with our name, logo & strap-line. I could really have done with t-shirts at PyCon, I pitched to 100+ of the 2,300 delegates but got on stage in front of them all once – if someone had seem our name and noticed ‘AI’ or ‘computer vision’ then I bet they’d have come over for a chat. Lesson learned.

I’m also going to give a shout out back to Moo in the UK for their cool little business cards. So many people here don’t have any cards yet, this is such a mistake. Everyone needs cards, I’ve used Moo for years, I’d vote you go via them and get the mini cards and  a plastic case (they’re robust, mine is >2 years old and is still fine).

Finally – Vivek Wadhwa kicked a bunch of us up the arse two nights ago and again last night talking about self-mentorship (given that there is no formal mentorship out here). I’m going to be organising a group who want to self mentor such that we can meet regularly (maybe every week), set goals, be held accountable and basically focus on getting ready for demo day in 2 month’s time. It’ll be an interesting experiment.

For now this is nearly the end of a crazy 2 months. Tonight I’m going to get a take-out Chinese and settle in front of a movie.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Entrepreneur, Python

18 March 2012 - 4:00High Performance Python 1 from PyCon 2012 (slides, video, src)

This is the follow-on for my PyCon 2012 notes from the end post. I gave a 3.5 hour tutorial on High Performance Python 1, below I link to the slides, the video and the source code.

Topics covered:

  1. Profiling with cProfile and line_profiler
  2. Profile visualisations with runsnake
  3. PyPy for quick wins
  4. Cython for C-level speed
  5. ShedSkin for ‘quick wins’ on the right problems
  6. Cython+numpy for multi-core (300* on this Mandelbrot problem) speed-ups
  7. Multiprocessing for multi-core support
  8. ParallelPython for multi-machine support
  9. Numexpr for faster numpy math

The other topics in this high performance track (a part of the tutorial track) are:

and there’s a full set of videos here.

After EuroPython I wrote up my talk with additional material as a 55 page book, I was hoping to update the book this year but things are moving so fast with our new StrongSteam AI/vision startup (presented at StartupRow at PyCon) that I can’t really justify the time right now. I’ll just link to the High Performance Python book from last year, the timings are out of date (but they’re correct in the slides below) and the src is updated a bit, but the method and discussion is still correct.

Github code for HighPerformancePython_PyCon2012.

Slides:

 

Video (3.5 hours) via pyvideo.org:

 


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: Life, Python

12 March 2012 - 0:26PyCon 2012 notes from the end

PyCon 2012 is just coming to a close. There were over 2,200 people here and too many talks to choose between. It was a bloody fine conference. Meeting so many of the Names of the Python world was rather grand, teaching High Performance Computing and getting pats on the back for the creation of ShowMeDo was also rather nice.

UPDATE my notes for my High Performance Python 1 tutorial at PyCon 2012 are now online.

I Send A Big Thank You To The Organisers (and I mean all the organisers – including the AV crew who did a very fine job). This was my first US PyCon, it won’t be my last.

On the Thursday morning I ran my High Performance Python 1 tutorial for 60 students. The 3 hours passed in a blur (much as it did at EuroPython last year). I’ll have an updated booklet (here’s last year’s EuroPython booklet) in a couple of weeks. Here’s the github code.

Here’s the 3 hour video of my tutorial: High Performance Python 1 at PyCon 2012:

On Friday Paul Graham gave a nice keynote on startups (“Frighteningly Ambitious Startup Ideas“), this rather set the stage for our attendance with StrongSteam. Keynote:

We won a booth on StartupRow on the Friday in the expo hall, I adorned our stand with some posters and props for our mobile phone demos. With StrongSteam we’re working to give a pair of eyes to mobile phones so phones can ‘see’ the world as a human does. What could you do with an API that let’s you build your own Google Goggles?

Kyran and Balthazar had put together some cool demos – OCR on photos of labels to read text and open relevant wikipedia entries and also artwork recognition for Shardcore‘s art. We won a few offers of angel investment, had an acquisition offer, got some users and found some collaborators. Not bad for the second day at the conference.

One criticism I have is that StartupRow wasn’t advertisied. We were given small booths at the back of the hall behind the big shiny stands so it looked a bit like we were the poor cousins to the ‘proper’ companies. A banner or other announcement priming folk to the idea that we were early stage would have been handy.

Today the expo hall was cleared for the poster session. This was huge, I was very happy to see a wide selection of science and HPC projects along with a handful of companies.

Now I need to sleep before the 4am wakeup for the return flight to Santiago. Then…finishing off our first StrongSteam client and moving towards inviting users into the API.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: Python

20 February 2012 - 16:08StartupChile, PyCon, StrongSteam

As ever with a startup – there’s always too much to do and the game is all about juggling burning balls whilst figuring out which shouldn’t be dropped. We’re rather busy here.

Yesterday Emily and I finished the paperwork for our first reimbursement round at StartupChile. This is the part of the process that gets the most complaints from the startups here. We spent 5 hours yesterday preparing our first £5k or so for refund (flights, visas, first month’s rent, various expenses). All going well we’ll get 90% of this money back in a few weeks. Quite possibly we’ll have missed a massively important but otherwise minor detail somewhere and the admin team will reject up to 90% of the receipts (which can be resubmitted in a month) – such horror stories abound from Round 1.

The future expenses that we’ve already paid for like my trip to PyCon (next month) and flights can’t be claimed yet as I’ve yet to attend – we can only reimburse for definitely-spent money. The argument is that we could refund a future plane or conference ticket having already claimed it here through StartupChile, so getting ‘money for nothing’. This means I’m carrying another few thousand pounds of expenses that I can’t refund for at least another 6 weeks. Ho hum. Cashflow is king, I’m glad we had reserves when we flew out here.

Emily notes that the next application round opens soon, I know that Round 3 starts to arrive in a week’s time. I hope everyone who is already here updates the wiki so the obvious newbie questions that we asked don’t get repeated all over again!

Talking of PyCon – I’m pretty excited to be teaching High Performance Computing 1 this year. I’ve made some updates from last year’s course and I’ll get to tell some stories this year as we’re using this tech in StrongSteam. Getting to catch up with Travis (numpy originator), Fijal (numpypy in PyPy) and others will be rather awesome. I’ve also accepted a teaching position for EuroSciPy in August.

StrongSteam continues to develop. We’re still not taking on alpha users, we’re focusing on our first client from London until the end of March and then we’ll invite people to come play with our first bit of tech. In April we release our first iPhone app – it’ll let you take photographs of Latin plant labels at botanical gardens, we’ll then match them using Optical Character Recognition and vision techniques to a database of plants and give you information, pictures and videos (via WikiPedia, GeoSpecies and BBC:Wildlife) in return. We’re working with Kasabi (data partner announce) as our data partner.

Everything is backed by Python, our third member (Balthazar Rouberol @baltorouberol) joins us this week and he’ll wrap the client API as a Python package so we can start to distribute it to users who have joined our announce list (see our homepage).

We hope to expand this tech to make a similar app for use at the London Science Museum – getting videos and schematics for all the wonderful devices at the Science Museum direct to the smartphone seems like a wonderful way to enhance a trip (Steam Engines puffing! Babbage’s machines calculating!). We’re really excited to see what devs can do once they can reliably match text from labels, plaques and information cards – despite noise, distortion and obstruction – to a database of matching entries. This should make for some fun mobile apps.

I’m also preparing to declare myself as ‘tribe leader’ for Data Mining here at StartupChile – this means our Data meetups will gather more of the Return Value Agenda points (the points you have to get to qualify for the $40k grant under the programme), it’ll also give me more reasons to go open doors at the local telecomms companies.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Python

26 October 2011 - 11:29StrongSteam alpha, HackerNewsLondon, Startup-Chile

I’m a little behind with the blogging so here’s the short version. StrongSteam has been under constant dev for 2 months, we’re close to putting up the first AI tools behind a few Python demos (hopefully it’ll be up next week). I’m talking on this at HackerNewsLondon tomorrow night.

We haven’t (quite) finished the demos so it’ll be a slideshow, I’m thinking of running a workshop in a month or so to show what’s possible, talk through the limitations and possibilities and help people got comfy with the API.

I’m also very pleased to say that we were accepted into the StartupChile programme alongside RadicalRobot (my better half). In StrongSteam Kyran and I will get 6 months in Santiago with a $40k budget (for no equity!) to build our API and this opens the door to further travel. We’re also very happy to welcome Balthazar Rouberol (linkedin) to our team, he’ll be joining us remotely as an intern for 6 months.

Our biggest priority now is to get the alpha out there. If you’re curious to see what we’re doing please follow us via @strongsteamapi and join the mailing list on the strongsteam homepage.

We also have two surveys – the first is so you can tell us about your general AI interest, the second focuses on some of the points raised in the first to tell us more about your needs. We’d really appreciate your input here if you have 10 minutes to spare.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Programming, Python

31 August 2011 - 13:24strongsteam – an “AppStore for A.I. and data mining tools”

Kyran and I are starting work on a new project – strongsteam offers a web API with artificial intelligence and data mining tools. The goal is to make it easy for you to do things like:

  • get the text out of images using optical character recognition
  • determine whether two images look the same and if one object (e.g. a certain book or a can of coke) can be found in another
  • use natural language processing to analyse, cluster and compare text
  • extract text from audio (e.g. to pull out keywords from podcasts)
  • use machine learning on text to derive new data

If you’d like to join the closed alpha then visit strongsteam and add your email to the announce list on the homepage.

We’ve started with Python bindings which make it easy to talk to the strongsteam web service. Initially we’ll wrap open source tools that we’ve used along with lots of our own A.I. data mining tools from years of work in my Mor Consulting A.I. consultancy.

At EuroSciPy last week I demo’d using O.C.R. to extract the words from plant labels at Wakehurst Place gardens so you can lookup the plant on Wikipedia once you’ve taken a photo like this one:

Plant label for Ostrich Plume Fern at Wakehurst Place (Sussex)

Now we’re looking at applying O.C.R. to conference name-badges, this will be a bit of a mash-up from data used in our SocialTies conference app and Lanyrd.com‘s data. Next we’ll look at image matching and some text processing tools.


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

No Comments | Tags: ArtificialIntelligence, Entrepreneur, Python

25 July 2011 - 9:42High Performance Python tutorial v0.2 (from EuroPython 2011)

My updated High Performance Python tutorial is now available as a 55 page PDF. The goal is to take you on several journeys which show you different ways of making Python code run much faster (up to 75* on the CPU, faster with a GPU).

UPDATE this talk is superseded by my High Performance Python 1 tutorial from PyCon 2012.

This is an update to the 49 page v0.1 I published three weeks ago after running the tutorial at EuroPython 2011 in Florence.

Topics covered:

  • Python profiling (cProfile, RunSnake, line_profiler) – find bottlenecks
  • PyPy – Python’s new Just In Time compiler, a note on the new numpy module
  • Cython – annotate your code and compile to C
  • numpy integration with Cython – fast numerical Python library wrapped by Cython
  • ShedSkin – automatic code annotation and conversion to C
  • numpy vectors – fast vector operations using numpy arrays
  • NumExpr on numpy vectors – automatic numpy compilation to multiple CPUs and vector units
  • multiprocessing – built-in module to use multiple CPUs
  • ParallelPython – run tasks on multiple computers
  • pyCUDA – run tasks on your Graphics Processing Unit
  • Other algorithmic choices and options you have

The improvement over the last version (v0.1) is that I’ve filled in all the sections now including pyCUDA (there are still a few IAN_TODOs marked, I hope to finish these in a future v0.3). I’ve also added a short section on Algorithmic Choices, link to the new Cython prange operator and show the new numpy module in PyPy.

The source code is on my github page. The original slides are on slideshare too. If you’re after a challenge then at the end of the report I suggest some ported versions of the code that I’d like to see.

The report is licensed Creative Commons by Attribution (please link back here) – I’ll also happily accept a beer if you meet me in person! If you’re curious about this sort of work then note that I offer A.I. and high performance computing consulting and training via my Mor Consulting.

Update – ShedSkin 0.9 adds faster complex number support. I haven’t added it to the report yet, evidence in the ShedSkin Group suggests it gets closer to the non-complex-number version (i.e. you don’t have to do more work but you get a nice speed boost whilst still using complex numbers).

Update (Nov 2011) – Antonio and Armin posted a note which explains some of the slowness in PyPy and show how it is competitive, under the right conditions. Armin also contributed a C version which shows PyPy to run as fast as C (for their chosen configuration).


Ian applies Artificial Intelligence as an Artificial Intelligence Researcher for companies (Mor Consulting), co-founded the StrongSteam A.I. datamining toolkit, co-authored SocialTies, programs Python, writes The Screencasting Handbook and is also a sea-side dweller and consumer of fine coffees.

10 Comments | Tags: ArtificialIntelligence, Python