Ian Ozsvald picture

This is Ian Ozsvald's blog (@IanOzsvald), I'm an entrepreneurial geek, a Data Science/ML/NLP/AI consultant, author of O'Reilly's High Performance Python book, co-organiser of PyDataLondon, a Pythonista, co-founder of ShowMeDo and also a Londoner. Here's a little more about me.

High Performance Python book with O'Reilly

View Ian Ozsvald's profile on LinkedIn

ModelInsight Data Science Consultancy London Protecting your bits. Open Rights Group

25 November 2014 - 19:11We’re running more Data Science Training in 2015 Q1 in London

A couple of weeks ago Bart and I ran two very successful training courses in London through my ModelInsight, one introduced data science using pandas and numpy to build a recommender engine, the second taught a two-day course on High Performance Python (and yes, that was somewhat based on my book with a lot of hands-on exercises). Based on feedback from those courses we’re looking to introduce up to 5 courses at the start of next year.

If you’d like to hear about our London data science training then sign-up to our (very low volume) announce list. I posted an anonymous survey onto the mailing list, if you’d like to give your vote to the courses we should run then jump over here (no sign-up, there’s only 1 question, there’s no commitment).

If you’d like to talk about these in person then you can find me (probably on-stage) co-running the PyDataLondon meetups.

Here’s the synopses for each of the proposed courses:

“Playing with data – pandas and matplotlib” (1 day)

Aimed at beginner Pythonista data scientists who want to load, manipulate and visualise data
We’ll use pandas with many practical exercises on different sorts of data (including messy data that needs fixing) to manipulate, visualise and join data. You’ll be able to work with your own data sets after this course, we’ll also look at other visualise tools like Seaborn and Bokeh. This will suit people who haven’t used pandas who want a practical introduction such as data journalists, engineers and semi-technical managers.

“Building a recommender system with Python” (1 day)

Aimed at intermediate Pythonistas who want to use pandas and numpy to build a working recommender engine, this covers both using data through to delivering a working data science product. You already know a little linear algebra and you’ve used numpy lightly, you want to see how to deploy a working data science product as a microservice (Flask) that could reliably be put into production.

“Statistics and Big Data using scikit-learn” (2 days)

Aimed at beginner/intermediate Pythonistas with some mathematical background and a desire to learn everyday statistics and to start with machine learning
Day 1 – Probability, distributions, Frequentist and Bayesian approaches, Inference and Regression, Experiment Design – part discussion and part practical
Day 2 – Applying these approaches with scikit-learn to everyday problems, examples may include (note *examples may change* this just gives a flavour) Bayesian spam detection, predicting political campaigns, quality testing, clustering, weather forecasting, tools will include Statsmodels and matplotlib.

“Hands on with Scikit-Learn” (5 days)

Aimed at intermediate Pythonistas who need a practical and comprehensive introduction to machine learning in Python, you’ve already got a basic statistical and linear algebra background
This course will cover all the terminology and stages that make up the machine learning pipeline and the fundamental skills needed to perform machine learning successfully. Aided by many hands on labs with Python scikit-learn the course will enable you to understand the basic concepts, become confident in applying the tools and techniques, and provide a firm foundation from which to dig deeper and explore more advanced methods.

“High Performance Python” (2 days)

Aimed at intermediate Pythonistas whose code is too slow
Day 1 – Profiling (CPU and RAM), compiling with Cython, using Numba, PyPy and Pythran (all the way through to using OpenMP)
Day 2 – Going multicore (multiprocessing) and multi-machine (IPython parallel), fitting more into RAM, probabilitistic counting, storage engines, Test Driven Development and several debugging exercises
A mix of theory and practical exercises, you’ll be able to use the main Python tools to confidently and reliably make your code run faster

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

1 Comment | Tags: Data science, Python