Entrepreneurial Geekiness

Ian is a London-based independent Chief Data Scientist who coaches teams, teaches and creates data products. More about Ian here.
Entrepreneurial Geekiness
Ian is a London-based independent Chief Data Scientist who coaches teams, teaches and creates data products.
Coaching
Training
Jobs
Products
Consulting

Software Engineering for Data Scientists and Successfully Delivering Data Science Projects – 2 courses for May

In May I’m running 2 courses:

The first is aimed at data scientists who have had a bad or wobbly delivery who want to learn better ways to design projects, derisk their stages and deliver more reliably. After the course you’ll have a guide to writing sensible project specifications, a new process to follow, new tools to help your team write stronger data science solutions and you’ll have had Q&A time to get your own questions answered.

The second is aimed at data scientists who lack software engineering experience, who need practice in writing stronger code, documentation, readability, collaborative practices, testing and trustworthiness of their code to be able to deliver useful production-ready systems. After the course you’ll write stronger code, you’ll be able to supportively critique the code of your colleagues and you’ll be able to have stronger conversations with the engineering team at your organisation.

Both courses use a slack channel which is live after the training, past students are still asking questions and getting answers from other students – by attending you’ll have access to this resource and we have healthy Q&A time in the courses to get all of your uncertainties resolved so you can take answers back to your office.

“One of the highlights from Ian’s Successfully Delivering Data Science Projects course was being introduced to the concept of a specialised project specification document. This provides a systematic framework to directly tackle numerous problems I have experienced when trying to move a project beyond an initial prototyping stage. I have now applied my own tailored specification document at my organisation and it immediately surfaced critical questions and issues that otherwise would not have been realised for months.” – Thomas Brown, Data Scientist at aire.io

Both are linked on my training page, there’s an email list there where I first announce new courses.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

Second Successfully Delivering Data Science Projects just over

I ran the second iteration of my Successfully Delivering Data Science Projects course last Friday to this happy group, we had a lovely day and good conversation has continued in the teaching slack over the weekend:

Topics covered included the design and derisking of data projects (not just machine learning), building a project plan, communicating effectively with non-data science stakeholders, cost/benefits analysis, tooling, hiring and the process of actually getting R&D models shipped and supported. The conversation was both similar and nicely different to the Q&A topics that came up in the previous iteration.

I haven’t got a date yet for the next iteration, it’ll be in a couple of months. I may also run something on Software Engineer for Data Scientists and another on Higher Performance Python, possibly I’ll run one on NLP with a colleague. You can sign-up for notifications on my very low-volume mailing list if you’d like to hear when courses become available.

I’m doing research into which other topics might be most pressing in London – I’d be curious to know what you’d like to learn in the next 6 months if a course were run in London. The survey attached has 1 question with 1 choice and there’s no sign-up. There is a second question box to take your email (entirely optional) which will get you a discount code valid for the year.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

On the Delivery of Data Science Projects – talk for Business, Analytics and Data Science meetup

Last night I spoke at Pivigo’s Business, Analytics and Data Science meetup (thanks for having me!). I spoke on the key points that I cover in my public training (Successfully Delivering Data Science Projects) aimed at the meetup’s audience where many folk are earlier on their data science journey.

Audience at Pivigo’s meetup

We covered:

  • Building a Project Specification
  • Improving your coding and project process
  • How your team’s process will improve if you contribute to open source projects rather than just consuming them
  • Python tools that will make your life easier

The slides are linked here. Various folk last night asked me about how they get their first data science job and we had some good conversations. I have a “jobs and thoughts” list with UK-based jobs, if you’re after a role in the UK then you might want to jump on this list (your email is never shared and it is always kept private).


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

New public course on Successfully Delivering Data Science Projects for March 1st

On Friday February 1st I ran my first Successfully Delivering Data Science Projects, this is a part of my new plan to give more training this year. This went really well and I got to both teach and learn a lot from my students. We talked through best practice, project design, derisking strategies, communication plans and we tried various new tools that’ll improve workflow. Conversation has continued in our private slack channel (which all attendees get access to).

The next iteration of Successfully Delivering Data Science Projects is online for March 1st, the course has half sold-out already. If you’d like to improve your confidence around the successful delivery of Python data science projects – you’ll want to get a ticket soon. The material I teach is based on years of helping clients from start-ups to corporates to successfully deliver data science projects.

I’m really happy that the discursive format gave room for students to raise their own issues and to add recommendations for tools and books in addition to my own. We continued our conversations in the pub after whilst decompressing – there we got to dig into some of the hard topics (such as mental health, imposter syndrome and running open source projects) in a more relaxed setting.

The topics covered in the next iteration will include:

  • Building a Project Plan that derisks uncertainties and identifies expected deliverables, based on a well-understood problem and data set (but starting from…we don’t know what we have or really what we want!) – you take the project plan template away for use in your own projects
  • Scenarios based on real-world (and sometimes very difficult) experience that have to be solved in small teams
  • Team best practice with practical exercises covering coding standards, code reviews, testing (during R&D and in production) and retrospectives using tools such as nbdime, pandas profiling and discover-feature-relationships – you take away the solutions and a guide to running code reviews to support relentless quality improvements in your team’s solutions
  • Group discussion around the problems everyone faces, to be solved or moved forwards by everyone in the group (the group will have more experience than any single teacher)
  • A slack channel that lives during and after the course for continued support and discussion among the attendees

You’re welcome to get in contact if you have questions. Further announces will be made on my low-volume training email list. I will also link to upcoming courses from my every-two-weeks data scientist jobs and thoughts email list.

 


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More

“discover feature relationships” – new EDA tool

I’ve built a new Exploratory Data Analysis tool, I used it in a few presentations last year with the code on github and have now (finally) published it to PyPI.

The goal is to quickly check in a DataFrame using machine learning (sklearn’s Random Forests) if any column predicts any other column. I’m interested in the question “what relationships exist in my data” – particularly if I’m working in an unknown domain and on new data. I’ve used this on client projects during the discovery phase to learn more about the sort of questions I should ask a client.

The GitHub Readme includes a screenshot which will give you an idea using the Titanic classification and Boston regression examples.

This is a very light project at the moment, I think the idea has value, I’m very open to feedback.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.
Read More