Allergic Rhinitis (“Why do I always sneeze?!”) research project using Machine Learning

Since April my wife (@fluffyemily) and I have been running a research project around her allergies. She sneezes all year and we’re trying to figure out the cause. Allergic Rhinitis affects 10-30% of Westerners, in Emily’s case it is all-year so it isn’t just pollen related. We figure that a good data-collection process coupled with robust analysis might reveal some of the causes of sneezing such that Emily’s in better control of her Rhinitis.

Emily’s a senior iOS developer with Mozilla, she wrote an open source App for her iPhone to log her sneezes, antihistamine use and interactions with “things” like animals. The App gives us a time-stamp and geolocation. Since she’s mostly in London we’ve got a rich source of events to join to other datasets.

This post is just to put down a marker. I’ve made some progress using Machine Learning to predict when an antihistamine might be used. Currently I can out-predict a Dummy (majority-class) classifier using many cross-validation runs, this is hardly brilliant but we didn’t expect diagnosing a long-term allergy to be a simple affair! Exploratory data analysis on the data shows lots of interesting behaviours, I hope to talk about some of these in the future.

We’ve tried (and so far rejected) air-born particulates as a reason for her allergies via Kings College LondonAir data (thanks!). Weather data is more promising using a local wunderground station (Emily seems to be a little sensitive to humidity and windspeed). I’ve recently started work on MyFitnessPal logged data (the Python 3.4 port was thankfully easy) to start to look at alcohol (a known histamine modifier) and possibly other food.

Behind the scenes I’ve got a collaborative group (thanks Frank and Giles!) in Slack and a private github repo, I plan to talk a little on how this works. I think talking about ways we can collaborate on research projects has value, anything that helps us move on from just working in an office seems like a good idea.

If you’re interested in hearing updates about this project and maybe getting involved to log your own allergy data, join this email announce list. Your email will be kept private, I’ll just send you an email every now and again when we’ve made some progress (which will probably appear here) and when we need volunteers.

Ultimately we’d like to help predict the causes of allergies for other folk. We’ve been talking about this for around 2 years, it is encouraging to see research like this pointing to the use of ML to predict and model the body’s behaviours.

Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.


  • Bob K.
    Your wife might also consider going to an immunologist (or allergist) and getting comprehensive blood tests for what is provoking the reaction. In my case I had all sorts of theories (pollens? moulds? ...) but the blood test showed moderate to high sensitivity to dust mite and nothing else. I'm now starting on immunotherapy/desensitisation therapy for dust mite. IAN - Hey Bob, many thanks, we've had mixed reports about an immunologist but I'm in favour of at least trying, given that we might learn something useful. Thanks for the piece of evidence :-)
  • Fabio Zadrozny
    Hi Ian, Well, not sure if it helps, but just thought I'd share my own case... for me, I discovered my allergy gets worse when I eat things that are done with milk (such as cheese), even though milk by itself didn't appear as an allergy for me (only dust) -- the only way for me to discover it was taking it out of my eating habits for some time... now when I eat it I know I'll probably be sneezing the next couple days ;)
  • Bryan Lott
    Was reading about light-sensitive rhinitis the other day... perhaps it's light sensitivity that's triggering the sneezing? IAN - agreed it might be a thing but, from what I understand, that happens during first exposure to light and doesn't apply to all the sneezes that occur all day (rather than at e.g. first-light).