A first approach to automatic text data cleaning
In October I gave the opening keynote at PyConIreland on The Real Unsolved Problems in Data Science. One of the topics I covered was poor quality data, by some estimates data cleaning occupies 50-80% of a data scientist’s time. Personally I’ve just spent the better part of last year figuring out ways to convert poorly-represented […]