After I spoke at DataScienceLondon in June I was given a set of paper references by a couple of people (the bulk were by Levente Török) – thanks to all. They’re listed below. Along the same lines I have one machine learning paper aimed at beginners to recommend (“A Few Useful Things to Know about Machine Learning” – Pedro Domingos), it gives a set of real-world examples to work off, useful for someone short on experience who wants to learn whilst avoiding some of the worse mistakes.
Selection of references in no particular order:
Rethinking LDA: Why priors matter (How to tune the hyper parameters which shouldn’t matter.)
Dynamic Topic Models and the Document Influence Model (in which they deal with the change of the hidden topics ( HMM))
Semi supervised topic model notes:
Melting the huge difference between the topic models and the bag of words approach:
Beyond Bag of words (presentation)
Collective Latent Dirichlet Allocation (might be useful for Tweet collections)
R packages (from Levente):
R Text Tools package (noted as most advanced package, website offline when I visited it)
Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight and in his Mor Consulting, sign-up for Data Science tutorials in London. He also founded the image and text annotation API Annotate.io, lives in London and is a consumer of fine coffees.