ANN: twitter-text-python release (Python Tweet parsing library)

A few weeks back I took over as maintainer of the twitter-text-python library (source on github). This library lets you take a tweet like:

"@ianozsvald, you now support #IvoWertzel's tweet ...

and extract the Twitter entities as defined in the Twitter conformance tests. The entities in the above tweet would be:

  • reply: 'ianozsvald'
  • users: ['ianozsvald']
  • tags: ['IvoWertzel']
  • urls: ['']
  • lists: []  # no lists in this tweet
  • output html: u'<a href="">@ianozsvald</a>, ...
  •   you now support <a href="">#IvoWertzel</a>\'s
  •   tweet parser! <a href=""></a>'

If you’re parsing Tweets or status-update-like-entities (from e.g.  in Python then this library makes it easy to extract @people, URLs and #hashtags. You can also request the spans (character locations) for each entity, very useful if you have repeated phrases and you’re doing a search/replace.

The library is easily installed using “$ pip install twitter-text-python” (MIT license) via the Python Package Index, currently at version

Credit – the library was developed by Ivo Wertzel (BonsiaDan on github), I merged a few Pull requests after forking to fix some bugs and have now taken over official maintenance.

Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.