ANN: twitter-text-python 1.0.0.2 release (Python Tweet parsing library)

A few weeks back I took over as maintainer of the twitter-text-python library (source on github). This library lets you take a tweet like:

"@ianozsvald, you now support #IvoWertzel's tweet ...
parser! https://github.com/ianozsvald/"

and extract the Twitter entities as defined in the Twitter conformance tests. The entities in the above tweet would be:

  • reply: 'ianozsvald'
  • users: ['ianozsvald']
  • tags: ['IvoWertzel']
  • urls: ['https://github.com/ianozsvald/']
  • lists: []  # no lists in this tweet
  • output html: u'<a href="http://twitter.com/ianozsvald">@ianozsvald</a>, ...
  •   you now support <a href="http://search.twitter.com/search?q=%23IvoWertzel">#IvoWertzel</a>\'s
  •   tweet parser! <a href="https://github.com/ianozsvald/">https://github.com/ianozsvald/</a>'

If you’re parsing Tweets or status-update-like-entities (from e.g. App.net)  in Python then this library makes it easy to extract @people, URLs and #hashtags. You can also request the spans (character locations) for each entity, very useful if you have repeated phrases and you’re doing a search/replace.

The library is easily installed using “$ pip install twitter-text-python” (MIT license) via the Python Package Index, currently at version 1.0.0.2.

Credit – the library was developed by Ivo Wertzel (BonsiaDan on github), I merged a few Pull requests after forking to fix some bugs and have now taken over official maintenance.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight and in his Mor Consulting, sign-up for Data Science tutorials in London. He also founded the image and text annotation API Annotate.io, lives in London and is a consumer of fine coffees.

7 Comments