aMaking “from lxml import etree” work with virtualenv (Python)

Update – these steps are overly complicated and *unnecessary*! See fizyk and Marius’ comments below. I’ll leave this post just in case it helps anyone – hopefully anyone coming here will realise it isn’t hard (now) to install lxml, as long as the OS dependencies are installed

I use virtualenv for all development. Recently I was stumped with the need for the lxml module – installing it using virtualenv on Linux requires a bit of work.

Let’s see the problem first:

$ virtualenv testlibxml
 New python executable in testlibxml/bin/python
 Installing distribute.............................................................................................................................................................................................done.
 Installing pip...............done.
.../virtualenvs/testlibxml $ source bin/activate
$ pip install lxml
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/home/ian/workspace/virtualenvs/testlibxml/build/lxml/src/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o
In file included from src/lxml/lxml.etree.c:254:0:
/home/ian/workspace/virtualenvs/testlibxml/build/lxml/src/lxml/includes/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1

Following these instructions and noting to follow the instructions for *both* libxml2 and libxml (further below) I run (using this change for my local path):

./configure --with-python=/home/ian/workspace/virtualenvs/testlibxml/bin/python

And now we can start python and import libxml2

(testlibxml)ian@ian-Latitude-E6420 ~/workspace/virtualenvs/testlibxml $ python
 Python 2.7.3 (default, Aug  1 2012, 05:14:39)
 [GCC 4.6.3] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import libxml2 # works

Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.

3 Comments

  • Can't you just sudo apt-get build-dep python-lxml pip install lxml ?
  • If I remember corectly, installing header files for libxml ind libxslt (libxml2-dev for libxml) would help you to install lxml without the need to follow these instructions. Simply using pip install lxml out of the box.
  • That was the first thing I tried! I had errors and then followed various processes before getting to the above. I've just tried a new (fresh) virtualenv and it worked as you suggested. Hmmm, maybe I missed a dependency at first? I'll update the blog title to say this is a bit unnecessary - thanks!