5 :description: lxml - the most feature-rich and easy-to-use library for processing XML and HTML in the Python language
6 :keywords: Python XML, XML processing, HTML, lxml, simple XML, ElementTree, etree, lxml.etree, objectify, XML parsing, XML validation, XPath, XSLT
10 | `» lxml takes all the pain out of XML. « <http://thread.gmane.org/gmane.comp.python.lxml.devel/3252/focus=3258>`_
15 lxml is the most feature-rich
16 and easy-to-use library
17 for processing XML and HTML
18 in the Python language.
33 The lxml XML toolkit is a Pythonic binding for the C libraries
34 libxml2_ and libxslt_. It is unique in that it combines the speed and
35 XML feature completeness of these libraries with the simplicity of a
36 native Python API, mostly compatible but superior to the well-known
37 ElementTree_ API. The latest release works with all CPython versions
38 from 2.4 to 3.2. See the introduction_ for more information about
39 background and goals of the lxml project. Some common questions are
42 .. _libxml2: http://xmlsoft.org/
43 .. _libxslt: http://xmlsoft.org/XSLT/
45 .. _introduction: intro.html
52 The complete lxml documentation is available for download as `PDF
53 documentation`_. The HTML documentation from this web site is part of
54 the normal `source download <#download>`_.
58 * the `lxml.etree tutorial for XML processing`_
60 * John Shipman's tutorial on `Python XML processing with lxml`_
62 * Fredrik Lundh's `tutorial for ElementTree`_
68 * compatibility_ and differences of lxml.etree
70 * `ElementTree performance`_ characteristics and comparison
74 * `lxml.etree specific API`_ documentation
76 * the `generated API documentation`_ as a reference
78 * parsing_ and validating_ XML
80 * `XPath and XSLT`_ support
82 * Python `XPath extension functions`_ for XPath and XSLT
84 * `custom XML element classes`_ for custom XML APIs (see `EuroPython 2008 talk`_)
86 * a `SAX compliant API`_ for interfacing with other XML tools
88 * a `C-level API`_ for interfacing with external C/Cython modules
92 * `lxml.objectify`_ API documentation
94 * a brief comparison of `objectify and etree`_
96 lxml.etree follows the ElementTree_ API as much as possible, building
97 it on top of the native libxml2 tree. If you are new to ElementTree,
98 start with the `lxml.etree tutorial for XML processing`_. See also the
99 ElementTree compatibility_ overview and the `ElementTree performance`_
100 page comparing lxml to the original ElementTree_ and cElementTree_
103 Right after the `lxml.etree tutorial for XML processing`_ and the
104 ElementTree_ documentation, the next place to look is the `lxml.etree
105 specific API`_ documentation. It describes how lxml extends the
106 ElementTree API to expose libxml2 and libxslt specific XML
107 functionality, such as XPath_, `Relax NG`_, `XML Schema`_, XSLT_, and
108 `c14n`_. Python code can be called from XPath expressions and XSLT
109 stylesheets through the use of `XPath extension functions`_. lxml
110 also offers a `SAX compliant API`_, that works with the SAX support in
111 the standard library.
113 There is a separate module `lxml.objectify`_ that implements a data-binding
114 API on top of lxml.etree. See the `objectify and etree`_ FAQ entry for a
117 In addition to the ElementTree API, lxml also features a sophisticated
118 API for `custom XML element classes`_. This is a simple way to write
119 arbitrary XML driven APIs on top of lxml. lxml.etree also has a
120 `C-level API`_ that can be used to efficiently extend lxml.etree in
121 external C modules, including fast custom element class support.
123 .. _ElementTree: http://effbot.org/zone/element-index.htm
124 .. _`ElementTree API`: http://effbot.org/zone/element-index.htm#documentation
125 .. _cElementTree: http://effbot.org/zone/celementtree.htm
127 .. _`tutorial for ElementTree`: http://effbot.org/zone/element.htm
128 .. _`lxml.etree tutorial for XML processing`: tutorial.html
129 .. _`Python XML processing with lxml`: http://www.nmt.edu/tcc/help/pubs/pylxml/
130 .. _`generated API documentation`: api/index.html
131 .. _`ElementTree performance`: performance.html
132 .. _`compatibility`: compatibility.html
133 .. _`lxml.etree specific API`: api.html
134 .. _`parsing`: parsing.html
135 .. _`validating`: validation.html
136 .. _`XPath and XSLT`: xpathxslt.html
137 .. _`XPath extension functions`: extensions.html
138 .. _`custom XML element classes`: element_classes.html
139 .. _`SAX compliant API`: sax.html
140 .. _`C-level API`: capi.html
141 .. _`lxml.objectify`: objectify.html
142 .. _`objectify and etree`: FAQ.html#what-is-the-difference-between-lxml-etree-and-lxml-objectify
143 .. _`EuroPython 2008 talk`: s5/lxml-ep2008.html
145 .. _XPath: http://www.w3.org/TR/xpath/
146 .. _`Relax NG`: http://www.relaxng.org/
147 .. _`XML Schema`: http://www.w3.org/XML/Schema
148 .. _`XSLT`: http://www.w3.org/TR/xslt
149 .. _`c14n`: http://www.w3.org/TR/xml-c14n
155 The best way to download lxml is to visit `lxml at the Python Package
156 Index`_ (PyPI). It has the source that compiles on various platforms.
157 The source distribution is signed with `this key`_. Binary builds for
158 MS Windows usually become available through PyPI a few days after a
159 source release. If you can't wait, consider trying a less recent
160 release version first.
162 The latest version is `lxml 2.3.5`_, released 2012-07-31
163 (`changes for 2.3.5`_). `Older versions`_ are listed below.
165 Please take a look at the `installation instructions`_!
167 This complete web site (including the generated API documentation) is
168 part of the source distribution, so if you want to download the
169 documentation for offline use, take the source archive and copy the
170 ``doc/html`` directory out of the source tree.
172 It's also possible to check out the latest development version of lxml
173 from github directly, using a command like this (assuming you use hg
174 and have hg-git installed)::
176 hg clone https://github.com/lxml/lxml.git lxml
178 You can also browse the `source repository`_ and its history through
179 the web. Please read `how to build lxml from source`_ first. The
180 `latest CHANGES`_ of the developer version are also accessible. You
181 can check there if a bug you found has been fixed or a feature you
182 want has been implemented in the latest trunk version.
184 .. _`lxml at the Python Package Index`: http://pypi.python.org/pypi/lxml/
185 .. _`this key`: pubkey.asc
186 .. _`Older versions`: #old-versions
187 .. _`installation instructions`: installation.html
188 .. _`how to build lxml from source`: build.html
189 .. _`source repository`: https://github.com/lxml/lxml/
190 .. _`latest CHANGES`: https://github.com/lxml/lxml/blob/master/CHANGES.txt
196 Questions? Suggestions? Code to contribute? We have a `mailing list`_.
198 You can search the archive with Gmane_ or Google_.
200 .. _`mailing list`: http://lxml.de/mailinglist/
201 .. _Gmane: http://blog.gmane.org/gmane.comp.python.lxml.devel
202 .. _Google: http://www.google.com/webhp?q=site:comments.gmane.org%2Fgmane.comp.python.lxml.devel+
208 lxml uses the `launchpad bug tracker`_. If you are sure you found a
209 bug in lxml, please file a bug report there. If you are not sure
210 whether some unexpected behaviour of lxml is a bug or not, please
211 check the documentation and ask on the `mailing list`_ first. Do not
212 forget to search the archive (e.g. with Gmane_)!
214 .. _`launchpad bug tracker`: https://launchpad.net/lxml/
220 The lxml library is shipped under a `BSD license`_. libxml2 and libxslt2
221 itself are shipped under the `MIT license`_. There should therefore be no
222 obstacle to using lxml in your codebase.
224 .. _`BSD license`: https://github.com/lxml/lxml/blob/master/doc/licenses/BSD.txt
225 .. _`MIT license`: http://www.opensource.org/licenses/mit-license.html
231 See the web sites of lxml `1.3 <http://lxml.de/1.3/>`_, `2.0
232 <http://lxml.de/2.0/>`_, `2.1 <http://lxml.de/2.1/>`_, `2.2
233 <http://lxml.de/2.2/>`_ and the `latest in-development version
234 <http://lxml.de/dev/>`_
236 .. _`PDF documentation`: lxmldoc-2.3.5.pdf
238 * `lxml 2.3.4`_, released 2012-03-26 (`changes for 2.3.4`_)
240 * `lxml 2.3.3`_, released 2012-01-04 (`changes for 2.3.3`_)
242 * `lxml 2.3.2`_, released 2011-11-11 (`changes for 2.3.2`_)
244 * `lxml 2.3.1`_, released 2011-09-25 (`changes for 2.3.1`_)
246 * `lxml 2.3`_, released 2011-02-06 (`changes for 2.3`_)
248 * `lxml 2.3beta1`_, released 2010-09-06 (`changes for 2.3beta1`_)
250 * `lxml 2.3alpha2`_, released 2010-07-24 (`changes for 2.3alpha2`_)
252 * `lxml 2.3alpha1`_, released 2010-06-19 (`changes for 2.3alpha1`_)
254 * `lxml 2.2.8`_, released 2010-09-02 (`changes for 2.2.8`_)
256 * `lxml 2.2.7`_, released 2010-07-24 (`changes for 2.2.7`_)
258 * `lxml 2.2.6`_, released 2010-03-02 (`changes for 2.2.6`_)
260 * `lxml 2.2.5`_, released 2010-02-28 (`changes for 2.2.5`_)
262 * `lxml 2.2.4`_, released 2009-11-11 (`changes for 2.2.4`_)
264 * `lxml 2.2.3`_, released 2009-10-30 (`changes for 2.2.3`_)
266 * `lxml 2.2.2`_, released 2009-06-21 (`changes for 2.2.2`_)
268 * `lxml 2.2.1`_, released 2009-06-02 (`changes for 2.2.1`_)
270 * `lxml 2.2`_, released 2009-03-21 (`changes for 2.2`_)
272 * `older releases <http://lxml.de/2.2/#old-versions>`_
274 .. _`lxml 2.3.5`: /files/lxml-2.3.5.tgz
275 .. _`lxml 2.3.4`: /files/lxml-2.3.4.tgz
276 .. _`lxml 2.3.3`: /files/lxml-2.3.3.tgz
277 .. _`lxml 2.3.2`: /files/lxml-2.3.2.tgz
278 .. _`lxml 2.3.1`: /files/lxml-2.3.1.tgz
279 .. _`lxml 2.3`: /files/lxml-2.3.tgz
280 .. _`lxml 2.3beta1`: /files/lxml-2.3beta1.tgz
281 .. _`lxml 2.3alpha2`: /files/lxml-2.3alpha2.tgz
282 .. _`lxml 2.3alpha1`: /files/lxml-2.3alpha1.tgz
283 .. _`lxml 2.2.8`: /files/lxml-2.2.8.tgz
284 .. _`lxml 2.2.7`: /files/lxml-2.2.7.tgz
285 .. _`lxml 2.2.6`: /files/lxml-2.2.6.tgz
286 .. _`lxml 2.2.5`: /files/lxml-2.2.5.tgz
287 .. _`lxml 2.2.4`: /files/lxml-2.2.4.tgz
288 .. _`lxml 2.2.3`: /files/lxml-2.2.3.tgz
289 .. _`lxml 2.2.2`: /files/lxml-2.2.2.tgz
290 .. _`lxml 2.2.1`: /files/lxml-2.2.1.tgz
291 .. _`lxml 2.2`: /files/lxml-2.2.tgz
293 .. _`changes for 2.3.5`: /changes-2.3.5.html
294 .. _`changes for 2.3.4`: /changes-2.3.4.html
295 .. _`changes for 2.3.3`: /changes-2.3.3.html
296 .. _`changes for 2.3.2`: /changes-2.3.2.html
297 .. _`changes for 2.3.1`: /changes-2.3.1.html
298 .. _`changes for 2.3`: /changes-2.3.html
299 .. _`changes for 2.3beta1`: /changes-2.3beta1.html
300 .. _`changes for 2.3alpha2`: /changes-2.3alpha2.html
301 .. _`changes for 2.3alpha1`: /changes-2.3alpha1.html
302 .. _`changes for 2.2.8`: /changes-2.2.8.html
303 .. _`changes for 2.2.7`: /changes-2.2.7.html
304 .. _`changes for 2.2.6`: /changes-2.2.6.html
305 .. _`changes for 2.2.5`: /changes-2.2.5.html
306 .. _`changes for 2.2.4`: /changes-2.2.4.html
307 .. _`changes for 2.2.3`: /changes-2.2.3.html
308 .. _`changes for 2.2.2`: /changes-2.2.2.html
309 .. _`changes for 2.2.1`: /changes-2.2.1.html
310 .. _`changes for 2.2`: /changes-2.2.html