5 :description: lxml - the most feature-rich and easy-to-use library for processing XML and HTML in the Python language
6 :keywords: Python XML, XML processing, HTML, lxml, simple XML, ElementTree, etree, lxml.etree, objectify, XML parsing, XML validation, XPath, XSLT
10 | `» lxml takes all the pain out of XML. « <https://mailman-mail5.webfaction.com/pipermail/lxml/20080131/019119.html>`_
15 lxml is the most feature-rich
16 and easy-to-use library
17 for processing XML and HTML
18 in the Python language.
33 The lxml XML toolkit is a Pythonic binding for the C libraries
34 libxml2_ and libxslt_. It is unique in that it combines the speed and
35 XML feature completeness of these libraries with the simplicity of a
36 native Python API, mostly compatible but superior to the well-known
37 ElementTree_ API. The latest release works with all CPython versions
38 from 2.7 to 3.8. See the introduction_ for more information about
39 background and goals of the lxml project. Some common questions are
42 .. _libxml2: http://xmlsoft.org/
43 .. _libxslt: http://xmlsoft.org/XSLT/
45 .. _introduction: intro.html
52 The complete lxml documentation is available for download as `PDF
53 documentation`_. The HTML documentation from this web site is part of
54 the normal `source download <#download>`_.
58 * the `lxml.etree tutorial for XML processing`_
60 * John Shipman's tutorial on `Python XML processing with lxml`_
62 * Fredrik Lundh's `tutorial for ElementTree`_
68 * compatibility_ and differences of lxml.etree
70 * `ElementTree performance`_ characteristics and comparison
74 * `lxml.etree specific API`_ documentation
76 * the `generated API documentation`_ as a reference
78 * parsing_ and validating_ XML
80 * `XPath and XSLT`_ support
82 * Python `XPath extension functions`_ for XPath and XSLT
84 * `custom XML element classes`_ for custom XML APIs (see `EuroPython 2008 talk`_)
86 * a `SAX compliant API`_ for interfacing with other XML tools
88 * a `C-level API`_ for interfacing with external C/Cython modules
92 * `lxml.objectify`_ API documentation
94 * a brief comparison of `objectify and etree`_
96 lxml.etree follows the ElementTree_ API as much as possible, building
97 it on top of the native libxml2 tree. If you are new to ElementTree,
98 start with the `lxml.etree tutorial for XML processing`_. See also the
99 ElementTree compatibility_ overview and the `ElementTree performance`_
100 page comparing lxml to the original ElementTree_ and cElementTree_
103 Right after the `lxml.etree tutorial for XML processing`_ and the
104 ElementTree_ documentation, the next place to look is the `lxml.etree
105 specific API`_ documentation. It describes how lxml extends the
106 ElementTree API to expose libxml2 and libxslt specific XML
107 functionality, such as XPath_, `Relax NG`_, `XML Schema`_, XSLT_, and
108 `c14n`_ (including `c14n 2.0`_).
109 Python code can be called from XPath expressions and XSLT
110 stylesheets through the use of `XPath extension functions`_. lxml
111 also offers a `SAX compliant API`_, that works with the SAX support in
112 the standard library.
114 There is a separate module `lxml.objectify`_ that implements a data-binding
115 API on top of lxml.etree. See the `objectify and etree`_ FAQ entry for a
118 In addition to the ElementTree API, lxml also features a sophisticated
119 API for `custom XML element classes`_. This is a simple way to write
120 arbitrary XML driven APIs on top of lxml. lxml.etree also has a
121 `C-level API`_ that can be used to efficiently extend lxml.etree in
122 external C modules, including fast custom element class support.
124 .. _ElementTree: http://effbot.org/zone/element-index.htm
125 .. _`ElementTree API`: http://effbot.org/zone/element-index.htm#documentation
126 .. _cElementTree: http://effbot.org/zone/celementtree.htm
128 .. _`tutorial for ElementTree`: http://effbot.org/zone/element.htm
129 .. _`lxml.etree tutorial for XML processing`: tutorial.html
130 .. _`Python XML processing with lxml`: http://www.nmt.edu/tcc/help/pubs/pylxml/
131 .. _`generated API documentation`: api/index.html
132 .. _`ElementTree performance`: performance.html
133 .. _`compatibility`: compatibility.html
134 .. _`lxml.etree specific API`: api.html
135 .. _`parsing`: parsing.html
136 .. _`validating`: validation.html
137 .. _`XPath and XSLT`: xpathxslt.html
138 .. _`XPath extension functions`: extensions.html
139 .. _`custom XML element classes`: element_classes.html
140 .. _`SAX compliant API`: sax.html
141 .. _`C-level API`: capi.html
142 .. _`lxml.objectify`: objectify.html
143 .. _`objectify and etree`: FAQ.html#what-is-the-difference-between-lxml-etree-and-lxml-objectify
144 .. _`EuroPython 2008 talk`: s5/lxml-ep2008.html
146 .. _XPath: https://www.w3.org/TR/xpath/
147 .. _`Relax NG`: https://relaxng.org/
148 .. _`XML Schema`: https://www.w3.org/XML/Schema
149 .. _`XSLT`: https://www.w3.org/TR/xslt
150 .. _`c14n`: https://www.w3.org/TR/xml-c14n
151 .. _`c14n 2.0`: https://www.w3.org/TR/xml-c14n2
157 The best way to download lxml is to visit `lxml at the Python Package
158 Index <http://pypi.python.org/pypi/lxml/>`_ (PyPI). It has the source
159 that compiles on various platforms. The source distribution is signed
160 with `this key <pubkey.asc>`_.
162 The latest version is `lxml 4.5.1`_, released 2020-05-19
163 (`changes for 4.5.1`_). `Older versions <#old-versions>`_
166 Please take a look at the
167 `installation instructions <installation.html>`_ !
169 This complete web site (including the generated API documentation) is
170 part of the source distribution, so if you want to download the
171 documentation for offline use, take the source archive and copy the
172 ``doc/html`` directory out of the source tree, or use the
173 `PDF documentation`_.
175 The latest `installable developer sources <https://github.com/lxml/lxml/archive/master.zip>`_
176 are available from Github. It's also possible to check out
177 the latest development version of lxml from Github directly, using a command
178 like this (assuming you use hg and have hg-git installed)::
180 hg clone git+ssh://git@github.com/lxml/lxml.git lxml
182 Alternatively, if you use git, this should work as well::
184 git clone https://github.com/lxml/lxml.git lxml
186 You can browse the `source repository`_ and its history through
187 the web. Please read `how to build lxml from source <build.html>`_
188 first. The `latest CHANGES`_ of the developer version are also
189 accessible. You can check there if a bug you found has been fixed
190 or a feature you want has been implemented in the latest trunk version.
192 .. _`source repository`: https://github.com/lxml/lxml/
193 .. _`latest CHANGES`: https://github.com/lxml/lxml/blob/master/CHANGES.txt
199 Questions? Suggestions? Code to contribute? We have a `mailing list`_.
201 You can search the archive with Gmane_ or Google_.
203 .. _`mailing list`: http://lxml.de/mailinglist/
204 .. _Gmane: http://blog.gmane.org/gmane.comp.python.lxml.devel
205 .. _Google: http://www.google.com/webhp?q=site:comments.gmane.org%2Fgmane.comp.python.lxml.devel+
211 lxml uses the `launchpad bug tracker`_. If you are sure you found a
212 bug in lxml, please file a bug report there. If you are not sure
213 whether some unexpected behaviour of lxml is a bug or not, please
214 check the documentation and ask on the `mailing list`_ first. Do not
215 forget to search the archive (e.g. with Gmane_)!
217 .. _`launchpad bug tracker`: https://launchpad.net/lxml/
223 The lxml library is shipped under a `BSD license`_. libxml2 and libxslt2
224 itself are shipped under the `MIT license`_. There should therefore be no
225 obstacle to using lxml in your codebase.
227 .. _`BSD license`: https://github.com/lxml/lxml/blob/master/doc/licenses/BSD.txt
228 .. _`MIT license`: http://www.opensource.org/licenses/mit-license.html
234 See the websites of lxml
235 `4.4 <http://lxml.de/4.4/>`_,
236 `4.3 <http://lxml.de/4.3/>`_,
237 `4.2 <http://lxml.de/4.2/>`_,
238 `4.1 <http://lxml.de/4.1/>`_,
239 `4.0 <http://lxml.de/4.0/>`_,
240 `3.8 <http://lxml.de/3.8/>`_,
241 `3.7 <http://lxml.de/3.7/>`_,
242 `3.6 <http://lxml.de/3.6/>`_,
243 `3.5 <http://lxml.de/3.5/>`_,
244 `3.4 <http://lxml.de/3.4/>`_,
245 `3.3 <http://lxml.de/3.3/>`_,
246 `3.2 <http://lxml.de/3.2/>`_,
247 `3.1 <http://lxml.de/3.1/>`_,
248 `3.0 <http://lxml.de/3.0/>`_,
249 `2.3 <http://lxml.de/2.3/>`_,
250 `2.2 <http://lxml.de/2.2/>`_,
251 `2.1 <http://lxml.de/2.1/>`_,
252 `2.0 <http://lxml.de/2.0/>`_,
253 `1.3 <http://lxml.de/1.3/>`_
256 and the `latest in-development version <http://lxml.de/dev/>`_.
258 .. _`PDF documentation`: lxmldoc-4.5.1.pdf
260 * `lxml 4.5.1`_, released 2020-05-19 (`changes for 4.5.1`_)
262 * `lxml 4.5.0`_, released 2020-01-29 (`changes for 4.5.0`_)
264 * `lxml 4.4.3`_, released 2020-01-28 (`changes for 4.4.3`_)
266 * `lxml 4.4.2`_, released 2019-11-25 (`changes for 4.4.2`_)
268 * `lxml 4.4.1`_, released 2019-08-11 (`changes for 4.4.1`_)
270 * `lxml 4.4.0`_, released 2019-07-27 (`changes for 4.4.0`_)
272 * `older releases <http://lxml.de/4.3/#old-versions>`_
274 .. _`lxml 4.5.1`: /files/lxml-4.5.1.tgz
275 .. _`lxml 4.5.0`: /files/lxml-4.5.0.tgz
276 .. _`lxml 4.4.3`: /files/lxml-4.4.3.tgz
277 .. _`lxml 4.4.2`: /files/lxml-4.4.2.tgz
278 .. _`lxml 4.4.1`: /files/lxml-4.4.1.tgz
279 .. _`lxml 4.4.0`: /files/lxml-4.4.0.tgz
281 .. _`changes for 4.5.1`: /changes-4.5.1.html
282 .. _`changes for 4.5.0`: /changes-4.5.0.html
283 .. _`changes for 4.4.3`: /changes-4.4.3.html
284 .. _`changes for 4.4.2`: /changes-4.4.2.html
285 .. _`changes for 4.4.1`: /changes-4.4.1.html
286 .. _`changes for 4.4.0`: /changes-4.4.0.html