CHANGES.txt

   1 ==============
   2 lxml changelog
   3 ==============
   4
   5 4.5.1 (2020-05-19)
   6 ==================
   7
   8 Bugs fixed
   9 ----------
  10
  11 * LP#1570388: Fix failures when serialising documents larger than 2GB in some cases.
  12
  13 * LP#1865141, GH#298: ``QName`` values were not accepted by the ``el.iter()`` method.
  14   Patch by xmo-odoo.
  15
  16 * LP#1863413, GH#297: The build failed to detect libraries on Linux that are only
  17   configured via pkg-config.
  18   Patch by Hugh McMaster.
  19
  20
  21 4.5.0 (2020-01-29)
  22 ==================
  23
  24 Features added
  25 --------------
  26
  27 * A new function ``indent()`` was added to insert tail whitespace for pretty-printing
  28   an XML tree.
  29
  30 Bugs fixed
  31 ----------
  32
  33 * LP#1857794: Tail text of nodes that get removed from a document using item
  34   deletion disappeared silently instead of sticking with the node that was removed.
  35
  36 Other changes
  37 -------------
  38
  39 * MacOS builds are 64-bit-only by default.
  40   Set CFLAGS and LDFLAGS explicitly to override it.
  41
  42 * Linux/MacOS Binary wheels now use libxml2 2.9.10 and libxslt 1.1.34.
  43
  44 * LP#1840234: The package version number is now available as ``lxml.__version__``.
  45
  46
  47 4.4.3 (2020-01-28)
  48 ==================
  49
  50 Bugs fixed
  51 ----------
  52
  53 * LP#1844674: ``itertext()`` was missing tail text of comments and PIs since 4.4.0.
  54
  55
  56 4.4.2 (2019-11-25)
  57 ==================
  58
  59 Bugs fixed
  60 ----------
  61
  62 * LP#1835708: ``ElementInclude`` incorrectly rejected repeated non-recursive
  63   includes as recursive.
  64   Patch by Rainer Hausdorf.
  65
  66
  67 4.4.1 (2019-08-11)
  68 ==================
  69
  70 Bugs fixed
  71 ----------
  72
  73 * LP#1838252: The order of an OrderedDict was lost in 4.4.0 when passing it as
  74   attrib mapping during element creation.
  75
  76 * LP#1838521: The package metadata now lists the supported Python versions.
  77
  78
  79 4.4.0 (2019-07-27)
  80 ==================
  81
  82 Features added
  83 --------------
  84
  85 * ``Element.clear()`` accepts a new keyword argument ``keep_tail=True`` to clear
  86   everything but the tail text.  This is helpful in some document-style use cases
  87   and for clearing the current element in ``iterparse()`` and pull parsing.
  88
  89 * When creating attributes or namespaces from a dict in Python 3.6+, lxml now
  90   preserves the original insertion order of that dict, instead of always sorting
  91   the items by name.  A similar change was made for ElementTree in CPython 3.8.
  92   See https://bugs.python.org/issue34160
  93
  94 * Integer elements in ``lxml.objectify`` implement the ``__index__()`` special method.
  95
  96 * GH#269: Read-only elements in XSLT were missing the ``nsmap`` property.
  97   Original patch by Jan Pazdziora.
  98
  99 * ElementInclude can now restrict the maximum inclusion depth via a ``max_depth``
 100   argument to prevent content explosion.  It is limited to 6 by default.
 101
 102 * The ``target`` object of the XMLParser can have ``start_ns()`` and ``end_ns()``
 103   callback methods to listen to namespace declarations.
 104
 105 * The ``TreeBuilder`` has new arguments ``comment_factory`` and ``pi_factory`` to
 106   pass factories for creating comments and processing instructions, as well as
 107   flag arguments ``insert_comments`` and ``insert_pis`` to discard them from the
 108   tree when set to false.
 109
 110 * A `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ implementation was added as
 111   ``etree.canonicalize()``, a corresponding ``C14NWriterTarget`` class, and
 112   a ``c14n2`` serialisation method.
 113
 114 Bugs fixed
 115 ----------
 116
 117 * When writing to file paths that contain the URL escape character '%', the file
 118   path could wrongly be mangled by URL unescaping and thus write to a different
 119   file or directory.  Code that writes to file paths that are provided by untrusted
 120   sources, but that must work with previous versions of lxml, should best either
 121   reject paths that contain '%' characters, or otherwise make sure that the path
 122   does not contain maliciously injected '%XX' URL hex escapes for paths like '../'.
 123
 124 * Assigning to Element child slices with negative step could insert the slice at
 125   the wrong position, starting too far on the left.
 126
 127 * Assigning to Element child slices with overly large step size could take very
 128   long, regardless of the length of the actual slice.
 129
 130 * Assigning to Element child slices of the wrong size could sometimes fail to
 131   raise a ValueError (like a list assignment would) and instead assign outside
 132   of the original slice bounds or leave parts of it unreplaced.
 133
 134 * The ``comment`` and ``pi`` events in ``iterwalk()`` were never triggered, and
 135   instead, comments and processing instructions in the tree were reported as
 136   ``start`` elements.  Also, when walking an ElementTree (as opposed to its root
 137   element), comments and PIs outside of the root element are now reported.
 138
 139 * LP#1827833: The RelaxNG compact syntax support was broken with recent versions
 140   of ``rnc2rng``.
 141
 142 * LP#1758553: The HTML elements ``source`` and ``track`` were added to the list
 143   of empty tags in ``lxml.html.defs``.
 144
 145 * Registering a prefix other than "xml" for the XML namespace is now rejected.
 146
 147 * Failing to write XSLT output to a file could raise a misleading exception.
 148   It now raises ``IOError``.
 149
 150 Other changes
 151 -------------
 152
 153 * Support for Python 3.4 was removed.
 154
 155 * When using ``Element.find*()`` with prefix-namespace mappings, the empty string
 156   is now accepted to define a default namespace, in addition to the previously
 157   supported ``None`` prefix.  Empty strings are more convenient since they keep
 158   all prefix keys in a namespace dict strings, which simplifies sorting etc.
 159
 160 * The ``ElementTree.write_c14n()`` method has been deprecated in favour of the
 161   long preferred ``ElementTree.write(f, method="c14n")``.  It will be removed
 162   in a future release.
 163
 164
 165 4.3.5 (2019-07-27)
 166 ==================
 167
 168 * Rebuilt with Cython 0.29.13 to support Python 3.8.
 169
 170
 171 4.3.4 (2019-06-10)
 172 ==================
 173
 174 * Rebuilt with Cython 0.29.10 to support Python 3.8.
 175
 176
 177 4.3.3 (2019-03-26)
 178 ==================
 179
 180 Bugs fixed
 181 ----------
 182
 183 * Fix leak of output buffer and unclosed files in ``_XSLTResultTree.write_output()``.
 184
 185
 186 4.3.2 (2019-02-29)
 187 ==================
 188
 189 Bugs fixed
 190 ----------
 191
 192 * Crash in 4.3.1 when appending a child subtree with certain text nodes.
 193
 194 Other changes
 195 -------------
 196
 197 * Built with Cython 0.29.6.
 198
 199
 200 4.3.1 (2019-02-08)
 201 ==================
 202
 203 Bugs fixed
 204 ----------
 205
 206 * LP#1814522: Crash when appending a child subtree that contains unsubstituted
 207   entity references.
 208
 209 Other changes
 210 -------------
 211
 212 * Built with Cython 0.29.5.
 213
 214
 215 4.3.0 (2019-01-04)
 216 ==================
 217
 218 Features added
 219 --------------
 220
 221 * The module ``lxml.sax`` is compiled using Cython in order to speed it up.
 222
 223 * GH#267: ``lxml.sax.ElementTreeProducer`` now preserves the namespace prefixes.
 224   If two prefixes point to the same URI, the first prefix in alphabetical order
 225   is used.  Patch by Lennart Regebro.
 226
 227 * Updated ISO-Schematron implementation to 2013 version (now MIT licensed)
 228   and the corresponding schema to the 2016 version (with optional "properties").
 229
 230 Other changes
 231 -------------
 232
 233 * GH#270, GH#271: Support for Python 2.6 and 3.3 was removed.
 234   Patch by hugovk.
 235
 236 * The minimum dependency versions were raised to libxml2 2.9.2 and libxslt 1.1.27,
 237   which were released in 2014 and 2012 respectively.
 238
 239 * Built with Cython 0.29.2.
 240
 241
 242 4.2.6 (2019-01-02)
 243 ==================
 244
 245 Bugs fixed
 246 ----------
 247
 248 * LP#1799755: Fix a DeprecationWarning in Py3.7+.
 249
 250 * Import warnings in Python 3.6+ were resolved.
 251
 252
 253 4.2.5 (2018-09-09)
 254 ==================
 255
 256 Bugs fixed
 257 ----------
 258
 259 * Javascript URLs that used URL escaping were not removed by the HTML cleaner.
 260   Security problem found by Omar Eissa.  (CVE-2018-19787)
 261
 262
 263 4.2.4 (2018-08-03)
 264 ==================
 265
 266 Features added
 267 --------------
 268
 269 * GH#259: Allow using ``pkg-config`` for build configuration.
 270   Patch by Patrick Griffis.
 271
 272 Bugs fixed
 273 ----------
 274
 275 * LP#1773749, GH#268: Crash when moving an element to another document with
 276   ``Element.insert()``.
 277   Patch by Alexander Weggerle.
 278
 279
 280 4.2.3 (2018-06-27)
 281 ==================
 282
 283 Bugs fixed
 284 ----------
 285
 286 * Reverted GH#265: lxml links against zlib as a shared library again.
 287
 288
 289 4.2.2 (2018-06-22)
 290 ==================
 291
 292 Bugs fixed
 293 ----------
 294
 295 * GH#266: Fix sporadic crash during GC when parse-time schema validation is used
 296   and the parser participates in a reference cycle.
 297   Original patch by Julien Greard.
 298
 299 * GH#265: lxml no longer links against zlib as a shared library, only on static builds.
 300   Patch by Nehal J Wani.
 301
 302
 303 4.2.1 (2018-03-21)
 304 ==================
 305
 306 Bugs fixed
 307 ----------
 308
 309 * LP#1755825: ``iterwalk()`` failed to return the 'start' event for the initial
 310   element if a tag selector is used.
 311
 312 * LP#1756314: Failure to import 4.2.0 into PyPy due to a missing library symbol.
 313
 314 * LP#1727864, GH#258: Add "-isysroot" linker option on MacOS as needed by XCode 9.
 315
 316
 317 4.2.0 (2018-03-13)
 318 ==================
 319
 320 Features added
 321 --------------
 322
 323 * GH#255: ``SelectElement.value`` returns more standard-compliant and
 324   browser-like defaults for non-multi-selects.  If no option is selected, the
 325   value of the first option is returned (instead of None).  If multiple options
 326   are selected, the value of the last one is returned (instead of that of the
 327   first one).  If no options are present (not standard-compliant)
 328   ``SelectElement.value`` still returns ``None``.
 329
 330 * GH#261: The ``HTMLParser()`` now supports the ``huge_tree`` option.
 331   Patch by stranac.
 332
 333 Bugs fixed
 334 ----------
 335
 336 * LP#1551797: Some XSLT messages were not captured by the transform error log.
 337
 338 * LP#1737825: Crash at shutdown after an interrupted iterparse run with XMLSchema
 339   validation.
 340
 341 Other changes
 342 -------------
 343
 344
 345 4.1.1 (2017-11-04)
 346 ==================
 347
 348 * Rebuild with Cython 0.27.3 to improve support for Py3.7.
 349
 350
 351 4.1.0 (2017-10-13)
 352 ==================
 353
 354 Features added
 355 --------------
 356
 357 * ElementPath supports text predicates for current node, like "[.='text']".
 358
 359 * ElementPath allows spaces in predicates.
 360
 361 * Custom Element classes and XPath functions can now be registered with a
 362   decorator rather than explicit dict assignments.
 363
 364 * Static Linux wheels are now built with link time optimisation (LTO) enabled.
 365   This should have a beneficial impact on the overall performance by providing
 366   a tighter compiler integration between lxml and libxml2/libxslt.
 367
 368 Bugs fixed
 369 ----------
 370
 371 * LP#1722776: Requesting non-Element objects like comments from a document with
 372   ``PythonElementClassLookup`` could fail with a TypeError.
 373
 374
 375 4.0.0 (2017-09-17)
 376 ==================
 377
 378 Features added
 379 --------------
 380
 381 * The ElementPath implementation is now compiled using Cython,
 382   which speeds up the ``.find*()`` methods quite significantly.
 383
 384 * The modules ``lxml.builder``, ``lxml.html.diff`` and ``lxml.html.clean``
 385   are also compiled using Cython in order to speed them up.
 386
 387 * ``xmlfile()`` supports async coroutines using ``async with`` and ``await``.
 388
 389 * ``iterwalk()`` has a new method ``skip_subtree()`` that prevents walking into
 390   the descendants of the current element.
 391
 392 * ``RelaxNG.from_rnc_string()`` accepts a ``base_url`` argument to
 393   allow relative resource lookups.
 394
 395 * The XSLT result object has a new method ``.write_output(file)`` that serialises
 396   output data into a file according to the ``<xsl:output>`` configuration.
 397
 398 Bugs fixed
 399 ----------
 400
 401 * GH#251: HTML comments were handled incorrectly by the soupparser.
 402   Patch by mozbugbox.
 403
 404 * LP#1654544: The html5parser no longer passes the ``useChardet`` option
 405   if the input is a Unicode string, unless explicitly requested.  When parsing
 406   files, the default is to enable it when a URL or file path is passed (because
 407   the file is then opened in binary mode), and to disable it when reading from
 408   a file(-like) object.
 409
 410   Note: This is a backwards incompatible change of the default configuration.
 411   If your code parses byte strings/streams and depends on character detection,
 412   please pass the option ``guess_charset=True`` explicitly, which already worked
 413   in older lxml versions.
 414
 415 * LP#1703810: ``etree.fromstring()`` failed to parse UTF-32 data with BOM.
 416
 417 * LP#1526522: Some RelaxNG errors were not reported in the error log.
 418
 419 * LP#1567526: Empty and plain text input raised a TypeError in soupparser.
 420
 421 * LP#1710429: Uninitialised variable usage in HTML diff.
 422
 423 * LP#1415643: The closing tags context manager in ``xmlfile()`` could continue
 424   to output end tags even after writing failed with an exception.
 425
 426 * LP#1465357: ``xmlfile.write()`` now accepts and ignores None as input argument.
 427
 428 * Compilation under Py3.7-pre failed due to a modified function signature.
 429
 430 Other changes
 431 -------------
 432
 433 * The main module source files were renamed from ``lxml.*.pyx`` to plain
 434   ``*.pyx`` (e.g. ``etree.pyx``) to simplify their handling in the build
 435   process.  Care was taken to keep the old header files as fallbacks for
 436   code that compiles against the public C-API of lxml, but it might still
 437   be worth validating that third-party code does not notice this change.
 438
 439
 440 3.8.0 (2017-06-03)
 441 ==================
 442
 443 Features added
 444 --------------
 445
 446 * ``ElementTree.write()`` has a new option ``doctype`` that writes out a
 447   doctype string before the serialisation, in the same way as ``tostring()``.
 448
 449 * GH#220: ``xmlfile`` allows switching output methods at an element level.
 450   Patch by Burak Arslan.
 451
 452 * LP#1595781, GH#240: added a PyCapsule Python API and C-level API for
 453   passing externally generated libxml2 documents into lxml.
 454
 455 * GH#244: error log entries have a new property ``path`` with an XPath
 456   expression (if known, None otherwise) that points to the tree element
 457   responsible for the error. Patch by Bob Kline.
 458
 459 * The namespace prefix mapping that can be used in ElementPath now injects
 460   a default namespace when passing a None prefix.
 461
 462 Bugs fixed
 463 ----------
 464
 465 * GH#238: Character escapes were not hex-encoded in the ``xmlfile`` serialiser.
 466   Patch by matejcik.
 467
 468 * GH#229: fix for externally created XML documents.  Patch by Theodore Dubois.
 469
 470 * LP#1665241, GH#228: Form data handling in lxml.html no longer strips the
 471   option values specified in form attributes but only the text values.
 472   Patch by Ashish Kulkarni.
 473
 474 * LP#1551797: revert previous fix for XSLT error logging as it breaks
 475   multi-threaded XSLT processing.
 476
 477 * LP#1673355, GH#233: ``fromstring()`` html5parser failed to parse byte strings.
 478
 479 Other changes
 480 -------------
 481
 482 * The previously undocumented ``docstring`` option in ``ElementTree.write()``
 483   produces a deprecation warning and will eventually be removed.
 484
 485
 486 3.7.4 (2017-??-??)
 487 ==================
 488
 489 Bugs fixed
 490 ----------
 491
 492 * LP#1551797: revert previous fix for XSLT error logging as it breaks
 493   multi-threaded XSLT processing.
 494
 495 * LP#1673355, GH#233: ``fromstring()`` html5parser failed to parse byte strings.
 496
 497
 498 3.7.3 (2017-02-18)
 499 ==================
 500
 501 Bugs fixed
 502 ----------
 503
 504 * GH#218 was ineffective in Python 3.
 505
 506 * GH#222: ``lxml.html.submit_form()`` failed in Python 3.
 507   Patch by Jakub Wilk.
 508
 509
 510 3.7.2 (2017-01-08)
 511 ==================
 512
 513 * GH#220: ``xmlfile`` allows switching output methods at an element level.
 514   Patch by Burak Arslan.
 515
 516 Bugs fixed
 517 ----------
 518
 519 * Work around installation problems in recent Python 2.7 versions
 520   due to FTP download failures.
 521
 522 * GH#219: ``xmlfile.element()`` was not properly quoting attribute values.
 523   Patch by Burak Arslan.
 524
 525 * GH#218: ``xmlfile.element()`` was not properly escaping text content of
 526   script/style tags.  Patch by Burak Arslan.
 527
 528
 529 3.7.1 (2016-12-23)
 530 ==================
 531
 532 * No source changes, issued only to solve problems with the
 533   binary packages released for 3.7.0.
 534
 535
 536 3.7.0 (2016-12-10)
 537 ==================
 538
 539 Features added
 540 --------------
 541
 542 * GH#217: ``XMLSyntaxError`` now behaves more like its ``SyntaxError``
 543   baseclass.  Patch by Philipp A.
 544
 545 * GH#216: ``HTMLParser()`` now supports the same ``collect_ids`` parameter
 546   as ``XMLParser()``.  Patch by Burak Arslan.
 547
 548 * GH#210: Allow specifying a serialisation method in ``xmlfile.write()``.
 549   Patch by Burak Arslan.
 550
 551 * GH#203: New option ``default_doctype`` in ``HTMLParser`` that allows
 552   disabling the automatic doctype creation.  Patch by Shadab Zafar.
 553
 554 * GH#201: Calling the method ``.set('attrname')`` without value argument
 555   (or ``None``) on HTML elements creates an attribute without value that
 556   serialises like ``<div attrname></div>``.  Patch by Daniel Holth.
 557
 558 * GH#197: Ignore form input fields in ``form_values()`` when they are
 559   marked as ``disabled`` in HTML.  Patch by Kristian Klemon.
 560
 561 Bugs fixed
 562 ----------
 563
 564 * GH#206: File name and line number were missing from XSLT error messages.
 565   Patch by Marcus Brinkmann.
 566
 567 Other changes
 568 -------------
 569
 570 * Log entries no longer allow anything but plain string objects as message text
 571   and file name.
 572
 573 * ``zlib`` is included in the list of statically built libraries.
 574
 575
 576 3.6.4 (2016-08-20)
 577 ==================
 578
 579 * GH#204, LP#1614693: build fix for MacOS-X.
 580
 581
 582 3.6.3 (2016-08-18)
 583 ==================
 584
 585 * LP#1614603: change linker flags to build multi-linux wheels
 586
 587
 588 3.6.2 (2016-08-18)
 589 ==================
 590
 591 * LP#1614603: release without source changes to provide cleanly built Linux wheels
 592
 593
 594 3.6.1 (2016-07-24)
 595 ==================
 596
 597 Features added
 598 --------------
 599
 600 * GH#180: Separate option ``inline_style`` for Cleaner that only removes ``style``
 601   attributes instead of all styles.  Patch by Christian Pedersen.
 602
 603 * GH#196: Windows build support for Python 3.5.  Contribution by Maximilian Hils.
 604
 605 Bugs fixed
 606 ----------
 607
 608 * GH#199: Exclude ``file`` fields from ``FormElement.form_values`` (as browsers do).
 609   Patch by Tomas Divis.
 610
 611 * GH#198, LP#1568167: Try to provide base URL from ``Resolver.resolve_string()``.
 612   Patch by Michael van Tellingen.
 613
 614 * GH#191: More accurate float serialisation in ``objectify.FloatElement``.
 615   Patch by Holger Joukl.
 616
 617 * LP#1551797: Repair XSLT error logging. Patch by Marcus Brinkmann.
 618
 619
 620 3.6.0 (2016-03-17)
 621 ==================
 622
 623 Features added
 624 --------------
 625
 626 * GH#187: Now supports (only) version 5.x and later of PyPy.
 627   Patch by Armin Rigo.
 628
 629 * GH#181: Direct support for ``.rnc`` files in `RelaxNG()` if ``rnc2rng``
 630   is installed.  Patch by Dirkjan Ochtman.
 631
 632 Bugs fixed
 633 ----------
 634
 635 * GH#189: Static builds honour FTP proxy configurations when downloading
 636   the external libs.  Patch by Youhei Sakurai.
 637
 638 * GH#186: Soupparser failed to process entities in Python 3.x.
 639   Patch by Duncan Morris.
 640
 641 * GH#185: Rare encoding related ``TypeError`` on import was fixed.
 642   Patch by Petr Demin.
 643
 644
 645 3.5.0 (2015-11-13)
 646 ==================
 647
 648 Bugs fixed
 649 ----------
 650
 651 * Unicode string results failed XPath queries in PyPy.
 652
 653 * LP#1497051: HTML target parser failed to terminate on exceptions
 654   and continued parsing instead.
 655
 656 * Deprecated API usage in doctestcompare.
 657
 658
 659 3.5.0b1 (2015-09-18)
 660 ====================
 661
 662 Features added
 663 --------------
 664
 665 * ``cleanup_namespaces()`` accepts a new argument ``keep_ns_prefixes``
 666   that does not remove definitions of the provided prefix-namespace
 667   mapping from the tree.
 668
 669 * ``cleanup_namespaces()`` accepts a new argument ``top_nsmap`` that
 670   moves definitions of the provided prefix-namespace mapping to the
 671   top of the tree.
 672
 673 * LP#1490451: ``Element`` objects gained a ``cssselect()`` method as
 674   known from ``lxml.html``.  Patch by Simon Sapin.
 675
 676 * API functions and methods behave and look more like Python functions,
 677   which allows introspection on them etc.  One side effect to be aware of
 678   is that the functions now bind as methods when assigned to a class
 679   variable.  A quick fix is to wrap them in ``staticmethod()`` (as for
 680   normal Python functions).
 681
 682 * ISO-Schematron support gained an option ``error_finder`` that allows
 683   passing a filter function for picking validation errors from reports.
 684
 685 * LP#1243600: Elements in ``lxml.html`` gained a ``classes`` property
 686   that provides a set-like interface to the ``class`` attribute.
 687   Original patch by masklinn.
 688
 689 * LP#1341964: The soupparser now handles DOCTYPE declarations, comments
 690   and processing instructions outside of the root element.
 691   Patch by Olli Pottonen.
 692
 693 * LP#1421512: The ``docinfo`` of a tree was made editable to allow
 694   setting and removing the public ID and system ID of the DOCTYPE.
 695   Patch by Olli Pottonen.
 696
 697 * LP#1442427: More work-arounds for quirks and bugs in pypy and pypy3.
 698
 699 * ``lxml.html.soupparser`` now uses BeautifulSoup version 4 instead
 700   of version 3 if available.
 701
 702 Bugs fixed
 703 ----------
 704
 705 * Memory errors that occur during tree adaptations (e.g. moving subtrees
 706   to foreign documents) could leave the tree in a crash prone state.
 707
 708 * Calling ``process_children()`` in an XSLT extension element without
 709   an ``output_parent`` argument failed with a ``TypeError``.
 710   Fix by Jens Tröger.
 711
 712 * GH#162: Image data in HTML ``data`` URLs is considered safe and
 713   no longer removed by ``lxml.html.clean`` JavaScript cleaner.
 714
 715 * GH#166: Static build could link libraries in wrong order.
 716
 717 * GH#172: Rely a bit more on libxml2 for encoding detection rather than
 718   rolling our own in some cases.  Patch by Olli Pottonen.
 719
 720 * GH#159: Validity checks for names and string content were tightened
 721   to detect the use of illegal characters early.  Patch by Olli Pottonen.
 722
 723 * LP#1421921: Comments/PIs before the DOCTYPE declaration were not
 724   serialised.  Patch by Olli Pottonen.
 725
 726 * LP#659367: Some HTML DOCTYPE declarations were not serialised.
 727   Patch by Olli Pottonen.
 728
 729 * LP#1238503: lxml.doctestcompare is now consistent with stdlib's doctest
 730   in how it uses ``+`` and ``-`` to refer to unexpected and missing output.
 731
 732 * Empty prefixes are explicitly rejected when a namespace mapping is used
 733   with ElementPath to avoid hiding bugs in user code.
 734
 735 * Several problems with PyPy were fixed by switching to Cython 0.23.
 736
 737
 738 3.4.4 (2015-04-25)
 739 ==================
 740
 741 Bugs fixed
 742 ----------
 743
 744 * An ElementTree compatibility test added in lxml 3.4.3 that failed in
 745   Python 3.4+ was removed again.
 746
 747
 748 3.4.3 (2015-04-15)
 749 ==================
 750
 751 Bugs fixed
 752 ----------
 753
 754 * Expression cache in ElementPath was ignored.  Fix by Changaco.
 755
 756 * LP#1426868: Passing a default namespace and a prefixed namespace mapping
 757   as nsmap into ``xmlfile.element()`` raised a ``TypeError``.
 758
 759 * LP#1421927: DOCTYPE system URLs were incorrectly quoted when containing
 760   double quotes.  Patch by Olli Pottonen.
 761
 762 * LP#1419354: meta-redirect URLs were incorrectly processed by
 763   ``iterlinks()`` if preceded by whitespace.
 764
 765
 766 3.4.2 (2015-02-07)
 767 ==================
 768
 769 Bugs fixed
 770 ----------
 771
 772 * LP#1415907: Crash when creating an XMLSchema from a non-root element
 773   of an XML document.
 774
 775 * LP#1369362: HTML cleaning failed when hitting processing instructions
 776   with pseudo-attributes.
 777
 778 * ``CDATA()`` wrapped content was rejected for tail text.
 779
 780 * CDATA sections were not serialised as tail text of the top-level element.
 781
 782
 783 3.4.1 (2014-11-20)
 784 ==================
 785
 786 Features added
 787 --------------
 788
 789 * New ``htmlfile`` HTML generator to accompany the incremental ``xmlfile``
 790   serialisation API.  Patch by Burak Arslan.
 791
 792 Bugs fixed
 793 ----------
 794
 795 * ``lxml.sax.ElementTreeContentHandler`` did not initialise its superclass.
 796
 797
 798 3.4.0 (2014-09-10)
 799 ==================
 800
 801 Features added
 802 --------------
 803
 804 * ``xmlfile(buffered=False)`` disables output buffering and flushes the
 805   content after each API operation (starting/ending element blocks or writes).
 806   A new method ``xf.flush()`` can alternatively be used to explicitly flush
 807   the output.
 808
 809 * ``lxml.html.document_fromstring`` has a new option ``ensure_head_body=True``
 810   which will add an empty head and/or body element to the result document if
 811   missing.
 812
 813 * ``lxml.html.iterlinks`` now returns links inside meta refresh tags.
 814
 815 * New ``XMLParser`` option ``collect_ids=False`` to disable ID hash table
 816   creation.  This can substantially speed up parsing of documents with many
 817   different IDs that are not used.
 818
 819 * The parser uses per-document hash tables for XML IDs.  This reduces the
 820   load of the global parser dict and speeds up parsing for documents with
 821   many different IDs.
 822
 823 * ``ElementTree.getelementpath(element)`` returns a structural ElementPath
 824   expression for the given element, which can be used for lookups later.
 825
 826 * ``xmlfile()`` accepts a new argument ``close=True`` to close file(-like)
 827   objects after writing to them.  Before, ``xmlfile()`` only closed the file
 828   if it had opened it internally.
 829
 830 * Allow "bytearray" type for ASCII text input.
 831
 832 Bugs fixed
 833 ----------
 834
 835 Other changes
 836 -------------
 837
 838 * LP#400588: decoding errors have become hard errors even in recovery mode.
 839   Previously, they could lead to an internal tree representation in a mixed
 840   encoding state, which lead to very late errors or even silently incorrect
 841   behaviour during tree traversal or serialisation.
 842
 843 * Requires Python 2.6, 2.7, 3.2 or later. No longer supports
 844   Python 2.4, 2.5 and 3.1, use lxml 3.3.x for those.
 845
 846 * Requires libxml2 2.7.0 or later and libxslt 1.1.23 or later,
 847   use lxml 3.3.x with older versions.
 848
 849
 850 3.3.6 (2014-08-28)
 851 ==================
 852
 853 Bugs fixed
 854 ----------
 855
 856 * Prevent tree cycle creation when adding Elements as siblings.
 857
 858 * LP#1361948: crash when deallocating Element siblings without parent.
 859
 860 * LP#1354652: crash when traversing internally loaded documents in XSLT
 861   extension functions.
 862
 863
 864 3.3.5 (2014-04-18)
 865 ==================
 866
 867 Bugs fixed
 868 ----------
 869
 870 * HTML cleaning could fail to strip javascript links that mix control
 871   characters into the link scheme.
 872
 873
 874 3.3.4 (2014-04-03)
 875 ==================
 876
 877 Features added
 878 --------------
 879
 880 * Source line numbers above 65535 are available on Elements when
 881   using libxml2 2.9 or later.
 882
 883 Bugs fixed
 884 ----------
 885
 886 * ``lxml.html.fragment_fromstring()`` failed for bytes input in Py3.
 887
 888 Other changes
 889 -------------
 890
 891
 892 3.3.3 (2014-03-04)
 893 ==================
 894
 895 Bugs fixed
 896 ----------
 897
 898 * LP#1287118: Crash when using Element subtypes with ``__slots__``.
 899
 900 Other changes
 901 -------------
 902
 903 * The internal classes ``_LogEntry`` and ``_Attrib`` can no longer be
 904   subclassed from Python code.
 905
 906
 907 3.3.2 (2014-02-26)
 908 ==================
 909
 910 Bugs fixed
 911 ----------
 912
 913 * The properties ``resolvers`` and ``version``, as well as the methods
 914   ``set_element_class_lookup()`` and ``makeelement()``, were lost from
 915   ``iterparse`` objects in 3.3.0.
 916
 917 * LP#1222132: instances of ``XMLSchema``, ``Schematron`` and ``RelaxNG``
 918   did not clear their local ``error_log`` before running a validation.
 919
 920 * LP#1238500: lxml.doctestcompare mixed up "expected" and "actual" in
 921   attribute values.
 922
 923 * Some file I/O tests were failing in MS-Windows due to non-portable temp
 924   file usage.  Initial patch by Gabi Davar.
 925
 926 * LP#910014: duplicate IDs in a document were not reported by DTD validation.
 927
 928 * LP#1185332: ``tostring(method="html")`` did not use HTML serialisation
 929   semantics for trailing tail text.  Initial patch by Sylvain Viollon.
 930
 931 * LP#1281139: ``.attrib`` value of Comments lost its mutation methods
 932   in 3.3.0.  Even though it is empty and immutable, it should still
 933   provide the same interface as that returned for Elements.
 934
 935
 936 3.3.1 (2014-02-12)
 937 ==================
 938
 939 Features added
 940 --------------
 941
 942 Bugs fixed
 943 ----------
 944
 945 * LP#1014290: HTML documents parsed with ``parser.feed()`` failed to find
 946   elements during tag iteration.
 947
 948 * LP#1273709: Building in PyPy failed due to missing support for
 949   ``PyUnicode_Compare()`` and ``PyByteArray_*()`` in PyPy's C-API.
 950
 951 * LP#1274413: Compilation in MSVC failed due to missing "stdint.h" standard
 952   header file.
 953
 954 * LP#1274118: iterparse() failed to parse BOM prefixed files.
 955
 956 Other changes
 957 -------------
 958
 959
 960 3.3.0 (2014-01-26)
 961 ==================
 962
 963 Features added
 964 --------------
 965
 966 Bugs fixed
 967 ----------
 968
 969 * The heuristic that distinguishes file paths from URLs was tightened
 970   to produce less false negatives.
 971
 972 Other changes
 973 -------------
 974
 975
 976 3.3.0beta5 (2014-01-18)
 977 =======================
 978
 979 Features added
 980 --------------
 981
 982 * The PEP 393 unicode parsing support gained a fallback for wchar strings
 983   which might still be somewhat common on Windows systems.
 984
 985 Bugs fixed
 986 ----------
 987
 988 * Several error handling problems were fixed throughout the code base that
 989   could previously lead to exceptions being silently swallowed or not
 990   properly reported.
 991
 992 * The C-API function ``appendChild()`` is now deprecated as it does not
 993   propagate exceptions (its return type is ``void``).  The new function
 994   ``appendChildToElement()`` was added as a safe replacement.
 995
 996 * Passing a string into ``fromstringlist()`` raises an exception instead of
 997   parsing the string character by character.
 998
 999 Other changes
1000 -------------
1001
1002 * Document cleanup code was simplified using the new GC features in
1003   Cython 0.20.
1004
1005
1006 3.3.0beta4 (2014-01-12)
1007 =======================
1008
1009 Features added
1010 --------------
1011
1012 Bugs fixed
1013 ----------
1014
1015 * The (empty) value returned by the ``attrib`` property of Entity and Comment
1016   objects was mutable.
1017
1018 * Element class lookup wasn't available for the new pull parsers or when using
1019   a custom parser target.
1020
1021 * Setting Element attributes on instantiation with both the ``attrib`` argument
1022   and keyword arguments could modify the mapping passed as ``attrib``.
1023
1024 * LP#1266171: DTDs instantiated from internal/external subsets (i.e. through
1025   the docinfo property) lost their attribute declarations.
1026
1027 Other changes
1028 -------------
1029
1030 * Built with Cython 0.20pre (gitrev 012ae82eb) to prepare support for
1031   Python 3.4.
1032
1033
1034 3.3.0beta3 (2014-01-02)
1035 =======================
1036
1037 Features added
1038 --------------
1039
1040 * Unicode string parsing was optimised for Python 3.3 (PEP 393).
1041
1042 Bugs fixed
1043 ----------
1044
1045 * HTML parsing of Unicode strings could misdecode the input on some platforms.
1046
1047 * Crash in xmlfile() when closing open elements out of order in an error case.
1048
1049 Other changes
1050 -------------
1051
1052
1053 3.3.0beta2 (2013-12-20)
1054 =======================
1055
1056 Features added
1057 --------------
1058
1059 * ``iterparse()`` supports the ``recover`` option.
1060
1061 Bugs fixed
1062 ----------
1063
1064 * Crash in ``iterparse()`` for HTML parsing.
1065
1066 * Crash in target parsing with attributes.
1067
1068 Other changes
1069 -------------
1070
1071 * The safety check in the read-only tree implementation (e.g. used by
1072   ``PythonElementClassLookup``) raises a more appropriate ``ReferenceError``
1073   for illegal access after tree disposal instead of an ``AssertionError``.
1074   This should only impact test code that specifically checks the original
1075   behaviour.
1076
1077
1078 3.3.0beta1 (2013-12-12)
1079 =======================
1080
1081 Features added
1082 --------------
1083
1084 * New option ``handle_failures`` in ``make_links_absolute()`` and
1085   ``resolve_base_href()`` (lxml.html) that enables ignoring or
1086   discarding links that fail to parse as URLs.
1087
1088 * New parser classes ``XMLPullParser`` and ``HTMLPullParser`` for
1089   incremental parsing, as implemented for ElementTree in Python 3.4.
1090
1091 * ``iterparse()`` enables recovery mode by default for HTML parsing
1092   (``html=True``).
1093
1094 Bugs fixed
1095 ----------
1096
1097 * LP#1255132: crash when trying to run validation over non-Element (e.g.
1098   comment or PI).
1099
1100 * Error messages in the log and in exception messages that originated
1101   from libxml2 could accidentally be picked up from preceding warnings
1102   instead of the actual error.
1103
1104 * The ``ElementMaker`` in lxml.objectify did not accept a dict as
1105   argument for adding attributes to the element it's building. This
1106   works as in lxml.builder now.
1107
1108 * LP#1228881: ``repr(XSLTAccessControl)`` failed in Python 3.
1109
1110 * Raise ``ValueError`` when trying to append an Element to itself or
1111   to one of its own descendants, instead of running into an infinite
1112   loop.
1113
1114 * LP#1206077: htmldiff discarded whitespace from the output.
1115
1116 * Compressed plain-text serialisation to file-like objects was broken.
1117
1118 * lxml.html.formfill: Fix textarea form filling.
1119   The textarea used to be cleared before the new content was set,
1120   which removed the name attribute.
1121
1122
1123 Other changes
1124 -------------
1125
1126 * Some basic API classes use freelists internally for faster
1127   instantiation.  This can speed up some ``iterparse()`` scenarios,
1128   for example.
1129
1130 * ``iterparse()`` was rewritten to use the new ``*PullParser``
1131   classes internally instead of being a parser itself.
1132
1133
1134 3.2.5 (2014-01-02)
1135 ==================
1136
1137 Features added
1138 --------------
1139
1140 Bugs fixed
1141 ----------
1142
1143 * Crash in xmlfile() when closing open elements out of order in an error case.
1144
1145 * Crash in target parsing with attributes.
1146
1147 * LP#1255132: crash when trying to run validation over non-Element (e.g.
1148   comment or PI).
1149
1150 Other changes
1151 -------------
1152
1153
1154 3.2.4 (2013-11-07)
1155 ==================
1156
1157 Features added
1158 --------------
1159
1160 Bugs fixed
1161 ----------
1162
1163 * Memory leak when creating an XPath evaluator in a thread.
1164
1165 * LP#1228881: ``repr(XSLTAccessControl)`` failed in Python 3.
1166
1167 * Raise ``ValueError`` when trying to append an Element to itself or
1168   to one of its own descendants.
1169
1170 * LP#1206077: htmldiff discarded whitespace from the output.
1171
1172 * Compressed plain-text serialisation to file-like objects was broken.
1173
1174 Other changes
1175 -------------
1176
1177
1178 3.2.3 (2013-07-28)
1179 ==================
1180
1181 Bugs fixed
1182 ----------
1183
1184 * Fix support for Python 2.4 which was lost in 3.2.2.
1185
1186
1187 3.2.2 (2013-07-28)
1188 ==================
1189
1190 Features added
1191 --------------
1192
1193 Bugs fixed
1194 ----------
1195
1196 * LP#1185701: spurious XMLSyntaxError after finishing iterparse().
1197
1198 * Crash in lxml.objectify during xsi annotation.
1199
1200 Other changes
1201 -------------
1202
1203 * Return values of user provided element class lookup methods are now
1204   validated against the type of the XML node they represent to prevent
1205   API class mismatches.
1206
1207
1208 3.2.1 (2013-05-11)
1209 ==================
1210
1211 Features added
1212 --------------
1213
1214 * The methods ``apply_templates()`` and ``process_children()`` of XSLT
1215   extension elements have gained two new boolean options ``elements_only``
1216   and ``remove_blank_text`` that discard either all strings or whitespace-only
1217   strings from the result list.
1218
1219 Bugs fixed
1220 ----------
1221
1222 * When moving Elements to another tree, the namespace cleanup mechanism
1223   no longer drops namespace prefixes from attributes for which it finds
1224   a default namespace declaration, to prevent them from appearing as
1225   unnamespaced attributes after serialisation.
1226
1227 * Returning non-type objects from a custom class lookup method could lead
1228   to a crash.
1229
1230 * Instantiating and using subtypes of Comments and ProcessingInstructions
1231   crashed.
1232
1233 Other changes
1234 -------------
1235
1236
1237 3.2.0 (2013-04-28)
1238 ==================
1239
1240 Features added
1241 --------------
1242
1243 Bugs fixed
1244 ----------
1245
1246 * LP#690319: Leading whitespace could change the behaviour of the string
1247   parsing functions in ``lxml.html``.
1248
1249 * LP#599318: The string parsing functions in ``lxml.html`` are more robust
1250   in the face of uncommon HTML content like framesets or missing body tags.
1251   Patch by Stefan Seelmann.
1252
1253 * LP#712941: I/O errors while trying to access files with paths that contain
1254   non-ASCII characters could raise ``UnicodeDecodeError`` instead of properly
1255   reporting the ``IOError``.
1256
1257 * LP#673205: Parsing from in-memory strings disabled network access in the
1258   default parser and made subsequent attempts to parse from a URL fail.
1259
1260 * LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead
1261   of overwriting the current value.
1262
1263 * LP#715687: lxml.html.clean no longer discards scripts that are explicitly
1264   allowed by the user provided whitelist.  Patch by Christine Koppelt.
1265
1266 Other changes
1267 -------------
1268
1269
1270 3.1.2 (2013-04-12)
1271 ==================
1272
1273 Features added
1274 --------------
1275
1276 Bugs fixed
1277 ----------
1278
1279 * LP#1136509: Passing attributes through the namespace-unaware API of
1280   the sax bridge (i.e. the ``handler.startElement()`` method) failed
1281   with a ``TypeError``.  Patch by Mike Bayer.
1282
1283 * LP#1123074: Fix serialisation error in XSLT output when converting
1284   the result tree to a Unicode string.
1285
1286 * GH#105: Replace illegal usage of ``xmlBufLength()`` in libxml2 2.9.0
1287   by properly exported API function ``xmlBufUse()``.
1288
1289 Other changes
1290 -------------
1291
1292
1293 3.1.1 (2013-03-29)
1294 ==================
1295
1296 Features added
1297 --------------
1298
1299 Bugs fixed
1300 ----------
1301
1302 * LP#1160386: Write access to ``lxml.html.FormElement.fields`` raised
1303   an AttributeError in Py3.
1304
1305 * Illegal memory access during cleanup in incremental xmlfile writer.
1306
1307 Other changes
1308 -------------
1309
1310 * The externally useless class ``lxml.etree._BaseParser`` was removed
1311   from the module dict.
1312
1313
1314 3.1.0 (2013-02-10)
1315 ==================
1316
1317 Features added
1318 --------------
1319
1320 * GH#89: lxml.html.clean allows overriding the set of attributes that it
1321   considers 'safe'.  Patch by Francis Devereux.
1322
1323 Bugs fixed
1324 ----------
1325
1326 * LP#1104370: ``copy.copy(el.attrib)`` raised an exception.  It now returns
1327   a copy of the attributes as a plain Python dict.
1328
1329 * GH#95: When used with namespace prefixes, the  ``el.find*()`` methods
1330   always used the first namespace mapping that was provided for each
1331   path expression instead of using the one that was actually passed
1332   in for the current run.
1333
1334 * LP#1092521, GH#91: Fix undefined C symbol in Python runtimes compiled
1335   without threading support.  Patch by Ulrich Seidl.
1336
1337 Other changes
1338 -------------
1339
1340
1341 3.1beta1 (2012-12-21)
1342 =====================
1343
1344 Features added
1345 --------------
1346
1347 * New build-time option ``--with-unicode-strings`` for Python 2 that
1348   makes the API always return Unicode strings for names and text
1349   instead of byte strings for plain ASCII content.
1350
1351 * New incremental XML file writing API ``etree.xmlfile()``.
1352
1353 * E factory in lxml.objectify is callable to simplify the creation of
1354   tags with non-identifier names without having to resort to getattr().
1355
1356 Bugs fixed
1357 ----------
1358
1359 * When starting from a non-namespaced element in lxml.objectify, searching
1360   for a child without explicitly specifying a namespace incorrectly found
1361   namespaced elements with the requested local name, instead of restricting
1362   the search to non-namespaced children.
1363
1364 * GH#85: Deprecation warnings were fixed for Python 3.x.
1365
1366 * GH#33: lxml.html.fromstring() failed to accept bytes input in Py3.
1367
1368 * LP#1080792: Static build of libxml2 2.9.0 failed due to missing file.
1369
1370 Other changes
1371 -------------
1372
1373 * The externally useless class ``_ObjectifyElementMakerCaller`` was
1374   removed from the module API of lxml.objectify.
1375
1376 * LP#1075622: lxml.builder is faster for adding text to elements with
1377   many children.  Patch by Anders Hammarquist.
1378
1379
1380 3.0.2 (2012-12-14)
1381 ==================
1382
1383 Features added
1384 --------------
1385
1386 Bugs fixed
1387 ----------
1388
1389 * Fix crash during interpreter shutdown by switching to Cython 0.17.3 for building.
1390
1391 Other changes
1392 -------------
1393
1394
1395 3.0.1 (2012-10-14)
1396 ==================
1397
1398 Features added
1399 --------------
1400
1401 Bugs fixed
1402 ----------
1403
1404 * LP#1065924: Element proxies could disappear during garbage collection
1405   in PyPy without proper cleanup.
1406
1407 * GH#71: Failure to work with libxml2 2.6.x.
1408
1409 * LP#1065139: static MacOS-X build failed in Py3.
1410
1411 Other changes
1412 -------------
1413
1414
1415 3.0 (2012-10-08)
1416 ================
1417
1418 Features added
1419 --------------
1420
1421 Bugs fixed
1422 ----------
1423
1424 * End-of-file handling was incorrect in iterparse() when reading from
1425   a low-level C file stream and failed in libxml2 2.9.0 due to its
1426   improved consistency checks.
1427
1428 Other changes
1429 -------------
1430
1431 * The build no longer uses Cython by default unless the generated C files
1432   are missing.  To use Cython, pass the option "--with-cython".  To ignore
1433   the fatal build error when Cython is required but not available (e.g. to
1434   run special setup.py commands that do not actually run a build), pass
1435   "--without-cython".
1436
1437
1438 3.0beta1 (2012-09-26)
1439 =====================
1440
1441 Features added
1442 --------------
1443
1444 * Python level access to (optional) libxml2 memory debugging features
1445   to simplify debugging of memory leaks etc.
1446
1447 Bugs fixed
1448 ----------
1449
1450 * Fix a memory leak in XPath by switching to Cython 0.17.1.
1451
1452 * Some tests were adapted to work with PyPy.
1453
1454 Other changes
1455 -------------
1456
1457 * The code was adapted to work with the upcoming libxml2 2.9.0 release.
1458
1459
1460 3.0alpha2 (2012-08-23)
1461 ======================
1462
1463 Features added
1464 --------------
1465
1466 * The ``.iter()`` method of elements now accepts ``tag`` arguments like
1467   ``"{*}name"`` to search for elements with a given local name in any
1468   namespace. With this addition, all combinations of wildcards now work
1469   as expected:
1470   ``"{ns}name"``, ``"{}name"``, ``"{*}name"``, ``"{ns}*"``, ``"{}*"``
1471   and ``"{*}*"``.  Note that ``"name"`` is equivalent to ``"{}name"``,
1472   but ``"*"`` is ``"{*}*"``.
1473   The same change applies to the ``.getiterator()``, ``.itersiblings()``,
1474   ``.iterancestors()``, ``.iterdescendants()``, ``.iterchildren()``
1475   and ``.itertext()`` methods;the ``strip_attributes()``,
1476   ``strip_elements()`` and ``strip_tags()`` functions as well as the
1477   ``iterparse()`` class.  Patch by Simon Sapin.
1478
1479 * C14N allows specifying the inclusive prefixes to be promoted
1480   to top-level during exclusive serialisation.
1481
1482 Bugs fixed
1483 ----------
1484
1485 * Passing long Unicode strings into the ``feed()`` parser interface
1486   failed to read the entire string.
1487
1488 Other changes
1489 -------------
1490
1491
1492 3.0alpha1 (2012-07-31)
1493 ======================
1494
1495 Features added
1496 --------------
1497
1498 * Initial support for building in PyPy (through cpyext).
1499
1500 * DTD objects gained an API that allows read access to their
1501   declarations.
1502
1503 * ``xpathgrep.py`` gained support for parsing line-by-line (e.g.
1504   from grep output) and for surrounding the output with a new root
1505   tag.
1506
1507 * ``E-factory`` in ``lxml.builder`` accepts subtypes of known data
1508   types (such as string subtypes) when building elements around them.
1509
1510 * Tree iteration and ``iterparse()`` with a selective ``tag``
1511   argument supports passing a set of tags.  Tree nodes will be
1512   returned by the iterators if they match any of the tags.
1513
1514 Bugs fixed
1515 ----------
1516
1517 * The ``.find*()`` methods in ``lxml.objectify`` no longer use XPath
1518   internally, which makes them faster in many cases (especially when
1519   short circuiting after a single or couple of elements) and fixes
1520   some behavioural differences compared to ``lxml.etree``.  Note that
1521   this means that they no longer support arbitrary XPath expressions
1522   but only the subset that the ``ElementPath`` language supports.
1523   The previous implementation was also redundant with the normal
1524   XPath support, which can be used as a replacement.
1525
1526 * ``el.find('*')`` could accidentally return a comment or processing
1527   instruction that happened to be in the wrong spot.  (Same for the
1528   other ``.find*()`` methods.)
1529
1530 * The error logging is less intrusive and avoids a global setup where
1531   possible.
1532
1533 * Fixed undefined names in html5lib parser.
1534
1535 * ``xpathgrep.py`` did not work in Python 3.
1536
1537 * ``Element.attrib.update()`` did not accept an ``attrib`` of
1538   another Element as parameter.
1539
1540 * For subtypes of ``ElementBase`` that make the ``.text`` or ``.tail``
1541   properties immutable (as in objectify, for example), inserting text
1542   when creating Elements through the E-Factory feature of the class
1543   constructor would fail with an exception, stating that the text
1544   cannot be modified.
1545
1546 Other changes
1547 --------------
1548
1549 * The code base was overhauled to properly use 'const' where the API
1550   of libxml2 and libxslt requests it.  This also has an impact on the
1551   public C-API of lxml itself, as defined in ``etreepublic.pxd``, as
1552   well as the provided declarations in the ``lxml/includes/`` directory.
1553   Code that uses these declarations may have to be adapted.  On the
1554   plus side, this fixes several C compiler warnings, also for user
1555   code, thus making it easier to spot real problems again.
1556
1557 * The functionality of "lxml.cssselect" was moved into a separate PyPI
1558   package called "cssselect".  To continue using it, you must install
1559   that package separately.  The "lxml.cssselect" module is still
1560   available and provides the same interface, provided the "cssselect"
1561   package can be imported at runtime.
1562
1563 * Element attributes passed in as an ``attrib`` dict or as keyword
1564   arguments are now sorted by (namespaced) name before being created
1565   to make their order predictable for serialisation and iteration.
1566   Note that adding or deleting attributes afterwards does not take
1567   that order into account, i.e. setting a new attribute appends it
1568   after the existing ones.
1569
1570 * Several classes that are for internal use only were removed
1571   from the ``lxml.etree`` module dict:
1572   ``_InputDocument, _ResolverRegistry, _ResolverContext, _BaseContext,
1573   _ExsltRegExp, _IterparseContext, _TempStore, _ExceptionContext,
1574   __ContentOnlyElement, _AttribIterator, _NamespaceRegistry,
1575   _ClassNamespaceRegistry, _FunctionNamespaceRegistry,
1576   _XPathFunctionNamespaceRegistry, _ParserDictionaryContext,
1577   _FileReaderContext, _ParserContext, _PythonSaxParserTarget,
1578   _TargetParserContext, _ReadOnlyProxy, _ReadOnlyPIProxy,
1579   _ReadOnlyEntityProxy, _ReadOnlyElementProxy, _OpaqueNodeWrapper,
1580   _OpaqueDocumentWrapper, _ModifyContentOnlyProxy,
1581   _ModifyContentOnlyPIProxy, _ModifyContentOnlyEntityProxy,
1582   _AppendOnlyElementProxy, _SaxParserContext, _FilelikeWriter,
1583   _ParserSchemaValidationContext, _XPathContext,
1584   _XSLTResolverContext, _XSLTContext, _XSLTQuotedStringParam``
1585
1586 * Several internal classes can no longer be inherited from:
1587   ``_InputDocument, _ResolverRegistry, _ExsltRegExp, _ElementUnicodeResult,
1588   _IterparseContext, _TempStore, _AttribIterator, _ClassNamespaceRegistry,
1589   _XPathFunctionNamespaceRegistry, _ParserDictionaryContext,
1590   _FileReaderContext, _PythonSaxParserTarget, _TargetParserContext,
1591   _ReadOnlyPIProxy, _ReadOnlyEntityProxy, _OpaqueDocumentWrapper,
1592   _ModifyContentOnlyPIProxy, _ModifyContentOnlyEntityProxy,
1593   _AppendOnlyElementProxy, _FilelikeWriter, _ParserSchemaValidationContext,
1594   _XPathContext, _XSLTResolverContext, _XSLTContext, _XSLTQuotedStringParam,
1595   _XSLTResultTree, _XSLTProcessingInstruction``
1596
1597
1598 2.3.6 (2012-09-28)
1599 ==================
1600
1601 Features added
1602 --------------
1603
1604 Bugs fixed
1605 ----------
1606
1607 * Passing long Unicode strings into the ``feed()`` parser interface
1608   failed to read the entire string.
1609
1610 Other changes
1611 --------------
1612
1613
1614 2.3.5 (2012-07-31)
1615 ==================
1616
1617 Features added
1618 --------------
1619
1620 Bugs fixed
1621 ----------
1622
1623 * Crash when merging text nodes in ``element.remove()``.
1624
1625 * Crash in sax/target parser when reporting empty doctype.
1626
1627 Other changes
1628 --------------
1629
1630
1631 2.3.4 (2012-03-26)
1632 ==================
1633
1634 Features added
1635 --------------
1636
1637 Bugs fixed
1638 ----------
1639
1640 * Crash when building an nsmap (Element property) with empty
1641   namespace URIs.
1642
1643 * Crash due to race condition when errors (or user messages) occur
1644   during threaded XSLT processing.
1645
1646 * XSLT stylesheet compilation could ignore compilation errors.
1647
1648 Other changes
1649 --------------
1650
1651
1652 2.3.3 (2012-01-04)
1653 ==================
1654
1655 Features added
1656 --------------
1657
1658 * ``lxml.html.tostring()`` gained new serialisation options
1659   ``with_tail`` and ``doctype``.
1660
1661 Bugs fixed
1662 ----------
1663
1664 * Fixed a crash when using ``iterparse()`` for HTML parsing and
1665   requesting start events.
1666
1667 * Fixed parsing of more selectors in cssselect.  Whitespace before
1668   pseudo-elements and pseudo-classes is significant as it is a
1669   descendant combinator.
1670   "E :pseudo" should parse the same as "E \*:pseudo", not "E:pseudo".
1671   Patch by Simon Sapin.
1672
1673 * lxml.html.diff no longer raises an exception when hitting
1674   'img' tags without 'src' attribute.
1675
1676 Other changes
1677 --------------
1678
1679
1680 2.3.2 (2011-11-11)
1681 ==================
1682
1683 Features added
1684 --------------
1685
1686 * ``lxml.objectify.deannotate()`` has a new boolean option
1687   ``cleanup_namespaces`` to remove the objectify namespace
1688   declarations (and generally clean up the namespace declarations)
1689   after removing the type annotations.
1690
1691 * ``lxml.objectify`` gained its own ``SubElement()`` function as a
1692   copy of ``etree.SubElement`` to avoid an otherwise redundant import
1693   of ``lxml.etree`` on the user side.
1694
1695 Bugs fixed
1696 ----------
1697
1698 * Fixed the "descendant" bug in cssselect a second time (after a first
1699   fix in lxml 2.3.1).  The previous change resulted in a serious
1700   performance regression for the XPath based evaluation of the
1701   translated expression.  Note that this breaks the usage of some of
1702   the generated XPath expressions as XSLT location paths that
1703   previously worked in 2.3.1.
1704
1705 * Fixed parsing of some selectors in cssselect. Whitespace after combinators
1706   ">", "+" and "~" is now correctly ignored. Previously it was parsed as
1707   a descendant combinator. For example, "div> .foo" was parsed the same as
1708   "div>* .foo" instead of "div>.foo". Patch by Simon Sapin.
1709
1710 Other changes
1711 --------------
1712
1713
1714 2.3.1 (2011-09-25)
1715 ==================
1716
1717 Features added
1718 --------------
1719
1720 * New option ``kill_tags`` in ``lxml.html.clean`` to remove specific
1721   tags and their content (i.e. their whole subtree).
1722
1723 * ``pi.get()`` and ``pi.attrib`` on processing instructions to parse
1724   pseudo-attributes from the text content of processing instructions.
1725
1726 * ``lxml.get_include()`` returns a list of include paths that can be
1727   used to compile external C code against lxml.etree.  This is
1728   specifically required for statically linked lxml builds when code
1729   needs to compile against the exact same header file versions as lxml
1730   itself.
1731
1732 * ``Resolver.resolve_file()`` takes an additional option
1733   ``close_file`` that configures if the file(-like) object will be
1734   closed after reading or not.  By default, the file will be closed,
1735   as the user is not expected to keep a reference to it.
1736
1737 Bugs fixed
1738 ----------
1739
1740 * HTML cleaning didn't remove 'data:' links.
1741
1742 * The html5lib parser integration now uses the 'official'
1743   implementation in html5lib itself, which makes it work with newer
1744   releases of the library.
1745
1746 * In ``lxml.sax``, ``endElementNS()`` could incorrectly reject a plain
1747   tag name when the corresponding start event inferred the same plain
1748   tag name to be in the default namespace.
1749
1750 * When an open file-like object is passed into ``parse()`` or
1751   ``iterparse()``, the parser will no longer close it after use.  This
1752   reverts a change in lxml 2.3 where all files would be closed.  It is
1753   the users responsibility to properly close the file(-like) object,
1754   also in error cases.
1755
1756 * Assertion error in lxml.html.cleaner when discarding top-level elements.
1757
1758 * In lxml.cssselect, use the xpath 'A//B' (short for
1759   'A/descendant-or-self::node()/B') instead of 'A/descendant::B' for
1760   the css descendant selector ('A B').  This makes a few edge cases
1761   like ``"div *:last-child"`` consistent with the selector behavior in
1762   WebKit and Firefox, and makes more css expressions valid location
1763   paths (for use in xsl:template match).
1764
1765 * In lxml.html, non-selected ``<option>`` tags no longer show up in the
1766   collected form values.
1767
1768 * Adding/removing ``<option>`` values to/from a multiple select form
1769   field properly selects them and unselects them.
1770
1771 Other changes
1772 --------------
1773
1774 * Static builds can specify the download directory with the
1775   ``--download-dir`` option.
1776
1777
1778 2.3 (2011-02-06)
1779 ================
1780
1781 Features added
1782 --------------
1783
1784 * When looking for children, ``lxml.objectify`` takes '{}tag' as
1785   meaning an empty namespace, as opposed to the parent namespace.
1786
1787 Bugs fixed
1788 ----------
1789
1790 * When finished reading from a file-like object, the parser
1791   immediately calls its ``.close()`` method.
1792
1793 * When finished parsing, ``iterparse()`` immediately closes the input
1794   file.
1795
1796 * Work-around for libxml2 bug that can leave the HTML parser in a
1797   non-functional state after parsing a severely broken document (fixed
1798   in libxml2 2.7.8).
1799
1800 * ``marque`` tag in HTML cleanup code is correctly named ``marquee``.
1801
1802 Other changes
1803 --------------
1804
1805 * Some public functions in the Cython-level C-API have more explicit
1806   return types.
1807
1808
1809 2.3beta1 (2010-09-06)
1810 =====================
1811
1812 Features added
1813 --------------
1814
1815 Bugs fixed
1816 ----------
1817
1818 * Crash in newer libxml2 versions when moving elements between
1819   documents that had attributes on replaced XInclude nodes.
1820
1821 * ``XMLID()`` function was missing the optional ``parser`` and
1822   ``base_url`` parameters.
1823
1824 * Searching for wildcard tags in ``iterparse()`` was broken in Py3.
1825
1826 * ``lxml.html.open_in_browser()`` didn't work in Python 3 due to the
1827   use of os.tempnam.  It now takes an optional 'encoding' parameter.
1828
1829 Other changes
1830 --------------
1831
1832
1833 2.3alpha2 (2010-07-24)
1834 ======================
1835
1836 Features added
1837 --------------
1838
1839 Bugs fixed
1840 ----------
1841
1842 * Crash in XSLT when generating text-only result documents with a
1843   stylesheet created in a different thread.
1844
1845 Other changes
1846 --------------
1847
1848 * ``repr()`` of Element objects shows the hex ID with leading 0x
1849   (following ElementTree 1.3).
1850
1851
1852 2.3alpha1 (2010-06-19)
1853 ======================
1854
1855 Features added
1856 --------------
1857
1858 * Keyword argument ``namespaces`` in ``lxml.cssselect.CSSSelector()``
1859   to pass a prefix-to-namespace mapping for the selector.
1860
1861 * New function ``lxml.etree.register_namespace(prefix, uri)`` that
1862   globally registers a namespace prefix for a namespace that newly
1863   created Elements in that namespace will use automatically.  Follows
1864   ElementTree 1.3.
1865
1866 * Support 'unicode' string name as encoding parameter in
1867   ``tostring()``, following ElementTree 1.3.
1868
1869 * Support 'c14n' serialisation method in ``ElementTree.write()`` and
1870   ``tostring()``, following ElementTree 1.3.
1871
1872 * The ElementPath expression syntax (``el.find*()``) was extended to
1873   match the upcoming ElementTree 1.3 that will ship in the standard
1874   library of Python 3.2/2.7.  This includes extended support for
1875   predicates as well as namespace prefixes (as known from XPath).
1876
1877 * During regular XPath evaluation, various ESXLT functions are
1878   available within their namespace when using libxslt 1.1.26 or later.
1879
1880 * Support passing a readily configured logger instance into
1881   ``PyErrorLog``, instead of a logger name.
1882
1883 * On serialisation, the new ``doctype`` parameter can be used to
1884   override the DOCTYPE (internal subset) of the document.
1885
1886 * New parameter ``output_parent`` to ``XSLTExtension.apply_templates()``
1887   to append the resulting content directly to an output element.
1888
1889 * ``XSLTExtension.process_children()`` to process the content of the
1890   XSLT extension element itself.
1891
1892 * ISO-Schematron support based on the de-facto Schematron reference
1893   'skeleton implementation'.
1894
1895 * XSLT objects now take XPath object as ``__call__`` stylesheet
1896   parameters.
1897
1898 * Enable path caching in ElementPath (``el.find*()``) to avoid parsing
1899   overhead.
1900
1901 * Setting the value of a namespaced attribute always uses a prefixed
1902   namespace instead of the default namespace even if both declare the
1903   same namespace URI.  This avoids serialisation problems when an
1904   attribute from a default namespace is set on an element from a
1905   different namespace.
1906
1907 * XSLT extension elements: support for XSLT context nodes other than
1908   elements: document root, comments, processing instructions.
1909
1910 * Support for strings (in addition to Elements) in node-sets returned
1911   by extension functions.
1912
1913 * Forms that lack an ``action`` attribute default to the base URL of
1914   the document on submit.
1915
1916 * XPath attribute result strings have an ``attrname`` property.
1917
1918 * Namespace URIs get validated against RFC 3986 at the API level
1919   (required by the XML namespace specification).
1920
1921 * Target parsers show their target object in the ``.target`` property
1922   (compatible with ElementTree).
1923
1924 Bugs fixed
1925 ----------
1926
1927 * API is hardened against invalid proxy instances to prevent crashes
1928   due to incorrectly instantiated Element instances.
1929
1930 * Prevent crash when instantiating ``CommentBase`` and friends.
1931
1932 * Export ElementTree compatible XML parser class as
1933   ``XMLTreeBuilder``, as it is called in ET 1.2.
1934
1935 * ObjectifiedDataElements in lxml.objectify were not hashable.  They
1936   now use the hash value of the underlying Python value (string,
1937   number, etc.) to which they compare equal.
1938
1939 * Parsing broken fragments in lxml.html could fail if the fragment
1940   contained an orphaned closing '</div>' tag.
1941
1942 * Using XSLT extension elements around the root of the output document
1943   crashed.
1944
1945 * ``lxml.cssselect`` did not distinguish between ``x[attr="val"]`` and
1946   ``x [attr="val"]`` (with a space).  The latter now matches the
1947   attribute independent of the element.
1948
1949 * Rewriting multiple links inside of HTML text content could end up
1950   replacing unrelated content as replacements could impact the
1951   reported position of subsequent matches.  Modifications are now
1952   simplified by letting the ``iterlinks()`` generator in ``lxml.html``
1953   return links in reversed order if they appear inside the same text
1954   node.  Thus, replacements and link-internal modifications no longer
1955   change the position of links reported afterwards.
1956
1957 * The ``.value`` attribute of ``textarea`` elements in lxml.html did
1958   not represent the complete raw value (including child tags etc.). It
1959   now serialises the complete content on read and replaces the
1960   complete content by a string on write.
1961
1962 * Target parser didn't call ``.close()`` on the target object if
1963   parsing failed.  Now it is guaranteed that ``.close()`` will be
1964   called after parsing, regardless of the outcome.
1965
1966 Other changes
1967 -------------
1968
1969 * Official support for Python 3.1.2 and later.
1970
1971 * Static MS Windows builds can now download their dependencies
1972   themselves.
1973
1974 * ``Element.attrib`` no longer uses a cyclic reference back to its
1975   Element object.  It therefore no longer requires the garbage
1976   collector to clean up.
1977
1978 * Static builds include libiconv, in addition to libxml2 and libxslt.
1979
1980
1981 2.2.8 (2010-09-02)
1982 ==================
1983
1984 Bugs fixed
1985 ----------
1986
1987 * Crash in newer libxml2 versions when moving elements between
1988   documents that had attributes on replaced XInclude nodes.
1989
1990 * Import fix for urljoin in Python 3.1+.
1991
1992
1993 2.2.7 (2010-07-24)
1994 ==================
1995
1996 Bugs fixed
1997 ----------
1998
1999 * Crash in XSLT when generating text-only result documents with a
2000   stylesheet created in a different thread.
2001
2002
2003 2.2.6 (2010-03-02)
2004 ==================
2005
2006 Bugs fixed
2007 ----------
2008
2009 * Fixed several Python 3 regressions by building with Cython 0.11.3.
2010
2011
2012 2.2.5 (2010-02-28)
2013 ==================
2014
2015 Features added
2016 --------------
2017
2018 * Support for running XSLT extension elements on the input root node
2019   (e.g. in a template matching on "/").
2020
2021 Bugs fixed
2022 ----------
2023
2024 * Crash in XPath evaluation when reading smart strings from a document
2025   other than the original context document.
2026
2027 * Support recent versions of html5lib by not requiring its
2028   ``XHTMLParser`` in ``htmlparser.py`` anymore.
2029
2030 * Manually instantiating the custom element classes in
2031   ``lxml.objectify`` could crash.
2032
2033 * Invalid XML text characters were not rejected by the API when they
2034   appeared in unicode strings directly after non-ASCII characters.
2035
2036 * lxml.html.open_http_urllib() did not work in Python 3.
2037
2038 * The functions ``strip_tags()`` and ``strip_elements()`` in
2039   ``lxml.etree`` did not remove all occurrences of a tag in all cases.
2040
2041 * Crash in XSLT extension elements when the XSLT context node is not
2042   an element.
2043
2044
2045 2.2.4 (2009-11-11)
2046 ==================
2047
2048 Bugs fixed
2049 ----------
2050
2051 * Static build of libxml2/libxslt was broken.
2052
2053
2054 2.2.3 (2009-10-30)
2055 ==================
2056
2057 Features added
2058 --------------
2059
2060 Bugs fixed
2061 ----------
2062
2063 * The ``resolve_entities`` option did not work in the incremental feed
2064   parser.
2065
2066 * Looking up and deleting attributes without a namespace could hit a
2067   namespaced attribute of the same name instead.
2068
2069 * Late errors during calls to ``SubElement()`` (e.g. attribute related
2070   ones) could leave a partially initialised element in the tree.
2071
2072 * Modifying trees that contain parsed entity references could result
2073   in an infinite loop.
2074
2075 * ObjectifiedElement.__setattr__ created an empty-string child element when the
2076   attribute value was rejected as a non-unicode/non-ascii string
2077
2078 * Syntax errors in ``lxml.cssselect`` could result in misleading error
2079   messages.
2080
2081 * Invalid syntax in CSS expressions could lead to an infinite loop in
2082   the parser of ``lxml.cssselect``.
2083
2084 * CSS special character escapes were not properly handled in
2085   ``lxml.cssselect``.
2086
2087 * CSS Unicode escapes were not properly decoded in ``lxml.cssselect``.
2088
2089 * Select options in HTML forms that had no explicit ``value``
2090   attribute were not handled correctly.  The HTML standard dictates
2091   that their value is defined by their text content.  This is now
2092   supported by lxml.html.
2093
2094 * XPath raised a TypeError when finding CDATA sections.  This is now
2095   fully supported.
2096
2097 * Calling ``help(lxml.objectify)`` didn't work at the prompt.
2098
2099 * The ``ElementMaker`` in lxml.objectify no longer defines the default
2100   namespaces when annotation is disabled.
2101
2102 * Feed parser failed to honour the 'recover' option on parse errors.
2103
2104 * Diverting the error logging to Python's logging system was broken.
2105
2106 Other changes
2107 -------------
2108
2109
2110 2.2.2 (2009-06-21)
2111 ==================
2112
2113 Features added
2114 --------------
2115
2116 * New helper functions ``strip_attributes()``, ``strip_elements()``,
2117   ``strip_tags()`` in lxml.etree to remove attributes/subtrees/tags
2118   from a subtree.
2119
2120 Bugs fixed
2121 ----------
2122
2123 * Namespace cleanup on subtree insertions could result in missing
2124   namespace declarations (and potentially crashes) if the element
2125   defining a namespace was deleted and the namespace was not used by
2126   the top element of the inserted subtree but only in deeper subtrees.
2127
2128 * Raising an exception from a parser target callback didn't always
2129   terminate the parser.
2130
2131 * Only {true, false, 1, 0} are accepted as the lexical representation for
2132   BoolElement ({True, False, T, F, t, f} not any more), restoring lxml <= 2.0
2133   behaviour.
2134
2135 Other changes
2136 -------------
2137
2138
2139 2.2.1 (2009-06-02)
2140 ==================
2141
2142 Features added
2143 --------------
2144
2145 * Injecting default attributes into a document during XML Schema
2146   validation (also at parse time).
2147
2148 * Pass ``huge_tree`` parser option to disable parser security
2149   restrictions imposed by libxml2 2.7.
2150
2151 Bugs fixed
2152 ----------
2153
2154 * The script for statically building libxml2 and libxslt didn't work
2155   in Py3.
2156
2157 * ``XMLSchema()`` also passes invalid schema documents on to libxml2
2158   for parsing (which could lead to a crash before release 2.6.24).
2159
2160 Other changes
2161 -------------
2162
2163
2164 2.2 (2009-03-21)
2165 ================
2166
2167 Features added
2168 --------------
2169
2170 * Support for ``standalone`` flag in XML declaration through
2171   ``tree.docinfo.standalone`` and by passing ``standalone=True/False``
2172   on serialisation.
2173
2174 Bugs fixed
2175 ----------
2176
2177 * Crash when parsing an XML Schema with external imports from a
2178   filename.
2179
2180
2181 2.2beta4 (2009-02-27)
2182 =====================
2183
2184 Features added
2185 --------------
2186
2187 * Support strings and instantiable Element classes as child arguments
2188   to the constructor of custom Element classes.
2189
2190 * GZip compression support for serialisation to files and file-like
2191   objects.
2192
2193 Bugs fixed
2194 ----------
2195
2196 * Deep-copying an ElementTree copied neither its sibling PIs and
2197   comments nor its internal/external DTD subsets.
2198
2199 * Soupparser failed on broken attributes without values.
2200
2201 * Crash in XSLT when overwriting an already defined attribute using
2202   ``xsl:attribute``.
2203
2204 * Crash bug in exception handling code under Python 3.  This was due
2205   to a problem in Cython, not lxml itself.
2206
2207 * ``lxml.html.FormElement._name()`` failed for non top-level forms.
2208
2209 * ``TAG`` special attribute in constructor of custom Element classes
2210   was evaluated incorrectly.
2211
2212 Other changes
2213 -------------
2214
2215 * Official support for Python 3.0.1.
2216
2217 * ``Element.findtext()`` now returns an empty string instead of None
2218   for Elements without text content.
2219
2220
2221 2.2beta3 (2009-02-17)
2222 =====================
2223
2224 Features added
2225 --------------
2226
2227 * ``XSLT.strparam()`` class method to wrap quoted string parameters
2228   that require escaping.
2229
2230 Bugs fixed
2231 ----------
2232
2233 * Memory leak in XPath evaluators.
2234
2235 * Crash when parsing indented XML in one thread and merging it with
2236   other documents parsed in another thread.
2237
2238 * Setting the ``base`` attribute in ``lxml.objectify`` from a unicode
2239   string failed.
2240
2241 * Fixes following changes in Python 3.0.1.
2242
2243 * Minor fixes for Python 3.
2244
2245 Other changes
2246 -------------
2247
2248 * The global error log (which is copied into the exception log) is now
2249   local to a thread, which fixes some race conditions.
2250
2251 * More robust error handling on serialisation.
2252
2253
2254 2.2beta2 (2009-01-25)
2255 =====================
2256
2257 Bugs fixed
2258 ----------
2259
2260 * Potential memory leak on exception handling.  This was due to a
2261   problem in Cython, not lxml itself.
2262
2263 * ``iter_links`` (and related link-rewriting functions) in
2264   ``lxml.html`` would interpret CSS like ``url("link")`` incorrectly
2265   (treating the quotation marks as part of the link).
2266
2267 * Failing import on systems that have an ``io`` module.
2268
2269
2270 2.1.5 (2009-01-06)
2271 ==================
2272
2273 Bugs fixed
2274 ----------
2275
2276 * Potential memory leak on exception handling.  This was due to a
2277   problem in Cython, not lxml itself.
2278
2279 * Failing import on systems that have an ``io`` module.
2280
2281
2282 2.2beta1 (2008-12-12)
2283 =====================
2284
2285 Features added
2286 --------------
2287
2288 * Allow ``lxml.html.diff.htmldiff`` to accept Element objects, not
2289   just HTML strings.
2290
2291 Bugs fixed
2292 ----------
2293
2294 * Crash when using an XPath evaluator in multiple threads.
2295
2296 * Fixed missing whitespace before ``Link:...`` in ``lxml.html.diff``.
2297
2298 Other changes
2299 -------------
2300
2301 * Export ``lxml.html.parse``.
2302
2303
2304 2.1.4 (2008-12-12)
2305 ==================
2306
2307 Bugs fixed
2308 ----------
2309
2310 * Crash when using an XPath evaluator in multiple threads.
2311
2312
2313 2.0.11 (2008-12-12)
2314 ===================
2315
2316 Bugs fixed
2317 ----------
2318
2319 * Crash when using an XPath evaluator in multiple threads.
2320
2321
2322 2.2alpha1 (2008-11-23)
2323 ======================
2324
2325 Features added
2326 --------------
2327
2328 * Support for XSLT result tree fragments in XPath/XSLT extension
2329   functions.
2330
2331 * QName objects have new properties ``namespace`` and ``localname``.
2332
2333 * New options for exclusive C14N and C14N without comments.
2334
2335 * Instantiating a custom Element classes creates a new Element.
2336
2337 Bugs fixed
2338 ----------
2339
2340 * XSLT didn't inherit the parse options of the input document.
2341
2342 * 0-bytes could slip through the API when used inside of Unicode
2343   strings.
2344
2345 * With ``lxml.html.clean.autolink``, links with balanced parenthesis,
2346   that end in a parenthesis, will be linked in their entirety (typical
2347   with Wikipedia links).
2348
2349 Other changes
2350 -------------
2351
2352
2353 2.1.3 (2008-11-17)
2354 ==================
2355
2356 Features added
2357 --------------
2358
2359 Bugs fixed
2360 ----------
2361
2362 * Ref-count leaks when lxml enters a try-except statement while an
2363   outside exception lives in sys.exc_*(). This was due to a problem in
2364   Cython, not lxml itself.
2365
2366 * Parser Unicode decoding errors could get swallowed by other
2367   exceptions.
2368
2369 * Name/import errors in some Python modules.
2370
2371 * Internal DTD subsets that did not specify a system or public ID were
2372   not serialised and did not appear in the docinfo property of
2373   ElementTrees.
2374
2375 * Fix a pre-Py3k warning when parsing from a gzip file in Py2.6.
2376
2377 * Test suite fixes for libxml2 2.7.
2378
2379 * Resolver.resolve_string() did not work for non-ASCII byte strings.
2380
2381 * Resolver.resolve_file() was broken.
2382
2383 * Overriding the parser encoding didn't work for many encodings.
2384
2385 Other changes
2386 -------------
2387
2388
2389 2.0.10 (2008-11-17)
2390 ===================
2391
2392 Bugs fixed
2393 ----------
2394
2395 * Ref-count leaks when lxml enters a try-except statement while an
2396   outside exception lives in sys.exc_*(). This was due to a problem in
2397   Cython, not lxml itself.
2398
2399
2400 2.1.2 (2008-09-05)
2401 ==================
2402
2403 Features added
2404 --------------
2405
2406 * lxml.etree now tries to find the absolute path name of files when
2407   parsing from a file-like object.  This helps custom resolvers when
2408   resolving relative URLs, as lixbml2 can prepend them with the path
2409   of the source document.
2410
2411 Bugs fixed
2412 ----------
2413
2414 * Memory problem when passing documents between threads.
2415
2416 * Target parser did not honour the ``recover`` option and raised an
2417   exception instead of calling ``.close()`` on the target.
2418
2419 Other changes
2420 -------------
2421
2422
2423 2.0.9 (2008-09-05)
2424 ==================
2425
2426 Bugs fixed
2427 ----------
2428
2429 * Memory problem when passing documents between threads.
2430
2431 * Target parser did not honour the ``recover`` option and raised an
2432   exception instead of calling ``.close()`` on the target.
2433
2434
2435 2.1.1 (2008-07-24)
2436 ==================
2437
2438 Features added
2439 --------------
2440
2441 Bugs fixed
2442 ----------
2443
2444 * Crash when parsing XSLT stylesheets in a thread and using them in
2445   another.
2446
2447 * Encoding problem when including text with ElementInclude under
2448   Python 3.
2449
2450 Other changes
2451 -------------
2452
2453
2454 2.0.8 (2008-07-24)
2455 ==================
2456
2457 Features added
2458 --------------
2459
2460 * ``lxml.html.rewrite_links()`` strips links to work around documents
2461   with whitespace in URL attributes.
2462
2463 Bugs fixed
2464 ----------
2465
2466 * Crash when parsing XSLT stylesheets in a thread and using them in
2467   another.
2468
2469 * CSS selector parser dropped remaining expression after a function
2470   with parameters.
2471
2472 Other changes
2473 -------------
2474
2475
2476 2.1 (2008-07-09)
2477 ================
2478
2479 Features added
2480 --------------
2481
2482 * Smart strings can be switched off in XPath (``smart_strings``
2483   keyword option).
2484
2485 * ``lxml.html.rewrite_links()`` strips links to work around documents
2486   with whitespace in URL attributes.
2487
2488 Bugs fixed
2489 ----------
2490
2491 * Custom resolvers were not used for XMLSchema includes/imports and
2492   XInclude processing.
2493
2494 * CSS selector parser dropped remaining expression after a function
2495   with parameters.
2496
2497 Other changes
2498 -------------
2499
2500 * ``objectify.enableRecursiveStr()`` was removed, use
2501   ``objectify.enable_recursive_str()`` instead
2502
2503 * Speed-up when running XSLTs on documents from other threads
2504
2505
2506 2.0.7 (2008-06-20)
2507 ==================
2508
2509 Features added
2510 --------------
2511
2512 * Pickling ``ElementTree`` objects in lxml.objectify.
2513
2514 Bugs fixed
2515 ----------
2516
2517 * Descending dot-separated classes in CSS selectors were not resolved
2518   correctly.
2519
2520 * ``ElementTree.parse()`` didn't handle target parser result.
2521
2522 * Potential threading problem in XInclude.
2523
2524 * Crash in Element class lookup classes when the __init__() method of
2525   the super class is not called from Python subclasses.
2526
2527 Other changes
2528 -------------
2529
2530 * Non-ASCII characters in attribute values are no longer escaped on
2531   serialisation.
2532
2533
2534 2.1beta3 (2008-06-19)
2535 =====================
2536
2537 Features added
2538 --------------
2539
2540 * Major overhaul of ``tools/xpathgrep.py`` script.
2541
2542 * Pickling ``ElementTree`` objects in lxml.objectify.
2543
2544 * Support for parsing from file-like objects that return unicode
2545   strings.
2546
2547 * New function ``etree.cleanup_namespaces(el)`` that removes unused
2548   namespace declarations from a (sub)tree (experimental).
2549
2550 * XSLT results support the buffer protocol in Python 3.
2551
2552 * Polymorphic functions in ``lxml.html`` that accept either a tree or
2553   a parsable string will return either a UTF-8 encoded byte string, a
2554   unicode string or a tree, based on the type of the input.
2555   Previously, the result was always a byte string or a tree.
2556
2557 * Support for Python 2.6 and 3.0 beta.
2558
2559 * File name handling now uses a heuristic to convert between byte
2560   strings (usually filenames) and unicode strings (usually URLs).
2561
2562 * Parsing from a plain file object frees the GIL under Python 2.x.
2563
2564 * Running ``iterparse()`` on a plain file (or filename) frees the GIL
2565   on reading under Python 2.x.
2566
2567 * Conversion functions ``html_to_xhtml()`` and ``xhtml_to_html()`` in
2568   lxml.html (experimental).
2569
2570 * Most features in lxml.html work for XHTML namespaced tag names
2571   (experimental).
2572
2573 Bugs fixed
2574 ----------
2575
2576 * ``ElementTree.parse()`` didn't handle target parser result.
2577
2578 * Crash in Element class lookup classes when the __init__() method of
2579   the super class is not called from Python subclasses.
2580
2581 * A number of problems related to unicode/byte string conversion of
2582   filenames and error messages were fixed.
2583
2584 * Building on MacOS-X now passes the "flat_namespace" option to the C
2585   compiler, which reportedly prevents build quirks and crashes on this
2586   platform.
2587
2588 * Windows build was broken.
2589
2590 * Rare crash when serialising to a file object with certain encodings.
2591
2592 Other changes
2593 -------------
2594
2595 * Non-ASCII characters in attribute values are no longer escaped on
2596   serialisation.
2597
2598 * Passing non-ASCII byte strings or invalid unicode strings as .tag,
2599   namespaces, etc. will result in a ValueError instead of an
2600   AssertionError (just like the tag well-formedness check).
2601
2602 * Up to several times faster attribute access (i.e. tree traversal) in
2603   lxml.objectify.
2604
2605
2606 2.0.6 (2008-05-31)
2607 ==================
2608
2609 Features added
2610 --------------
2611
2612 Bugs fixed
2613 ----------
2614
2615 * Incorrect evaluation of ``el.find("tag[child]")``.
2616
2617 * Windows build was broken.
2618
2619 * Moving a subtree from a document created in one thread into a
2620   document of another thread could crash when the rest of the source
2621   document is deleted while the subtree is still in use.
2622
2623 * Rare crash when serialising to a file object with certain encodings.
2624
2625 Other changes
2626 -------------
2627
2628 * lxml should now build without problems on MacOS-X.
2629
2630
2631 2.1beta2 (2008-05-02)
2632 =====================
2633
2634 Features added
2635 --------------
2636
2637 * All parse functions in lxml.html take a ``parser`` keyword argument.
2638
2639 * lxml.html has a new parser class ``XHTMLParser`` and a module
2640   attribute ``xhtml_parser`` that provide XML parsers that are
2641   pre-configured for the lxml.html package.
2642
2643 Bugs fixed
2644 ----------
2645
2646 * Moving a subtree from a document created in one thread into a
2647   document of another thread could crash when the rest of the source
2648   document is deleted while the subtree is still in use.
2649
2650 * Passing an nsmap when creating an Element will no longer strip
2651   redundantly defined namespace URIs.  This prevented the definition
2652   of more than one prefix for a namespace on the same Element.
2653
2654 Other changes
2655 -------------
2656
2657 * If the default namespace is redundantly defined with a prefix on the
2658   same Element, the prefix will now be preferred for subelements and
2659   attributes.  This allows users to work around a problem in libxml2
2660   where attributes from the default namespace could serialise without
2661   a prefix even when they appear on an Element with a different
2662   namespace (i.e. they would end up in the wrong namespace).
2663
2664
2665 2.0.5 (2008-05-01)
2666 ==================
2667
2668 Features added
2669 --------------
2670
2671 Bugs fixed
2672 ----------
2673
2674 * Resolving to a filename in custom resolvers didn't work.
2675
2676 * lxml did not honour libxslt's second error state "STOPPED", which
2677   let some XSLT errors pass silently.
2678
2679 * Memory leak in Schematron with libxml2 >= 2.6.31.
2680
2681 Other changes
2682 -------------
2683
2684
2685 2.1beta1 (2008-04-15)
2686 =====================
2687
2688 Features added
2689 --------------
2690
2691 * Error logging in Schematron (requires libxml2 2.6.32 or later).
2692
2693 * Parser option ``strip_cdata`` for normalising or keeping CDATA
2694   sections.  Defaults to ``True`` as before, thus replacing CDATA
2695   sections by their text content.
2696
2697 * ``CDATA()`` factory to wrap string content as CDATA section.
2698
2699 Bugs fixed
2700 ----------
2701
2702 * Resolving to a filename in custom resolvers didn't work.
2703
2704 * lxml did not honour libxslt's second error state "STOPPED", which
2705   let some XSLT errors pass silently.
2706
2707 * Memory leak in Schematron with libxml2 >= 2.6.31.
2708
2709 * lxml.etree accepted non well-formed namespace prefix names.
2710
2711 Other changes
2712 -------------
2713
2714 * Major cleanup in internal ``moveNodeToDocument()`` function, which
2715   takes care of namespace cleanup when moving elements between
2716   different namespace contexts.
2717
2718 * New Elements created through the ``makeelement()`` method of an HTML
2719   parser or through lxml.html now end up in a new HTML document
2720   (doctype HTML 4.01 Transitional) instead of a generic XML document.
2721   This mostly impacts the serialisation and the availability of a DTD
2722   context.
2723
2724
2725 2.0.4 (2008-04-13)
2726 ==================
2727
2728 Features added
2729 --------------
2730
2731 Bugs fixed
2732 ----------
2733
2734 * Hanging thread in conjunction with GTK threading.
2735
2736 * Crash bug in iterparse when moving elements into other documents.
2737
2738 * HTML elements' ``.cssselect()`` method was broken.
2739
2740 * ``ElementTree.find*()`` didn't accept QName objects.
2741
2742 Other changes
2743 -------------
2744
2745
2746 2.1alpha1 (2008-03-27)
2747 ======================
2748
2749 Features added
2750 --------------
2751
2752 * New event types 'comment' and 'pi' in ``iterparse()``.
2753
2754 * ``XSLTAccessControl`` instances have a property ``options`` that
2755   returns a dict of access configuration options.
2756
2757 * Constant instances ``DENY_ALL`` and ``DENY_WRITE`` on
2758   ``XSLTAccessControl`` class.
2759
2760 * Extension elements for XSLT (experimental!)
2761
2762 * ``Element.base`` property returns the xml:base or HTML base URL of
2763   an Element.
2764
2765 * ``docinfo.URL`` property is writable.
2766
2767 Bugs fixed
2768 ----------
2769
2770 * Default encoding for plain text serialisation was different from
2771   that of XML serialisation (UTF-8 instead of ASCII).
2772
2773 Other changes
2774 -------------
2775
2776 * Minor API speed-ups.
2777
2778 * The benchmark suite now uses tail text in the trees, which makes the
2779   absolute numbers incomparable to previous results.
2780
2781 * Generating the HTML documentation now requires Pygments_, which is
2782   used to enable syntax highlighting for the doctest examples.
2783
2784 .. _Pygments: http://pygments.org/
2785
2786 Most long-time deprecated functions and methods were removed:
2787
2788 - ``etree.clearErrorLog()``, use ``etree.clear_error_log()``
2789
2790 - ``etree.useGlobalPythonLog()``, use
2791   ``etree.use_global_python_log()``
2792
2793 - ``etree.ElementClassLookup.setFallback()``, use
2794   ``etree.ElementClassLookup.set_fallback()``
2795
2796 - ``etree.getDefaultParser()``, use ``etree.get_default_parser()``
2797
2798 - ``etree.setDefaultParser()``, use ``etree.set_default_parser()``
2799
2800 - ``etree.setElementClassLookup()``, use
2801   ``etree.set_element_class_lookup()``
2802
2803   Note that ``parser.setElementClassLookup()`` has not been removed
2804   yet, although ``parser.set_element_class_lookup()`` should be used
2805   instead.
2806
2807 - ``xpath_evaluator.registerNamespace()``, use
2808   ``xpath_evaluator.register_namespace()``
2809
2810 - ``xpath_evaluator.registerNamespaces()``, use
2811   ``xpath_evaluator.register_namespaces()``
2812
2813 - ``objectify.setPytypeAttributeTag``, use
2814   ``objectify.set_pytype_attribute_tag``
2815
2816 - ``objectify.setDefaultParser()``, use
2817   ``objectify.set_default_parser()``
2818
2819
2820 2.0.3 (2008-03-26)
2821 ==================
2822
2823 Features added
2824 --------------
2825
2826 * soupparser.parse() allows passing keyword arguments on to
2827   BeautifulSoup.
2828
2829 * ``fromstring()`` method in ``lxml.html.soupparser``.
2830
2831 Bugs fixed
2832 ----------
2833
2834 * ``lxml.html.diff`` didn't treat empty tags properly (e.g.,
2835   ``<br>``).
2836
2837 * Handle entity replacements correctly in target parser.
2838
2839 * Crash when using ``iterparse()`` with XML Schema validation.
2840
2841 * The BeautifulSoup parser (soupparser.py) did not replace entities,
2842   which made them turn up in text content.
2843
2844 * Attribute assignment of custom PyTypes in objectify could fail to
2845   correctly serialise the value to a string.
2846
2847 Other changes
2848 -------------
2849
2850 * ``lxml.html.ElementSoup`` was replaced by a new module
2851   ``lxml.html.soupparser`` with a more consistent API.  The old module
2852   remains for compatibility with ElementTree's own ElementSoup module.
2853
2854 * Setting the XSLT_CONFIG and XML2_CONFIG environment variables at
2855   build time will let setup.py pick up the ``xml2-config`` and
2856   ``xslt-config`` scripts from the supplied path name.
2857
2858 * Passing ``--with-xml2-config=/path/to/xml2-config`` to setup.py will
2859   override the ``xml2-config`` script that is used to determine the C
2860   compiler options.  The same applies for the ``--with-xslt-config``
2861   option.
2862
2863
2864 2.0.2 (2008-02-22)
2865 ==================
2866
2867 Features added
2868 --------------
2869
2870 * Support passing ``base_url`` to file parser functions to override
2871   the filename of the file(-like) object.
2872
2873 Bugs fixed
2874 ----------
2875
2876 * The prefix for objectify's pytype namespace was missing from the set
2877   of default prefixes.
2878
2879 * Memory leak in Schematron (fixed only for libxml2 2.6.31+).
2880
2881 * Error type names in RelaxNG were reported incorrectly.
2882
2883 * Slice deletion bug fixed in objectify.
2884
2885 Other changes
2886 -------------
2887
2888 * Enabled doctests for some Python modules (especially ``lxml.html``).
2889
2890 * Add a ``method`` argument to ``lxml.html.tostring()``
2891   (``method="xml"`` for XHTML output).
2892
2893 * Make it clearer that methods like ``lxml.html.fromstring()`` take a
2894   ``base_url`` argument.
2895
2896
2897 2.0.1 (2008-02-13)
2898 ==================
2899
2900 Features added
2901 --------------
2902
2903 * Child iteration in ``lxml.pyclasslookup``.
2904
2905 * Loads of new docstrings reflect the signature of functions and
2906   methods to make them visible in API docs and ``help()``
2907
2908 Bugs fixed
2909 ----------
2910
2911 * The module ``lxml.html.builder`` was duplicated as
2912   ``lxml.htmlbuilder``
2913
2914 * Form elements would return None for ``form.fields.keys()`` if there
2915   was an unnamed input field.  Now unnamed input fields are completely
2916   ignored.
2917
2918 * Setting an element slice in objectify could insert slice-overlapping
2919   elements at the wrong position.
2920
2921 Other changes
2922 -------------
2923
2924 * The generated API documentation was cleaned up and disburdened from
2925   non-public classes etc.
2926
2927 * The previously public module ``lxml.html.setmixin`` was renamed to
2928   ``lxml.html._setmixin`` as it is not an official part of lxml.  If
2929   you want to use it, feel free to copy it over to your own source
2930   base.
2931
2932 * Passing ``--with-xslt-config=/path/to/xslt-config`` to setup.py will
2933   override the ``xslt-config`` script that is used to determine the C
2934   compiler options.
2935
2936
2937 2.0 (2008-02-01)
2938 ================
2939
2940 Features added
2941 --------------
2942
2943 * Passing the ``unicode`` type as ``encoding`` to ``tostring()`` will
2944   serialise to unicode.  The ``tounicode()`` function is now
2945   deprecated.
2946
2947 * ``XMLSchema()`` and ``RelaxNG()`` can parse from StringIO.
2948
2949 * ``makeparser()`` function in ``lxml.objectify`` to create a new
2950   parser with the usual objectify setup.
2951
2952 * Plain ASCII XPath string results are no longer forced into unicode
2953   objects as in 2.0beta1, but are returned as plain strings as before.
2954
2955 * All XPath string results are 'smart' objects that have a
2956   ``getparent()`` method to retrieve their parent Element.
2957
2958 * ``with_tail`` option in serialiser functions.
2959
2960 * More accurate exception messages in validator creation.
2961
2962 * Parse-time XML schema validation (``schema`` parser keyword).
2963
2964 * XPath string results of the ``text()`` function and attribute
2965   selection make their Element container accessible through a
2966   ``getparent()`` method.  As a side-effect, they are now always
2967   unicode objects (even ASCII strings).
2968
2969 * ``XSLT`` objects are usable in any thread - at the cost of a deep
2970   copy if they were not created in that thread.
2971
2972 * Invalid entity names and character references will be rejected by
2973   the ``Entity()`` factory.
2974
2975 * ``entity.text`` returns the textual representation of the entity,
2976   e.g. ``&amp;``.
2977
2978 * New properties ``position`` and ``code`` on ParseError exception (as
2979   in ET 1.3)
2980
2981 * Rich comparison of ``element.attrib`` proxies.
2982
2983 * ElementTree compatible TreeBuilder class.
2984
2985 * Use default prefixes for some common XML namespaces.
2986
2987 * ``lxml.html.clean.Cleaner`` now allows for a ``host_whitelist``, and
2988   two overridable methods: ``allow_embedded_url(el, url)`` and the
2989   more general ``allow_element(el)``.
2990
2991 * Extended slicing of Elements as in ``element[1:-1:2]``, both in
2992   etree and in objectify
2993
2994 * Resolvers can now provide a ``base_url`` keyword argument when
2995   resolving a document as string data.
2996
2997 * When using ``lxml.doctestcompare`` you can give the doctest option
2998   ``NOPARSE_MARKUP`` (like ``# doctest: +NOPARSE_MARKUP``) to suppress
2999   the special checking for one test.
3000
3001 * Separate ``feed_error_log`` property for the feed parser interface.
3002   The normal parser interface and ``iterparse`` continue to use
3003   ``error_log``.
3004
3005 * The normal parsers and the feed parser interface are now separated
3006   and can be used concurrently on the same parser instance.
3007
3008 * ``fromstringlist()`` and ``tostringlist()`` functions as in
3009   ElementTree 1.3
3010
3011 * ``iterparse()`` accepts an ``html`` boolean keyword argument for
3012   parsing with the HTML parser (note that this interface may be
3013   subject to change)
3014
3015 * Parsers accept an ``encoding`` keyword argument that overrides the encoding
3016   of the parsed documents.
3017
3018 * New C-API function ``hasChild()`` to test for children
3019
3020 * ``annotate()`` function in objectify can annotate with Python types and XSI
3021   types in one step.  Accompanied by ``xsiannotate()`` and ``pyannotate()``.
3022
3023 * ``ET.write()``, ``tostring()`` and ``tounicode()`` now accept a keyword
3024   argument ``method`` that can be one of 'xml' (or None), 'html' or 'text' to
3025   serialise as XML, HTML or plain text content.
3026
3027 * ``iterfind()`` method on Elements returns an iterator equivalent to
3028   ``findall()``
3029
3030 * ``itertext()`` method on Elements
3031
3032 * Setting a QName object as value of the .text property or as an attribute
3033   will resolve its prefix in the respective context
3034
3035 * ElementTree-like parser target interface as described in
3036   http://effbot.org/elementtree/elementtree-xmlparser.htm
3037
3038 * ElementTree-like feed parser interface on XMLParser and HTMLParser
3039   (``feed()`` and ``close()`` methods)
3040
3041 * Reimplemented ``objectify.E`` for better performance and improved
3042   integration with objectify.  Provides extended type support based on
3043   registered PyTypes.
3044
3045 * XSLT objects now support deep copying
3046
3047 * New ``makeSubElement()`` C-API function that allows creating a new
3048   subelement straight with text, tail and attributes.
3049
3050 * XPath extension functions can now access the current context node
3051   (``context.context_node``) and use a context dictionary
3052   (``context.eval_context``) from the context provided in their first
3053   parameter
3054
3055 * HTML tag soup parser based on BeautifulSoup in ``lxml.html.ElementSoup``
3056
3057 * New module ``lxml.doctestcompare`` by Ian Bicking for writing simplified
3058   doctests based on XML/HTML output.  Use by importing ``lxml.usedoctest`` or
3059   ``lxml.html.usedoctest`` from within a doctest.
3060
3061 * New module ``lxml.cssselect`` by Ian Bicking for selecting Elements with CSS
3062   selectors.
3063
3064 * New package ``lxml.html`` written by Ian Bicking for advanced HTML
3065   treatment.
3066
3067 * Namespace class setup is now local to the ``ElementNamespaceClassLookup``
3068   instance and no longer global.
3069
3070 * Schematron validation (incomplete in libxml2)
3071
3072 * Additional ``stringify`` argument to ``objectify.PyType()`` takes a
3073   conversion function to strings to support setting text values from arbitrary
3074   types.
3075
3076 * Entity support through an ``Entity`` factory and element classes.  XML
3077   parsers now have a ``resolve_entities`` keyword argument that can be set to
3078   False to keep entities in the document.
3079
3080 * ``column`` field on error log entries to accompany the ``line`` field
3081
3082 * Error specific messages in XPath parsing and evaluation
3083   NOTE: for evaluation errors, you will now get an XPathEvalError instead of
3084   an XPathSyntaxError.  To catch both, you can except on ``XPathError``
3085
3086 * The regular expression functions in XPath now support passing a node-set
3087   instead of a string
3088
3089 * Extended type annotation in objectify: new ``xsiannotate()`` function
3090
3091 * EXSLT RegExp support in standard XPath (not only XSLT)
3092
3093 Bugs fixed
3094 ----------
3095
3096 * Missing import in ``lxml.html.clean``.
3097
3098 * Some Python 2.4-isms prevented lxml from building/running under
3099   Python 2.3.
3100
3101 * XPath on ElementTrees could crash when selecting the virtual root
3102   node of the ElementTree.
3103
3104 * Compilation ``--without-threading`` was buggy in alpha5/6.
3105
3106 * Memory leak in the ``parse()`` function.
3107
3108 * Minor bugs in XSLT error message formatting.
3109
3110 * Result document memory leak in target parser.
3111
3112 * Target parser failed to report comments.
3113
3114 * In the ``lxml.html`` ``iter_links`` method, links in ``<object>``
3115   tags weren't recognized.  (Note: plugin-specific link parameters
3116   still aren't recognized.)  Also, the ``<embed>`` tag, though not
3117   standard, is now included in ``lxml.html.defs.special_inline_tags``.
3118
3119 * Using custom resolvers on XSLT stylesheets parsed from a string
3120   could request ill-formed URLs.
3121
3122 * With ``lxml.doctestcompare`` if you do ``<tag xmlns="...">`` in your
3123   output, it will then be namespace-neutral (before the ellipsis was
3124   treated as a real namespace).
3125
3126 * AttributeError in feed parser on parse errors
3127
3128 * XML feed parser setup problem
3129
3130 * Type annotation for unicode strings in ``DataElement()``
3131
3132 * lxml failed to serialise namespace declarations of elements other than the
3133   root node of a tree
3134
3135 * Race condition in XSLT where the resolver context leaked between concurrent
3136   XSLT calls
3137
3138 * lxml.etree did not check tag/attribute names
3139
3140 * The XML parser did not report undefined entities as error
3141
3142 * The text in exceptions raised by XML parsers, validators and XPath
3143   evaluators now reports the first error that occurred instead of the last
3144
3145 * Passing '' as XPath namespace prefix did not raise an error
3146
3147 * Thread safety in XPath evaluators
3148
3149 Other changes
3150 -------------
3151
3152 * Exceptions carry only the part of the error log that is related to
3153   the operation that caused the error.
3154
3155 * ``XMLSchema()`` and ``RelaxNG()`` now enforce passing the source
3156   file/filename through the ``file`` keyword argument.
3157
3158 * The test suite now skips most doctests under Python 2.3.
3159
3160 * ``make clean`` no longer removes the .c files (use ``make
3161   realclean`` instead)
3162
3163 * Minor performance tweaks for Element instantiation and subelement
3164   creation
3165
3166 * Various places in the XPath, XSLT and iteration APIs now require
3167   keyword-only arguments.
3168
3169 * The argument order in ``element.itersiblings()`` was changed to
3170   match the order used in all other iteration methods.  The second
3171   argument ('preceding') is now a keyword-only argument.
3172
3173 * The ``getiterator()`` method on Elements and ElementTrees was
3174   reverted to return an iterator as it did in lxml 1.x.  The ET API
3175   specification allows it to return either a sequence or an iterator,
3176   and it traditionally returned a sequence in ET and an iterator in
3177   lxml.  However, it is now deprecated in favour of the ``iter()``
3178   method, which should be used in new code wherever possible.
3179
3180 * The 'pretty printed' serialisation of ElementTree objects now
3181   inserts newlines at the root level between processing instructions,
3182   comments and the root tag.
3183
3184 * A 'pretty printed' serialisation is now terminated with a newline.
3185
3186 * Second argument to ``lxml.etree.Extension()`` helper is no longer
3187   required, third argument is now a keyword-only argument ``ns``.
3188
3189 * ``lxml.html.tostring`` takes an ``encoding`` argument.
3190
3191 * The module source files were renamed to "lxml.*.pyx", such as
3192   "lxml.etree.pyx".  This was changed for consistency with the way
3193   Pyrex commonly handles package imports.  The main effect is that
3194   classes now know about their fully qualified class name, including
3195   the package name of their module.
3196
3197 * Keyword-only arguments in some API functions, especially in the
3198   parsers and serialisers.
3199
3200 * Tag name validation in lxml.etree (and lxml.html) now distinguishes
3201   between HTML tags and XML tags based on the parser that was used to
3202   parse or create them.  HTML tags no longer reject any non-ASCII
3203   characters in tag names but only spaces and the special characters
3204   ``<>&/"'``.
3205
3206 * lxml.etree now emits a warning if you use XPath with libxml2 2.6.27
3207   (which can crash on certain XPath errors)
3208
3209 * Type annotation in objectify now preserves the already annotated type by
3210   default to prevent losing type information that is already there.
3211
3212 * ``element.getiterator()`` returns a list, use ``element.iter()`` to retrieve
3213   an iterator (ElementTree 1.3 compatible behaviour)
3214
3215 * objectify.PyType for None is now called "NoneType"
3216
3217 * ``el.getiterator()`` renamed to ``el.iter()``, following ElementTree 1.3 -
3218   original name is still available as alias
3219
3220 * In the public C-API, ``findOrBuildNodeNs()`` was replaced by the more
3221   generic ``findOrBuildNodeNsPrefix``
3222
3223 * Major refactoring in XPath/XSLT extension function code
3224
3225 * Network access in parsers disabled by default
3226
3227
3228 1.3.6 (2007-10-29)
3229 ==================
3230
3231 Bugs fixed
3232 ----------
3233
3234 * Backported decref crash fix from 2.0
3235
3236 * Well hidden free-while-in-use crash bug in ObjectPath
3237
3238 Other changes
3239 -------------
3240
3241 * The test suites now run ``gc.collect()`` in the ``tearDown()``
3242   methods.  While this makes them take a lot longer to run, it also
3243   makes it easier to link a specific test to garbage collection
3244   problems that would otherwise appear in later tests.
3245
3246
3247 1.3.5 (2007-10-22)
3248 ==================
3249
3250 Features added
3251 --------------
3252
3253 Bugs fixed
3254 ----------
3255
3256 * lxml.etree could crash when adding more than 10000 namespaces to a
3257   document
3258
3259 * lxml failed to serialise namespace declarations of elements other
3260   than the root node of a tree
3261
3262
3263 1.3.4 (2007-08-30)
3264 ==================
3265
3266 Features added
3267 --------------
3268
3269 * The ``ElementMaker`` in ``lxml.builder`` now accepts the keyword arguments
3270   ``namespace`` and ``nsmap`` to set a namespace and nsmap for the Elements it
3271   creates.
3272
3273 * The ``docinfo`` on ElementTree objects has new properties ``internalDTD``
3274   and ``externalDTD`` that return a DTD object for the internal or external
3275   subset of the document respectively.
3276
3277 * Serialising an ElementTree now includes any internal DTD subsets that are
3278   part of the document, as well as comments and PIs that are siblings of the
3279   root node.
3280
3281 Bugs fixed
3282 ----------
3283
3284 * Parsing with the ``no_network`` option could fail
3285
3286 Other changes
3287 -------------
3288
3289 * lxml now raises a TagNameWarning about tag names containing ':' instead of
3290   an Error as 1.3.3 did.  The reason is that a number of projects currently
3291   misuse the previous lack of tag name validation to generate namespace
3292   prefixes without declaring namespaces.  Apart from the danger of generating
3293   broken XML this way, it also breaks most of the namespace-aware tools in
3294   XML, including XPath, XSLT and validation.  lxml 1.3.x will continue to
3295   support this bug with a Warning, while lxml 2.0 will be strict about
3296   well-formed tag names (not only regarding ':').
3297
3298 * Serialising an Element no longer includes its comment and PI siblings (only
3299   ElementTree serialisation includes them).
3300
3301
3302 1.3.3 (2007-07-26)
3303 ==================
3304
3305 Features added
3306 --------------
3307
3308 * ElementTree compatible parser ``ETCompatXMLParser`` strips processing
3309   instructions and comments while parsing XML
3310
3311 * Parsers now support stripping PIs (keyword argument 'remove_pis')
3312
3313 * ``etree.fromstring()`` now supports parsing both HTML and XML, depending on
3314   the parser you pass.
3315
3316 * Support ``base_url`` keyword argument in ``HTML()`` and ``XML()``
3317
3318 Bugs fixed
3319 ----------
3320
3321 * Parsing from Python Unicode strings failed on some platforms
3322
3323 * ``Element()`` did not raise an exception on tag names containing ':'
3324
3325 * ``Element.getiterator(tag)`` did not accept ``Comment`` and
3326   ``ProcessingInstruction`` as tags. It also accepts ``Element`` now.
3327
3328
3329 1.3.2 (2007-07-03)
3330 ==================
3331
3332 Features added
3333 --------------
3334
3335 Bugs fixed
3336 ----------
3337
3338 * "deallocating None" crash bug
3339
3340
3341 1.3.1 (2007-07-02)
3342 ==================
3343
3344 Features added
3345 --------------
3346
3347 * objectify.DataElement now supports setting values from existing data
3348   elements (not just plain Python types) and reuses defined namespaces etc.
3349
3350 * E-factory support for lxml.objectify (``objectify.E``)
3351
3352 Bugs fixed
3353 ----------
3354
3355 * Better way to prevent crashes in Element proxy cleanup code
3356
3357 * objectify.DataElement didn't set up None value correctly
3358
3359 * objectify.DataElement didn't check the value against the provided type hints
3360
3361 * Reference-counting bug in ``Element.attrib.pop()``
3362
3363
3364 1.3 (2007-06-24)
3365 ================
3366
3367 Features added
3368 --------------
3369
3370 * Module ``lxml.pyclasslookup`` module implements an Element class lookup
3371   scheme that can access the entire tree in read-only mode to help determining
3372   a suitable Element class
3373
3374 * Parsers take a ``remove_comments`` keyword argument that skips over comments
3375
3376 * ``parse()`` function in ``objectify``, corresponding to ``XML()`` etc.
3377
3378 * ``Element.addnext(el)`` and ``Element.addprevious(el)`` methods to support
3379   adding processing instructions and comments around the root node
3380
3381 * ``Element.attrib`` was missing ``clear()`` and ``pop()`` methods
3382
3383 * Extended type annotation in objectify: cleaner annotation namespace setup
3384   plus new ``deannotate()`` function
3385
3386 * Support for custom Element class instantiation in lxml.sax: passing a
3387   ``makeelement`` function to the ElementTreeContentHandler will reuse the
3388   lookup context of that function
3389
3390 * '.' represents empty ObjectPath (identity)
3391
3392 * ``Element.values()`` to accompany the existing ``.keys()`` and ``.items()``
3393
3394 * ``collectAttributes()`` C-function to build a list of attribute
3395   keys/values/items for a libxml2 node
3396
3397 * ``DTD`` validator class (like ``RelaxNG`` and ``XMLSchema``)
3398
3399 * HTML generator helpers by Fredrik Lundh in ``lxml.htmlbuilder``
3400
3401 * ``ElementMaker`` XML generator by Fredrik Lundh in ``lxml.builder.E``
3402
3403 * Support for pickling ``objectify.ObjectifiedElement`` objects to XML
3404
3405 * ``update()`` method on Element.attrib
3406
3407 * Optimised replacement for libxml2's _xmlReconsiliateNs(). This allows lxml
3408   a better handling of namespaces when moving elements between documents.
3409
3410 Bugs fixed
3411 ----------
3412
3413 * Removing Elements from a tree could make them lose their namespace
3414   declarations
3415
3416 * ``ElementInclude`` didn't honour base URL of original document
3417
3418 * Replacing the children slice of an Element would cut off the tails of the
3419   original children
3420
3421 * ``Element.getiterator(tag)`` did not accept ``Comment`` and
3422   ``ProcessingInstruction`` as tags
3423
3424 * API functions now check incoming strings for XML conformity.  Zero bytes or
3425   low ASCII characters are no longer accepted (AssertionError).
3426
3427 * XSLT parsing failed to pass resolver context on to imported documents
3428
3429 * passing '' as namespace prefix in nsmap could be passed through to libxml2
3430
3431 * Objectify couldn't handle prefixed XSD type names in ``xsi:type``
3432
3433 * More ET compatible behaviour when writing out XML declarations or not
3434
3435 * More robust error handling in ``iterparse()``
3436
3437 * Documents lost their top-level PIs and comments on serialisation
3438
3439 * lxml.sax failed on comments and PIs. Comments are now properly ignored and
3440   PIs are copied.
3441
3442 * Possible memory leaks in namespace handling when moving elements between
3443   documents
3444
3445 Other changes
3446 -------------
3447
3448 * major restructuring in the documentation
3449
3450
3451 1.2.1 (2007-02-27)
3452 ==================
3453
3454 Bugs fixed
3455 ----------
3456
3457 * Build fixes for MS compiler
3458
3459 * Item assignments to special names like ``element["text"]`` failed
3460
3461 * Renamed ObjectifiedDataElement.__setText() to _setText() to make it easier
3462   to access
3463
3464 * The pattern for attribute names in ObjectPath was too restrictive
3465
3466
3467 1.2 (2007-02-20)
3468 ================
3469
3470 Features added
3471 --------------
3472
3473 * Rich comparison of QName objects
3474
3475 * Support for regular expressions in benchmark selection
3476
3477 * get/set emulation (not .attrib!) for attributes on processing instructions
3478
3479 * ElementInclude Python module for ElementTree compatible XInclude processing
3480   that honours custom resolvers registered with the source document
3481
3482 * ElementTree.parser property holds the parser used to parse the document
3483
3484 * setup.py has been refactored for greater readability and flexibility
3485
3486 * --rpath flag to setup.py to induce automatic linking-in of dynamic library
3487   runtime search paths has been renamed to --auto-rpath. This makes it
3488   possible to pass an --rpath directly to distutils; previously this was being
3489   shadowed.
3490
3491 Bugs fixed
3492 ----------
3493
3494 * Element instantiation now uses locks to prevent race conditions with threads
3495
3496 * ElementTree.write() did not raise an exception when the file was not writable
3497
3498 * Error handling could crash under Python <= 2.4.1 - fixed by disabling thread
3499   support in these environments
3500
3501 * Element.find*() did not accept QName objects as path
3502
3503 Other changes
3504 -------------
3505
3506 * code cleanup: redundant _NodeBase super class merged into _Element class
3507   Note: although the impact should be zero in most cases, this change breaks
3508   the compatibility of the public C-API
3509
3510
3511 1.1.2 (2006-10-30)
3512 ==================
3513
3514 Features added
3515 --------------
3516
3517 * Data elements in objectify support repr(), which is now used by dump()
3518
3519 * Source distribution now ships with a patched Pyrex
3520
3521 * New C-API function makeElement() to create new elements with text,
3522   tail, attributes and namespaces
3523
3524 * Reuse original parser flags for XInclude
3525
3526 * Simplified support for handling XSLT processing instructions
3527
3528 Bugs fixed
3529 ----------
3530
3531 * Parser resources were not freed before the next parser run
3532
3533 * Open files and XML strings returned by Python resolvers were not
3534   closed/freed
3535
3536 * Crash in the IDDict returned by XMLDTDID
3537
3538 * Copying Comments and ProcessingInstructions failed
3539
3540 * Memory leak for external URLs in _XSLTProcessingInstruction.parseXSL()
3541
3542 * Memory leak when garbage collecting tailed root elements
3543
3544 * HTML script/style content was not propagated to .text
3545
3546 * Show text xincluded between text nodes correctly in .text and .tail
3547
3548 * 'integer * objectify.StringElement' operation was not supported
3549
3550
3551 1.1.1 (2006-09-21)
3552 ==================
3553
3554 Features added
3555 --------------
3556
3557 * XSLT profiling support (``profile_run`` keyword)
3558
3559 * countchildren() method on objectify.ObjectifiedElement
3560
3561 * Support custom elements for tree nodes in lxml.objectify
3562
3563 Bugs fixed
3564 ----------
3565
3566 * lxml.objectify failed to support long data values (e.g., "123L")
3567
3568 * Error messages from XSLT did not reach ``XSLT.error_log``
3569
3570 * Factories objectify.Element() and objectify.DataElement() were missing
3571   ``attrib`` and ``nsmap`` keyword arguments
3572
3573 * Changing the default parser in lxml.objectify did not update the factories
3574   Element() and DataElement()
3575
3576 * Let lxml.objectify.Element() always generate tree elements (not data
3577   elements)
3578
3579 * Build under Windows failed ('\0' bug in patched Pyrex version)
3580
3581
3582 1.1 (2006-09-13)
3583 ================
3584
3585 Features added
3586 --------------
3587
3588 * Comments and processing instructions return '<!-- comment -->' and
3589   '<?pi-target content?>' for repr()
3590
3591 * Parsers are now the preferred (and default) place where element class lookup
3592   schemes should be registered.  Namespace lookup is no longer supported by
3593   default.
3594
3595 * Support for Python 2.5 beta
3596
3597 * Unlock the GIL for deep copying documents and for XPath()
3598
3599 * New ``compact`` keyword argument for parsing read-only documents
3600
3601 * Support for parser options in iterparse()
3602
3603 * The ``namespace`` axis is supported in XPath and returns (prefix, URI)
3604   tuples
3605
3606 * The XPath expression "/" now returns an empty list instead of raising an
3607   exception
3608
3609 * XML-Object API on top of lxml (lxml.objectify)
3610
3611 * Customizable Element class lookup:
3612
3613   * different pre-implemented lookup mechanisms
3614
3615   * support for externally provided lookup functions
3616
3617 * Support for processing instructions (ET-like, not compatible)
3618
3619 * Public C-level API for independent extension modules
3620
3621 * Module level ``iterwalk()`` function as 'iterparse' for trees
3622
3623 * Module level ``iterparse()`` function similar to ElementTree (see
3624   documentation for differences)
3625
3626 * Element.nsmap property returns a mapping of all namespace prefixes known at
3627   the Element to their namespace URI
3628
3629 * Reentrant threading support in RelaxNG, XMLSchema and XSLT
3630
3631 * Threading support in parsers and serializers:
3632
3633   * All in-memory operations (tostring, parse(StringIO), etc.) free the GIL
3634
3635   * File operations (on file names) free the GIL
3636
3637   * Reading from file-like objects frees the GIL and reacquires it for reading
3638
3639   * Serialisation to file-like objects is single-threaded (high lock overhead)
3640
3641 * Element iteration over XPath axes:
3642
3643   * Element.iterdescendants() iterates over the descendants of an element
3644
3645   * Element.iterancestors() iterates over the ancestors of an element (from
3646     parent to parent)
3647
3648   * Element.itersiblings() iterates over either the following or preceding
3649     siblings of an element
3650
3651   * Element.iterchildren() iterates over the children of an element in either
3652     direction
3653
3654   * All iterators support the ``tag`` keyword argument to restrict the
3655     generated elements
3656
3657 * Element.getnext() and Element.getprevious() return the direct siblings of an
3658   element
3659
3660 Bugs fixed
3661 ----------
3662
3663 * filenames with local 8-bit encoding were not supported
3664
3665 * 1.1beta did not compile under Python 2.3
3666
3667 * ignore unknown 'pyval' attribute values in objectify
3668
3669 * objectify.ObjectifiedElement.addattr() failed to accept Elements and Lists
3670
3671 * objectify.ObjectPath.setattr() failed to accept Elements and Lists
3672
3673 * XPathSyntaxError now inherits from XPathError
3674
3675 * Threading race conditions in RelaxNG and XMLSchema
3676
3677 * Crash when mixing elements from XSLT results into other trees, concurrent
3678   XSLT is only allowed when the stylesheet was parsed in the main thread
3679
3680 * The EXSLT ``regexp:match`` function now works as defined (except for some
3681   differences in the regular expression syntax)
3682
3683 * Setting element.text to '' returned None on request, not the empty string
3684
3685 * ``iterparse()`` could crash on long XML files
3686
3687 * Creating documents no longer copies the parser for later URL resolving.  For
3688   performance reasons, only a reference is kept.  Resolver updates on the
3689   parser will now be reflected by documents that were parsed before the
3690   change.  Although this should rarely become visible, it is a behavioral
3691   change from 1.0.
3692
3693
3694 1.0.4 (2006-09-09)
3695 ==================
3696
3697 Features added
3698 --------------
3699
3700 * List-like ``Element.extend()`` method
3701
3702 Bugs fixed
3703 ----------
3704
3705 * Crash in tail handling in ``Element.replace()``
3706
3707
3708 1.0.3 (2006-08-08)
3709 ==================
3710
3711 Features added
3712 --------------
3713
3714 * Element.replace(old, new) method to replace a subelement by another one
3715
3716 Bugs fixed
3717 ----------
3718
3719 * Crash when mixing elements from XSLT results into other trees
3720
3721 * Copying/deepcopying did not work for ElementTree objects
3722
3723 * Setting an attribute to a non-string value did not raise an exception
3724
3725 * Element.remove() deleted the tail text from the removed Element
3726
3727
3728 1.0.2 (2006-06-27)
3729 ==================
3730
3731 Features added
3732 --------------
3733
3734 * Support for setting a custom default Element class as opposed to namespace
3735   specific classes (which still override the default class)
3736
3737 Bugs fixed
3738 ----------
3739
3740 * Rare exceptions in Python list functions were not handled
3741
3742 * Parsing accepted unicode strings with XML encoding declaration in certain
3743   cases
3744
3745 * Parsing 8-bit encoded strings from StringIO objects raised an exception
3746
3747 * Module function ``initThread()`` was removed - useless (and never worked)
3748
3749 * XSLT and parser exception messages include the error line number
3750
3751
3752 1.0.1 (2006-06-09)
3753 ==================
3754
3755 Features added
3756 --------------
3757
3758 * Repeated calls to Element.attrib now efficiently return the same instance
3759
3760 Bugs fixed
3761 ----------
3762
3763 * Document deallocation could crash in certain garbage collection scenarios
3764
3765 * Extension function calls in XSLT variable declarations could break the
3766   stylesheet and crash on repeated calls
3767
3768 * Deep copying Elements could lose namespaces declared in parents
3769
3770 * Deep copying Elements did not copy tail
3771
3772 * Parsing file(-like) objects failed to load external entities
3773
3774 * Parsing 8-bit strings from file(-like) objects raised an exception
3775
3776 * xsl:include failed when the stylesheet was parsed from a file-like object
3777
3778 * lxml.sax.ElementTreeProducer did not call startDocument() / endDocument()
3779
3780 * MSVC compiler complained about long strings (supports only 2048 bytes)
3781
3782
3783 1.0 (2006-06-01)
3784 ================
3785
3786 Features added
3787 --------------
3788
3789 * Element.getiterator() and the findall() methods support finding arbitrary
3790   elements from a namespace (pattern ``{namespace}*``)
3791
3792 * Another speedup in tree iteration code
3793
3794 * General speedup of Python Element object creation and deallocation
3795
3796 * Writing C14N no longer serializes in memory (reduced memory footprint)
3797
3798 * PyErrorLog for error logging through the Python ``logging`` module
3799
3800 * ``Element.getroottree()`` returns an ElementTree for the root node of the
3801   document that contains the element.
3802
3803 * ElementTree.getpath(element) returns a simple, absolute XPath expression to
3804   find the element in the tree structure
3805
3806 * Error logs have a ``last_error`` attribute for convenience
3807
3808 * Comment texts can be changed through the API
3809
3810 * Formatted output via ``pretty_print`` keyword in serialization functions
3811
3812 * XSLT can block access to file system and network via ``XSLTAccessControl``
3813
3814 * ElementTree.write() no longer serializes in memory (reduced memory
3815   footprint)
3816
3817 * Speedup of Element.findall(tag) and Element.getiterator(tag)
3818
3819 * Support for writing the XML representation of Elements and ElementTrees to
3820   Python unicode strings via ``etree.tounicode()``
3821
3822 * Support for writing XSLT results to Python unicode strings via ``unicode()``
3823
3824 * Parsing a unicode string no longer copies the string (reduced memory
3825   footprint)
3826
3827 * Parsing file-like objects reads chunks rather than the whole file (reduced
3828   memory footprint)
3829
3830 * Parsing StringIO objects from the start avoids copying the string (reduced
3831   memory footprint)
3832
3833 * Read-only 'docinfo' attribute in ElementTree class holds DOCTYPE
3834   information, original encoding and XML version as seen by the parser
3835
3836 * etree module can be compiled without libxslt by commenting out the line
3837   ``include "xslt.pxi"`` near the end of the etree.pyx source file
3838
3839 * Better error messages in parser exceptions
3840
3841 * Error reporting also works in XSLT
3842
3843 * Support for custom document loaders (URI resolvers) in parsers and XSLT,
3844   resolvers are registered at parser level
3845
3846 * Implementation of exslt:regexp for XSLT based on the Python 're' module,
3847   enabled by default, can be switched off with 'regexp=False' keyword argument
3848
3849 * Support for exslt extensions (libexslt) and libxslt extra functions
3850   (node-set, document, write, output)
3851
3852 * Substantial speedup in XPath.evaluate()
3853
3854 * HTMLParser for parsing (broken) HTML
3855
3856 * XMLDTDID function parses XML into tuple (root node, ID dict) based on xml:id
3857   implementation of libxml2 (as opposed to ET compatible XMLID)
3858
3859 Bugs fixed
3860 ----------
3861
3862 * Memory leak in Element.__setitem__
3863
3864 * Memory leak in Element.attrib.items() and Element.attrib.values()
3865
3866 * Memory leak in XPath extension functions
3867
3868 * Memory leak in unicode related setup code
3869
3870 * Element now raises ValueError on empty tag names
3871
3872 * Namespace fixing after moving elements between documents could fail if the
3873   source document was freed too early
3874
3875 * Setting namespace-less tag names on namespaced elements ('{ns}t' -> 't')
3876   didn't reset the namespace
3877
3878 * Unknown constants from newer libxml2 versions could raise exceptions in the
3879   error handlers
3880
3881 * lxml.etree compiles much faster
3882
3883 * On libxml2 <= 2.6.22, parsing strings with encoding declaration could fail
3884   in certain cases
3885
3886 * Document reference in ElementTree objects was not updated when the root
3887   element was moved to a different document
3888
3889 * Running absolute XPath expressions on an Element now evaluates against the
3890   root tree
3891
3892 * Evaluating absolute XPath expressions (``/*``) on an ElementTree could fail
3893
3894 * Crashes when calling XSLT, RelaxNG, etc. with uninitialized ElementTree
3895   objects
3896
3897 * Removed public function ``initThreadLogging()``, replaced by more general
3898   ``initThread()`` which fixes a number of setup problems in threads
3899
3900 * Memory leak when using iconv encoders in tostring/write
3901
3902 * Deep copying Elements and ElementTrees maintains the document information
3903
3904 * Serialization functions raise LookupError for unknown encodings
3905
3906 * Memory deallocation crash resulting from deep copying elements
3907
3908 * Some ElementTree methods could crash if the root node was not initialized
3909   (neither file nor element passed to the constructor)
3910
3911 * Element/SubElement failed to set attribute namespaces from passed ``attrib``
3912   dictionary
3913
3914 * ``tostring()`` adds an XML declaration for non-ASCII encodings
3915
3916 * ``tostring()`` failed to serialize encodings that contain 0-bytes
3917
3918 * ElementTree.xpath() and XPathDocumentEvaluator were not using the
3919   ElementTree root node as reference point
3920
3921 * Calling ``document('')`` in XSLT failed to return the stylesheet
3922
3923
3924 0.9.2 (2006-05-10)
3925 ==================
3926
3927 Features added
3928 --------------
3929
3930 * Speedup for Element.makeelement(): the new element reuses the original
3931   libxml2 document instead of creating a new empty one
3932
3933 * Speedup for reversed() iteration over element children (Py2.4+ only)
3934
3935 * ElementTree compatible QName class
3936
3937 * RelaxNG and XMLSchema accept any Element, not only ElementTrees
3938
3939 Bugs fixed
3940 ----------
3941
3942 * str(xslt_result) was broken for XSLT output other than UTF-8
3943
3944 * Memory leak if write_c14n fails to write the file after conversion
3945
3946 * Crash in XMLSchema and RelaxNG when passing non-schema documents
3947
3948 * Memory leak in RelaxNG() when RelaxNGParseError is raised
3949
3950 0.9.1 (2006-03-30)
3951 ==================
3952
3953 Features added
3954 --------------
3955
3956 * lxml.sax.ElementTreeContentHandler checks closing elements and raises
3957   SaxError on mismatch
3958
3959 * lxml.sax.ElementTreeContentHandler supports namespace-less SAX events
3960   (startElement, endElement) and defaults to empty attributes (keyword
3961   argument)
3962
3963 * Speedup for repeatedly accessing element tag names
3964
3965 * Minor API performance improvements
3966
3967 Bugs fixed
3968 ----------
3969
3970 * Memory deallocation bug when using XSLT output method "html"
3971
3972 * sax.py was handling UTF-8 encoded tag names where it shouldn't
3973
3974 * lxml.tests package will no longer be installed (is still in source tar)
3975
3976 0.9 (2006-03-20)
3977 ================
3978
3979 Features added
3980 --------------
3981
3982 * Error logging API for libxml2 error messages
3983
3984 * Various performance improvements
3985
3986 * Benchmark script for lxml, ElementTree and cElementTree
3987
3988 * Support for registering extension functions through new FunctionNamespace
3989   class (see doc/extensions.txt)
3990
3991 * ETXPath class for XPath expressions in ElementTree notation ('//{ns}tag')
3992
3993 * Support for variables in XPath expressions (also in XPath class)
3994
3995 * XPath class for compiled XPath expressions
3996
3997 * XMLID module level function (ElementTree compatible)
3998
3999 * XMLParser API for customized libxml2 parser configuration
4000
4001 * Support for custom Element classes through new Namespace API (see
4002   doc/namespace_extensions.txt)
4003
4004 * Common exception base class LxmlError for module exceptions
4005
4006 * real iterator support in iter(Element), Element.getiterator()
4007
4008 * XSLT objects are callable, result trees support str()
4009
4010 * Added MANIFEST.in for easier creation of RPM files.
4011
4012 * 'getparent' method on elements allows navigation to an element's
4013   parent element.
4014
4015 * Python core compatible SAX tree builder and SAX event generator. See
4016   doc/sax.txt for more information.
4017
4018 Bugs fixed
4019 ----------
4020
4021 * Segfaults and memory leaks in various API functions of Element
4022
4023 * Segfault in XSLT.tostring()
4024
4025 * ElementTree objects no longer interfere, Elements can be root of different
4026   ElementTrees at the same time
4027
4028 * document('') works in XSLT documents read from files (in-memory documents
4029   cannot support this due to libxslt deficiencies)
4030
4031 0.8 (2005-11-03)
4032 ================
4033
4034 Features added
4035 --------------
4036
4037 * Support for copy.deepcopy() on elements. copy.copy() works also, but
4038   does the same thing, and does *not* create a shallow copy, as that
4039   makes no sense in the context of libxml2 trees. This means a
4040   potential incompatibility with ElementTree, but there's more chance
4041   that it works than if copy.copy() isn't supported at all.
4042
4043 * Increased compatibility with (c)ElementTree; .parse() on ElementTree is
4044   supported and parsing of gzipped XML files works.
4045
4046 * implemented index() on elements, allowing one to find the index of a
4047   SubElement.
4048
4049 Bugs fixed
4050 ----------
4051
4052 * Use xslt-config instead of xml2-config to find out libxml2
4053   directories to take into account a case where libxslt is installed
4054   in a different directory than libxslt.
4055
4056 * Eliminate crash condition in iteration when text nodes are changed.
4057
4058 * Passing 'None' to tostring() does not result in a segfault anymore,
4059   but an AssertionError.
4060
4061 * Some test fixes for Windows.
4062
4063 * Raise XMLSyntaxError and XPathSyntaxError instead of plain python
4064   syntax errors. This should be less confusing.
4065
4066 * Fixed error with uncaught exception in Pyrex code.
4067
4068 * Calling lxml.etree.fromstring('') throws XMLSyntaxError instead of a
4069   segfault.
4070
4071 * has_key() works on attrib. 'in' tests also work correctly on attrib.
4072
4073 * INSTALL.txt was saying 2.2.16 instead of 2.6.16 as a supported
4074   libxml2 version, as it should.
4075
4076 * Passing a UTF-8 encoded string to the XML() function would fail;
4077   fixed.
4078
4079 0.7 (2005-06-15)
4080 ================
4081
4082 Features added
4083 --------------
4084
4085 * parameters (XPath expressions) can be passed to XSLT using keyword
4086   parameters.
4087
4088 * Simple XInclude support. Calling the xinclude() method on a tree
4089   will process any XInclude statements in the document.
4090
4091 * XMLSchema support. Use the XMLSchema class or the convenience
4092   xmlschema() method on a tree to do XML Schema (XSD) validation.
4093
4094 * Added convenience xslt() method on tree. This is less efficient
4095   than the XSLT object, but makes it easier to write quick code.
4096
4097 * Added convenience relaxng() method on tree. This is less efficient
4098   than the RelaxNG object, but makes it easier to write quick code.
4099
4100 * Make it possible to use XPathEvaluator with elements as well. The
4101   XPathEvaluator in this case will retain the element so multiple
4102   XPath queries can be made against one element efficiently. This
4103   replaces the second argument to the .evaluate() method that existed
4104   previously.
4105
4106 * Allow registerNamespace() to be called on an XPathEvaluator, after
4107   creation, to add additional namespaces. Also allow registerNamespaces(),
4108   which does the same for a namespace dictionary.
4109
4110 * Add 'prefix' attribute to element to be able to read prefix information.
4111   This is entirely read-only.
4112
4113 * It is possible to supply an extra nsmap keyword parameter to
4114   the Element() and SubElement() constructors, which supplies a
4115   prefix to namespace URI mapping. This will create namespace
4116   prefix declarations on these elements and these prefixes will show up
4117   in XML serialization.
4118
4119 Bugs fixed
4120 ----------
4121
4122 * Killed yet another memory management related bug: trees created
4123   using newDoc would not get a libxml2-level dictionary, which caused
4124   problems when deallocating these documents later if they contained a
4125   node that came from a document with a dictionary.
4126
4127 * Moving namespaced elements between documents was problematic as
4128   references to the original document would remain. This has been fixed
4129   by applying xmlReconciliateNs() after each move operation.
4130
4131 * Can pass None to 'dump()' without segfaults.
4132
4133 * tostring() works properly for non-root elements as well.
4134
4135 * Cleaned out the tostring() method so it should handle encoding
4136   correctly.
4137
4138 * Cleaned out the ElementTree.write() method so it should handle encoding
4139   correctly. Writing directly to a file should also be faster, as there is no
4140   need to go through a Python string in that case. Made sure the test cases
4141   test both serializing to StringIO as well as serializing to a real file.
4142
4143 0.6 (2005-05-14)
4144 ================
4145
4146 Features added
4147 --------------
4148
4149 * Changed setup.py so that library_dirs is also guessed. This should
4150   help with compilation on the Mac OS X platform, where otherwise the
4151   wrong library (shipping with the OS) could be picked up.
4152
4153 * Tweaked setup.py so that it picks up the version from version.txt.
4154
4155 Bugs fixed
4156 ----------
4157
4158 * Do the right thing when handling namespaced attributes.
4159
4160 * fix bug where tostring() moved nodes into new documents. tostring()
4161   had very nasty side-effects before this fix, sorry!
4162
4163 0.5.1 (2005-04-09)
4164 ==================
4165
4166 * Python 2.2 compatibility fixes.
4167
4168 * unicode fixes in Element() and Comment() as well as XML(); unicode
4169   input wasn't properly being UTF-8 encoded.
4170
4171 0.5 (2005-04-08)
4172 ================
4173
4174 Initial public release.