when an individual is representing the project or its community.
Instances of abusive, harassing, or otherwise unacceptable behavior may be
-reported by contacting a project maintainer at markdown@freewisdom.org. All
+reported by contacting a project maintainer at python.markdown@gmail.com. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. Maintainers are
obligated to maintain confidentiality with regard to the reporter of an
Documentation
-------------
-Installation and usage documentation is available in the `docs/` directory
-of the distribution and on the project website at
-<https://Python-Markdown.github.io/>.
+```bash
+pip install markdown
+```
+```python
+import markdown
+html = markdown.markdown(your_text_string)
+```
+
+For more advanced [installation] and [usage] documentation, see the `docs/` directory
+of the distribution or the project website at <https://Python-Markdown.github.io/>.
+
+[installation]: https://python-markdown.github.io/install/
+[usage]: https://python-markdown.github.io/reference/
See the change log at <https://Python-Markdown.github.io/change_log>.
and has been assisting with maintenance, reviewing pull requests and ticket
triage.
-* __[Yuri Takteyev](http://freewisdom.org/)__
+* __[Yuri Takteyev](https://github.com/yuri)__
Yuri wrote most of the code found in version 1.x while procrastinating his
Ph.D. Various pieces of his code still exist, most notably the basic
Python-Markdown Change Log
=========================
+Feb 24, 2021: version 3.3.4 (a bug-fix release).
+
+* Properly parse unclosed tags in code spans (#1066).
+* Properly parse processing instructions in md_in_html (#1070).
+* Properly parse code spans in md_in_html (#1069).
+* Preserve text immediately before an admonition (#1092).
+* Simplified regex for HTML placeholders (#928) addressing (#932).
+* Ensure `permalinks` and `ankorlinks` are not restricted by `toc_depth` (#1107).
+
Oct 25, 2020: version 3.3.3 (a bug-fix release).
* Unify all block-level tags (#1047).
* Fix unescaping of HTML characters `<>` in CodeHilite (#990).
* Fix complex scenarios involving lists and admonitions (#1004).
* Fix complex scenarios with nested ordered and unordered lists in a definition list (#918).
+* Fix corner cases with lists under admonitions.
[spec]: https://www.w3.org/TR/html5/text-level-semantics.html#the-code-element
[fenced_code]: ../extensions/fenced_code_blocks.md
This project and everyone participating in it is governed by the
[Python-Markdown Code of Conduct]. By participating, you are expected to uphold
-this code. Please report unacceptable behavior to [markdown@freewisdom.org][email].
+this code. Please report unacceptable behavior to <python.markdown@gmail.com>.
## Project Organization
[Python-Markdown Organization]: https://github.com/Python-Markdown
[Python-Markdown Code of Conduct]: https://github.com/Python-Markdown/markdown/blob/master/CODE_OF_CONDUCT.md
-[email]: mailto:markdown@freewisdom.org
[Python-Markdown/markdown]: https://github.com/Python-Markdown/markdown
[issue tracker]: https://github.com/Python-Markdown/markdown/issues
[syntax rules]: https://daringfireball.net/projects/markdown/syntax
set_link_class(child) # run recursively on children
```
-For more information about working with ElementTree see the ElementTree
-[Documentation](https://effbot.org/zone/element-index.htm) ([Python
-Docs](https://docs.python.org/3/library/xml.etree.elementtree.html)).
+For more information about working with ElementTree see the [ElementTree
+Documentation][ElementTree].
+
+## Working with Raw HTML {: #working_with_raw_html }
+
+Occasionally an extension may need to call out to a third party library which returns a pre-made string
+of raw HTML that needs to be inserted into the document unmodified. Raw strings can be stashed for later
+retrieval using an `htmlStash` instance, rather than converting them into `ElementTree` objects. A raw string
+(which may or may not be raw HTML) passed to `self.md.htmlStash.store()` will be saved to the stash and a
+placeholder string will be returned which should be inserted into the tree instead. After the tree is
+serialized, a postprocessor will replace the placeholder with the raw string. This prevents subsequent
+processing steps from modifying the HTML data. For example,
+
+```python
+html = "<p>This is some <em>raw</em> HTML data</p>"
+el = etree.Element("div")
+el.text = self.md.htmlStash.store(html)
+```
+
+For the global `htmlStash` instance to be available from a processor, the `markdown.Markdown` instance must
+be passed to the processor from [extendMarkdown](#extendmarkdown) and will be available as `self.md.htmlStash`.
## Integrating Your Code Into Markdown {: #integrating_into_markdown }
[registerExtension]: #registerextension
[Config Settings]: #configsettings
[makeExtension]: #makeextension
-[ElementTree]: https://effbot.org/zone/element-index.htm
+[ElementTree]: https://docs.python.org/3/library/xml.etree.elementtree.html
[Available Extensions]: index.md
[Footnotes]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
[Definition Lists]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/definition_lists
test. The defaults can be overridden on individual tests.
Methods
-: `assertMarkdownRenders`: accepts the source text, the expected output,
- and any keywords to pass to Markdown. The `default_kwargs` defined on the
- class are used except where overridden by keyword arguments. The output and
- expected output are passed to `TestCase.assertMultiLineEqual`. An
- `AssertionError` is raised with a diff if the actual output does not equal the
- expected output.
+: `assertMarkdownRenders`: accepts the source text, the expected output, an optional
+ dictionary of `expected_attrs`, and any keywords to pass to Markdown. The
+ `default_kwargs` defined on the class are used except where overridden by
+ keyword arguments. The output and expected output are passed to
+ `TestCase.assertMultiLineEqual`. An `AssertionError` is raised with a diff
+ if the actual output does not equal the expected output. The optional
+ keyword `expected_attrs` accepts a dictionary of attribute names as keys with
+ expected values. Each value is checked against the attribute of that
+ name on the instance of the `Markdown` class using `TestCase.assertEqual`. An
+ `AssertionError` is raised if any value does not match the expected value.
: `dedent`: Dedent triple-quoted strings.
# (1, 2, 0, 'beta', 2) => "1.2b2"
# (1, 2, 0, 'rc', 4) => "1.2rc4"
# (1, 2, 0, 'final', 0) => "1.2"
-__version_info__ = (3, 3, 3, 'final', 0)
+__version_info__ = (3, 3, 4, 'final', 0)
def _get_version(version_info):
else:
return None
- def detab(self, text):
+ def detab(self, text, length=None):
""" Remove a tab from the front of each line of the given text. """
+ if length is None:
+ length = self.tab_length
newtext = []
lines = text.split('\n')
for line in lines:
- if line.startswith(' '*self.tab_length):
- newtext.append(line[self.tab_length:])
+ if line.startswith(' ' * length):
+ newtext.append(line[length:])
elif not line.strip():
newtext.append('')
else:
self.current_sibling = None
self.content_indention = 0
- def get_sibling(self, parent, block):
+ def parse_content(self, parent, block):
"""Get sibling admontion.
Retrieve the appropriate siblimg element. This can get trickly when
"""
+ old_block = block
+ the_rest = ''
+
# We already acquired the block via test
if self.current_sibling is not None:
sibling = self.current_sibling
- block = block[self.content_indent:]
+ block, the_rest = self.detab(block, self.content_indent)
self.current_sibling = None
self.content_indent = 0
- return sibling, block
+ return sibling, block, the_rest
sibling = self.lastChild(parent)
sibling = None
if sibling is not None:
+ indent += self.tab_length
+ block, the_rest = self.detab(old_block, indent)
self.current_sibling = sibling
self.content_indent = indent
- return sibling, block
+ return sibling, block, the_rest
def test(self, parent, block):
if self.RE.search(block):
return True
else:
- return self.get_sibling(parent, block)[0] is not None
+ return self.parse_content(parent, block)[0] is not None
def run(self, parent, blocks):
block = blocks.pop(0)
m = self.RE.search(block)
if m:
+ if m.start() > 0:
+ self.parser.parseBlocks(parent, [block[:m.start()]])
block = block[m.end():] # removes the first line
+ block, theRest = self.detab(block)
else:
- sibling, block = self.get_sibling(parent, block)
-
- block, theRest = self.detab(block)
+ sibling, block, theRest = self.parse_content(parent, block)
if m:
klass, title = self.get_class_and_title(m)
from ..preprocessors import Preprocessor
from ..postprocessors import RawHtmlPostprocessor
from .. import util
-from ..htmlparser import HTMLExtractor
+from ..htmlparser import HTMLExtractor, blank_line_re
import xml.etree.ElementTree as etree
else: # pragma: no cover
return None
- def at_line_start(self):
- """At line start."""
-
- value = super().at_line_start()
- if not value and self.cleandoc and self.cleandoc[-1].endswith('\n'):
- value = True
- return value
-
def handle_starttag(self, tag, attrs):
# Handle tags that should always be empty and do not specify a closing tag
- if tag in self.empty_tags:
+ if tag in self.empty_tags and (self.at_line_start() or self.intail):
attrs = {key: value if value is not None else key for key, value in attrs}
if "markdown" in attrs:
attrs.pop('markdown')
self.handle_empty_tag(data, True)
return
- if tag in self.block_level_tags:
+ if tag in self.block_level_tags and (self.at_line_start() or self.intail):
# Valueless attr (ex: `<tag checked>`) results in `[('checked', None)]`.
# Convert to `{'checked': 'checked'}`.
attrs = {key: value if value is not None else key for key, value in attrs}
state = self.get_state(tag, attrs)
-
- if self.inraw or (state in [None, 'off'] and not self.mdstack) or not self.at_line_start():
+ if self.inraw or (state in [None, 'off'] and not self.mdstack):
# fall back to default behavior
attrs.pop('markdown', None)
super().handle_starttag(tag, attrs)
self.handle_data(self.md.htmlStash.store(text))
else:
self.handle_data(text)
+ if tag in self.CDATA_CONTENT_ELEMENTS:
+ # This is presumably a standalone tag in a code span (see #1036).
+ self.clear_cdata_mode()
def handle_endtag(self, tag):
if tag in self.block_level_tags:
self.cleandoc.append(self.md.htmlStash.store(element))
self.cleandoc.append('\n\n')
self.state = []
+ # Check if element has a tail
+ if not blank_line_re.match(
+ self.rawdata[self.line_offset + self.offset + len(self.get_endtag_text(tag)):]):
+ # More content exists after endtag.
+ self.intail = True
else:
# Treat orphan closing tag as a span level tag.
text = self.get_endtag_text(tag)
self.handle_empty_tag(data, is_block=self.md.is_block_level(tag))
def handle_data(self, data):
+ if self.intail and '\n' in data:
+ self.intail = False
if self.inraw or not self.mdstack:
super().handle_data(data)
else:
if self.at_line_start() and is_block:
self.handle_data('\n' + self.md.htmlStash.store(data) + '\n\n')
else:
- if self.mdstate and self.mdstate[-1] == "off":
- self.handle_data(self.md.htmlStash.store(data))
- else:
- self.handle_data(data)
+ self.handle_data(self.md.htmlStash.store(data))
+
+ def parse_pi(self, i):
+ if self.at_line_start() or self.intail or self.mdstack:
+ # The same override exists in HTMLExtractor without the check
+ # for mdstack. Therefore, use HTMLExtractor's parent instead.
+ return super(HTMLExtractor, self).parse_pi(i)
+ # This is not the beginning of a raw block so treat as plain data
+ # and avoid consuming any tags which may follow (see #1066).
+ self.handle_data('<?')
+ return i + 2
+
+ def parse_html_declaration(self, i):
+ if self.at_line_start() or self.intail or self.mdstack:
+ # The same override exists in HTMLExtractor without the check
+ # for mdstack. Therefore, use HTMLExtractor's parent instead.
+ return super(HTMLExtractor, self).parse_html_declaration(i)
+ # This is not the beginning of a raw block so treat as plain data
+ # and avoid consuming any tags which may follow (see #1066).
+ self.handle_data('<!')
+ return i + 2
class HtmlBlockPreprocessor(Preprocessor):
for el in doc.iter():
if isinstance(el.tag, str) and self.header_rgx.match(el.tag):
self.set_level(el)
- if int(el.tag[-1]) < self.toc_top or int(el.tag[-1]) > self.toc_bottom:
- continue
text = get_name(el)
# Do not override pre-existing ids
innertext = unescape(stashedHTML2text(text, self.md))
el.attrib["id"] = unique(self.slugify(innertext, self.sep), used_ids)
- toc_tokens.append({
- 'level': int(el.tag[-1]),
- 'id': el.attrib["id"],
- 'name': unescape(stashedHTML2text(
- code_escape(el.attrib.get('data-toc-label', text)),
- self.md, strip_entities=False
- ))
- })
+ if int(el.tag[-1]) >= self.toc_top and int(el.tag[-1]) <= self.toc_bottom:
+ toc_tokens.append({
+ 'level': int(el.tag[-1]),
+ 'id': el.attrib["id"],
+ 'name': unescape(stashedHTML2text(
+ code_escape(el.attrib.get('data-toc-label', text)),
+ self.md, strip_entities=False
+ ))
+ })
# Remove the data-toc-label attribute as it is no longer needed
if 'data-toc-label' in el.attrib:
# so the 'incomplete' functionality is unnecessary. As the entityref regex is run right before incomplete,
# and the two regex are the same, then incomplete will simply never match and we avoid the logic within.
htmlparser.incomplete = htmlparser.entityref
+# Monkeypatch HTMLParser to not accept a backtick in a tag name, attribute name, or bare value.
+htmlparser.locatestarttagend_tolerant = re.compile(r"""
+ <[a-zA-Z][^`\t\n\r\f />\x00]* # tag name <= added backtick here
+ (?:[\s/]* # optional whitespace before attribute name
+ (?:(?<=['"\s/])[^`\s/>][^\s/=>]* # attribute name <= added backtick here
+ (?:\s*=+\s* # value indicator
+ (?:'[^']*' # LITA-enclosed value
+ |"[^"]*" # LIT-enclosed value
+ |(?!['"])[^`>\s]* # bare value <= added backtick here
+ )
+ (?:\s*,)* # possibly followed by a comma
+ )?(?:\s|/(?!>))*
+ )*
+ )?
+ \s* # trailing whitespace
+""", re.VERBOSE)
# Match a blank line at the start of a block of text (two newlines).
# The newlines may be preceded by additional whitespace.
@property
def line_offset(self):
"""Returns char index in self.rawdata for the start of the current line. """
- if self.lineno > 1:
- return re.match(r'([^\n]*\n){{{}}}'.format(self.lineno-1), self.rawdata).end()
+ if self.lineno > 1 and '\n' in self.rawdata:
+ m = re.match(r'([^\n]*\n){{{}}}'.format(self.lineno-1), self.rawdata)
+ if m:
+ return m.end()
+ else: # pragma: no cover
+ # Value of self.lineno must exceed total number of lines.
+ # Find index of begining of last line.
+ return self.rawdata.rfind('\n')
return 0
def at_line_start(self):
end = ']]>' if data.startswith('CDATA[') else ']>'
self.handle_empty_tag('<![{}{}'.format(data, end), is_block=True)
+ def parse_pi(self, i):
+ if self.at_line_start() or self.intail:
+ return super().parse_pi(i)
+ # This is not the beginning of a raw block so treat as plain data
+ # and avoid consuming any tags which may follow (see #1066).
+ self.handle_data('<?')
+ return i + 2
+
+ def parse_html_declaration(self, i):
+ if self.at_line_start() or self.intail:
+ return super().parse_html_declaration(i)
+ # This is not the beginning of a raw block so treat as plain data
+ # and avoid consuming any tags which may follow (see #1066).
+ self.handle_data('<!')
+ return i + 2
+
# The rest has been copied from base class in standard lib to address #1036.
# As __startag_text is private, all references to it must be in this subclass.
# The last few lines of parse_starttag are reversed so that handle_starttag
class ShortReferenceInlineProcessor(ReferenceInlineProcessor):
- """Shorte form of reference: [google]. """
+ """Short form of reference: [google]. """
def evalId(self, data, index, text):
"""Evaluate the id from of [ref] """
self.md.htmlStash.get_placeholder(i))] = html
replacements[self.md.htmlStash.get_placeholder(i)] = html
+ def substitute_match(m):
+ key = m.group(0)
+
+ if key not in replacements:
+ if key[3:-4] in replacements:
+ return f'<p>{ replacements[key[3:-4]] }</p>'
+ else:
+ return key
+
+ return replacements[key]
+
if replacements:
- pattern = re.compile("|".join(re.escape(k) for k in replacements))
- processed_text = pattern.sub(lambda m: replacements[m.group(0)], text)
+ base_placeholder = util.HTML_PLACEHOLDER % r'([0-9]+)'
+ pattern = re.compile(f'<p>{ base_placeholder }</p>|{ base_placeholder }')
+ processed_text = pattern.sub(substitute_match, text)
else:
return text
import sys
import unittest
import textwrap
-from . import markdown, util
+from . import markdown, Markdown, util
try:
import tidylib
default_kwargs = {}
- def assertMarkdownRenders(self, source, expected, **kwargs):
+ def assertMarkdownRenders(self, source, expected, expected_attrs=None, **kwargs):
"""
Test that source Markdown text renders to expected output with given keywords.
+
+ `expected_attrs` accepts a dict. Each key should be the name of an attribute
+ on the `Markdown` instance and the value should be the expected value after
+ the source text is parsed by Markdown. After the expected output is tested,
+ the expected value for each attribute is compared against the actual
+ attribute of the `Markdown` instance using `TestCase.assertEqual`.
"""
+ expected_attrs = expected_attrs or {}
kws = self.default_kwargs.copy()
kws.update(kwargs)
- output = markdown(source, **kws)
+ md = Markdown(**kws)
+ output = md.convert(source)
self.assertMultiLineEqual(output, expected)
+ for key, value in expected_attrs.items():
+ self.assertEqual(getattr(md, key), value)
def dedent(self, text):
"""
"""
if not isinstance(data, util.AtomicString):
startIndex = 0
- while patternIndex < len(self.inlinePatterns):
+ count = len(self.inlinePatterns)
+ while patternIndex < count:
data, matched, startIndex = self.__applyPattern(
self.inlinePatterns[patternIndex], data, patternIndex, startIndex
)
long_description=long_description,
long_description_content_type='text/markdown',
author='Manfred Stienstra, Yuri takhteyev and Waylan limberg',
- author_email='waylan.limberg@icloud.com',
+ author_email='python.markdown@gmail.com',
maintainer='Waylan Limberg',
- maintainer_email='waylan.limberg@icloud.com',
+ maintainer_email='python.markdown@gmail.com',
license='BSD License',
packages=['markdown', 'markdown.extensions'],
python_requires='>=3.6',
{'level': 1, 'id': 'some-header-with-markup', 'name': 'Some Header with markup.', 'children': []},
])
- def testAnchorLink(self):
- """ Test TOC Anchorlink. """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(anchorlink=True)]
- )
- text = '# Header 1\n\n## Header *2*'
- self.assertEqual(
- md.convert(text),
- '<h1 id="header-1"><a class="toclink" href="#header-1">Header 1</a></h1>\n'
- '<h2 id="header-2"><a class="toclink" href="#header-2">Header <em>2</em></a></h2>'
- )
-
- def testAnchorLinkWithSingleInlineCode(self):
- """ Test TOC Anchorlink with single inline code. """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(anchorlink=True)]
- )
- text = '# This is `code`.'
- self.assertEqual(
- md.convert(text),
- '<h1 id="this-is-code">' # noqa
- '<a class="toclink" href="#this-is-code">' # noqa
- 'This is <code>code</code>.' # noqa
- '</a>' # noqa
- '</h1>' # noqa
- )
-
- def testAnchorLinkWithDoubleInlineCode(self):
- """ Test TOC Anchorlink with double inline code. """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(anchorlink=True)]
- )
- text = '# This is `code` and `this` too.'
- self.assertEqual(
- md.convert(text),
- '<h1 id="this-is-code-and-this-too">' # noqa
- '<a class="toclink" href="#this-is-code-and-this-too">' # noqa
- 'This is <code>code</code> and <code>this</code> too.' # noqa
- '</a>' # noqa
- '</h1>' # noqa
- )
-
- def testPermalink(self):
- """ Test TOC Permalink. """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(permalink=True)]
- )
- text = '# Header'
- self.assertEqual(
- md.convert(text),
- '<h1 id="header">' # noqa
- 'Header' # noqa
- '<a class="headerlink" href="#header" title="Permanent link">¶</a>' # noqa
- '</h1>' # noqa
- )
-
- def testPermalinkWithSingleInlineCode(self):
- """ Test TOC Permalink with single inline code. """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(permalink=True)]
- )
- text = '# This is `code`.'
- self.assertEqual(
- md.convert(text),
- '<h1 id="this-is-code">' # noqa
- 'This is <code>code</code>.' # noqa
- '<a class="headerlink" href="#this-is-code" title="Permanent link">¶</a>' # noqa
- '</h1>' # noqa
- )
-
- def testPermalinkWithDoubleInlineCode(self):
- """ Test TOC Permalink with double inline code. """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(permalink=True)]
- )
- text = '# This is `code` and `this` too.'
- self.assertEqual(
- md.convert(text),
- '<h1 id="this-is-code-and-this-too">' # noqa
- 'This is <code>code</code> and <code>this</code> too.' # noqa
- '<a class="headerlink" href="#this-is-code-and-this-too" title="Permanent link">¶</a>' # noqa
- '</h1>' # noqa
- )
-
def testTitle(self):
""" Test TOC Title. """
md = markdown.Markdown(
'<h1 id="toc"><em>[TOC]</em></h1>' # noqa
)
- def testMinMaxLevel(self):
- """ Test toc_height setting """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(toc_depth='3-4')]
- )
- text = '# Header 1 not in TOC\n\n## Header 2 not in TOC\n\n### Header 3\n\n####Header 4'
- self.assertEqual(
- md.convert(text),
- '<h1>Header 1 not in TOC</h1>\n'
- '<h2>Header 2 not in TOC</h2>\n'
- '<h3 id="header-3">Header 3</h3>\n'
- '<h4 id="header-4">Header 4</h4>'
- )
- self.assertEqual(
- md.toc,
- '<div class="toc">\n'
- '<ul>\n' # noqa
- '<li><a href="#header-3">Header 3</a>' # noqa
- '<ul>\n' # noqa
- '<li><a href="#header-4">Header 4</a></li>\n' # noqa
- '</ul>\n' # noqa
- '</li>\n' # noqa
- '</ul>\n' # noqa
- '</div>\n'
- )
-
- self.assertNotIn("Header 1", md.toc)
-
- def testMaxLevel(self):
- """ Test toc_depth setting """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(toc_depth=2)]
- )
- text = '# Header 1\n\n## Header 2\n\n###Header 3 not in TOC'
- self.assertEqual(
- md.convert(text),
- '<h1 id="header-1">Header 1</h1>\n'
- '<h2 id="header-2">Header 2</h2>\n'
- '<h3>Header 3 not in TOC</h3>'
- )
- self.assertEqual(
- md.toc,
- '<div class="toc">\n'
- '<ul>\n' # noqa
- '<li><a href="#header-1">Header 1</a>' # noqa
- '<ul>\n' # noqa
- '<li><a href="#header-2">Header 2</a></li>\n' # noqa
- '</ul>\n' # noqa
- '</li>\n' # noqa
- '</ul>\n' # noqa
- '</div>\n'
- )
-
- self.assertNotIn("Header 3", md.toc)
-
- def testMinMaxLevelwithBaseLevel(self):
- """ Test toc_height setting together with baselevel """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(toc_depth='4-6',
- baselevel=3)]
- )
- text = '# First Header\n\n## Second Level\n\n### Third Level'
- self.assertEqual(
- md.convert(text),
- '<h3>First Header</h3>\n'
- '<h4 id="second-level">Second Level</h4>\n'
- '<h5 id="third-level">Third Level</h5>'
- )
- self.assertEqual(
- md.toc,
- '<div class="toc">\n'
- '<ul>\n' # noqa
- '<li><a href="#second-level">Second Level</a>' # noqa
- '<ul>\n' # noqa
- '<li><a href="#third-level">Third Level</a></li>\n' # noqa
- '</ul>\n' # noqa
- '</li>\n' # noqa
- '</ul>\n' # noqa
- '</div>\n'
- )
- self.assertNotIn("First Header", md.toc)
-
- def testMaxLevelwithBaseLevel(self):
- """ Test toc_depth setting together with baselevel """
- md = markdown.Markdown(
- extensions=[markdown.extensions.toc.TocExtension(toc_depth=3,
- baselevel=2)]
- )
- text = '# Some Header\n\n## Next Level\n\n### Too High'
- self.assertEqual(
- md.convert(text),
- '<h2 id="some-header">Some Header</h2>\n'
- '<h3 id="next-level">Next Level</h3>\n'
- '<h4>Too High</h4>'
- )
- self.assertEqual(
- md.toc,
- '<div class="toc">\n'
- '<ul>\n' # noqa
- '<li><a href="#some-header">Some Header</a>' # noqa
- '<ul>\n' # noqa
- '<li><a href="#next-level">Next Level</a></li>\n' # noqa
- '</ul>\n' # noqa
- '</li>\n' # noqa
- '</ul>\n' # noqa
- '</div>\n'
- )
- self.assertNotIn("Too High", md.toc)
-
class TestSmarty(unittest.TestCase):
def setUp(self):
"""
from markdown.test_tools import TestCase
+import markdown
class TestHTMLBlocks(TestCase):
'<p><foo</p>'
)
+ def test_raw_unclosed_tag_in_code_span(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ `<div`.
+
+ <div>
+ hello
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <p><code><div</code>.</p>
+ <div>
+ hello
+ </div>
+ """
+ )
+ )
+
+ def test_raw_unclosed_tag_in_code_span_space(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ ` <div `.
+
+ <div>
+ hello
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <p><code><div</code>.</p>
+ <div>
+ hello
+ </div>
+ """
+ )
+ )
+
def test_raw_attributes(self):
self.assertMarkdownRenders(
'<p id="foo", class="bar baz", style="margin: 15px; line-height: 1.5; text-align: center;">text</p>',
)
)
+ def test_raw_processing_instruction_code_span(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ `<?php`
+
+ <div>
+ foo
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <p><code><?php</code></p>
+ <div>
+ foo
+ </div>
+ """
+ )
+ )
+
def test_raw_declaration_one_line(self):
self.assertMarkdownRenders(
'<!DOCTYPE html>',
)
)
+ def test_raw_declaration_code_span(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ `<!`
+
+ <div>
+ foo
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <p><code><!</code></p>
+ <div>
+ foo
+ </div>
+ """
+ )
+ )
+
def test_raw_cdata_one_line(self):
self.assertMarkdownRenders(
'<![CDATA[ document.write(">"); ]]>',
)
)
+ def test_raw_cdata_code_span(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ `<![`
+
+ <div>
+ foo
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <p><code><![</code></p>
+ <div>
+ foo
+ </div>
+ """
+ )
+ )
+
def test_charref(self):
self.assertMarkdownRenders(
'§',
"""
)
)
+
+ def test_placeholder_in_source(self):
+ # This should never occur, but third party extensions could create weird edge cases.
+ md = markdown.Markdown()
+ # Ensure there is an htmlstash so relevant code (nested in `if replacements`) is run.
+ md.htmlStash.store('foo')
+ # Run with a placeholder which is not in the stash
+ placeholder = md.htmlStash.get_placeholder(md.htmlStash.html_counter + 1)
+ result = md.postprocessors['raw_html'].run(placeholder)
+ self.assertEqual(placeholder, result)
),
extensions=['admonition', 'def_list']
)
+
+ def test_with_preceding_text(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ foo
+ **foo**
+ !!! note "Admonition"
+ '''
+ ),
+ self.dedent(
+ '''
+ <p>foo
+ <strong>foo</strong></p>
+ <div class="admonition note">
+ <p class="admonition-title">Admonition</p>
+ </div>
+ '''
+ ),
+ extensions=['admonition']
+ )
+
+ def test_admontion_detabbing(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ !!! note "Admonition"
+ - Parent 1
+
+ - Child 1
+ - Child 2
+ '''
+ ),
+ self.dedent(
+ '''
+ <div class="admonition note">
+ <p class="admonition-title">Admonition</p>
+ <ul>
+ <li>
+ <p>Parent 1</p>
+ <ul>
+ <li>Child 1</li>
+ <li>Child 2</li>
+ </ul>
+ </li>
+ </ul>
+ </div>
+ '''
+ ),
+ extensions=['admonition']
+ )
)
)
+ def test_md1_code_span(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ <div markdown="1">
+ `<h1>code span</h1>`
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <div>
+ <p><code><h1>code span</h1></code></p>
+ </div>
+ """
+ )
+ )
+
+ def test_md1_code_span_oneline(self):
+ self.assertMarkdownRenders(
+ '<div markdown="1">`<h1>code span</h1>`</div>',
+ self.dedent(
+ """
+ <div>
+ <p><code><h1>code span</h1></code></p>
+ </div>
+ """
+ )
+ )
+
+ def test_md1_code_span_unclosed(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ <div markdown="1">
+ `<p>`
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <div>
+ <p><code><p></code></p>
+ </div>
+ """
+ )
+ )
+
+ def test_md1_code_span_script_tag(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ <div markdown="1">
+ `<script>`
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <div>
+ <p><code><script></code></p>
+ </div>
+ """
+ )
+ )
+
def test_md1_div_blank_lines(self):
self.assertMarkdownRenders(
self.dedent(
)
)
+ def test_md1_PI_oneliner(self):
+ self.assertMarkdownRenders(
+ '<div markdown="1"><?php print("foo"); ?></div>',
+ self.dedent(
+ """
+ <div>
+ <?php print("foo"); ?>
+ </div>
+ """
+ )
+ )
+
+ def test_md1_PI_multiline(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ <div markdown="1">
+ <?php print("foo"); ?>
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <div>
+ <?php print("foo"); ?>
+ </div>
+ """
+ )
+ )
+
+ def test_md1_PI_blank_lines(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ """
+ <div markdown="1">
+
+ <?php print("foo"); ?>
+
+ </div>
+ """
+ ),
+ self.dedent(
+ """
+ <div>
+ <?php print("foo"); ?>
+ </div>
+ """
+ )
+ )
+
def test_md_span_paragraph(self):
self.assertMarkdownRenders(
'<p markdown="span">*foo*</p>',
class TestTOC(TestCase):
+ maxDiff = None
# TODO: Move the rest of the TOC tests here.
+ def testAnchorLink(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # Header 1
+
+ ## Header *2*
+ '''
+ ),
+ self.dedent(
+ '''
+ <h1 id="header-1"><a class="toclink" href="#header-1">Header 1</a></h1>
+ <h2 id="header-2"><a class="toclink" href="#header-2">Header <em>2</em></a></h2>
+ '''
+ ),
+ extensions=[TocExtension(anchorlink=True)]
+ )
+
+ def testAnchorLinkWithSingleInlineCode(self):
+ self.assertMarkdownRenders(
+ '# This is `code`.',
+ '<h1 id="this-is-code">' # noqa
+ '<a class="toclink" href="#this-is-code">' # noqa
+ 'This is <code>code</code>.' # noqa
+ '</a>' # noqa
+ '</h1>', # noqa
+ extensions=[TocExtension(anchorlink=True)]
+ )
+
+ def testAnchorLinkWithDoubleInlineCode(self):
+ self.assertMarkdownRenders(
+ '# This is `code` and `this` too.',
+ '<h1 id="this-is-code-and-this-too">' # noqa
+ '<a class="toclink" href="#this-is-code-and-this-too">' # noqa
+ 'This is <code>code</code> and <code>this</code> too.' # noqa
+ '</a>' # noqa
+ '</h1>', # noqa
+ extensions=[TocExtension(anchorlink=True)]
+ )
+
+ def testPermalink(self):
+ self.assertMarkdownRenders(
+ '# Header',
+ '<h1 id="header">' # noqa
+ 'Header' # noqa
+ '<a class="headerlink" href="#header" title="Permanent link">¶</a>' # noqa
+ '</h1>', # noqa
+ extensions=[TocExtension(permalink=True)]
+ )
+
+ def testPermalinkWithSingleInlineCode(self):
+ self.assertMarkdownRenders(
+ '# This is `code`.',
+ '<h1 id="this-is-code">' # noqa
+ 'This is <code>code</code>.' # noqa
+ '<a class="headerlink" href="#this-is-code" title="Permanent link">¶</a>' # noqa
+ '</h1>', # noqa
+ extensions=[TocExtension(permalink=True)]
+ )
+
+ def testPermalinkWithDoubleInlineCode(self):
+ self.assertMarkdownRenders(
+ '# This is `code` and `this` too.',
+ '<h1 id="this-is-code-and-this-too">' # noqa
+ 'This is <code>code</code> and <code>this</code> too.' # noqa
+ '<a class="headerlink" href="#this-is-code-and-this-too" title="Permanent link">¶</a>' # noqa
+ '</h1>', # noqa
+ extensions=[TocExtension(permalink=True)]
+ )
+
+ def testMinMaxLevel(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # Header 1 not in TOC
+
+ ## Header 2 not in TOC
+
+ ### Header 3
+
+ #### Header 4
+
+ ##### Header 5 not in TOC
+ '''
+ ),
+ self.dedent(
+ '''
+ <h1 id="header-1-not-in-toc">Header 1 not in TOC</h1>
+ <h2 id="header-2-not-in-toc">Header 2 not in TOC</h2>
+ <h3 id="header-3">Header 3</h3>
+ <h4 id="header-4">Header 4</h4>
+ <h5 id="header-5-not-in-toc">Header 5 not in TOC</h5>
+ '''
+ ),
+ expected_attrs={
+ 'toc': (
+ '<div class="toc">\n'
+ '<ul>\n' # noqa
+ '<li><a href="#header-3">Header 3</a>' # noqa
+ '<ul>\n' # noqa
+ '<li><a href="#header-4">Header 4</a></li>\n' # noqa
+ '</ul>\n' # noqa
+ '</li>\n' # noqa
+ '</ul>\n' # noqa
+ '</div>\n' # noqa
+ ),
+ 'toc_tokens': [
+ {
+ 'level': 3,
+ 'id': 'header-3',
+ 'name': 'Header 3',
+ 'children': [
+ {
+ 'level': 4,
+ 'id': 'header-4',
+ 'name': 'Header 4',
+ 'children': []
+ }
+ ]
+ }
+ ]
+ },
+ extensions=[TocExtension(toc_depth='3-4')]
+ )
+
+ def testMaxLevel(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # Header 1
+
+ ## Header 2
+
+ ### Header 3 not in TOC
+ '''
+ ),
+ self.dedent(
+ '''
+ <h1 id="header-1">Header 1</h1>
+ <h2 id="header-2">Header 2</h2>
+ <h3 id="header-3-not-in-toc">Header 3 not in TOC</h3>
+ '''
+ ),
+ expected_attrs={
+ 'toc': (
+ '<div class="toc">\n'
+ '<ul>\n' # noqa
+ '<li><a href="#header-1">Header 1</a>' # noqa
+ '<ul>\n' # noqa
+ '<li><a href="#header-2">Header 2</a></li>\n' # noqa
+ '</ul>\n' # noqa
+ '</li>\n' # noqa
+ '</ul>\n' # noqa
+ '</div>\n' # noqa
+ ),
+ 'toc_tokens': [
+ {
+ 'level': 1,
+ 'id': 'header-1',
+ 'name': 'Header 1',
+ 'children': [
+ {
+ 'level': 2,
+ 'id': 'header-2',
+ 'name': 'Header 2',
+ 'children': []
+ }
+ ]
+ }
+ ]
+ },
+ extensions=[TocExtension(toc_depth=2)]
+ )
+
+ def testMinMaxLevelwithAnchorLink(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # Header 1 not in TOC
+
+ ## Header 2 not in TOC
+
+ ### Header 3
+
+ #### Header 4
+
+ ##### Header 5 not in TOC
+ '''
+ ),
+ '<h1 id="header-1-not-in-toc">' # noqa
+ '<a class="toclink" href="#header-1-not-in-toc">Header 1 not in TOC</a></h1>\n' # noqa
+ '<h2 id="header-2-not-in-toc">' # noqa
+ '<a class="toclink" href="#header-2-not-in-toc">Header 2 not in TOC</a></h2>\n' # noqa
+ '<h3 id="header-3">' # noqa
+ '<a class="toclink" href="#header-3">Header 3</a></h3>\n' # noqa
+ '<h4 id="header-4">' # noqa
+ '<a class="toclink" href="#header-4">Header 4</a></h4>\n' # noqa
+ '<h5 id="header-5-not-in-toc">' # noqa
+ '<a class="toclink" href="#header-5-not-in-toc">Header 5 not in TOC</a></h5>', # noqa
+ expected_attrs={
+ 'toc': (
+ '<div class="toc">\n'
+ '<ul>\n' # noqa
+ '<li><a href="#header-3">Header 3</a>' # noqa
+ '<ul>\n' # noqa
+ '<li><a href="#header-4">Header 4</a></li>\n' # noqa
+ '</ul>\n' # noqa
+ '</li>\n' # noqa
+ '</ul>\n' # noqa
+ '</div>\n' # noqa
+ ),
+ 'toc_tokens': [
+ {
+ 'level': 3,
+ 'id': 'header-3',
+ 'name': 'Header 3',
+ 'children': [
+ {
+ 'level': 4,
+ 'id': 'header-4',
+ 'name': 'Header 4',
+ 'children': []
+ }
+ ]
+ }
+ ]
+ },
+ extensions=[TocExtension(toc_depth='3-4', anchorlink=True)]
+ )
+
+ def testMinMaxLevelwithPermalink(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # Header 1 not in TOC
+
+ ## Header 2 not in TOC
+
+ ### Header 3
+
+ #### Header 4
+
+ ##### Header 5 not in TOC
+ '''
+ ),
+ '<h1 id="header-1-not-in-toc">Header 1 not in TOC' # noqa
+ '<a class="headerlink" href="#header-1-not-in-toc" title="Permanent link">¶</a></h1>\n' # noqa
+ '<h2 id="header-2-not-in-toc">Header 2 not in TOC' # noqa
+ '<a class="headerlink" href="#header-2-not-in-toc" title="Permanent link">¶</a></h2>\n' # noqa
+ '<h3 id="header-3">Header 3' # noqa
+ '<a class="headerlink" href="#header-3" title="Permanent link">¶</a></h3>\n' # noqa
+ '<h4 id="header-4">Header 4' # noqa
+ '<a class="headerlink" href="#header-4" title="Permanent link">¶</a></h4>\n' # noqa
+ '<h5 id="header-5-not-in-toc">Header 5 not in TOC' # noqa
+ '<a class="headerlink" href="#header-5-not-in-toc" title="Permanent link">¶</a></h5>', # noqa
+ expected_attrs={
+ 'toc': (
+ '<div class="toc">\n'
+ '<ul>\n' # noqa
+ '<li><a href="#header-3">Header 3</a>' # noqa
+ '<ul>\n' # noqa
+ '<li><a href="#header-4">Header 4</a></li>\n' # noqa
+ '</ul>\n' # noqa
+ '</li>\n' # noqa
+ '</ul>\n' # noqa
+ '</div>\n' # noqa
+ ),
+ 'toc_tokens': [
+ {
+ 'level': 3,
+ 'id': 'header-3',
+ 'name': 'Header 3',
+ 'children': [
+ {
+ 'level': 4,
+ 'id': 'header-4',
+ 'name': 'Header 4',
+ 'children': []
+ }
+ ]
+ }
+ ]
+ },
+ extensions=[TocExtension(toc_depth='3-4', permalink=True)]
+ )
+
+ def testMinMaxLevelwithBaseLevel(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # First Header
+
+ ## Second Level
+
+ ### Third Level
+
+ #### Forth Level
+ '''
+ ),
+ self.dedent(
+ '''
+ <h3 id="first-header">First Header</h3>
+ <h4 id="second-level">Second Level</h4>
+ <h5 id="third-level">Third Level</h5>
+ <h6 id="forth-level">Forth Level</h6>
+ '''
+ ),
+ expected_attrs={
+ 'toc': (
+ '<div class="toc">\n'
+ '<ul>\n' # noqa
+ '<li><a href="#second-level">Second Level</a>' # noqa
+ '<ul>\n' # noqa
+ '<li><a href="#third-level">Third Level</a></li>\n' # noqa
+ '</ul>\n' # noqa
+ '</li>\n' # noqa
+ '</ul>\n' # noqa
+ '</div>\n' # noqa
+ ),
+ 'toc_tokens': [
+ {
+ 'level': 4,
+ 'id': 'second-level',
+ 'name': 'Second Level',
+ 'children': [
+ {
+ 'level': 5,
+ 'id': 'third-level',
+ 'name': 'Third Level',
+ 'children': []
+ }
+ ]
+ }
+ ]
+ },
+ extensions=[TocExtension(toc_depth='4-5', baselevel=3)]
+ )
+
+ def testMaxLevelwithBaseLevel(self):
+ self.assertMarkdownRenders(
+ self.dedent(
+ '''
+ # Some Header
+
+ ## Next Level
+
+ ### Too High
+ '''
+ ),
+ self.dedent(
+ '''
+ <h2 id="some-header">Some Header</h2>
+ <h3 id="next-level">Next Level</h3>
+ <h4 id="too-high">Too High</h4>
+ '''
+ ),
+ expected_attrs={
+ 'toc': (
+ '<div class="toc">\n'
+ '<ul>\n' # noqa
+ '<li><a href="#some-header">Some Header</a>' # noqa
+ '<ul>\n' # noqa
+ '<li><a href="#next-level">Next Level</a></li>\n' # noqa
+ '</ul>\n' # noqa
+ '</li>\n' # noqa
+ '</ul>\n' # noqa
+ '</div>\n' # noqa
+ ),
+ 'toc_tokens': [
+ {
+ 'level': 2,
+ 'id': 'some-header',
+ 'name': 'Some Header',
+ 'children': [
+ {
+ 'level': 3,
+ 'id': 'next-level',
+ 'name': 'Next Level',
+ 'children': []
+ }
+ ]
+ }
+ ]
+ },
+ extensions=[TocExtension(toc_depth=3, baselevel=2)]
+ )
+
def test_escaped_code(self):
self.assertMarkdownRenders(
self.dedent(
'<h1 id="unicode-ヘッター">' # noqa
'Unicode ヘッダー' # noqa
'<a class="headerlink" href="#unicode-ヘッター" title="Permanent link">¶</a>' # noqa
- '</h1>', # noqa
+ '</h1>', # noqa
extensions=[TocExtension(permalink=True, slugify=slugify_unicode)]
)
from markdown.extensions.toc import slugify_unicode
self.assertMarkdownRenders(
'# Unicode ヘッダー',
- '<h1 id="unicode-ヘッター">' # noqa
- 'Unicode ヘッダー' # noqa
- '<a class="headerlink" href="#unicode-ヘッター" title="パーマリンク">¶</a>' # noqa
- '</h1>', # noqa
+ '<h1 id="unicode-ヘッター">' # noqa
+ 'Unicode ヘッダー' # noqa
+ '<a class="headerlink" href="#unicode-ヘッター" title="パーマリンク">¶</a>' # noqa
+ '</h1>', # noqa
extensions=[TocExtension(permalink=True, permalink_title="パーマリンク", slugify=slugify_unicode)]
)