summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorcrayzeewulf <crayzeewulf@gmail.com>2013-03-21 14:15:52 -0700
committercrayzeewulf <crayzeewulf@gmail.com>2013-03-21 14:15:52 -0700
commitcaa2748cb1c774d18a1664c3434cbc7c862bb46f (patch)
tree1239d0bd62e8eedbf8812cb3b380b33736e79367
parentec692af97eea48421c12525bdafd2f20f922bd86 (diff)
downloadpython-lxml-caa2748cb1c774d18a1664c3434cbc7c862bb46f.tar.gz
Corrected the sample output of clean_html()
The output of clean_html() does not include html and body tags. The example output in the documentation was corrected.
-rw-r--r--doc/lxmlhtml.txt29
1 files changed, 12 insertions, 17 deletions
diff --git a/doc/lxmlhtml.txt b/doc/lxmlhtml.txt
index 776a4ae3..940e65bb 100644
--- a/doc/lxmlhtml.txt
+++ b/doc/lxmlhtml.txt
@@ -515,24 +515,19 @@ To remove the all suspicious content from this unparsed document, use the
.. sourcecode:: pycon
>>> from lxml.html.clean import clean_html
-
>>> print clean_html(html)
- <html>
- <body>
- <div>
- <style>/* deleted */</style>
- <a href="">a link</a>
- <a href="#">another link</a>
- <p>a paragraph</p>
- <div>secret EVIL!</div>
- of EVIL!
- Password:
- annoying EVIL!
- <a href="evil-site">spam spam SPAM!</a>
- <img src="evil!">
- </div>
- </body>
- </html>
+ <div><style>/* deleted */</style><body>
+
+ <a href="">a link</a>
+ <a href="#">another link</a>
+ <p>a paragraph</p>
+ <div>secret EVIL!</div>
+ of EVIL!
+
+
+ Password:
+ annoying EVIL!<a href="evil-site">spam spam SPAM!</a>
+ <img src="evil!"></body></div>
The ``Cleaner`` class supports several keyword arguments to control exactly
which content is removed: