diff options
author | scoder <stefan_ml@behnel.de> | 2013-04-28 06:42:45 -0700 |
---|---|---|
committer | scoder <stefan_ml@behnel.de> | 2013-04-28 06:42:45 -0700 |
commit | bdbf8f85ad6e27ac386493e96046938a532a8054 (patch) | |
tree | 71940d18ed3f3db560b95df05a4bbffee82dd1e3 | |
parent | 889172388ecb621f5739578ade35faa1879d66df (diff) | |
parent | caa2748cb1c774d18a1664c3434cbc7c862bb46f (diff) | |
download | python-lxml-bdbf8f85ad6e27ac386493e96046938a532a8054.tar.gz |
Merge pull request #104 from crayzeewulf/master
Corrected the sample output of clean_html()
-rw-r--r-- | doc/lxmlhtml.txt | 29 |
1 files changed, 12 insertions, 17 deletions
diff --git a/doc/lxmlhtml.txt b/doc/lxmlhtml.txt index 776a4ae3..940e65bb 100644 --- a/doc/lxmlhtml.txt +++ b/doc/lxmlhtml.txt @@ -515,24 +515,19 @@ To remove the all suspicious content from this unparsed document, use the .. sourcecode:: pycon >>> from lxml.html.clean import clean_html - >>> print clean_html(html) - <html> - <body> - <div> - <style>/* deleted */</style> - <a href="">a link</a> - <a href="#">another link</a> - <p>a paragraph</p> - <div>secret EVIL!</div> - of EVIL! - Password: - annoying EVIL! - <a href="evil-site">spam spam SPAM!</a> - <img src="evil!"> - </div> - </body> - </html> + <div><style>/* deleted */</style><body> + + <a href="">a link</a> + <a href="#">another link</a> + <p>a paragraph</p> + <div>secret EVIL!</div> + of EVIL! + + + Password: + annoying EVIL!<a href="evil-site">spam spam SPAM!</a> + <img src="evil!"></body></div> The ``Cleaner`` class supports several keyword arguments to control exactly which content is removed: |