diff options
author | Leonard Richardson <leonardr@segfault.org> | 2018-07-28 16:58:23 -0400 |
---|---|---|
committer | Leonard Richardson <leonardr@segfault.org> | 2018-07-28 16:58:23 -0400 |
commit | 37e4159cb49d2f7c8fdafa0268adca5a1e2017e4 (patch) | |
tree | 30826a9a744be8f2194c469484618a0326d0488e /NEWS.txt | |
parent | 81f853622f808fba7cd89d02ec524abc8588f196 (diff) | |
download | beautifulsoup4-37e4159cb49d2f7c8fdafa0268adca5a1e2017e4.tar.gz |
Correctly handle invalid HTML numeric character entities like “
which reference code points that are not Unicode code points. Note
that this is only fixed when Beautiful Soup is used with the
html.parser parser -- html5lib already worked and I couldn't fix it
with lxml. [bug=1782933]
Diffstat (limited to 'NEWS.txt')
-rw-r--r-- | NEWS.txt | 6 |
1 files changed, 6 insertions, 0 deletions
@@ -12,6 +12,12 @@ * Fixed a problem where the html.parser tree builder interpreted a string like "&foo " as the character entity "&foo;" [bug=1728706] +* Correctly handle invalid HTML numeric character entities like “ + which reference code points that are not Unicode code points. Note + that this is only fixed when Beautiful Soup is used with the + html.parser parser -- html5lib already worked and I couldn't fix it + with lxml. [bug=1782933] + * Improved the warning given when no parser is specified. [bug=1780571] * Fixed code that was causing deprecation warnings in recent Python 3 |