summaryrefslogtreecommitdiff
path: root/src/encoding/xml/xml.go
Commit message (Collapse)AuthorAgeFilesLines
* encoding/xml: replace comments inside directives with a spaceFilippo Valsorda2021-03-151-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A Directive (like <!ENTITY xxx []>) can't have other nodes nested inside it (in our data structure representation), so there is no way to preserve comments. The previous behavior was to just elide them, which however might change the semantic meaning of the surrounding markup. Instead, replace them with a space which hopefully has the same semantic effect of the comment. Directives are not actually a node type in the XML spec, which instead specifies each of them separately (<!ENTITY, <!DOCTYPE, etc.), each with its own grammar. The rules for where and when the comments are allowed are not straightforward, and can't be implemented without implementing custom logic for each of the directives. Simply preserving the comments in the body of the directive would be problematic, as there can be unmatched quotes inside the comment. Whether those quotes are considered meaningful semantically or not, other parsers might disagree and interpret the output differently. This issue was reported by Juho Nurminen of Mattermost as it leads to round-trip mismatches. See #43168. It's not being fixed in a security release because round-trip stability is not a currently supported security property of encoding/xml, and we don't believe these fixes would be sufficient to reliably guarantee it in the future. Fixes CVE-2020-29510 Updates #43168 Change-Id: Icd86c75beff3e1e0689543efebdad10ed5178ce3 Reviewed-on: https://go-review.googlesource.com/c/go/+/277893 Run-TryBot: Filippo Valsorda <filippo@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Filippo Valsorda <filippo@golang.org> Reviewed-by: Katie Hockman <katie@golang.org>
* encoding/xml: handle leading, trailing, or double colons in namesFilippo Valsorda2021-03-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, <:name> would parse as <name>, which could cause issues in applications that rely on the parse-encode cycle to round-trip. Similarly, <x name:=""> would parse as expected but then have the attribute dropped when serializing because its name was empty. Finally, <a:b:c> would parse and get serialized incorrectly. All these values are invalid XML, but to minimize the impact of this change, we parse them whole into Name.Local. This issue was reported by Juho Nurminen of Mattermost as it leads to round-trip mismatches. See #43168. It's not being fixed in a security release because round-trip stability is not a currently supported security property of encoding/xml, and we don't believe these fixes would be sufficient to reliably guarantee it in the future. Fixes CVE-2020-29509 Fixes CVE-2020-29511 Updates #43168 Change-Id: I68321c4d867305046f664347192948a889af3c7f Reviewed-on: https://go-review.googlesource.com/c/go/+/277892 Run-TryBot: Filippo Valsorda <filippo@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Filippo Valsorda <filippo@golang.org> Reviewed-by: Katie Hockman <katie@golang.org>
* all: use HTML5 br tagsJohn Bampton2021-03-131-1/+1
| | | | | | | | | | | | | In HTML5 br tags don't need a closing slash Change-Id: Ic53c43faee08c5b1267daa9a02cc186b1c255ca1 GitHub-Last-Rev: 652208116944d01b23b8af8f1af485da5e916d32 GitHub-Pull-Request: golang/go#44283 Reviewed-on: https://go-review.googlesource.com/c/go/+/292370 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com>
* encoding/xml: prevent infinite loop while decodingKatie Hockman2021-03-101-9/+10
| | | | | | | | | | | | | | | | | | | | | | | This change properly handles a TokenReader which returns an EOF in the middle of an open XML element. Thanks to Sam Whited for reporting this. Fixes CVE-2021-27918 Fixes #44913 Change-Id: Id02a3f3def4a1b415fa2d9a8e3b373eb6cb0f433 Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/1004594 Reviewed-by: Russ Cox <rsc@google.com> Reviewed-by: Roland Shoemaker <bracewell@google.com> Reviewed-by: Filippo Valsorda <valsorda@google.com> Reviewed-on: https://go-review.googlesource.com/c/go/+/300391 Trust: Katie Hockman <katie@golang.org> Run-TryBot: Katie Hockman <katie@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Alexander Rakoczy <alex@golang.org> Reviewed-by: Filippo Valsorda <filippo@golang.org>
* all: avoid string(i) where i has type intIan Lance Taylor2020-02-261-2/+2
| | | | | | | | | | | | | | | | Instead use string(r) where r has type rune. This is in preparation for a vet warning for string(i). Updates #32479 Change-Id: Ic205269bba1bd41723950219ecfb67ce17a7aa79 Reviewed-on: https://go-review.googlesource.com/c/go/+/220844 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Akhil Indurti <aindurti@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Toshihiro Shiino <shiino.toshihiro@gmail.com>
* encoding/xml: fix token decoder on early EOFSam Whited2019-10-301-1/+4
| | | | | | | | | | | | | | The documentation for TokenReader suggests that implementations of the interface may return a token and io.EOF together, indicating that it is the last token in the stream. This is similar to io.Reader. However, if you wrap such a TokenReader in a Decoder it complained about the EOF. A test was added to ensure this behavior on Decoder's. Change-Id: I9083c91d9626180d3bcf5c069a017050f3c7c4a8 Reviewed-on: https://go-review.googlesource.com/c/go/+/130556 Run-TryBot: Sam Whited <sam@samwhited.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: document HTMLAutoClose and HTMLEntity moreBrad Fitzpatrick2018-08-021-4/+8
| | | | | | | | | | | | | | They didn't even have public types, which made them pretty mysterious. Give them types and reference the Decoder, which uses them. Also, refer them qualified by their package name in the examples, as we usually do in example*.go files, which usually use package foo_test specifically so we can show the package names along with the symbols. Change-Id: I50ebbbf43778c1627bfa526f8824f52c7953454f Reviewed-on: https://go-review.googlesource.com/127663 Reviewed-by: Bryan C. Mills <bcmills@google.com>
* encoding/xml: remove some primordial semicolonsBrad Fitzpatrick2018-08-021-2/+2
| | | | | | Change-Id: I23e5d87648a4091fb4f6616bf80aa6c800974900 Reviewed-on: https://go-review.googlesource.com/127662 Reviewed-by: Bryan C. Mills <bcmills@google.com>
* all: update comment URLs from HTTP to HTTPS, where possibleTim Cooper2018-06-011-6/+6
| | | | | | | | | | Each URL was manually verified to ensure it did not serve up incorrect content. Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df Reviewed-on: https://go-review.googlesource.com/115798 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
* encoding/xml: fix valid character rangeArtyom Pervukhin2018-05-091-1/+1
| | | | | | | | | | | | | Section 2.2 of the referenced spec http://www.xml.com/axml/testaxml.htm defines 0xD7FF as a (sub)range boundary, not 0xDF77. Fixes #25172 Change-Id: Ic5a3328cd46ef6474b8e93c4a343dcfba0e6511f Reviewed-on: https://go-review.googlesource.com/109495 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/xml: remove unnecessary if conditionstengufromsky2018-04-151-8/+4
| | | | | | | | | | | Fixes gosimple warning "if err != nil { return err }; return nil' can be simplified to 'return err" Change-Id: Ibbc717fb066ff41ab35c481b6d44980ac809ae09 Reviewed-on: https://go-review.googlesource.com/107018 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* all: remove duplicate word "the"Ryuma Yoshida2018-02-201-1/+1
| | | | | | | | Change-Id: Ia5908e94a6bd362099ca3c63f6ffb7e94457131d GitHub-Last-Rev: 545a40571a912f433546d8c94a9d63459313515d GitHub-Pull-Request: golang/go#23942 Reviewed-on: https://go-review.googlesource.com/95435 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: simplify slice-growing logic in rawTokenAlberto Donizetti2018-02-191-12/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | It appears that old code (from 2009) in xml.(*Decoder).rawToken replicates append's slice-growing functionality by allocating a new, bigger backing array and then calling copy. Simplifying the code by replacing it with a single append call does not seem to hurt performance: name old time/op new time/op delta Marshal-4 11.2µs ± 1% 11.3µs ±10% ~ (p=0.069 n=19+17) Unmarshal-4 28.6µs ± 1% 28.4µs ± 1% -0.60% (p=0.000 n=20+18) name old alloc/op new alloc/op delta Marshal-4 5.78kB ± 0% 5.78kB ± 0% ~ (all equal) Unmarshal-4 8.61kB ± 0% 8.27kB ± 0% -3.90% (p=0.000 n=20+20) name old allocs/op new allocs/op delta Marshal-4 23.0 ± 0% 23.0 ± 0% ~ (all equal) Unmarshal-4 189 ± 0% 190 ± 0% +0.53% (p=0.000 n=20+20) Change-Id: Ie580d1216a44760e611e63dee2c339af5465aea5 Reviewed-on: https://go-review.googlesource.com/86655 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: move unexported const out of exported const blockRuss Cox2017-11-151-1/+2
| | | | | | | | | | | | | | | CL 58210 introduced this constant for reasons I don't understand. It should not be in the exported const block, which will pollute godoc output with a "... unexported" notice. Also since we already have a constant named xmlnsPrefix for "xmlns", it is very confusing to also have xmlNamespacePrefix for "xml". If we must have the constant at all, rename it to xmlPrefix. Change-Id: I15f937454d730005816fcd32b1acca703acf1e51 Reviewed-on: https://go-review.googlesource.com/78121 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: don't panic when custom Unmarshaler sees StartElementSam Whited2017-10-301-3/+3
| | | | | | | | Change-Id: I90aa0a983abd0080f3de75d3340fdb15c1f9ca35 Reviewed-on: https://go-review.googlesource.com/70891 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Sam Whited <sam@samwhited.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: revert "all: prefer strings.IndexByte over strings.Index"Marvin Stenger2017-10-051-1/+1
| | | | | | | | | | This reverts https://golang.org/cl/65930. Fixes #22148 Change-Id: Ie0712621ed89c43bef94417fc32de9af77607760 Reviewed-on: https://go-review.googlesource.com/68430 Reviewed-by: Ian Lance Taylor <iant@golang.org>
* all: prefer strings.IndexByte over strings.IndexMarvin Stenger2017-09-251-1/+1
| | | | | | | | | | | | | | | strings.IndexByte was introduced in go1.2 and it can be used effectively wherever the second argument to strings.Index is exactly one byte long. This avoids generating unnecessary string symbols and saves a few calls to strings.Index. Change-Id: I1ab5edb7c4ee9058084cfa57cbcc267c2597e793 Reviewed-on: https://go-review.googlesource.com/65930 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/xml: add decode wrapperSam Whited2017-09-131-0/+37
| | | | | | | | | | Fixes #19480 Change-Id: I5a621507279d5bb1f3991b7a412d9a63039d464b Reviewed-on: https://go-review.googlesource.com/38791 Run-TryBot: Sam Whited <sam@samwhited.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/xml: improve package based on the suggestions from metalinterKarel Pazdera2017-08-241-37/+43
| | | | | | | | | | | | | | | Existing code in encoding/xml packages contains code which breaks various linter rules (comments, constant and variable naming, variable shadowing, etc). Fixes #21578 Change-Id: Id4bd9a9be6d5728ce88fb6efe33030ef943c078c Reviewed-on: https://go-review.googlesource.com/58210 Reviewed-by: Sam Whited <sam@samwhited.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Sam Whited <sam@samwhited.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: single space after period.Brad Fitzpatrick2016-03-021-3/+3
| | | | | | | | | | | | | | | | | | | | The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: remove public named return values when uselessBrad Fitzpatrick2016-02-291-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Named returned values should only be used on public funcs and methods when it contributes to the documentation. Named return values should not be used if they're only saving the programmer a few lines of code inside the body of the function, especially if that means there's stutter in the documentation or it was only there so the programmer could use a naked return statement. (Naked returns should not be used except in very small functions) This change is a manual audit & cleanup of public func signatures. Signatures were not changed if: * the func was private (wouldn't be in public godoc) * the documentation referenced it * the named return value was an interesting name. (i.e. it wasn't simply stutter, repeating the name of the type) There should be no changes in behavior. (At least: none intended) Change-Id: I3472ef49619678fe786e5e0994bdf2d9de76d109 Reviewed-on: https://go-review.googlesource.com/20024 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Gerrand <adg@golang.org>
* encoding/xml: update docs for TokenRuss Cox2016-01-241-1/+2
| | | | | | | | | | Fixes #13757. Change-Id: I1b52593df8df0e98ce7342767eb34eccecc11761 Reviewed-on: https://go-review.googlesource.com/18854 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: case-insensitive encoding recognitionGiulio Iotti2015-11-251-1/+1
| | | | | | | | | | | From the XML spec: "XML processors should match character encoding names in a case-insensitive way" Fixes #12417. Change-Id: I678c50152a49c14364be62b3f21ab9b9b009b24b Reviewed-on: https://go-review.googlesource.com/14084 Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/xml: reject invalid commentsMichal Bohuslávek2015-11-251-1/+6
| | | | | | | | Fixes #11112. Change-Id: I16e7363549a0dec8c61addfa14af0866c1fd7c40 Reviewed-on: https://go-review.googlesource.com/14173 Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/xml: Add CDATA-wrapper output support to xml.Marshal.Charles Weill2015-11-251-0/+40
| | | | | | | | Fixes #12963 Change-Id: Icc50dfb6130fe1e189d45f923c2f7408d3cf9401 Reviewed-on: https://go-review.googlesource.com/16047 Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/xml: Return SyntaxError for unmatched root start elements.Robert Stepanek2015-09-101-0/+3
| | | | | | | | | | | | | | | | | | | | | Currently, the xml.Decoder's Token routine returns successfully for XML input that does not properly close root start elements (and any unclosed descendants). For example, all the following inputs <root> <root><foo> <root><foo></foo> cause Token to return with nil and io.EOF, indicating a successful parse. This change fixes that. It leaves the semantics of RawToken intact. Fixes #11405 Change-Id: I6f1328c410cf41e17de0a93cf357a69f12c2a9f7 Reviewed-on: https://go-review.googlesource.com/14315 Reviewed-by: Nigel Tao <nigeltao@golang.org>
* encoding/xml: restore Go 1.4 name space behaviorRuss Cox2015-07-271-38/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is clearly work to do here with respect to xml name spaces, but I don't believe the changes in this cycle are clearly correct. The changes in this cycle have visible impact on the generated xml, possibly breaking existing programs, and yet it's not clear that they are the end of the story: there is still significant confusion about how name spaces work or should work (see #9519, #9775, #8167, #7113). I would like to wait to make breaking changes until we completely understand what the behavior should be and can evaluate the benefit of those breaking changes. My main concern here is that we will break programs in Go 1.5 for the sake of name space adjustments and then while trying to fix those other bugs we'll break programs in Go 1.6 too. Let's wait until we know all the changes we want to make before we decide whether or how to break existing programs. This CL reverts: 5ae822b encoding/xml: minor changes bb7e665 encoding/xml: fix xmlns= behavior 9f9d66d encoding/xml: fix default namespace of tags b69ea01 encoding/xml: fix namespaces in a>b tags 3be158d encoding/xml: encoding name spaces correctly and adjusts tests from a9dddb5 encoding/xml: add more EncodeToken tests. to expect Go 1.4 behavior. I have confirmed that the name space parts of the test suite as of this CL passes against the Go 1.4 encoding/xml package, indicating that this CL successfully restores the Go 1.4 behavior. (Other tests do not, but that's because there were some real bug fixes in this cycle that are being kept. Specifically, the tests that don't pass in Go 1.4 are TestMarshal's NestedAndComment case, TestEncodeToken's encoding of newlines, and TestSimpleUseOfEncodeToken returning an error for invalid token types.) I also checked that the Go 1.4 tests pass when run against this copy of the sources. Fixes #11841. Change-Id: I97de06761038b40388ef6e3a55547ff43edee7cb Reviewed-on: https://go-review.googlesource.com/12570 Reviewed-by: Nigel Tao <nigeltao@golang.org>
* encoding/xml: fix xmlns= behaviorRoger Peppe2015-06-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When an xmlns="..." attribute was explicitly generated, it was being ignored because the name space on the attribute was assumed to have been explicitly set (to the empty name space) and it's not possible to have an element in the empty name space when there is a non-empty name space set. We fix this by recording when a default name space has been explicitly set and setting the name space of the element to that so printer.defineNS can do its work correctly. We do not attempt to add our own xmlns="..." attribute when one is explicitly set. We also add tests for EncodeElement, as that's the only way to attain coverage of some of the changed behaviour. Some other test coverage is also increased, although more work remains to be done in this area. This change was jointly developed with Martin Hilton (mhilton on github). Fixes #11431. Change-Id: I7b85e06eea5b18b2c15ec16dcbd92a8e1d6a9a4e Reviewed-on: https://go-review.googlesource.com/11635 Reviewed-by: Russ Cox <rsc@golang.org>
* xml: add check of version in document declarationGiulio Iotti2015-06-181-6/+12
| | | | | | | | | Check that if a version is declared, for example in '<?xml version="XX" ?>', version must be '1.0'. Change-Id: I16ba9f78873a5f31977dcf75ac8e671fe6c08280 Reviewed-on: https://go-review.googlesource.com/8961 Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/xml: do not escape newlinesRoger Peppe2015-04-271-0/+10
| | | | | | | | | | There is no need to escape newlines in char data - it makes the XML larger and harder to read. Change-Id: I1c1fcee1bdffc705c7428f89ca90af8085d6fb73 Reviewed-on: https://go-review.googlesource.com/9310 Reviewed-by: Nigel Tao <nigeltao@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/xml: encoding name spaces correctlyRoger Peppe2015-02-131-5/+32
| | | | | | | | | | | | | The current XML printer does not understand the xmlns attribute. This change changes it so that it interprets the xmlns attributes in the tokens being printed, and uses appropriate prefixes. Fixes #7535. Change-Id: I20fae291d20602d37deb41ed42fab4c9a50ec85d Reviewed-on: https://go-review.googlesource.com/2660 Reviewed-by: Nigel Tao <nigeltao@golang.org>
* encoding/xml: avoid an allocation for tags without attributesBrian Smith2015-02-071-2/+6
| | | | | | | | | | | | | Before, an array of size 4 would always be allocated even if a tag doesn't have any attributes. Now that array is allocated only if needed. benchmark old allocs new allocs delta BenchmarkUnmarshal 191 176 -8.5% Change-Id: I4d214b228883d0a6e892c0d6eb00dfe2da84c116 Reviewed-on: https://go-review.googlesource.com/4160 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: remove unnecessary memory allocation in UnmarshalDmitry Vyukov2015-01-151-4/+4
| | | | | | | | | | | | benchmark old ns/op new ns/op delta BenchmarkUnmarshal 75256 72626 -3.49% benchmark old allocs new allocs delta BenchmarkUnmarshal 259 219 -15.44% Change-Id: I7fd30739b045e35b95e6ef6a8ef2f15b0dd6839c Reviewed-on: https://go-review.googlesource.com/2758 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/xml: remove SyntaxError.ByteRuss Cox2014-12-051-1/+0
| | | | | | | | | | It is unused. It was introduced in the CL that added InputOffset. I suspect it was an editing mistake. LGTM=bradfitz R=bradfitz CC=golang-codereviews https://golang.org/cl/182580043
* build: move package sources from src/pkg to srcRuss Cox2014-09-081-0/+1946
Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.