diff options
author | Jeremy Evans <code@jeremyevans.net> | 2020-01-15 12:27:18 -0800 |
---|---|---|
committer | Samuel Williams <samuel.williams@oriontransfer.co.nz> | 2020-01-24 10:03:20 +1300 |
commit | ecdf6fb41996892fcaa01e25452b8647450672ff (patch) | |
tree | 3766544f683a60924b51c3e3e1e46c270266c658 | |
parent | 73b61551ae8093bf3ffaf389045187c349bc12aa (diff) | |
download | rack-ecdf6fb41996892fcaa01e25452b8647450672ff.tar.gz |
Update SPEC and Lint to require ASCII-8BIT encoding for CGI values with non-ASCII characters
Fixes warning when binary Regexp is used to match against a
CGI value (QUERY_STRING).
ASCII-8BIT encoding is already required for response bodies.
Requiring this encoding for all CGI values would break usage
with puma and potentially other rack servers. So it seems
safest to only require ASCII-8BIT encoding when the value
actually contains non-ASCII. We could further limit it to
just QUERY_STRING, but I'm not sure we want to do that.
Testing with a few rack servers:
* Unicorn always uses ASCII-8BIT for all CGI values.
* Puma will use UTF-8 for some values, but uses ASCII-8BIT
for CGI values contianing non-ASCII characters.
* Webrick doesn't support non-ASCII chracters, at least for
PATH_INFO or QUERY_STRING.
If we don't do this, the only way to avoid the warning would be
to duplicate QUERY_STRING if it is not already binary. Checking
encoding will slow down parsing even if it already uses
ASCII-8BIT encoding.
Fixes #1183
-rw-r--r-- | SPEC | 2 | ||||
-rw-r--r-- | lib/rack/lint.rb | 6 | ||||
-rw-r--r-- | test/spec_lint.rb | 14 |
3 files changed, 22 insertions, 0 deletions
@@ -122,6 +122,8 @@ The environment must not contain the keys <tt>HTTP_CONTENT_TYPE</tt> or <tt>HTTP_CONTENT_LENGTH</tt> (use the versions without <tt>HTTP_</tt>). The CGI keys (named without a period) must have String values. +If the string values for CGI keys contain non-ASCII characters, +they should use ASCII-8BIT encoding. There are the following restrictions: * <tt>rack.version</tt> must be an array of Integers. * <tt>rack.url_scheme</tt> must either be +http+ or +https+. diff --git a/lib/rack/lint.rb b/lib/rack/lint.rb index 20491c36..a5365e57 100644 --- a/lib/rack/lint.rb +++ b/lib/rack/lint.rb @@ -278,11 +278,17 @@ module Rack } ## The CGI keys (named without a period) must have String values. + ## If the string values for CGI keys contain non-ASCII characters, + ## they should use ASCII-8BIT encoding. env.each { |key, value| next if key.include? "." # Skip extensions assert("env variable #{key} has non-string value #{value.inspect}") { value.kind_of? String } + next if value.encoding == Encoding::ASCII_8BIT + assert("env variable #{key} has value containing non-ASCII characters and has non-ASCII-8BIT encoding #{value.inspect} encoding: #{value.encoding}") { + value.b !~ /[\x80-\xff]/n + } } ## There are the following restrictions: diff --git a/test/spec_lint.rb b/test/spec_lint.rb index 82de1c1d..66339c68 100644 --- a/test/spec_lint.rb +++ b/test/spec_lint.rb @@ -114,6 +114,20 @@ describe Rack::Lint do message.must_match(/Invalid CONTENT_LENGTH/) lambda { + Rack::Lint.new(nil).call(env("QUERY_STRING" => nil)) + }.must_raise(Rack::Lint::LintError). + message.must_include('env variable QUERY_STRING has non-string value nil') + + lambda { + Rack::Lint.new(nil).call(env("QUERY_STRING" => "\u1234")) + }.must_raise(Rack::Lint::LintError). + message.must_include('env variable QUERY_STRING has value containing non-ASCII characters and has non-ASCII-8BIT encoding') + + Rack::Lint.new(lambda { |env| + [200, {}, []] + }).call(env("QUERY_STRING" => "\u1234".b)).first.must_equal 200 + + lambda { e = env e.delete("PATH_INFO") e.delete("SCRIPT_NAME") |