summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJeremy Evans <code@jeremyevans.net>2020-01-15 12:27:18 -0800
committerSamuel Williams <samuel.williams@oriontransfer.co.nz>2020-01-24 10:03:20 +1300
commitecdf6fb41996892fcaa01e25452b8647450672ff (patch)
tree3766544f683a60924b51c3e3e1e46c270266c658
parent73b61551ae8093bf3ffaf389045187c349bc12aa (diff)
downloadrack-ecdf6fb41996892fcaa01e25452b8647450672ff.tar.gz
Update SPEC and Lint to require ASCII-8BIT encoding for CGI values with non-ASCII characters
Fixes warning when binary Regexp is used to match against a CGI value (QUERY_STRING). ASCII-8BIT encoding is already required for response bodies. Requiring this encoding for all CGI values would break usage with puma and potentially other rack servers. So it seems safest to only require ASCII-8BIT encoding when the value actually contains non-ASCII. We could further limit it to just QUERY_STRING, but I'm not sure we want to do that. Testing with a few rack servers: * Unicorn always uses ASCII-8BIT for all CGI values. * Puma will use UTF-8 for some values, but uses ASCII-8BIT for CGI values contianing non-ASCII characters. * Webrick doesn't support non-ASCII chracters, at least for PATH_INFO or QUERY_STRING. If we don't do this, the only way to avoid the warning would be to duplicate QUERY_STRING if it is not already binary. Checking encoding will slow down parsing even if it already uses ASCII-8BIT encoding. Fixes #1183
-rw-r--r--SPEC2
-rw-r--r--lib/rack/lint.rb6
-rw-r--r--test/spec_lint.rb14
3 files changed, 22 insertions, 0 deletions
diff --git a/SPEC b/SPEC
index 59ec9d72..3efc2f7c 100644
--- a/SPEC
+++ b/SPEC
@@ -122,6 +122,8 @@ The environment must not contain the keys
<tt>HTTP_CONTENT_TYPE</tt> or <tt>HTTP_CONTENT_LENGTH</tt>
(use the versions without <tt>HTTP_</tt>).
The CGI keys (named without a period) must have String values.
+If the string values for CGI keys contain non-ASCII characters,
+they should use ASCII-8BIT encoding.
There are the following restrictions:
* <tt>rack.version</tt> must be an array of Integers.
* <tt>rack.url_scheme</tt> must either be +http+ or +https+.
diff --git a/lib/rack/lint.rb b/lib/rack/lint.rb
index 20491c36..a5365e57 100644
--- a/lib/rack/lint.rb
+++ b/lib/rack/lint.rb
@@ -278,11 +278,17 @@ module Rack
}
## The CGI keys (named without a period) must have String values.
+ ## If the string values for CGI keys contain non-ASCII characters,
+ ## they should use ASCII-8BIT encoding.
env.each { |key, value|
next if key.include? "." # Skip extensions
assert("env variable #{key} has non-string value #{value.inspect}") {
value.kind_of? String
}
+ next if value.encoding == Encoding::ASCII_8BIT
+ assert("env variable #{key} has value containing non-ASCII characters and has non-ASCII-8BIT encoding #{value.inspect} encoding: #{value.encoding}") {
+ value.b !~ /[\x80-\xff]/n
+ }
}
## There are the following restrictions:
diff --git a/test/spec_lint.rb b/test/spec_lint.rb
index 82de1c1d..66339c68 100644
--- a/test/spec_lint.rb
+++ b/test/spec_lint.rb
@@ -114,6 +114,20 @@ describe Rack::Lint do
message.must_match(/Invalid CONTENT_LENGTH/)
lambda {
+ Rack::Lint.new(nil).call(env("QUERY_STRING" => nil))
+ }.must_raise(Rack::Lint::LintError).
+ message.must_include('env variable QUERY_STRING has non-string value nil')
+
+ lambda {
+ Rack::Lint.new(nil).call(env("QUERY_STRING" => "\u1234"))
+ }.must_raise(Rack::Lint::LintError).
+ message.must_include('env variable QUERY_STRING has value containing non-ASCII characters and has non-ASCII-8BIT encoding')
+
+ Rack::Lint.new(lambda { |env|
+ [200, {}, []]
+ }).call(env("QUERY_STRING" => "\u1234".b)).first.must_equal 200
+
+ lambda {
e = env
e.delete("PATH_INFO")
e.delete("SCRIPT_NAME")