Skip to content

Unicode handling in header location #110

@noraj

Description

@noraj

webrick doesn't handle Unicode in HTTP location header, eg. redirection to an URL like http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg.

[2023-02-17 16:41:33] ERROR URI::InvalidURIError: URI must be ascii only "http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/\u041E\u0443\u044D\u043D-\u041C\u044D\u0442\u044C\u044E\u0441.jpg"                                                                                                              
        /usr/local/lib/ruby/3.2.0/uri/rfc3986_parser.rb:20:in `split'                                                                                                                                                
        /usr/local/lib/ruby/3.2.0/uri/rfc3986_parser.rb:71:in `parse'                                                                                                                                                
        /usr/local/lib/ruby/3.2.0/uri/rfc3986_parser.rb:111:in `convert_to_uri'                                                                                                                                      
        /usr/local/lib/ruby/3.2.0/uri/generic.rb:1110:in `merge'                                                                                                                                                     
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:320:in `setup_header'                                                                                                                       
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/httpresponse.rb:240:in `send_response'                                                                                                                      
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/httpserver.rb:112:in `run'                                                                                                                                  
        /usr/local/bundle/gems/webrick-1.8.1/lib/webrick/server.rb:310:in `block in start_thread'

The following code is responsible:

@header['location'] = @request_uri.merge(location).to_s

This is because methods such as URI.parse or here URI.merge only handles ASCII.

uri = URI.parse('http://dxczjjuegupb.cloudfront.net')
uri.merge('/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg').to_s
/home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/rfc3986_parser.rb:20:in `split': URI must be ascii only "/wp-content/uploads/2017/10/\u041E\u0443\u044D\u043D-\u041C\u044D\u0442\u044C\u044E\u0441.jpg" (URI::InvalidURIError)                                                                                                                                                                      
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/rfc3986_parser.rb:71:in `parse'                                                                                   
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/rfc3986_parser.rb:111:in `convert_to_uri'                                                                         
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/3.2.0/uri/generic.rb:1110:in `merge'                                                                                        
        from (irb):9:in `<main>'                                                                                                                                                        
        from /home/noraj/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/irb-1.6.2/exe/irb:11:in `<top (required)>'                                                                  
        from /home/noraj/.asdf/installs/ruby/3.2.0/bin/irb:25:in `load'                                                                                                                 
        from /home/noraj/.asdf/installs/ruby/3.2.0/bin/irb:25:in `<main>'

So URL or fragments should be escaped first, with CGI.escape for URL component and URI::Parser.new.escape for full URLs.

Examples in https://github.com/noraj/ctf-party/blob/master/lib/ctf_party/cgi.rb.

cf. https://stackoverflow.com/questions/46849219/ruby-uriinvalidurierror-uri-must-be-ascii-only/75487328

patched code:

uri.merge(CGI.escape('/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg')).to_s
# => "http://dxczjjuegupb.cloudfront.net/%2Fwp-content%2Fuploads%2F2017%2F10%2F%D0%9E%D1%83%D1%8D%D0%BD-%D0%9C%D1%8D%D1%82%D1%8C%D1%8E%D1%81.jpg"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions