前言

这个笔记是看rfc7230对于http header的解析器描述。

一、header Fields

https://tools.ietf.org/html/rfc7230#section-3.2.1

  • ":"分隔field和value
  • OWS = *( SP /HTAB)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
header-field   = field-name ":" OWS field-value OWS

     field-name     = token
     field-value    = *( field-content / obs-fold )
     field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
     field-vchar    = VCHAR / obs-text

     obs-fold       = CRLF 1*( SP / HTAB )
                    ; obsolete line folding
                    ; see Section 3.2.4

1.解析header Fields

https://tools.ietf.org/html/rfc7230#section-3.2.4

http field和':‘之间不允许有空格,服务器如果收到此类数据,应该返回400 (https://www.mozilla.org/en-US/security/advisories/mfsa2006-33/) field value前面后者后面是可选的OWS 过去支持field value可以扩展到多行。但是rf7230不支持这种形式。如果收到obs,返回400。

2.Field Value Components

https://tools.ietf.org/html/rfc7230#section-3.2.6

Most HTTP header field values are defined using common syntax components (token, quoted-string, and comment) separated by whitespace or specific delimiting characters. Delimiters are chosen from the set of US-ASCII visual characters not allowed in a token (DQUOTE and “(),/:;<=>?@[]{}").

 token          = 1*tchar

 tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
                / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
                / DIGIT / ALPHA
                ; any VCHAR, except delimiters

A string of text is parsed as a single value if it is quoted using double-quote marks.

 quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
 qdtext         = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
 obs-text       = %x80-FF

Comments can be included in some HTTP header fields by surrounding the comment text with parentheses. Comments are only allowed in fields containing “comment” as part of their field value definition.

 comment        = "(" *( ctext / quoted-pair / comment ) ")"
 ctext          = HTAB / SP / %x21-27 / %x2A-5B / %x5D-7E / obs-text

The backslash octet ("") can be used as a single-octet quoting mechanism within quoted-string and comment constructs. Recipients that process the value of a quoted-string MUST handle a quoted-pair as if it were replaced by the octet following the backslash.

 quoted-pair    = "\" ( HTAB / SP / VCHAR / obs-text )

A sender SHOULD NOT generate a quoted-pair in a quoted-string except where necessary to quote DQUOTE and backslash octets occurring within that string. A sender SHOULD NOT generate a quoted-pair in a comment except where necessary to quote parentheses ["(” and “)"] and backslash octets occurring within that comment.