Allowed characters in cookies -


this one's quickie:

what allowed characters in both cookie name , value? same url or common subset?

reason i'm asking i've hit strange behavior cookies have - in name , i'm wondering if it's browser specific or if code faulty.

this one's quickie:

you might think should be, it's not @ all!

what allowed characters in both cookie name , value?

according ancient netscape cookie_spec entire name=value string is:

a sequence of characters excluding semi-colon, comma , white space.

so - should work, , seem ok in browsers i've got here; having trouble it?

by implication of above:

  • = legal include, potentially ambiguous. browsers split name , value on first = symbol in string, in practice can put = symbol in value not name.

what isn't mentioned, because netscape terrible @ writing specs, seems consistently supported browsers:

  • either name or value may empty strings

  • if there no = symbol in string @ all, browsers treat cookie empty-string name, ie set-cookie: foo same set-cookie: =foo.

  • when browsers output cookie empty name, omit equals sign. set-cookie: =bar begets cookie: bar.

  • commas , spaces in names , values seem work, though spaces around equals sign trimmed

  • control characters (\x00 \x1f plus \x7f) aren't allowed

what isn't mentioned , browsers totally inconsistent about, non-ascii (unicode) characters:

  • in opera , google chrome, encoded cookie headers utf-8;
  • in ie, machine's default code page used (locale-specific , never utf-8);
  • firefox (and other mozilla-based browsers) use low byte of each utf-16 code point on own (so iso-8859-1 ok else mangled);
  • safari refuses send cookie containing non-ascii characters.

so in practice cannot use non-ascii characters in cookies @ all. if want use unicode, control codes or other arbitrary byte sequences, cookie_spec demands use ad-hoc encoding scheme of own choosing , suggest url-encoding (as produced javascript's encodeuricomponent) reasonable choice.

in terms of actual standards, there have been few attempts codify cookie behaviour none far reflect real world.

  • rfc 2109 attempt codify , fix original netscape cookie_spec. in standard many more special characters disallowed, uses rfc 2616 tokens (a - still allowed there), , value may specified in quoted-string other characters. no browser ever implemented limitations, special handling of quoted strings , escaping, or new features in spec.

  • rfc 2965 go @ it, tidying 2109 , adding more features under ‘version 2 cookies’ scheme. nobody ever implemented of either. spec has same token-and-quoted-string limitations earlier version , it's load of nonsense.

  • rfc 6265 html5-era attempt clear historical mess. still doesn't match reality it's better earlier attempts—it @ least proper subset of browsers support, not introducing syntax supposed work doesn't (like previous quoted-string).

in 6265 cookie name still specified rfc 2616 token, means can pick alphanums plus:

!#$%&'*+-.^_`|~ 

in cookie value formally bans (filtered browsers) control characters , (inconsistently-implemented) non-ascii characters. retains cookie_spec's prohibition on space, comma , semicolon, plus compatibility poor idiots implemented earlier rfcs banned backslash , quotes, other quotes wrapping whole value (but in case quotes still considered part of value, not encoding scheme). leaves alphanums plus:

!#$%&'()*+-./:<=>?@[]^_`{|}~ 

in real world still using original-and-worst netscape cookie_spec, code consumes cookies should prepared encounter pretty anything, code produces cookies advisable stick subset in rfc 6265.


Comments