I was reading RFC 4627 and I can't figure out if the following is valid JSON or not. Consider this minimalistic JSON text:
["\u005c"]
The problem is the lowercase c
.
According to the text of the RFC it is allowed:
Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A though F can be upper or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".
(Emphasis mine)
The problem is that the RFC also contains the grammar for this:
char = unescaped /
escape (
%x22 / ; " quotation mark U+0022
%x5C / ; \ reverse solidus U+005C
%x2F / ; / solidus U+002F
%x62 / ; b backspace U+0008
%x66 / ; f form feed U+000C
%x6E / ; n line feed U+000A
%x72 / ; r carriage return U+000D
%x74 / ; t tab U+0009
%x75 4HEXDIG ) ; uXXXX U+XXXX
where HEXDIG
is defined in referenced RFC 4234 as
HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
which includes only uppercase letters.
FWIW, from what I researched most JSON parsers accept both upper and lowercase letters.
Question(s): What is actually correct? Is there a contradiction and the grammar in the RFC should be fixed?
I think it's explained by this part of RFC 4234:
ABNF strings are case-insensitive and the character set for these strings is us-ascii.
Hence:
rulename = "abc"
and:
rulename = "aBc"
will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC", and "ABC".
On the other hand, the follow-on part is not terribly clear:
To specify a rule that IS case SENSITIVE, specify the characters individually.
For example:
rulename = %d97 %d98 %d99
or
rulename = %d97.98.99
In the case of the HEXDIG
rule, they're individual characters to start with - but they're specified literally as "A"
etc rather than %d41
, so I suspect that means they're case-insensitive. It's not the clearest spec I've read :(
See more on this question at Stackoverflow