-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
In line with the repository's published vulnerability-reporting guidance, I am reporting this issue here as a public issue.
Summary
When parsing crafted XML containing an out-of-range numeric character reference such as �, XML#toJSONObject() throws an uncaught IllegalArgumentException instead of a controlled parsing exception such as JSONException.
As a result, applications that parse attacker-controlled XML may encounter an uncaught runtime exception. Depending on the integration, this may result in request failure or denial of service.
I reproduced this in release 20251224.
Details
The apparent root cause is in XMLTokener#unescapeEntity(), where a decoded numeric character reference is passed to string construction without first validating that it is a valid Unicode code point:
JSON-java/src/main/java/org/json/XMLTokener.java
Lines 162 to 171 in cf65368
| if (e.charAt(0) == '#') { | |
| int cp; | |
| if (e.charAt(1) == 'x' || e.charAt(1) == 'X') { | |
| // hex encoded unicode | |
| cp = Integer.parseInt(e.substring(2), 16); | |
| } else { | |
| // decimal encoded unicode | |
| cp = Integer.parseInt(e.substring(1)); | |
| } | |
| return new String(new int[] {cp},0,1); |
Minimal PoC
XML.toJSONObject("<a>�</a>");I also checked a few closely related inputs while narrowing this down:
�reproduces the same behavior.- The same behavior is also reachable from an attribute value, e.g.
<a b="�"/>. �did not reproduce the same uncaught exception in my testing.
This suggests that the immediate issue here is specifically the handling of out-of-range Unicode code points during string construction, rather than XML-invalid numeric character references in general.
Observed Result
java.lang.IllegalArgumentException: 1114112
at java.base/java.lang.StringUTF16.toBytes(Unknown Source)
at java.base/java.lang.String.<init>(Unknown Source)
at org.json.XMLTokener.unescapeEntity(XMLTokener.java:171)
at org.json.XMLTokener.nextEntity(XMLTokener.java:148)
at org.json.XMLTokener.nextContent(XMLTokener.java:117)
at org.json.XML.parse(XML.java:407)
at org.json.XML.toJSONObject(XML.java:780)
at org.json.XML.toJSONObject(XML.java:866)
at org.json.XML.toJSONObject(XML.java:665)
at PoC.main(PoC.java:7)