Skip to content

fix: std.parseYaml YAML 1.2 octal (0o777) and document marker handling#968

Draft
He-Pin wants to merge 3 commits into
databricks:masterfrom
He-Pin:fix/parseyaml-0o-octal
Draft

fix: std.parseYaml YAML 1.2 octal (0o777) and document marker handling#968
He-Pin wants to merge 3 commits into
databricks:masterfrom
He-Pin:fix/parseyaml-0o-octal

Conversation

@He-Pin

@He-Pin He-Pin commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Motivation

Two issues in std.parseYaml diverging from go-jsonnet:

  1. YAML 1.2 octal: SnakeYAML's SafeConstructor uses YAML 1.1 type resolution which does not recognize the 0o prefix for octal integers (YAML 1.2), causing unquoted 0o777 to be parsed as string "0o777" instead of 511.
  2. Document markers: Explicit document start markers (---) caused single-doc YAML to be returned directly instead of wrapped in an array as go-jsonnet does.

Modification

  • JVM (src-jvm/Platform.scala): Replaced SafeConstructor-based parsing with composeAll() + custom yamlNodeToJson() that handles YAML 1.2 octal (0o prefix), hex (0x), binary (0b), sexagesimal, and special float values for plain scalars while preserving quoted values as strings. Added YamlDocStartPattern regex to detect explicit --- and wrap single-doc results in an array.
  • JS (src-js/Platform.scala): Added Yaml12OctalPattern matching and isQuotedScalar position check to detect YAML 1.2 octal in the scala-yaml Node representation. Added same document marker detection.

Result

std.parseYaml now matches go-jsonnet for both YAML 1.2 octal syntax and document start marker handling.

YAML 1.2 Octal

YAML input go-jsonnet v0.22.0 sjsonnet (before) sjsonnet (after)
a: 0777 511 511 511
b: 0o777 511 "0o777" (bug) 511
d: 0o10 8 "0o10" (bug) 8
e: -0o777 -511 "-0o777" (bug) -511
f: "0o777" "0o777" "0o777" "0o777"

Document Markers

YAML input go-jsonnet v0.22.0 sjsonnet (before) sjsonnet (after)
"---" [null] null (bug) [null]
"---\na: 1" [{a:1}] {a:1} (bug) [{a:1}]
"a: 1" {a:1} {a:1} {a:1}

Test plan

  • All ParseYaml tests pass (14 tests)
  • All FileTests pass (including parseyaml_yaml12_octal.jsonnet and parseyaml_doc_marker.jsonnet)
  • All EvaluatorTests pass
  • All go_test_suite golden tests pass
  • Code passes scalafmt
  • Verified against go-jsonnet v0.22.0 and jrsonnet v0.5.0-pre99

@He-Pin He-Pin changed the title fix: std.parseYaml handles YAML 1.2 modern octal syntax (0o777) fix: std.parseYaml YAML 1.2 octal (0o777) and document marker handling Jun 18, 2026
@He-Pin He-Pin force-pushed the fix/parseyaml-0o-octal branch from 3d07aba to 58d6d9c Compare June 18, 2026 10:24
@He-Pin He-Pin changed the title fix: std.parseYaml YAML 1.2 octal (0o777) and document marker handling fix: std.parseYaml handles YAML 1.2 modern octal syntax (0o777) Jun 18, 2026
@He-Pin He-Pin force-pushed the fix/parseyaml-0o-octal branch from 58d6d9c to 2240e17 Compare June 18, 2026 10:30
@He-Pin He-Pin changed the title fix: std.parseYaml handles YAML 1.2 modern octal syntax (0o777) fix: std.parseYaml YAML 1.2 octal (0o777) and document marker handling Jun 18, 2026
@He-Pin He-Pin marked this pull request as draft June 18, 2026 10:42
Motivation:
Two issues in std.parseYaml diverging from go-jsonnet:
1. SnakeYAML's SafeConstructor uses YAML 1.1 type resolution which does
   not recognize the 0o prefix for octal integers (YAML 1.2), causing
   unquoted 0o777 to be parsed as string "0o777" instead of 511.
2. Explicit document start markers (---) caused single-doc YAML to be
   returned directly instead of wrapped in an array as go-jsonnet does.

Modification:
Replaced SafeConstructor-based parsing with composeAll() + custom
yamlNodeToJson() that handles YAML 1.2 octal (0o prefix) for plain
scalars while preserving quoted values as strings. Added
YamlDocStartPattern regex to detect explicit --- and wrap single-doc
results in an array.

Result:
std.parseYaml now matches go-jsonnet for both YAML 1.2 octal syntax
and document start marker handling.

| YAML input | go-jsonnet v0.22.0 | jrsonnet 0.5.0-pre99 | sjsonnet (before) | sjsonnet (after) |
|-----------|-------------------|---------------------|-------------------|-----------------|
| 0o777     | 511               | 511                 | "0o777" (bug)     | 511             |
| -0o777    | -511              | -511                | "-0o777" (bug)    | -511            |
| "0o777"   | "0o777"           | "0o777"             | "0o777"           | "0o777"         |
| "---"     | [null]            | null                | null (bug)        | [null]          |
| "---\na:1"| [{a:1}]           | {a:1}               | {a:1} (bug)       | [{a:1}]         |
| "a: 1"    | {a:1}             | {a:1}               | {a:1}             | {a:1}           |

Note: jrsonnet does NOT wrap --- in array; sjsonnet aligns with go-jsonnet.
@He-Pin He-Pin force-pushed the fix/parseyaml-0o-octal branch from 2240e17 to cfb4b5b Compare June 18, 2026 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant