Skip to content

fix: CRLF newline handling and error span rendering#262

Open
metalurgical wants to merge 1 commit intoBlockstreamResearch:masterfrom
metalurgical:fix_issue_260
Open

fix: CRLF newline handling and error span rendering#262
metalurgical wants to merge 1 commit intoBlockstreamResearch:masterfrom
metalurgical:fix_issue_260

Conversation

@metalurgical
Copy link
Copy Markdown
Contributor

@metalurgical metalurgical commented Mar 30, 2026

Previously, CRLF was treated as two newlines, causing incorrect line numbers, extra blank lines, and misaligned spans on Windows.

  • treat CRLF as a single newline
  • use consistent newline handling for line/column calculation and display
  • fix trailing space rendering on empty lines
  • preserve UTF-16 column alignment and span rendering
  • Handle lone CR for current lexer behavior.

@metalurgical metalurgical requested a review from delta1 as a code owner March 30, 2026 16:45
@metalurgical
Copy link
Copy Markdown
Contributor Author

resolves #261
closes #261

@metalurgical metalurgical force-pushed the fix_issue_260 branch 2 times, most recently from 40661b5 to ab4848a Compare March 31, 2026 14:49
@metalurgical metalurgical requested a review from KyrylR April 1, 2026 14:11
@KyrylR
Copy link
Copy Markdown
Collaborator

KyrylR commented Apr 2, 2026

#257 was merged, could you rebase pls

Also, have to ask you to narrow down the changes to CRLF only due to the reason that the lexer is limited to the LF/CRLF/CR (\n, \r\n, \r), though I do not see a reason to support CR, as it seems a legacy thing

@apoelstra
Copy link
Copy Markdown
Contributor

ab4848a needs rebase

@metalurgical metalurgical force-pushed the fix_issue_260 branch 2 times, most recently from ea07842 to fcbc7bc Compare April 2, 2026 12:21
@metalurgical
Copy link
Copy Markdown
Contributor Author

#257 was merged, could you rebase pls

Also, have to ask you to narrow down the changes to CRLF only due to the reason that the lexer is limited to the LF/CRLF/CR (\n, \r\n, \r), though I do not see a reason to support CR, as it seems a legacy thing

Has been narrowed to an explicit check instead of via is_newline().

@KyrylR
Copy link
Copy Markdown
Collaborator

KyrylR commented Apr 7, 2026

Need to take a deeper look, there could be an issue with lexer that still uses .padded(), and Chumsky treats \r as whitespace there, so lone-\r input is still accepted

@metalurgical
Copy link
Copy Markdown
Contributor Author

Need to take a deeper look, there could be an issue with lexer that still uses .padded(), and Chumsky treats \r as whitespace there, so lone-\r input is still accepted

I suspected CR still needed to be handled. So in this instance should it be erased or treated as a newline?

@KyrylR
Copy link
Copy Markdown
Collaborator

KyrylR commented Apr 10, 2026

Not sure if we ever fallback into this, but let's proceed with restoring handling of bare-\r recognition inside next_newline()

Also could you leave a short comment that this is compatibility with current lexer behavior, not a statement that CR-only files are preferred?

I believe we can add those two tests:

#[test]
fn display_with_cr_only_newlines() {
    let file = "let a: u8 = 0;\rlet b: u8 = 65536;";
    let error = Error::CannotParse("number too large to fit in target type".to_string())
        .with_span(Span::new(27, 32))
        .with_file(Arc::from(file));

    let expected = r#"
  |
2 | let b: u8 = 65536;
  |             ^^^^^ Cannot parse: number too large to fit in target type"#;

    assert_eq!(&expected[1..], &error.to_string());
}
#[test]
fn display_span_as_point_on_trailing_cr_only_empty_line() {
    let file = "fn main(){\r    let a:\r";
    let error = Error::CannotParse("eof".to_string())
        .with_span(Span::new(file.len(), file.len()))
        .with_file(Arc::from(file));

    let expected = r#"
  |
3 |
  | ^ Cannot parse: eof"#;

    assert_eq!(&expected[1..], &error.to_string());
}

fix: CRLF newline handling and error span rendering

Previously, CRLF was treated as two newlines, causing incorrect line numbers, extra blank lines, and misaligned spans on Windows.

- treat CRLF as a single newline
- use consistent newline handling for line/column calculation and display
- fix trailing space rendering on empty lines
- preserve UTF-16 column alignment and span rendering
- Handle lone CR for current lexer behavior.
@metalurgical
Copy link
Copy Markdown
Contributor Author

metalurgical commented Apr 10, 2026

Not sure if we ever fallback into this, but let's proceed with restoring handling of bare-\r recognition inside next_newline()

Also could you leave a short comment that this is compatibility with current lexer behavior, not a statement that CR-only files are preferred?

Not sure if we ever fallback into this, but let's proceed with restoring handling of bare-\r recognition inside next_newline()

Also could you leave a short comment that this is compatibility with current lexer behavior, not a statement that CR-only files are preferred?

Lone CR support was added back in for compatibility with current lexer behavior only.

The proposed tests were also added in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants