Skip to content

Python 3.12+ parser crash on f-string conversion specifiers like {x!r} in Scenic files #464

@thshao2

Description

@thshao2

System Details

  1. Python 3.12.13
  2. Scenic 3.1.0 (main branch as of commit ffa0aba)
  3. Ubuntu 24.04 LTS (Noble Numbat), Linux x86_64
  4. N/A

Detailed Description

When running Scenic under Python 3.12+, Scenic crashes while parsing a .scenic file containing a valid Python f-string conversion specifier such as {x!r}, {x!s}, and {x!a}. This syntax works for versions older than Python 3.12, but under Python 3.12+ the Scenic parser enters its f-string grammar path and fails while handling the conversion field.

The issue appears to be in src/scenic/syntax/scenic.gram, specifically the fstring_conversion[int] rule and the helper check_fstring_conversion.

The relevant grammar path is:

strings (memo): a=(fstring|STRING)+ {
    self.concatenate_strings(a) if sys.version_info >= (3, 12) else self.generate_ast_for_string(a)
}

Under Python < 3.12, Scenic uses generate_ast_for_string, which reconstructs the string source and delegates f-string parsing to CPython via ast.parse. In that path, Scenic does not appear to enter its own fstring_conversion rule for ordinary f-string conversions like {x!r}.

Under Python 3.12, PEP 701 formalized f-strings into the grammar/token stream. Python 3.12 exposes f-string internals through tokens such as FSTRING_START, FSTRING_MIDDLE, and FSTRING_END, so Scenic’s own fstring, fstring_replacement_field, and fstring_conversion rules are now exercised.

This creates two related issues:

  1. check_fstring_conversion receives tokenize.TokenInfo objects but accesses .lineno and .col_offset:

    if mark.lineno != name.lineno or mark.col_offset != name.col_offset:
    

    However, tokenize.TokenInfo does not have .lineno or .col_offset. Its position fields are .start and .end.

  2. After changing that locally to use .start / .end, the parser then crashes because fstring_conversion[int] is declared as returning an int, but check_fstring_conversion returns the TokenInfo object name. Later, fstring_replacement_field calls:

    conversion.decode()[0]
    

    which fails because conversion is a TokenInfo, not bytes/string-like data.

The Python AST representation expects ast.FormattedValue.conversion to be an integer: -1 for no conversion, or the integer code for a, r, or s.

The f-string syntax itself is valid Python, and Python supports !s, !r, and !a conversion specifiers in f-string replacement fields.

Steps To Reproduce

Save the below as a .scenic file.

x = 3
param y = f"{x!r}"

Run scenic python312_fstring_conversion.scenic -b

Scenic crashes while parsing the f-string conversion field with:

AttributeError: 'TokenInfo' object has no attribute 'lineno'. Did you mean: 'line'?

After changing the location check to use .start / .end, the parser reaches a second issue:

AttributeError: 'TokenInfo' object has no attribute 'decode'

Issue Submission Checklist

  • I am reporting an issue, not asking a question
  • I checked the open and closed issues, forum, etc. and have not found any solution
  • I have provided all necessary code, etc. to reproduce the issue

Metadata

Metadata

Assignees

Labels

status: validIssue is valid and will be worked ontype: bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions