Use UTF-8 for filename encoding instead of ASCII by slash-proc · Pull Request #168 · jrast/littlefs-python

slash-proc · 2026-06-04T14:24:47Z

littlefs stores names as opaque byte strings, so the API-level encoding is a free choice. The previous default of ASCII rejected any non-ASCII filename with UnicodeEncodeError (and decoded directory entries as ASCII as well).

Switch the default FILENAME_ENCODING to UTF-8: non-ASCII names now round-trip through open/stat/listdir/mkdir/rename/remove, and because ASCII is a strict subset of UTF-8 all existing filenames are unaffected.

Adds regression tests covering Latin-1, CJK and emoji filenames.

littlefs stores names as opaque byte strings, so the API-level encoding is a free choice. The previous default of ASCII rejected any non-ASCII filename with UnicodeEncodeError (and decoded directory entries as ASCII as well). Switch the default FILENAME_ENCODING to UTF-8: non-ASCII names now round-trip through open/stat/listdir/mkdir/rename/remove, and because ASCII is a strict subset of UTF-8 all existing filenames are unaffected. Adds regression tests covering Latin-1, CJK and emoji filenames.

Copilot

Pull request overview

This PR changes the default filename encoding used by the Python bindings from ASCII to UTF-8 so that non-ASCII filenames can be created and round-tripped through the LittleFS API (while keeping ASCII behavior unchanged).

Changes:

Switch FILENAME_ENCODING default from 'ascii' to 'utf-8'.
Add regression tests for Unicode filenames (Latin-1, CJK, emoji) covering open/stat/listdir/mkdir/rename/remove.
Document the rationale for UTF-8 as the default filename encoding.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
test/test_unicode_filenames.py	Adds regression coverage ensuring non-ASCII filenames round-trip across core filesystem operations.
src/littlefs/lfs.pyx	Changes the default filename encoding constant to UTF-8 and documents the reasoning.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Build on the UTF-8 default by letting callers choose the filename encoding per filesystem instead of relying on the process-wide lfs.FILENAME_ENCODING global. - Low-level lfs.* functions take an optional filename_encoding that falls back to the module global, keeping existing callers unaffected. - LittleFS accepts filename_encoding= and threads it through every path-handling call (open, stat, listdir/scandir, mkdir, remove, rename, *attr). - Useful for images whose names were written with a non-UTF-8 encoding (e.g. latin-1, shift-jis), which would otherwise mis-decode or raise. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add a --filename-encoding option to the shared CLI parser so create, extract, list, and repl can encode/decode image filenames with a non-UTF-8 codec (e.g. latin-1, shift-jis). Defaults to None so the default lives solely in lfs.FILENAME_ENCODING (utf-8). Unlike name_max/attr_max/file_max, this is a host-side encode/decode choice and is never stored in the image, so it may differ freely between create and extract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Confirmed against the upstream C source: lfs_path_namelen() (strcspn) measures the name in bytes and lfs.c checks `nlen > name_max`, so a multi-byte UTF-8 character counts as 2-4 bytes against the limit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

- lfs.pyi: dir_read returns Optional[LFSStat] (None at end-of-directory) - __init__.py: scandir wraps iteration in try/finally so dir_close always runs even if dir_read raises (e.g. UnicodeDecodeError) - __main__.py: clarify that extract must use the same --filename-encoding as create; it cannot differ freely - lfs.pyx: file_sync now returns the error code, matching its -> int stub and every other file_* function (was the lone outlier returning None) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

BrianPugh self-assigned this Jun 4, 2026

BrianPugh requested a review from Copilot June 4, 2026 14:29

Copilot started reviewing on behalf of BrianPugh June 4, 2026 14:29 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

BrianPugh and others added 3 commits June 4, 2026 10:56

BrianPugh requested a review from Copilot June 4, 2026 23:37

Copilot started reviewing on behalf of BrianPugh June 4, 2026 23:37 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Comment thread src/littlefs/lfs.pyi Outdated

Comment thread src/littlefs/__main__.py Outdated

Comment thread src/littlefs/__init__.py Outdated

Comment thread test/test_unicode_filenames.py

BrianPugh requested a review from Copilot June 4, 2026 23:52

Copilot started reviewing on behalf of BrianPugh June 4, 2026 23:52 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

BrianPugh merged commit fcd02ba into jrast:master Jun 5, 2026
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use UTF-8 for filename encoding instead of ASCII#168

Use UTF-8 for filename encoding instead of ASCII#168
BrianPugh merged 5 commits into
jrast:masterfrom
slash-proc:utf8-filename-encoding

slash-proc commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

slash-proc commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants