Skip to content

Bug: string.Template.substitute() crashes on $ in config values like regex file_pattern #2349

@Lewis-404

Description

@Lewis-404

Description

The config loader in graphrag-common uses string.Template.substitute() to process environment variables in settings.yaml. However, substitute() treats every $ character as a placeholder prefix — including $ that happens to be part of a regex pattern or any other string value. This causes a hard crash with an unhelpful ValueError.

Reproduction

Create a settings.yaml with:

input:
  type: markitdown
  file_pattern: ".*\\.md$"

Run graphrag index --root <dir>.

Error

ValueError: Invalid placeholder in string: line 50, col 25

This happens because $ at the end of .*\.md$ (a valid regex anchor) is treated by string.Template as a template placeholder, and a bare $ without a valid identifier causes substitute() to raise ValueError.

Root Cause

File: graphrag-common/graphrag_common/config/load_config.py, the _parse_env_variables function:

def _parse_env_variables(text: str) -> str:
    """Parse environment variables in the configuration text."""
    try:
        return Template(text).substitute(os.environ)
    except KeyError as error:
        msg = f"Environment variable not found: {error}"
        raise ConfigParsingError(msg) from error

substitute() raises ValueError for any invalid placeholder (not just KeyError for missing env vars). So bare $ in regex patterns, $VAR with special characters, etc. all crash the config loader.

This affects:

  • file_pattern with regex ending in $ (very common, e.g. .*\.md$)
  • Any other config value that happens to contain $
  • prompt paths containing $ characters

Proposed Fix

Replace substitute() with safe_substitute(), which silently leaves unrecognized placeholders as literal text instead of raising an error:

def _parse_env_variables(text: str) -> str:
    """Parse environment variables in the configuration text."""
    return Template(text).safe_substitute(os.environ)

safe_substitute() handles $$ escape sequences but does not raise an error on bare $ or missing keys — it leaves them unchanged as literal text. This is the expected behavior: a $ in a regex pattern should stay as $, not crash the config loader.

A KeyError or ValueError from substitute() is the wrong place to validate config values anyway — Pydantic schema validation in the config models already handles that properly.

Environment

  • graphrag version: 3.0.9
  • Python: 3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions