Skip to content

Fix parsing of agent search space file#1293

Merged
alexdewar merged 10 commits into
mainfrom
fix-agent-search-space-all
May 19, 2026
Merged

Fix parsing of agent search space file#1293
alexdewar merged 10 commits into
mainfrom
fix-agent-search-space-all

Conversation

@alexdewar
Copy link
Copy Markdown
Member

Description

It seems like the parsing and validation for the agent search space file was pretty broken and we just didn't notice because we don't actually use it anywhere! The default values that you get if the file doesn't exist seem to work fine, though.

The problem with parsing is that if you specify all in the search_space field, it's treated as meaning all processes in the simulation, rather than just the ones that apply to this entry. This causes a panic later on when we try to get the flow for a specific commodity from the process, which it may not produce (#1290). The fix for this is to limit the search space to processes which are actually feasible options, i.e. they produce the given commodity in the given year.

The current validation of the input file is very basic (e.g. just checking that IDs are relevant) so there are many ways to get MUSE2 to panic by putting in a commodity that an agent isn't responsible for or processes that don't produce it etc. I've essentially rewritten this code to handle all these cases correctly and I've added a bunch of tests.

I noticed that we're calling this file agent_search_space.csv but the other agent-related files have plurals at the end (agent_commodity_portions.csv, agent_objectives.csv). The inconsistency is a bit annoying, so I figured we should change it now, before too many people are actually using MUSE2. Hopefully it won't break too many models.

Unrelated change: The Agent::iter_possible_producers_of method was checking whether processes in the agent's search space actually produce the given commodity in the given year, but this is unncessary, as all the processes in the search space should have been checked for this anyway. So I've changed it to just check the region and renamed it.

Fixes #1290.

Type of change

  • Bug fix (non-breaking change to fix an issue)
  • New feature (non-breaking change to add functionality)
  • Refactoring (non-breaking, non-functional change to improve maintainability)
  • Optimization (non-breaking change to speed up the code)
  • Breaking change (whatever its nature)
  • Documentation (improve or add documentation)

Key checklist

  • All tests pass: $ cargo test
  • The documentation builds and looks OK: $ cargo doc
  • Update release notes for the latest release if this PR adds a new feature or fixes a bug
    present in the previous release

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

Copilot AI review requested due to automatic review settings May 18, 2026 16:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a panic in agent search-space parsing (issue #1290) when the search_space field is set to all (or empty), and tightens validation so unsupported combinations of agent/commodity/process/year are rejected up-front. The PR also renames the input file from agent_search_space.csv to agent_search_spaces.csv for consistency with other agent CSVs, and simplifies Agent::iter_possible_producers_of (renamed to iter_search_space) since the commodity-production check is now enforced at search-space construction time. An unrelated cleanup removes a redundant base-file existence check in FilePatch::apply.

Changes:

  • Restrict the all search space to processes that actually produce the relevant commodity in the relevant region/year, and add validation for unknown IDs, overlapping entries, and agents not responsible for the commodity.
  • Rename agent_search_space.csvagent_search_spaces.csv (with matching schema and loader symbols) and simplify Agent::iter_search_space to filter only by region.
  • Add a suite of tests using FilePatch to cover both happy paths and validation failures.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/input/agent/search_space.rs Rewrites parsing/validation; introduces a producers map keyed by (agent, commodity, year); adds comprehensive tests.
src/input/agent.rs Updates import to the renamed read_agent_search_spaces function.
src/agent.rs Renames iter_possible_producers_of to iter_search_space and removes redundant flow-direction filtering.
src/simulation/investment.rs Updates call site to the renamed method.
schemas/input/agent_search_spaces.yaml New schema file documenting the renamed CSV.
src/patch.rs Removes redundant base-file existence check and its test.
Comments suppressed due to low confidence (1)

src/input/agent/search_space.rs:131

  • The docstring still refers to the old filename agent_search_space.csv. Since the file was renamed to agent_search_spaces.csv (and the constant updated accordingly), this comment is now out-of-date.
/// Read agent search space info from the `agent_search_space.csv` file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/patch.rs Outdated
Comment thread src/input/agent/search_space.rs
Comment thread src/input/agent/search_space.rs Outdated
Comment thread src/input/agent.rs
Comment thread src/input/agent/search_space.rs Outdated
Comment thread src/input/agent/search_space.rs Outdated
agent_id: &AgentID,
commodity_id: &CommodityID,
years: &[u32],
processes: &ProcessMap,
Comment thread src/agent.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 98.14815% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.73%. Comparing base (0af4399) to head (0b56234).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/input/agent/search_space.rs 98.06% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1293      +/-   ##
==========================================
+ Coverage   89.46%   89.73%   +0.27%     
==========================================
  Files          57       57              
  Lines        8361     8418      +57     
  Branches     8361     8418      +57     
==========================================
+ Hits         7480     7554      +74     
+ Misses        581      564      -17     
  Partials      300      300              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

alexdewar added 3 commits May 18, 2026 17:20
This method was pointlessly checking whether the relevant processes in the agent's search space have the given commodity as an output in the given region and year, when we already know that the flow must be an output for any of these processes. It's enough to check that the region matches.
@alexdewar alexdewar force-pushed the fix-agent-search-space-all branch from 84c7873 to c6a3d42 Compare May 18, 2026 16:25
Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Do you think it would be cleaner to add a region key to AgentSearchSpaceMap? I.e. the current approach is to add processes to the search space if they exist in any region that the agent operates in, then filter based on region at investment time. Versus assembling the region-level search space at read time, then it's just a simple retrieval at investment time. Could make the code a bit cleaner, at the cost of some redundancy in the data structure.

Probably not worth it for now, but I wonder if in the future it could be limiting that we don't allow users to specify different search spaces for different regions...

@alexdewar
Copy link
Copy Markdown
Member Author

Do you think it would be cleaner to add a region key to AgentSearchSpaceMap? I.e. the current approach is to add processes to the search space if they exist in any region that the agent operates in, then filter based on region at investment time. Versus assembling the region-level search space at read time, then it's just a simple retrieval at investment time. Could make the code a bit cleaner, at the cost of some redundancy in the data structure.

Probably not worth it for now, but I wonder if in the future it could be limiting that we don't allow users to specify different search spaces for different regions...

Good idea! I just opened an issue about allowing users to vary agent properties by region (#1294) because it seems odd to not allow it. Anyway, breaking it down by region now will make that easier when we get round to it, so I'll just do it now.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Comment thread src/input/agent/search_space.rs
Comment thread src/input/agent/search_space.rs
Comment thread src/input/agent/search_space.rs Outdated
Comment thread src/agent.rs Outdated
@alexdewar alexdewar force-pushed the fix-agent-search-space-all branch from 3884e45 to be05852 Compare May 19, 2026 11:28
@alexdewar alexdewar requested a review from Copilot May 19, 2026 11:28
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/patch.rs:230

  • FilePatch::apply no longer validates that the base file exists when replacement_content is set. This allows patches to silently create new files on typoed filenames, which can make model patching failures harder to detect. If creating new files is intended, consider making it explicit (e.g., an allow_create flag) and/or adding a targeted test asserting the intended behavior.
    /// Apply this patch to a base model and return the modified CSV as a string.
    fn apply(&self, base_model_dir: &Path) -> Result<String> {
        // Read and validate the base file path
        let base_path = base_model_dir.join(&self.filename);

        // If this patch is a full replacement, return the replacement content.
        if let Some(content) = &self.replacement_content {
            return Ok(content.clone());
        }

        // Read the base file to string
        let base = fs::read_to_string(&base_path)?;

Comment thread src/agent.rs
Comment on lines +92 to +96
ensure!(!search_space.is_empty(), "No processes provided");

let regions_and_years = iproduct!(agent.regions.iter(), years.iter().copied());
if search_space.eq_ignore_ascii_case("all") {
// Iterate over all possible producers for each year
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah.

I forgot to mention this in the PR description, but I changed this behaviour on purpose. We used to do the same thing in other places -- treat an empty string as synonymous with all -- but decided it was better to be explicit. So I've changed this code to do the same thing.

@alexdewar alexdewar merged commit 09a2236 into main May 19, 2026
8 checks passed
@alexdewar alexdewar deleted the fix-agent-search-space-all branch May 19, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting search_space to all in agent_search_space.csv causes panic

3 participants