Fix parsing of agent search space file#1293
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes a panic in agent search-space parsing (issue #1290) when the search_space field is set to all (or empty), and tightens validation so unsupported combinations of agent/commodity/process/year are rejected up-front. The PR also renames the input file from agent_search_space.csv to agent_search_spaces.csv for consistency with other agent CSVs, and simplifies Agent::iter_possible_producers_of (renamed to iter_search_space) since the commodity-production check is now enforced at search-space construction time. An unrelated cleanup removes a redundant base-file existence check in FilePatch::apply.
Changes:
- Restrict the
allsearch space to processes that actually produce the relevant commodity in the relevant region/year, and add validation for unknown IDs, overlapping entries, and agents not responsible for the commodity. - Rename
agent_search_space.csv→agent_search_spaces.csv(with matching schema and loader symbols) and simplifyAgent::iter_search_spaceto filter only by region. - Add a suite of tests using
FilePatchto cover both happy paths and validation failures.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/input/agent/search_space.rs | Rewrites parsing/validation; introduces a producers map keyed by (agent, commodity, year); adds comprehensive tests. |
| src/input/agent.rs | Updates import to the renamed read_agent_search_spaces function. |
| src/agent.rs | Renames iter_possible_producers_of to iter_search_space and removes redundant flow-direction filtering. |
| src/simulation/investment.rs | Updates call site to the renamed method. |
| schemas/input/agent_search_spaces.yaml | New schema file documenting the renamed CSV. |
| src/patch.rs | Removes redundant base-file existence check and its test. |
Comments suppressed due to low confidence (1)
src/input/agent/search_space.rs:131
- The docstring still refers to the old filename
agent_search_space.csv. Since the file was renamed toagent_search_spaces.csv(and the constant updated accordingly), this comment is now out-of-date.
/// Read agent search space info from the `agent_search_space.csv` file.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| agent_id: &AgentID, | ||
| commodity_id: &CommodityID, | ||
| years: &[u32], | ||
| processes: &ProcessMap, |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1293 +/- ##
==========================================
+ Coverage 89.46% 89.73% +0.27%
==========================================
Files 57 57
Lines 8361 8418 +57
Branches 8361 8418 +57
==========================================
+ Hits 7480 7554 +74
+ Misses 581 564 -17
Partials 300 300 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This method was pointlessly checking whether the relevant processes in the agent's search space have the given commodity as an output in the given region and year, when we already know that the flow must be an output for any of these processes. It's enough to check that the region matches.
9e17a66 to
84c7873
Compare
…istency Also update var names etc. where appropriate.
84c7873 to
c6a3d42
Compare
tsmbland
left a comment
There was a problem hiding this comment.
Looks good!
Do you think it would be cleaner to add a region key to AgentSearchSpaceMap? I.e. the current approach is to add processes to the search space if they exist in any region that the agent operates in, then filter based on region at investment time. Versus assembling the region-level search space at read time, then it's just a simple retrieval at investment time. Could make the code a bit cleaner, at the cost of some redundancy in the data structure.
Probably not worth it for now, but I wonder if in the future it could be limiting that we don't allow users to specify different search spaces for different regions...
Good idea! I just opened an issue about allowing users to vary agent properties by region (#1294) because it seems odd to not allow it. Anyway, breaking it down by region now will make that easier when we get round to it, so I'll just do it now. |
3884e45 to
be05852
Compare
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/patch.rs:230
FilePatch::applyno longer validates that the base file exists whenreplacement_contentis set. This allows patches to silently create new files on typoed filenames, which can make model patching failures harder to detect. If creating new files is intended, consider making it explicit (e.g., anallow_createflag) and/or adding a targeted test asserting the intended behavior.
/// Apply this patch to a base model and return the modified CSV as a string.
fn apply(&self, base_model_dir: &Path) -> Result<String> {
// Read and validate the base file path
let base_path = base_model_dir.join(&self.filename);
// If this patch is a full replacement, return the replacement content.
if let Some(content) = &self.replacement_content {
return Ok(content.clone());
}
// Read the base file to string
let base = fs::read_to_string(&base_path)?;
| ensure!(!search_space.is_empty(), "No processes provided"); | ||
|
|
||
| let regions_and_years = iproduct!(agent.regions.iter(), years.iter().copied()); | ||
| if search_space.eq_ignore_ascii_case("all") { | ||
| // Iterate over all possible producers for each year |
There was a problem hiding this comment.
Nah.
I forgot to mention this in the PR description, but I changed this behaviour on purpose. We used to do the same thing in other places -- treat an empty string as synonymous with all -- but decided it was better to be explicit. So I've changed this code to do the same thing.
Description
It seems like the parsing and validation for the agent search space file was pretty broken and we just didn't notice because we don't actually use it anywhere! The default values that you get if the file doesn't exist seem to work fine, though.
The problem with parsing is that if you specify
allin thesearch_spacefield, it's treated as meaning all processes in the simulation, rather than just the ones that apply to this entry. This causes a panic later on when we try to get the flow for a specific commodity from the process, which it may not produce (#1290). The fix for this is to limit the search space to processes which are actually feasible options, i.e. they produce the given commodity in the given year.The current validation of the input file is very basic (e.g. just checking that IDs are relevant) so there are many ways to get MUSE2 to panic by putting in a commodity that an agent isn't responsible for or processes that don't produce it etc. I've essentially rewritten this code to handle all these cases correctly and I've added a bunch of tests.
I noticed that we're calling this file
agent_search_space.csvbut the other agent-related files have plurals at the end (agent_commodity_portions.csv,agent_objectives.csv). The inconsistency is a bit annoying, so I figured we should change it now, before too many people are actually using MUSE2. Hopefully it won't break too many models.Unrelated change: The
Agent::iter_possible_producers_ofmethod was checking whether processes in the agent's search space actually produce the given commodity in the given year, but this is unncessary, as all the processes in the search space should have been checked for this anyway. So I've changed it to just check the region and renamed it.Fixes #1290.
Type of change
Key checklist
$ cargo test$ cargo docpresent in the previous release
Further checks