Skip to content

feat: dp test stress#5588

Queued
OutisLi wants to merge 2 commits into
deepmodeling:masterfrom
OutisLi:pr/teststress
Queued

feat: dp test stress#5588
OutisLi wants to merge 2 commits into
deepmodeling:masterfrom
OutisLi:pr/teststress

Conversation

@OutisLi

@OutisLi OutisLi commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • Energy test outputs now include stress (sigma) detail files and stress MAE/RMSE metrics when periodic boundary conditions and virial data are available.
    • Aggregated energy test summaries now report Stress MAE/RMSE alongside existing energy/virial metrics.
  • Bug Fixes

    • Stress values are now derived consistently from virial and per-frame cell volume, with matching accuracy reporting.
  • Tests

    • Expanded dp test coverage to validate stress/virial consistency and stress metric correctness.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Energy testing now computes stress from virial and cell volume for PBC cases, writes stress detail files, reports stress MAE/RMSE in per-test and aggregate summaries, and adds PT and TF regression tests for the new outputs.

Changes

Energy-test stress support

Layer / File(s) Summary
Stress detail output
deepmd/entrypoints/test.py
_write_energy_test_details accepts stress arrays and writes a .s.out detail file when both reference and prediction stress values are present.
Stress metrics and logging
deepmd/entrypoints/test.py
test_ener computes stress from virial and cell volume, stores stress MAE/RMSE, passes stress data to the detail writer, and logs per-test and aggregate stress summaries.
PyTorch test harness updates
source/tests/pt/test_dp_test.py
PyTorch dp_test helpers switch TorchScript temp files to mkstemp/os.close() and unlink them after use.
PyTorch stress regression
source/tests/pt/test_dp_test.py
TestDPTestStress prepares a virial-backed temporary dataset, runs dp_test_ener, checks .s.out against .v.out / volume, and verifies the reported stress MAE/RMSE.
TensorFlow stress regression
source/tests/tf/test_dp_test.py
TestDPTestEner seeds virial.npy files and asserts the generated stress output matches pred_v / volume.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and accurately reflects the main change: adding stress handling to dp test.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@source/tests/pt/test_dp_test.py`:
- Around line 411-430: The temporary model file handle in the test setup is left
open after tempfile.NamedTemporaryFile(delete=False), which breaks Windows
compatibility for torch.jit.save and later os.unlink. In the test around
dp_test_ener, close the NamedTemporaryFile handle immediately after creating
tmp_model and before calling torch.jit.save, then continue using tmp_model.name
for DeepEval and cleanup so the file can be written and deleted reliably on all
platforms.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e22fbaf7-cd5a-43b7-b046-8b88ff0292bf

📥 Commits

Reviewing files that changed from the base of the PR and between 5733301 and 6cca3c8.

📒 Files selected for processing (3)
  • deepmd/entrypoints/test.py
  • source/tests/pt/test_dp_test.py
  • source/tests/tf/test_dp_test.py

Comment thread source/tests/pt/test_dp_test.py Outdated

@njzjz-bot njzjz-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding stress reporting to dp test — this is a useful output to have. I found a couple of things that I think should be addressed before merging:

  1. Stress sign convention looks inconsistent with the existing DeePMD API. The new code currently writes

    prediction_stress = virial / volume
    reference_stress = test_data["virial"][:numb_test] / volume

    but deepmd/calculator.py converts virial to ASE stress as stress = -virial / volume (with the usual tensile-positive convention). The sign will cancel out in MAE/RMSE because both prediction and reference are transformed the same way, but the new .s.out detail file is user-facing and would report the opposite sign from the existing stress API. If this file is meant to be mechanical stress, please use -virial / volume and update the PT/TF tests accordingly; if it is intentionally virial density, I would avoid naming it stress.

  2. Please close the temporary model file before saving in the new PT test. tempfile.NamedTemporaryFile(delete=False, suffix=".pth") leaves the file handle open, and torch.jit.save(model, tmp_model.name) can fail on Windows or make later cleanup unreliable. Something like tmp_model.close() immediately after creation (or using mkstemp/TemporaryDirectory and closing the fd) should be enough.

CI note: at the time of review, the C++ test aggregate is failing because one C++ job exited with 143, and several Python jobs are still pending.

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.29%. Comparing base (5733301) to head (069c328).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/entrypoints/test.py 94.73% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5588      +/-   ##
==========================================
+ Coverage   82.27%   82.29%   +0.01%     
==========================================
  Files         887      887              
  Lines      100331   100452     +121     
  Branches     4060     4059       -1     
==========================================
+ Hits        82550    82667     +117     
+ Misses      16320    16318       -2     
- Partials     1461     1467       +6     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
source/tests/pt/test_dp_test.py (2)

54-73: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make temp-model cleanup exception-safe.

If torch.jit.save, dp_test, DeepEval, or dp_test_ener raises, os.unlink(tmp_model_path) is skipped and the .pth file leaks into the temp directory. This pattern is repeated at every new temp-model call site here, so it should be wrapped in try/finally (or a small helper/context manager) once and reused.

Suggested pattern
 tmp_fd, tmp_model_path = tempfile.mkstemp(suffix=".pth")
 os.close(tmp_fd)
- torch.jit.save(model, tmp_model_path)
- dp_test(...)
- os.unlink(tmp_model_path)
+ try:
+     torch.jit.save(model, tmp_model_path)
+     dp_test(...)
+ finally:
+     os.unlink(tmp_model_path)

Also applies to: 202-217, 258-273, 327-374, 415-435, 486-500

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/pt/test_dp_test.py` around lines 54 - 73, The temporary model
cleanup in the test helpers is not exception-safe, so a failed torch.jit.save,
dp_test, DeepEval, or dp_test_ener call can leave .pth files behind. Update the
repeated temp-model call sites in test_dp_test.py to use a try/finally block or
a shared helper/context manager so os.unlink(tmp_model_path) always runs after
torch.jit.save and the downstream test invocation. Apply the same pattern
consistently across the dp_test, DeepEval, and dp_test_ener sections to remove
the duplicated leak-prone cleanup logic.

382-382: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Update the stress sign in the docstring.

The class docstring still documents sigma = virial / volume, but the test now validates -virial / volume. Leaving the sign convention stale here will mislead the next person touching this regression.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/pt/test_dp_test.py` at line 382, Update the docstring for the
stress regression test to match the current sign convention: it should describe
the ``dp test`` stress output as ``-virial / volume`` instead of ``virial /
volume``. Locate the test by its stress-verification docstring in the ``dp
test`` regression case and adjust the wording so it stays aligned with the
assertion logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@source/tests/pt/test_dp_test.py`:
- Around line 54-73: The temporary model cleanup in the test helpers is not
exception-safe, so a failed torch.jit.save, dp_test, DeepEval, or dp_test_ener
call can leave .pth files behind. Update the repeated temp-model call sites in
test_dp_test.py to use a try/finally block or a shared helper/context manager so
os.unlink(tmp_model_path) always runs after torch.jit.save and the downstream
test invocation. Apply the same pattern consistently across the dp_test,
DeepEval, and dp_test_ener sections to remove the duplicated leak-prone cleanup
logic.
- Line 382: Update the docstring for the stress regression test to match the
current sign convention: it should describe the ``dp test`` stress output as
``-virial / volume`` instead of ``virial / volume``. Locate the test by its
stress-verification docstring in the ``dp test`` regression case and adjust the
wording so it stays aligned with the assertion logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 82ab4add-838f-4120-81fa-49bbeac8a153

📥 Commits

Reviewing files that changed from the base of the PR and between 6cca3c8 and 069c328.

📒 Files selected for processing (3)
  • deepmd/entrypoints/test.py
  • source/tests/pt/test_dp_test.py
  • source/tests/tf/test_dp_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • source/tests/tf/test_dp_test.py
  • deepmd/entrypoints/test.py

@OutisLi OutisLi added this pull request to the merge queue Jun 26, 2026

@njzjz-bot njzjz-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. The core implementation now looks consistent with the tensile-positive stress convention (stress = -virial / volume), and the current CI matrix is green. I have two remaining comments, both relatively small but worth cleaning up for maintainability:

  1. The PT stress regression docstring is still stale. source/tests/pt/test_dp_test.py still says sigma = virial / volume, while the implementation and assertions now use -virial / volume. Please update that docstring so future readers do not infer the wrong sign convention from the test description.

  2. The temporary model handling was improved by switching away from an open NamedTemporaryFile, but the cleanup is still not exception-safe. Several call sites now follow the pattern:

tmp_fd, tmp_model_path = tempfile.mkstemp(suffix=".pth")
os.close(tmp_fd)
torch.jit.save(model, tmp_model_path)
...
os.unlink(tmp_model_path)

If torch.jit.save, dp_test, DeepEval, dp_test_ener, or an assertion raises, the temporary .pth file is leaked. Since this PR already touches these call sites, it would be cleaner to wrap the model use in try/finally, or better, factor a small helper/context manager for creating and cleaning up the scripted model path. That would also reduce the repeated boilerplate across the PT tests.

The stress detail output and metrics path itself look reasonable after the sign correction.

Authored by OpenClaw 2026.6.8 (844f405) (model: custom-chat-jinzhezeng-group/gpt-5.5)

@njzjz-bot njzjz-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up inline review. The implementation path now looks consistent with the stress sign convention and CI is green; I only left the remaining maintainability/test-hygiene items inline.

Authored by OpenClaw 2026.6.8 (844f405) (model: custom-chat-jinzhezeng-group/gpt-5.5)

torch.jit.save(model, tmp_model.name)
tmp_fd, tmp_model_path = tempfile.mkstemp(suffix=".pth")
os.close(tmp_fd)
torch.jit.save(model, tmp_model_path)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the Windows issue from the open NamedTemporaryFile, but the cleanup is still not exception-safe: if torch.jit.save, dp_test, DeepEval, dp_test_ener, or an assertion raises, the later os.unlink(tmp_model_path) is skipped and the .pth file leaks. Since this pattern is now repeated in several test cases, please wrap the use site in try/finally or factor a small helper/context manager that saves the scripted model and always unlinks it.

The same pattern appears at the other new mkstemp call sites in this file.



class TestDPTestStress(DPTest, unittest.TestCase):
"""Verify the stress output of ``dp test`` (sigma = virial / volume, eV/Å^3)."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is now stale after the sign-convention fix. The implementation and assertions below validate sigma = -virial / volume, so the class description should say the same thing; otherwise the regression test itself documents the wrong convention for future readers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants