Refine complex buffer copies and add round-trip tests for module_pw#7412
Refine complex buffer copies and add round-trip tests for module_pw#7412Aunixt wants to merge 7 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors repeated complex-buffer copy loops in the plane-wave transform code into shared helper templates (detail::copy_complex_buffer and detail::copy_complex_buffer_parallel) that operate on the underlying scalar stream to aid vectorization. Adds round-trip serial unit tests for PW_Basis and PW_Basis_K.
Changes:
- Introduced
detail::copy_complex_buffer[_parallel]helpers inpw_gatherscatter.husing__restrict__and ivdep hints over the interleaved real/imag scalar stream. - Replaced inline copy loops in
pw_transform.cpp,pw_transform_k.cpp, and gather/scatter routines with calls to the new helpers. - Added
ComplexTransformRoundTripserial tests forPW_BasisandPW_Basis_K.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| source/source_basis/module_pw/pw_gatherscatter.h | Adds shared copy helpers and refactors gather/scatter loops to use them. |
| source/source_basis/module_pw/pw_transform.cpp | Replaces manual OpenMP copy loops with copy_complex_buffer_parallel. |
| source/source_basis/module_pw/pw_transform_k.cpp | Same refactor applied to k-point variant. |
| source/source_basis/module_pw/test_serial/pw_basis_test.cpp | New round-trip test for recip2real/real2recip. |
| source/source_basis/module_pw/test_serial/pw_basis_k_test.cpp | New round-trip test for the k-point variant. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (int ig = 0; ig < pwb.npw; ++ig) | ||
| { | ||
| EXPECT_NEAR(recip_in[ig].real(), recip_out[ig].real(), 1e-10); | ||
| EXPECT_NEAR(recip_in[ig].imag(), recip_out[ig].imag(), 1e-10); | ||
| } |
| std::vector<std::complex<double>> recip_in(basis_k.npwk[0]); | ||
| std::vector<std::complex<double>> real_space(basis_k.nrxx); | ||
| std::vector<std::complex<double>> recip_out(basis_k.npwk[0]); |
|
This refactoring idea is quite interesting. Feel free to move forward with your implementation. |
|
Reply to the comments if you have solved them or known how to answer. |
Qianruipku
left a comment
There was a problem hiding this comment.
Please provide a comparison of the execution time before and after optimization.
| @@ -0,0 +1,284 @@ | |||
| # `feat/simd` 优化过程报告 | |||
There was a problem hiding this comment.
Please do not include report revisions in the source code. If necessary, you can post them on GitHub instead.
Reminder
Linked Issue
Fix #...
Unit Tests and/or Case Tests for my changes
What's changed?
This pull request refactors the copying of complex buffers in the plane-wave basis module to use new helper functions that improve performance and code clarity, especially with respect to vectorization and parallelization. It also adds comprehensive round-trip tests for complex-to-complex plane-wave transforms to ensure correctness. The most important changes are summarized below.
Performance and Code Quality Improvements
detail::copy_complex_bufferanddetail::copy_complex_buffer_parallelhelper functions inpw_gatherscatter.hto efficiently copy complex buffers, enabling better compiler vectorization and OpenMP parallelization. All manual loops copying complex arrays have been replaced with calls to these helpers.pw_gatherscatter.h,pw_transform.cpp, andpw_transform_k.cppto use the new helper functions for copying complex buffers, replacing explicit for-loops and OpenMP pragmas with the new, more maintainable approach.Testing and Validation
PW_BasisandPW_Basis_Kclasses to verify that a reciprocal-to-real followed by a real-to-reciprocal transform accurately recovers the original complex data, ensuring the correctness of the new buffer copy implementation.These changes collectively improve the performance, maintainability, and reliability of the plane-wave basis transformation routines.
Any changes of core modules? (ignore if not applicable)