Rest of GPR Library by vtommasini · Pull Request #8 · sxs-collaboration/SimulationSupport

vtommasini · 2026-05-12T22:31:56Z

Complete GPR library with the remaining functions beyond standalone ML utilities pushed previously. Added the functions to run the full pipeline, plot kernel eigenvalue decay, leave-one-out cross-validation, and the functions to restore saved models from disk.

Copilot

Pull request overview

Adds a new Gaussian Process Regression (GPR) module to the SimulationSupport package, aiming to support Post-Newtonian initial-parameter generation/corrections plus an end-to-end GPR training/prediction/cross-validation workflow, along with unit tests and new runtime dependencies.

Changes:

Added SimulationSupport.gpr with GPytorch-based training/prediction utilities, PN helper functions, cross-validation, plotting, and checkpoint loading.
Added a large new unit test module covering normalization, training/prediction, parsing, correction application, LOO CV, plotting residuals, and checkpoint restore.
Updated pyproject.toml to include ML/plotting dependencies.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`src/SimulationSupport/gpr.py`	New GPR/PN library implementation (training, prediction, CV, plotting, checkpoint restore).
`tests/test_gpr.py`	New unit tests for the GPR module and related utilities.
`pyproject.toml`	Adds new dependencies required by the new GPR functionality.

Comments suppressed due to low confidence (7)

src/SimulationSupport/gpr.py:44

normalize_data always reshapes X and Y with .reshape(-1, 1), which breaks multidimensional inputs (N, D) and contradicts the docstring claim that multi-D X is passed through unchanged. Handle X.ndim == 1/Y.ndim == 1 separately and only reshape in that case (otherwise fit/transform the 2D arrays directly).

    # Reshape X and Y into 2D arrays before fitting
    # as StandardScaler expects shape (N, 1) for 1D inputs
    X_normalized = scaler_X.fit_transform(X.reshape(-1, 1))
    Y_normalized = scaler_Y.fit_transform(Y.reshape(-1, 1))

src/SimulationSupport/gpr.py:680

predict_with_gpr_model does not reshape 1D raw_X to (N, 1) before converting to a tensor, while train_gpr_model reshapes 1D inputs during training. If callers pass a 1D array at prediction time, this can produce a shape mismatch in the GP model (kernels/LinearMean expect 2D inputs). Reshape 1D inputs consistently in both training and prediction paths.

    # Normalize the input using the model's stored parameters
    normalized_X = (raw_X - model.input_mean) / model.input_std
    X_tensor = torch.from_numpy(normalized_X).float()

src/SimulationSupport/gpr.py:813

train_model_and_eigenvalue_analysis defaults gpr=None but then unconditionally calls gpr.train_gpr_model, gpr.normalize_data, and gpr.predict_with_gpr_model, which will raise an AttributeError unless the caller always passes a module object. Either default gpr to the current module (or remove the parameter) and call the local functions directly, or raise a clear error if gpr is not provided.

    # Train GPR
    model, likelihood = gpr.train_gpr_model(X, Y)

    # Normalize values - needed for eigenvalue analysis
    X_normalized, Y_normalized, scaler_X, scaler_Y = gpr.normalize_data(X, Y)

tests/test_gpr.py:223

setUpClass writes to a hard-coded path under the repo (Path("SimulationSupport") / "gpr_tests"). Tests should avoid creating/cleaning directories in the working tree because failures can leave artifacts or clash with user checkouts. Prefer using tempfile.TemporaryDirectory() (like other tests here) and keep all filesystem writes inside it.

        cls.test_dir = Path("SimulationSupport") / "gpr_tests"
        shutil.rmtree(cls.test_dir, ignore_errors=True)
        cls.test_dir.mkdir(parents=True, exist_ok=True)
        cls.X, cls.Y = cls.make_data(n=20)

src/SimulationSupport/gpr.py:625

The training-loop comment states that “early stopping terminates training once the loss plateaus”, but the implementation always runs the full 200 iterations (it only tracks best_state). Either implement actual early stopping (break when no improvement for N steps) or adjust the comment/docstring so it doesn’t claim behavior that isn’t present.

    # Training loop for 200 iterations; 200 was chosen empirically for this dataset.
    # In practice, early stopping terminates training once the loss
    # plateaus, so the exact value isn't critical. If utilizing a larger
    # dataset where more iterations may be needed, increase this value if the loss
    # is still decreasing at iteration 200.

src/SimulationSupport/gpr.py:1076

save_gpr_corrected has a malformed docstring opener (""" ") which leaves a stray quote at the start of the docstring. Clean this up to a standard triple-quoted docstring to avoid confusing generated docs/help output.

):
    """ "
    Write and save GPR corrected runs to a text file.
    Each line contains RunID, ZwickyDays, q, chiA, chiB, D0, Omega0, adot0.

src/SimulationSupport/gpr.py:567

train_gpr_model normalizes by input_std / output_std computed from the training data without guarding against zeros. If a feature column (or the target) is constant, this will divide by zero and propagate NaNs through training/prediction. Add a small epsilon or use a safer scaling strategy (e.g., replace zeros in std with 1.0, similar to StandardScaler).

    input_mean = raw_X.mean(axis=0)
    input_std = raw_X.std(axis=0)
    output_mean = raw_Y.mean()
    output_std = raw_Y.std()

    # Normalize data column-wise; needed for all dimension > 1
    normalized_X = (raw_X - input_mean) / input_std
    normalized_Y = (raw_Y - output_mean) / output_std

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    "scipy",
+    "matplotlib",
+    "torch",
+    "gpytorch"


nilsvu

Please also rebase on the merged PR #4 .

nilsvu

Please remove these other eccentricity-specific functions. Also, we have quite a few diagnostic functions now that do validations and plots (eigenvalue analysis, LOO analysis). Let's put these in a separate file to keep the code manageable. I suggest you move gpr.py -> grp/core.py and put the diagnostic functions in a new file gpr/diagnostics.py. You can also add gpr/__init__.py with the code from .core import *. This way people can still just do from SimulationSupport.gpr import GPRegressionModel without having to import from core.

Copilot AI review requested due to automatic review settings May 12, 2026 22:31

Copilot started reviewing on behalf of vtommasini May 12, 2026 22:32 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

Comment thread src/SimulationSupport/gpr.py Outdated

Comment thread pyproject.toml

"scipy",

"matplotlib",

"torch",

"gpytorch"

Comment thread tests/test_gpr.py Outdated

nilsvu requested changes May 13, 2026

View reviewed changes

Comment thread src/SimulationSupport/gpr.py Outdated

Vittoria Tommasini added 2 commits May 18, 2026 12:55

Add rest of GPR library and unit tests

7a630c6

remove eccentricity related functions and keep only GPR

daafdf5

vtommasini force-pushed the rest-of-gpr-lib branch from 00e1240 to daafdf5 Compare May 18, 2026 20:02

nilsvu requested changes May 18, 2026

View reviewed changes

vtommasini closed this May 19, 2026

vtommasini reopened this May 19, 2026

Split gpr.py into core.py and diagnostics.py, and pushing core.py now

8bcb555

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rest of GPR Library#8

Rest of GPR Library#8
vtommasini wants to merge 3 commits into
sxs-collaboration:mainfrom
vtommasini:rest-of-gpr-lib

vtommasini commented May 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

nilsvu left a comment

Uh oh!

Uh oh!

nilsvu left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vtommasini commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

nilsvu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nilsvu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vtommasini commented May 12, 2026 •

edited

Loading

nilsvu left a comment •

edited

Loading