Skip to content

Improve the GMM testing by leveraging the SMT#45

Open
CB-quakemodel wants to merge 29 commits intomasterfrom
gmm_testing
Open

Improve the GMM testing by leveraging the SMT#45
CB-quakemodel wants to merge 29 commits intomasterfrom
gmm_testing

Conversation

@CB-quakemodel
Copy link
Copy Markdown
Contributor

@CB-quakemodel CB-quakemodel commented Apr 3, 2026

Expand the initial GMM testing to compute per GMM per TRT the total, inter and intra-event residuals and some overall summary plots too. Addresses #20

I extend the same ContextDB used in the SMT so we can use the SMT's existing capabilities to compute the partitioned random effects residuals. We can also then use the plotting functions to provide some summary plots of the residuals too. Right now it is hardcoded to compute the GRM IMTs (PGA, SA(0.3), SA(0.6) and SA(1.0)).

Examples of the plots are provided here (these are the ones generated in the added unit test for the new residual analysis functions - note that I made a fake ground-motion dataset for each TRT given there is not any available ground-motions for the region covered by the source model used in the SSC testing QA). This is made clear in the sample flatfile to ensure a user does not think this is real metadata for ground-motions in the test SSC area.

I have updated the generation of the residual analysis HTML and the documentation.

This PR will break the current tests because we need the MBTK installed in the same environment as the CI tests create

Example plots

residual_means_stds_vs_period AkkarEtAlRjb2014_kind_rjb__adjustment_factor_0 0_PGA_vs_dist

@cossatot
Copy link
Copy Markdown
Contributor

cossatot commented Apr 9, 2026

@CB-quakemodel, thanks tremendously for this.

This is excellent work, but I have some reservations. Maybe you can explain to me a bit better about some of the choices.

First, let me say that I think that the use of large custom classes to store data is a mistake, unless there are specific performance or other reasons to do so. I think it makes the code harder to understand, harder to use in a more ad-hoc way (for example, you can easily import a function that operates on a numpy array, but if instead the function is actually a method of some class, you have to instantiate the whole class to use the function, even if you aren't interested in 90% of the class and/or you have to have a lot of extraneous data on hand to make the class). The classes are also a lot more work to test (for the same set up/tear down reasons). You have to write a lot of specific code to save an instantiated class to a file, whereas saving a dict is easy. You have to dig in hard in the REPL to see all the data and methods, so they are hard to debug. Hamlet used to use a lot more of these, because I was following what was then the standard practice at GEM, and I spent months redoing things to use more standard Python data structures (dicts, dataframes, etc.) instead. There are a few cases that remain for performance reasons or to integrate better with the OQ Engine.

So for some of the structure of the PR, I don't think that we want to create a single large class that holds the evaluation data and results. The rest of the evals manage fine with a dictionary that has config, data, results, etc., and it is best to continue to use the same pattern throughout.

As we've discussed, I also don't want to require the MBTK as a dependency, even an optional one, if it can be avoided. However there may be real drawbacks to not using the SMT. (I don't think performance is one, as there are generally a few tens of EQs max in a given model with observations--though I haven't run Japan yet...)

  1. What functionality is present by using the SMT code instead of what was already present--basically making rupture objects and calculating the GMFs for each at the relevant sites?

  2. Would having a separate implementation of the residual analysis here lead to a major future divergence with the methods in the SMT over a medium term (the next few years, say)? Are any planned or desired features particularly hard to implement in the streamlined framework already present?

  3. Are there any types of plots, metrics, or statistical evals/tests that are not present in the SMT that could be implemented here as a test case? For example, we already have interactive plotting for some evals using d3.js.

Again, thanks a lot. I think this will really help us get to the forefront on whole-model evaluations.

Also I'm happy to have a call to discuss if you'd like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants