Improve the GMM testing by leveraging the SMT by CB-quakemodel · Pull Request #45 · GEMScienceTools/hamlet

CB-quakemodel · 2026-04-03T17:31:01Z

Expand the initial GMM testing to compute per GMM per TRT the total, inter and intra-event residuals and some overall summary plots too. Addresses #20

I extend the same ContextDB used in the SMT so we can use the SMT's existing capabilities to compute the partitioned random effects residuals. We can also then use the plotting functions to provide some summary plots of the residuals too. Right now it is hardcoded to compute the GRM IMTs (PGA, SA(0.3), SA(0.6) and SA(1.0)).

Examples of the plots are provided here (these are the ones generated in the added unit test for the new residual analysis functions - note that I made a fake ground-motion dataset for each TRT given there is not any available ground-motions for the region covered by the source model used in the SSC testing QA). This is made clear in the sample flatfile to ensure a user does not think this is real metadata for ground-motions in the test SSC area.

I have updated the generation of the residual analysis HTML and the documentation.

This PR will break the current tests because we need the MBTK installed in the same environment as the CI tests create

Example plots

AkkarEtAlRjb2014_kind_rjb__adjustment_factor_0 0_PGA_vs_dist

cossatot · 2026-04-09T00:01:39Z

@CB-quakemodel, thanks tremendously for this.

This is excellent work, but I have some reservations. Maybe you can explain to me a bit better about some of the choices.

First, let me say that I think that the use of large custom classes to store data is a mistake, unless there are specific performance or other reasons to do so. I think it makes the code harder to understand, harder to use in a more ad-hoc way (for example, you can easily import a function that operates on a numpy array, but if instead the function is actually a method of some class, you have to instantiate the whole class to use the function, even if you aren't interested in 90% of the class and/or you have to have a lot of extraneous data on hand to make the class). The classes are also a lot more work to test (for the same set up/tear down reasons). You have to write a lot of specific code to save an instantiated class to a file, whereas saving a dict is easy. You have to dig in hard in the REPL to see all the data and methods, so they are hard to debug. Hamlet used to use a lot more of these, because I was following what was then the standard practice at GEM, and I spent months redoing things to use more standard Python data structures (dicts, dataframes, etc.) instead. There are a few cases that remain for performance reasons or to integrate better with the OQ Engine.

So for some of the structure of the PR, I don't think that we want to create a single large class that holds the evaluation data and results. The rest of the evals manage fine with a dictionary that has config, data, results, etc., and it is best to continue to use the same pattern throughout.

As we've discussed, I also don't want to require the MBTK as a dependency, even an optional one, if it can be avoided. However there may be real drawbacks to not using the SMT. (I don't think performance is one, as there are generally a few tens of EQs max in a given model with observations--though I haven't run Japan yet...)

What functionality is present by using the SMT code instead of what was already present--basically making rupture objects and calculating the GMFs for each at the relevant sites?
Would having a separate implementation of the residual analysis here lead to a major future divergence with the methods in the SMT over a medium term (the next few years, say)? Are any planned or desired features particularly hard to implement in the streamlined framework already present?
Are there any types of plots, metrics, or statistical evals/tests that are not present in the SMT that could be implemented here as a test case? For example, we already have interactive plotting for some evals using d3.js.

Again, thanks a lot. I think this will really help us get to the forefront on whole-model evaluations.

Also I'm happy to have a call to discuss if you'd like.

CB-quakemodel added 14 commits April 3, 2026 15:09

initial

8256954

remove some code

6045133

cleanup

ea56153

more splitting of functions

f05cedc

make the aratio and msr trt dependent

3ba2438

upd

32c5ee2

upd

880234c

upd

e816a0a

move residuals into old gmm_utils

bb947eb

clean

27baec7

add a test

61b0805

better test file

d28d0c4

upd test

fbe4d8a

upd

7c284f8

CB-quakemodel requested review from cossatot and raoanirudh April 3, 2026 17:34

fix output

fb86e36

CB-quakemodel force-pushed the gmm_testing branch from 5d1219f to fb86e36 Compare April 3, 2026 18:04

CB-quakemodel added 12 commits April 3, 2026 20:24

resolve merge properly

84617d7

restore doc strings

22b90c4

restore other docstrings

aef928b

cleanup

752451a

upd

a6bbc38

improve test

6c816ad

clean up test

c94a23b

improve tests for gmc render

02164cf

cleanup

5642b51

use oq rup instead

0fad498

upd

39697b0

simplify trt stuff

583a90f

CB-quakemodel added 2 commits April 7, 2026 16:14

only use events with 3 or more recs

a375c31

make the min recs configurable

df9bf66

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the GMM testing by leveraging the SMT#45

Improve the GMM testing by leveraging the SMT#45
CB-quakemodel wants to merge 29 commits intomasterfrom
gmm_testing

CB-quakemodel commented Apr 3, 2026 •

edited

Loading

Uh oh!

cossatot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CB-quakemodel commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cossatot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CB-quakemodel commented Apr 3, 2026 •

edited

Loading