Skip to content

Add an evaluation for model robustness #36

@aron0093

Description

@aron0093

We need to add an evaluation that tests the robustness of programs across multiple runs (seeds) and also across multiple K-values.

  1. A weak test can assess similarity of the overall information captured by each run.
  2. A stronger test would compare programs across runs and assess consistency.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions