MindSet Large on Kaggle (~0.5 GB)
MindSet Lite on Kaggle (~150 MB)
The MindSet: Vision datasets are designed to facilitate the testing of DNNs against controlled experiments in psychology. MindSet: Vision datasets focus on a range of low-, middle-, and high-level visual findings that provide important constraints for computational theories. It also provides materials for DNN testing and demonstrates how to evaluate a DNN for each experiment using DNNs pretrained on ImageNet.
Paper: arXiv:2404.05290
pip install -e . # core package
pip install -e ".[notebook]" # + Jupyter notebook explorerRequires Python >= 3.10.
mindset listjupyter lab examples/explorer.ipynbThe notebook has an interactive widget with sliders for every parameter, plus a catalog of all 33 generators with sample images.
# with defaults
mindset generate ebbinghaus -o data/ebbinghaus
# override specific parameters
mindset generate ebbinghaus --num-samples-scrambled 5000 --num-samples-illusory 50 -o data/ebbinghaus
# dump defaults to yaml, edit, then generate from config
mindset generate ebbinghaus --save-config
# edit ebbinghaus.yaml
mindset generate ebbinghaus --config ebbinghaus.yaml -o data/ebbinghaus
# generate all 33 datasets
mindset generate allfrom mindset.cli import _load_registry
from mindset.generators import get_generator
registry = _load_registry()
get_generator("ebbinghaus")["func"](
output_folder="data/ebbinghaus",
num_samples_scrambled=5000,
num_samples_illusory=50,
)MindSet: Vision datasets are divided into three categories:
| Category | Generators |
|---|---|
| Low and mid-level vision (9) | amodal_completion, decomposition, depth_drawings, emergent_features, nap_vs_mp_2d, nap_vs_mp_3d, relational_vs_coordinate, uncrowding, weber_law |
| Visual illusions (10) | adelson_checkerboard, ebbinghaus, grayscale_shapes, jastrow, lightness_contrast, muller_lyer, ponzo, thatcher_face, thatcher_words, tilt |
| Shape and object recognition (14) | dotted_linedrawings, embedded_figures, global_change, global_change_baker2022, leuven_embedded, linedrawings, same_different, segmented_images, silhouettes, texturized_blobs, texturized_chars, texturized_lines, transformations_2d, viewpoint_invariance |
A detailed description of each dataset can be found in the related paper here: refer to Section 2 for an overview, or to Appendix C for more detailed information, including the psychological significance of each dataset, references to relevant papers, and details on the structure of each dataset.
The datasets are structured into subfolders (conditions), which are organized based on the dataset's specific characteristics. At the root of each dataset, there's an annotation.csv file. This file lists the paths to individual images (starting from the dataset folder) along with their associated parameters. Such organization enables users to use the datasets either exploting their folder structure (e.g. through PyTorch's ImageFolder) or by directly referencing the annotation file.
MindSet: Vision is model-agnostic and offers flexibility in the way each dataset is employed. Depending on the testing method, you may need a few samples or several thousand images. To cater to these needs, we provide two variants of the dataset on Kaggle:
- Large Version with ~5000 samples for each condition.
- Lite Version with ~100 samples for each condition.
Both versions of the MindSet: Vision dataset are structured into folders, each containing a specific dataset. Due to Kaggle's current limitations, it's not possible to download these folders individually. Hence, if you need access to a specific dataset, you'll have to download the entire collection of datasets. Alternatively, you can generate the desired dataset on your own using the CLI or notebook.
mindset/
generators/ # 33 stimulus generators (decorator + config dataclass pattern)
drawing/ # shared drawing infrastructure (base classes, geometry, shapes)
cli.py # CLI entry point
utils.py # shared utilities
examples/
explorer.ipynb # interactive notebook with widget explorer + catalog
data_generation/ # CLI usage examples
tests/
test_smoke.py # smoke tests (33 generators registered, generation works)
Tested on macOS, Ubuntu, and Windows with Python 3.10+.


