Developed by Ahmed Elghazi and Rabih Neouchi at the University of Texas at Dallas.
This KNIME Python extension provides a suite of statistical analysis nodes for hypothesis testing, variance analysis, and regression diagnostics — packaged for seamless use inside KNIME Analytics Platform.
.
├── icons/
│ ├── utd.png # Extension category icon
│ ├── curve.jpg # Normality Tests icon
│ ├── post_hoc.jpg # Post-Hoc Analysis icon
│ ├── heteroskedasticity.png # Heteroskedasticity Tests icon
│ ├── factorial.jpg # Factorial ANOVA icon
│ ├── manova3.png # One-Way MANOVA icon
│ └── rm_anova.png # Repeated Measures ANOVA icon
├── src/
│ ├── __init__.py
│ ├── utils.py # Shared helpers and parameter definitions
│ ├── normality_node.py
│ ├── normality_tests/
│ ├── post_hoc_node.py
│ ├── post_hoc/
│ ├── heteroskedasticity_node.py
│ ├── heteroskedasticity/
│ ├── factorial_anova_node.py
│ ├── factorial_anova/
│ ├── manova_node.py
│ ├── manova/
│ ├── repeated_measures_anova_node.py
│ └── repeated_measures_anova/
├── golden_tables/ # Golden reference tables and CSV fixtures
│ ├── data/ # CSV test datasets
│ ├── generate_normality_golden.py
│ ├── generate_post_hoc_golden.py
│ ├── generate_heteroskedasticity_golden.py
│ ├── generate_factorial_anova_golden.py
│ ├── generate_manova_golden.py
│ └── generate_rm_anova_golden.py
├── workflows/ # Demo KNIME workflows (.knwf)
├── knime.yml # Extension metadata
├── pixi.toml # Dependency management
├── pixi.lock # Locked dependency versions
├── ruff.toml # Code formatting config
├── pytest.ini # Pytest configuration
└── LICENSE.TXT
Tests whether one or more numeric columns follow a normal distribution.
- Methods: Anderson-Darling, Cramer-von Mises
- Input: Table with numeric column(s)
- Output: Per-column results — Test Statistic, P-Value, Statistical Decision
Runs a one-way ANOVA and, if significant, identifies which group pairs differ.
- Methods: Tukey HSD, Holm-Bonferroni
- Input: Numeric dependent variable + categorical grouping variable
- Outputs: ANOVA Summary · Pairwise Comparisons (Mean Difference, Corrected P-Value)
Checks whether the residual variance of an OLS regression model is constant.
- Methods: Breusch-Pagan, White, Goldfeld-Quandt
- Input: Numeric target + predictor variables
- Outputs: Test Result · Model Summary · Data with Predictions and Residuals
Tests whether categorical factors — alone or in combination — significantly affect a continuous outcome.
- Features: Up to N-way interactions, Type I/II/III sums of squares, partial eta squared
- Input: Numeric response variable + one or more categorical factor variables
- Outputs: ANOVA Results (Basic or Advanced) · Model Coefficients with confidence intervals
Tests whether group means differ across multiple dependent variables simultaneously.
- Test statistic: Pillai's Trace (robust to assumption violations)
- Input: Two or more numeric dependent variables + one categorical grouping variable
- Outputs: Multivariate Results · Reliability Report (Box's M test)
Tests whether the same participants respond differently across conditions or time points.
- Correction: Greenhouse-Geisser sphericity correction applied automatically
- Input format: Long format only — one row per measurement, with columns for the measured value, condition/time point, and participant ID
- Output: Basic summary or full Advanced breakdown (sphericity diagnostics included)
-
Clone this repository:
git clone <repository-url> cd knime-utd-statistics
-
Review
knime.ymlto confirm the extension metadata (name, group ID, author, version). -
Inspect the
src/directory to explore or modify node implementations. Each node is implemented as a Python file (e.g.factorial_anova_node.py) backed by a dedicated submodule (e.g.factorial_anova/). -
Install the Python environment:
pixi install
This installs all dependencies as defined in
pixi.toml. The resulting environment is locked inpixi.lock— commit both files whenever you add or update packages. -
(Optional) Add packages to the environment:
pixi add <package_name>
-
Register the extension in debug mode with your local KNIME Analytics Platform:
pixi run register-debug-in-knime
This command auto-detects your KNIME installation and appends the
-Dknime.python.extension.debug_knime_yaml_listargument to theknime.inifile automatically — no manual file editing required. You can then test the nodes in KNIME (seeworkflows/for a demo). -
Bundle your extension:
pixi run build
To place the update site at a custom path (default is
./local-update-site):pixi run build dest=<path_to_your_update_site>
-
Install the bundled extension in KNIME via:
File > Install KNIME Extensions... > Available Software Sites > Add...Enter the path to your local update site. After that, install and restart KNIME.
-
To publish on KNIME Hub, follow the KNIME Hub documentation.
Golden reference tables and CSV fixtures live in golden_tables/, alongside scripts to regenerate them. Each node has a corresponding KNIME workflow in workflows/ (e.g. golden_NWay_ANOVA.knwf, golden_MANOVA.knwf) that can be used to validate output end-to-end inside KNIME Analytics Platform.
Run all tests with:
pixi run test