This project implements an automated scoring system for the Elicited Imitation Task (EIT) as part of the HumanAI AutoEIT GSoC project.
The system evaluates learner transcriptions against prompt sentences and assigns a score (0–4) based on meaning preservation and accuracy, following a rubric-based approach.
Below is a sample output generated by the system:
| Stimulus | Response | Score | Explanation |
|---|---|---|---|
| Quiero cortarme el pelo | Quiero cortarme el pelo | 4 | Exact or near-exact reproduction |
| ¿Qué dice usted que va a hacer hoy? | Que dices ustedes se que van a hacer hoy | 3 | Meaning preserved with minor differences |
| El carro lo tiene Pedro | gibberish perro | 0 | Response unrelated or incorrect |
- ✅ Multi-sheet Excel processing
- ✅ Text preprocessing and normalization
- ✅ Feature engineering:
- Word overlap
- Missing words
- Length ratio
- Sequence similarity
- ✅ Semantic similarity using Sentence Transformers
- ✅ Hybrid rule-based scoring engine
- ✅ Explainable AI (score + reasoning)
- ✅ Automated Excel output generation
The system combines:
- Lexical similarity (word overlap, missing words)
- Structural similarity (sequence similarity)
- Semantic similarity (Sentence Transformers)
Final scores are determined primarily by semantic similarity and aligned with the EIT scoring rubric:
| Score | Description |
|---|---|
| 4 | Exact or near-exact reproduction |
| 3 | Meaning preserved with minor differences |
| 2 | Partial meaning captured |
| 1 | Limited meaning retained |
| 0 | Incorrect or unrelated response |
AutoEIT/ ├── data/ ├── outputs/ ├── src/ ├── requirements.txt ├── README.md
flowchart LR
A[📥 Input Excel] --> B[🧹 Preprocessing]
B --> C[🧠 Feature Engineering]
C --> D[🔍 Semantic Similarity]
D --> E[⚖️ Scoring Engine]
E --> F[📤 Output Excel]
- Create virtual environment:
python -m venv venv
venv\Scripts\activate
- Create virtual environment:
pip install -r requirements.txt
Run the pipeline:
python src/main.py
Output file will be generated:
outputs/scored_output.xlsx
Each sheet contains:
- Original stimulus
- Learner response
- Predicted score
- Explanation of score
- Fine-tune semantic models on EIT datasets
- Learn scoring function from human-rated data
- Add grammatical error classification (omission, substitution, word order)
- Build web-based scoring interface (API + UI)
- Add evaluation metrics (correlation with human scores)
Ansh Shrivastava GSoC 2026 Applicant — HumanAI AutoEIT