Skip to content

Latest commit

 

History

History
132 lines (95 loc) · 2.87 KB

File metadata and controls

132 lines (95 loc) · 2.87 KB

AutoEIT: Automated Scoring for Elicited Imitation Task

📌 Overview

This project implements an automated scoring system for the Elicited Imitation Task (EIT).

The system evaluates learner transcriptions against prompt sentences and assigns a score (0–4) based on meaning preservation and accuracy, following a rubric-based approach.


📊 Example Output

Below is a sample output generated by the system:

Stimulus Response Score Explanation
Quiero cortarme el pelo Quiero cortarme el pelo 4 Exact or near-exact reproduction
¿Qué dice usted que va a hacer hoy? Que dices ustedes se que van a hacer hoy 3 Meaning preserved with minor differences
El carro lo tiene Pedro gibberish perro 0 Response unrelated or incorrect

🎯 Key Features

  • ✅ Multi-sheet Excel processing
  • ✅ Text preprocessing and normalization
  • ✅ Feature engineering:
    • Word overlap
    • Missing words
    • Length ratio
    • Sequence similarity
  • ✅ Semantic similarity using Sentence Transformers
  • ✅ Hybrid rule-based scoring engine
  • ✅ Explainable AI (score + reasoning)
  • ✅ Automated Excel output generation

🧠 Scoring Logic

The system combines:

  • Lexical similarity (word overlap, missing words)
  • Structural similarity (sequence similarity)
  • Semantic similarity (Sentence Transformers)

Final scores are determined primarily by semantic similarity and aligned with the EIT scoring rubric:

Score Description
4 Exact or near-exact reproduction
3 Meaning preserved with minor differences
2 Partial meaning captured
1 Limited meaning retained
0 Incorrect or unrelated response

📂 Project Structure

AutoEIT/ ├── data/ ├── outputs/ ├── src/ ├── requirements.txt ├── README.md


⚙️ How It Works

flowchart LR
    A[📥 Input Excel] --> B[🧹 Preprocessing]
    B --> C[🧠 Feature Engineering]
    C --> D[🔍 Semantic Similarity]
    D --> E[⚖️ Scoring Engine]
    E --> F[📤 Output Excel]
Loading

⚙️ Installation

  1. Create virtual environment:
python -m venv venv
venv\Scripts\activate
  1. Create virtual environment:
pip install -r requirements.txt

▶️ Usage

Run the pipeline:

python src/main.py

Output file will be generated:

outputs/scored_output.xlsx

📊 Output

Each sheet contains:

  • Original stimulus
  • Learner response
  • Predicted score
  • Explanation of score

🚀 Future Work

  • Fine-tune semantic models on EIT datasets
  • Learn scoring function from human-rated data
  • Add grammatical error classification (omission, substitution, word order)
  • Build web-based scoring interface (API + UI)
  • Add evaluation metrics (correlation with human scores)

👨‍💻 Author

Ansh Shrivastava