Skip to content

RiyaMathew-11/bug-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bug Classifier

A machine learning project that classifies GitHub issue reports from popular ML frameworks as performance bugs (class 1) or non-performance bugs (class 0).

Dataset

Issues were collected from five open-source ML frameworks:

Project Total Performance Non-Performance
TensorFlow 1,490 279 1,211
PyTorch 752 95 657
Keras 668 135 533
Apache MXNet 516 65 451
Caffe 286 33 253
Total 3,712 607 3,105

The raw data lives in data/. Each CSV contains issue metadata (title, body, labels, comments, code snippets, etc.) plus a class column (0/1). The processed training-ready file is data/final_dataset.csv — two columns: report_text and class.

Notebooks

Run the notebooks in this order:

Notebook Purpose
notebooks/exploration.ipynb Initial EDA — class distributions, text/word length, label analysis
notebooks/relevance_checks.ipynb Data quality and relevance checks
notebooks/data_prep.ipynb Combines all project CSVs, cleans text, outputs final_dataset.csv
notebooks/baseline_model_training.ipynb Naive Bayes + TF-IDF baseline (two configs)
notebooks/other_model_experiments.ipynb SVM and Logistic Regression with TF-IDF and sentence embeddings
notebooks/statistical_tests.ipynb Wilcoxon signed-rank tests comparing all models

Models

Trained models are saved as .pkl files in notebooks/models/:

File Description
baseline_default.pkl Naive Bayes, course-provided config (TF-IDF, 1k features, ROC AUC scoring)
baseline_self.pkl Naive Bayes, custom config (TF-IDF, 18k features, F1-macro scoring)
svm.pkl LinearSVC + TF-IDF
logistic_regression.pkl Logistic Regression + TF-IDF
svm_st.pkl LinearSVC + all-MiniLM-L6-v2 sentence embeddings
svm_st_balanced.pkl LinearSVC (class-balanced) + MiniLM embeddings
svm_st_balanced_enhanced.pkl RBF SVC (class-balanced) + all-mpnet-base-v2 embeddings

Results

All models evaluated with 5×5 repeated stratified cross-validation, scored on F1 (macro):

Model Mean F1
NB + TF-IDF (baseline course) 0.443
NB + TF-IDF (reimplemented) 0.429
LR + TF-IDF 0.661
SVM + TF-IDF 0.653
SVM + MiniLM 0.752
SVM + MiniLM (balanced) 0.755
SVM + MiniLM (enhanced) 0.779
SVM + MPNet (best) 0.797

Wilcoxon signed-rank tests confirm all sentence-transformer models are statistically significantly better (p < 0.05) than both baselines.

Setup

Requires Python 3.11+.

pip install pipenv
pipenv install
pipenv shell
jupyter notebook

Dependencies: numpy, pandas, matplotlib, seaborn, scikit-learn, sentence-transformers, ipykernel.

About

An experiment on bug classification using ML

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors