A PyTorch based Deep Learning pipeline designed to classify environmental sounds from the UrbanSound8K dataset. The architecture utilizes a custom 3 layer Convolutional Neural Network (CNN).
- Training Set: Folds 1–8 (~7,000 audio samples)
- Testing Set: Folds 9 & 10 (~1,700 audio samples — Strict Isolate Evaluation)
- Training Epochs: 100
- Final Evaluation Accuracy: 76.77%
- Total Paramater Count: 2.8M
Note on Data Integrity: This project strictly adheres to the official UrbanSound8K fold structure. Folds 9 and 10 were completely locked away during the training phase, ensuring zero data leakage and a true, real-world generalization score.
The model is built using a custom sequential CNN backbone designed to balance high computational accuracy with a lightweight parameter footprint. By downsampling through 3 distinct convolutional blocks, the model keeps total parameters around 2.8 Million.