Stereo audio compression and decompression using a Mid/Side MDCT pipeline with a desktop GUI and test suite.
- Goal: Compress stereo audio into a compact binary format and reconstruct it with controlled quality loss.
- Core idea: Convert Left/Right (stereo channels) to Mid/Side (sum/difference), apply MDCT on overlapped blocks, quantize coefficients, and discard the last M coefficients per block.
- Load audio (WAV/FLAC/AIFF/OGG) and normalize channels to stereo.
- Mid/Side transform:
mid = (L + R) / 2,side = (L - R) / 2. - Framing: Overlapping blocks of length
2Nwith hopN(50% overlap), zero-padded on both ends. - Sine window: Apply MDCT analysis window to each block.
- MDCT: Multiply with the cached cosine basis matrix.
- Quantize: Scale by 32768 and round to int32.
- Compression: Zero the last M coefficients per block (where
0 ≤ M ≤ N). - Bit-packing: Store coefficients with sign-magnitude encoding into a
.binfile. - Decompression: Inverse steps via IMDCT + overlap-add + Mid/Side to Left/Right.
GUI.py: Main GUI (file selection, parameters, log, playback).compression.py: Full compression pipeline + helpers (blocking, window, MDCT, bit-writer).decompression.py: Full decompression pipeline + helpers (bit-reader, IMDCT, overlap-add).audio_stream.py: Low-level audio streaming for playback.playback.py: Playback track UI logic (play/pause/seek + time display).logging_configuration.py: Color console + file logging setup.Test/test_compression.py: Compression unit + integration tests.Test/test_decompression.py: Decompression unit + integration tests.Analysis/performance_analysis.py: Benchmarking and plotting.Data/: Input audio assets.Compressed/,Decompressed/,Channels/: Outputs created during runs/tests.
Compression (compression.py):
load_stereo_and_compute_mid_side()loads audio, normalizes channels, computes Mid/Side.divide_signal_into_blocks()pads and creates overlapping blocks of size2N.generate_sine_window()andapply_sine_window_to_blocks()apply the analysis window.generate_mdct_cosine_matrix()andapply_mdct_transform_to_blocks()compute MDCT and quantize.save_compressed_mdct()writes a custom header and bit-packed coefficients.
Decompression (decompression.py):
load_compressed_mdct()reads header + bitstream and reconstructs coefficient matrices.apply_imdct_transform_to_blocks()performs IMDCT and synthesis windowing.reconstruct_signal_from_blocks()overlap-adds and trims padding.sound_decompression()converts Mid/Side back to Left/Right and clips to [-1, 1].
- Audio file and compressed file browsing.
- Adjustable Block size (N) and Compression factor (M).
- Compress, Decompress, and Combined workflow.
- Activity log with timestamped messages.
- Playback section:
- Original and Decompressed tracks.
- Play/Pause, click/drag seek, and time display.
- UI remains responsive during processing; non-playback controls are disabled while processing.
logging_configuration.pycreates two loggers:compressionanddecompression.- Logs go to:
- Console (colorized levels via
colorlog). - Files in
Logs/:compression_log.txt,decompression_log.txt.
- Console (colorized levels via
- GUI has an independent activity log for user-visible actions and errors.
Located in Test/:
- Compression tests cover block framing, sine windowing, MDCT output, coefficient zeroing, caching immutability, and end-to-end compression.
- Decompression tests cover header decoding, IMDCT formula correctness, overlap-add reconstruction, error handling, and round-trip checks.
Analysis/performance_analysis.py:
- Runs compression + decompression across multiple N/M combinations.
- Writes
sound_performance_metrics.csv. - Generates timing plots:
sound_compression_times.pngsound_decompression_times.png
- Required packages:
numpy,soundfile,customtkinter(optional),sounddevice(playback),colorlog,matplotlib(analysis). - Run GUI:
GUI.pyorSound compressor.exe.