Skip to content

8Altair/MDCT-sound-compression

Repository files navigation

MDCT Sound Compression

Stereo audio compression and decompression using a Mid/Side MDCT pipeline with a desktop GUI and test suite.

Overview

  • Goal: Compress stereo audio into a compact binary format and reconstruct it with controlled quality loss.
  • Core idea: Convert Left/Right (stereo channels) to Mid/Side (sum/difference), apply MDCT on overlapped blocks, quantize coefficients, and discard the last M coefficients per block.

Algorithm (High-Level)

  1. Load audio (WAV/FLAC/AIFF/OGG) and normalize channels to stereo.
  2. Mid/Side transform: mid = (L + R) / 2, side = (L - R) / 2.
  3. Framing: Overlapping blocks of length 2N with hop N (50% overlap), zero-padded on both ends.
  4. Sine window: Apply MDCT analysis window to each block.
  5. MDCT: Multiply with the cached cosine basis matrix.
  6. Quantize: Scale by 32768 and round to int32.
  7. Compression: Zero the last M coefficients per block (where 0 ≤ M ≤ N).
  8. Bit-packing: Store coefficients with sign-magnitude encoding into a .bin file.
  9. Decompression: Inverse steps via IMDCT + overlap-add + Mid/Side to Left/Right.

Project Structure

  • GUI.py: Main GUI (file selection, parameters, log, playback).
  • compression.py: Full compression pipeline + helpers (blocking, window, MDCT, bit-writer).
  • decompression.py: Full decompression pipeline + helpers (bit-reader, IMDCT, overlap-add).
  • audio_stream.py: Low-level audio streaming for playback.
  • playback.py: Playback track UI logic (play/pause/seek + time display).
  • logging_configuration.py: Color console + file logging setup.
  • Test/test_compression.py: Compression unit + integration tests.
  • Test/test_decompression.py: Decompression unit + integration tests.
  • Analysis/performance_analysis.py: Benchmarking and plotting.
  • Data/: Input audio assets.
  • Compressed/, Decompressed/, Channels/: Outputs created during runs/tests.

Compression & Decompression Implementation

Compression (compression.py):

  • load_stereo_and_compute_mid_side() loads audio, normalizes channels, computes Mid/Side.
  • divide_signal_into_blocks() pads and creates overlapping blocks of size 2N.
  • generate_sine_window() and apply_sine_window_to_blocks() apply the analysis window.
  • generate_mdct_cosine_matrix() and apply_mdct_transform_to_blocks() compute MDCT and quantize.
  • save_compressed_mdct() writes a custom header and bit-packed coefficients.

Decompression (decompression.py):

  • load_compressed_mdct() reads header + bitstream and reconstructs coefficient matrices.
  • apply_imdct_transform_to_blocks() performs IMDCT and synthesis windowing.
  • reconstruct_signal_from_blocks() overlap-adds and trims padding.
  • sound_decompression() converts Mid/Side back to Left/Right and clips to [-1, 1].

GUI Features

  • Audio file and compressed file browsing.
  • Adjustable Block size (N) and Compression factor (M).
  • Compress, Decompress, and Combined workflow.
  • Activity log with timestamped messages.
  • Playback section:
    • Original and Decompressed tracks.
    • Play/Pause, click/drag seek, and time display.
  • UI remains responsive during processing; non-playback controls are disabled while processing.

Logging (Console + GUI)

  • logging_configuration.py creates two loggers: compression and decompression.
  • Logs go to:
    • Console (colorized levels via colorlog).
    • Files in Logs/: compression_log.txt, decompression_log.txt.
  • GUI has an independent activity log for user-visible actions and errors.

Tests

Located in Test/:

  • Compression tests cover block framing, sine windowing, MDCT output, coefficient zeroing, caching immutability, and end-to-end compression.
  • Decompression tests cover header decoding, IMDCT formula correctness, overlap-add reconstruction, error handling, and round-trip checks.

Performance Analysis

Analysis/performance_analysis.py:

  • Runs compression + decompression across multiple N/M combinations.
  • Writes sound_performance_metrics.csv.
  • Generates timing plots:
    • sound_compression_times.png
    • sound_decompression_times.png

Usage Notes

  • Required packages: numpy, soundfile, customtkinter (optional), sounddevice (playback), colorlog, matplotlib (analysis).
  • Run GUI: GUI.py or Sound compressor.exe.

About

An application for sound compression using modified discrete cosine transform (MDCT) technique. The application allows compression and decompression of sound, and also saving processed audio.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages