Version Control System Core is a from-scratch implementation of a local version control system written in Java, designed to demonstrate core data structures, algorithms, and systems-level reasoning.
Unlike typical “Git clone” projects that focus on surface-level commands, this project focuses on the internal mechanics of a VCS:
- Content-addressable storage
- Immutable snapshots
- Commit graph traversal
- Branching and merging
- Three-way merge with conflict detection
- Disk persistence
The system is built incrementally: first fully in-memory, then extended with on-disk persistence, mirroring real-world systems engineering practices inspired conceptually by :contentReference[oaicite:0]{index=0}.
-
Immutability
- Blobs, trees, and commits are immutable once created.
- Any modification produces a new object.
-
Content-Addressable Storage
- Objects are identified by cryptographic hashes.
- Identical content is stored only once.
-
Separation of Concerns
- In-memory logic is cleanly separated from disk persistence.
- Commands do not perform raw file I/O directly.
-
Data-Structure-Driven Design
- Every VCS concept maps directly to a fundamental data structure.
Purpose
Acts as the single source of truth for the VCS state.
Responsibilities
- Maintains the staging area
- Tracks the current HEAD commit
- Coordinates commit creation
Data Structures Used
HashMap<String, Blob>for staging (O(1) access)- Linked commit references for history traversal
Purpose
Represents an immutable snapshot of a file’s content.
Key Properties
- Stores raw content as
byte[] - Computes a SHA-1 hash from content
- Enforces immutability (no setters)
Why this matters
Files are treated as content, not filenames, enabling deduplication and integrity checks.
Purpose
Represents the complete directory structure at a commit.
Implementation
- N-ary tree using
HashMap<String, TreeNode> - Nodes represent either directories or files (Blob references)
Why a tree
Filesystem structure is hierarchical; trees allow efficient snapshotting and traversal.
Purpose
Represents a snapshot in time.
Contains
- Commit message
- Timestamp
- Parent commit reference
- Root tree snapshot
Graph Model
- Commits form a Directed Acyclic Graph (DAG)
- Enables history traversal, branching, and merging
Concept
A branch is a named pointer to a commit, not a copy of history.
Implementation
HashMap<String, Commit>mapping branch names to commit heads- Branch switching only reassigns pointers
Result
- O(1) branch operations
- No data duplication
This project implements a true three-way merge, not a naïve overwrite.
- BASE: Lowest Common Ancestor (LCA)
- OURS: Current branch HEAD
- THEIRS: Target branch HEAD
- Find the LCA using graph traversal
- Compare changes:
- BASE → OURS
- BASE → THEIRS
- Apply deterministic merge rules
- Detect conflicts when both sides diverge differently
- Graph traversal with
HashSetfor LCA detection - Three-way comparison logic
- Standard conflict markers
Purpose
Compares different versions of file content.
Algorithm
- Longest Common Subsequence (LCS)
Complexity
- Time:
O(m × n) - Space:
O(m × n)
This demonstrates dynamic programming fundamentals.
After completing the in-memory system, disk persistence was added.