SQLite Concurrency Best Practices for sqlite-graph

Overview

This document outlines best practices for handling concurrency in sqlite-graph, informed by real-world production experience from projects like Jellyfin's SQLite implementation.

Understanding SQLite's Concurrency Model

Core Limitations

SQLite's concurrency model has fundamental constraints:

Single Writer: Only one write transaction at a time
Multiple Readers: Unlimited concurrent reads (with WAL mode)
Lock Contention: Writers block readers in rollback mode, but not in WAL mode

Lock Types

Lock Type	Scope	Impact	Duration
SHARED	Read operations	Blocks writes	Query duration
RESERVED	Intent to write	Blocks other writers	Transaction duration
PENDING	Waiting for write	Blocks new readers	Short (transition)
EXCLUSIVE	Active write	Blocks all operations	Write completion

WAL Mode (Write-Ahead Logging)

What is WAL Mode?

WAL (Write-Ahead Logging) changes how SQLite handles transactions:

Traditional Rollback Mode:

Writes directly to database file
Creates rollback journal for recovery
Writers block readers
Readers block writers

WAL Mode:

Writes go to separate WAL file first
Multiple readers can proceed during writes
WAL periodically checkpointed to main database
Better concurrency for read-heavy workloads

Enabling WAL Mode

import { GraphDatabase } from 'sqlite-graph';

const db = new GraphDatabase('./graph.db');

// Enable WAL mode for better concurrency
db.db.pragma('journal_mode = WAL');

// Verify it's enabled
const mode = db.db.pragma('journal_mode', { simple: true });
console.log(`Journal mode: ${mode}`); // Should output: "wal"

WAL Mode Configuration

// Full WAL optimization
db.db.pragma('journal_mode = WAL');
db.db.pragma('synchronous = NORMAL'); // Faster, still safe with WAL
db.db.pragma('wal_autocheckpoint = 1000'); // Checkpoint every 1000 pages
db.db.pragma('journal_size_limit = 6144000'); // 6MB WAL size limit

When to Use WAL Mode

✅ Use WAL when:

Read-heavy workloads (80%+ reads)
Multiple concurrent readers expected
Write latency is acceptable
Disk space available for WAL file (can grow to ~2x database size)

❌ Avoid WAL when:

Write-heavy workloads (>50% writes)
Network filesystems (NFS, SMB) - WAL requires local filesystem
Very small databases (<100KB) - overhead not worth it
Multiple processes need to write simultaneously (use client-server DB instead)

Locking Strategies

Based on Jellyfin's experience, there are three primary strategies for handling lock contention:

1. No-Lock Strategy (Default)

When to use: 99% of use cases with low contention

Characteristics:

No synchronization overhead
Fast for single-writer scenarios
Relies on SQLite's internal locking
Fails fast on lock conflicts

Implementation:

// sqlite-graph uses this by default
const db = new GraphDatabase('./graph.db');
db.db.pragma('journal_mode = WAL');

try {
  db.createNode('Job', { title: 'Engineer' });
} catch (error) {
  if (error.message.includes('SQLITE_BUSY')) {
    console.error('Database locked, try again later');
  }
  throw error;
}

Best for:

Single-process applications
Low write concurrency (<10 writes/sec)
Embedded applications (desktop, mobile)
Background processing with single writer

2. Optimistic Locking with Retry

When to use: Moderate write contention, can tolerate brief delays

Characteristics:

"Try and Retry" approach
Automatic retry on lock conflicts
Exponential backoff
Maximum retry attempts

Implementation:

/**
 * Retry wrapper for write operations
 */
async function withRetry<T>(
  operation: () => T,
  maxRetries: number = 5,
  initialDelayMs: number = 10
): Promise<T> {
  let lastError: Error | undefined;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return operation();
    } catch (error: any) {
      lastError = error;

      // Only retry on lock errors
      if (!error.message.includes('SQLITE_BUSY') &&
          !error.message.includes('database is locked')) {
        throw error;
      }

      // Exponential backoff: 10ms, 20ms, 40ms, 80ms, 160ms
      const delayMs = initialDelayMs * Math.pow(2, attempt);
      await new Promise(resolve => setTimeout(resolve, delayMs));

      console.warn(`Retry attempt ${attempt + 1}/${maxRetries} after ${delayMs}ms`);
    }
  }

  throw new Error(`Operation failed after ${maxRetries} retries: ${lastError?.message}`);
}

// Usage
await withRetry(() => {
  db.createNode('Job', { title: 'Engineer' });
});

await withRetry(() => {
  db.mergeNode('Company', { name: 'TechCorp' }, { industry: 'SaaS' });
});

Advanced Retry with jitter:

function exponentialBackoffWithJitter(attempt: number, baseMs: number = 10): number {
  const exponential = baseMs * Math.pow(2, attempt);
  const jitter = Math.random() * exponential * 0.1; // ±10% jitter
  return exponential + jitter;
}

Best for:

Web applications with moderate concurrency (10-100 concurrent users)
Background job processing with multiple workers
API servers with bursty write patterns
Batch import operations

3. Pessimistic Locking with Queue

When to use: High write contention, strict ordering requirements

Characteristics:

Single write queue
Guaranteed write ordering
Higher latency but predictable
Prevents lock conflicts entirely

Implementation:

/**
 * Write queue for pessimistic locking
 */
class WriteQueue {
  private queue: Array<() => void> = [];
  private processing = false;

  async enqueue<T>(operation: () => T): Promise<T> {
    return new Promise((resolve, reject) => {
      this.queue.push(() => {
        try {
          const result = operation();
          resolve(result);
        } catch (error) {
          reject(error);
        }
      });

      this.processQueue();
    });
  }

  private async processQueue(): Promise<void> {
    if (this.processing || this.queue.length === 0) {
      return;
    }

    this.processing = true;

    while (this.queue.length > 0) {
      const operation = this.queue.shift()!;
      operation();

      // Small delay between operations to allow checkpoint
      await new Promise(resolve => setTimeout(resolve, 1));
    }

    this.processing = false;
  }
}

// Usage
const writeQueue = new WriteQueue();

// All writes go through queue
await writeQueue.enqueue(() => {
  db.createNode('Job', { title: 'Engineer' });
});

await writeQueue.enqueue(() => {
  db.mergeNode('Company', { name: 'TechCorp' }, { industry: 'SaaS' });
});

Best for:

High-concurrency web applications (100+ concurrent users)
Real-time systems requiring strict ordering
Financial systems requiring ACID guarantees
Multi-tenant applications with shared database

Transaction Best Practices

Batch Operations in Transactions

Problem: Individual writes are slow

// ❌ SLOW: 1000 individual transactions
for (let i = 0; i < 1000; i++) {
  db.createNode('Job', { title: `Job ${i}` });
} // ~1000ms

// ✅ FAST: Single transaction
db.transaction(() => {
  for (let i = 0; i < 1000; i++) {
    db.createNode('Job', { title: `Job ${i}` });
  }
}); // ~50ms (20x faster!)

Keep Transactions Short

Problem: Long transactions increase lock contention

// ❌ BAD: Long-running transaction blocks other operations
db.transaction(() => {
  const jobs = db.nodes('Job').exec(); // Fast

  jobs.forEach(job => {
    // Slow operation inside transaction
    const analysis = expensiveAnalysis(job); // 100ms each
    db.updateNode(job.id, { analysis });
  });
}); // Holds lock for seconds!

// ✅ GOOD: Do expensive work outside transaction
const jobs = db.nodes('Job').exec();

const updates = jobs.map(job => ({
  id: job.id,
  analysis: expensiveAnalysis(job) // Outside transaction
}));

// Fast batch update
db.transaction(() => {
  updates.forEach(({ id, analysis }) => {
    db.updateNode(id, { analysis });
  });
}); // Holds lock for milliseconds

Use Savepoints for Partial Rollback

db.transaction(() => {
  // Create base data
  const company = db.createNode('Company', { name: 'TechCorp' });

  // Savepoint before risky operation
  const sp = db.savepoint();

  try {
    // Risky operation that might fail
    const job = db.createNode('Job', { title: 'Engineer' });
    db.createEdge(job.id, 'POSTED_BY', company.id);
  } catch (error) {
    // Rollback just the risky part
    db.rollbackTo(sp);
    console.warn('Job creation failed, keeping company');
  }

  // Transaction continues with company created
});

Index Strategy for Merge Operations

Indexes are critical for merge operation performance (7.11x speedup):

// ❌ Without index: 4,196 ops/sec
db.mergeNode('Job', { url: 'https://example.com/job/123' });

// ✅ With index: 29,844 ops/sec (7.11x faster!)
db.createPropertyIndex('Job', 'url');
db.mergeNode('Job', { url: 'https://example.com/job/123' });

Index Creation Strategy

/**
 * Create indexes for frequently merged properties
 */
function setupProductionIndexes(db: GraphDatabase): void {
  // Indexes for unique identifiers
  db.createPropertyIndex('Job', 'url');
  db.createPropertyIndex('Company', 'name');
  db.createPropertyIndex('Company', 'domain');
  db.createPropertyIndex('User', 'email');

  // Indexes for frequently queried properties
  db.createPropertyIndex('Job', 'status');
  db.createPropertyIndex('Application', 'status');

  console.log('Production indexes created');
}

// Run on database initialization
const db = new GraphDatabase('./graph.db');
db.db.pragma('journal_mode = WAL');
setupProductionIndexes(db);

Multi-Process Architecture Patterns

Pattern 1: Single Writer, Multiple Readers

Use case: API server with read replicas

// writer.ts (single process)
const writerDb = new GraphDatabase('./graph.db');
writerDb.db.pragma('journal_mode = WAL');

app.post('/jobs', async (req, res) => {
  const job = writerDb.createNode('Job', req.body);
  res.json(job);
});

// reader.ts (multiple processes)
const readerDb = new GraphDatabase('./graph.db');
readerDb.db.pragma('journal_mode = WAL');

app.get('/jobs', async (req, res) => {
  const jobs = readerDb.nodes('Job').where(req.query).exec();
  res.json(jobs);
});

Pattern 2: Write Queue Service

Use case: Multiple services need to write

// write-service.ts (dedicated write process)
import express from 'express';

const db = new GraphDatabase('./graph.db');
db.db.pragma('journal_mode = WAL');
const writeQueue = new WriteQueue();

app.post('/write', async (req, res) => {
  const result = await writeQueue.enqueue(() => {
    return db.createNode(req.body.type, req.body.properties);
  });
  res.json(result);
});

// application-service.ts (calls write service)
async function createJob(data: JobProperties) {
  const response = await fetch('http://write-service/write', {
    method: 'POST',
    body: JSON.stringify({ type: 'Job', properties: data })
  });
  return response.json();
}

Pattern 3: Message Queue with Worker Pool

Use case: High-throughput batch processing

import { Worker } from 'worker_threads';

// main.ts
const writeWorker = new Worker('./write-worker.ts');

messageQueue.on('message', async (msg) => {
  writeWorker.postMessage({ operation: 'createNode', data: msg });
});

// write-worker.ts (single thread owns database connection)
import { parentPort } from 'worker_threads';

const db = new GraphDatabase('./graph.db');
db.db.pragma('journal_mode = WAL');

parentPort?.on('message', (msg) => {
  db.transaction(() => {
    const result = db.createNode(msg.data.type, msg.data.properties);
    parentPort?.postMessage({ success: true, result });
  });
});

Monitoring and Debugging

Detect Lock Contention

/**
 * Monitor database lock wait times
 */
function monitorLockWaits(db: GraphDatabase): void {
  let lockWaits = 0;

  const originalExec = db.db.prepare.bind(db.db);
  db.db.prepare = function(sql: string) {
    const stmt = originalExec(sql);
    const originalRun = stmt.run.bind(stmt);

    stmt.run = function(...args: any[]) {
      const start = Date.now();
      try {
        return originalRun(...args);
      } catch (error: any) {
        if (error.message.includes('SQLITE_BUSY')) {
          lockWaits++;
          console.warn(`Lock wait detected (${lockWaits} total)`);
        }
        throw error;
      } finally {
        const duration = Date.now() - start;
        if (duration > 100) {
          console.warn(`Slow query: ${sql} took ${duration}ms`);
        }
      }
    };

    return stmt;
  };
}

WAL Checkpoint Monitoring

/**
 * Monitor WAL file size and checkpoint frequency
 */
function monitorWAL(db: GraphDatabase): void {
  setInterval(() => {
    const walSize = db.db.pragma('wal_checkpoint(PASSIVE)', { simple: true });
    console.log(`WAL checkpoint: ${walSize} pages`);
  }, 60000); // Check every minute
}

Performance Metrics

interface ConcurrencyMetrics {
  totalWrites: number;
  lockWaits: number;
  avgWriteTime: number;
  maxWriteTime: number;
  retryCount: number;
}

class MetricsCollector {
  private metrics: ConcurrencyMetrics = {
    totalWrites: 0,
    lockWaits: 0,
    avgWriteTime: 0,
    maxWriteTime: 0,
    retryCount: 0
  };

  recordWrite(durationMs: number, hadRetry: boolean): void {
    this.metrics.totalWrites++;
    this.metrics.avgWriteTime =
      (this.metrics.avgWriteTime * (this.metrics.totalWrites - 1) + durationMs) /
      this.metrics.totalWrites;
    this.metrics.maxWriteTime = Math.max(this.metrics.maxWriteTime, durationMs);
    if (hadRetry) this.metrics.retryCount++;
  }

  recordLockWait(): void {
    this.metrics.lockWaits++;
  }

  report(): ConcurrencyMetrics {
    return { ...this.metrics };
  }
}

Production Recommendations

Startup Configuration

/**
 * Production-ready database initialization
 */
function initProductionDatabase(path: string): GraphDatabase {
  const db = new GraphDatabase(path);

  // Enable WAL mode for better concurrency
  db.db.pragma('journal_mode = WAL');

  // Optimize for mixed workload
  db.db.pragma('synchronous = NORMAL'); // Safe with WAL
  db.db.pragma('cache_size = -64000'); // 64MB cache
  db.db.pragma('temp_store = MEMORY'); // Temp tables in RAM
  db.db.pragma('mmap_size = 30000000000'); // 30GB mmap
  db.db.pragma('page_size = 4096'); // Standard page size

  // WAL-specific optimizations
  db.db.pragma('wal_autocheckpoint = 1000'); // Checkpoint every 1000 pages
  db.db.pragma('journal_size_limit = 6144000'); // 6MB WAL limit

  // Set busy timeout (milliseconds to wait on lock)
  db.db.pragma('busy_timeout = 5000'); // 5 second timeout

  return db;
}

Error Handling

/**
 * Handle database errors with appropriate retry logic
 */
async function executeWithRetry<T>(
  db: GraphDatabase,
  operation: () => T,
  maxRetries: number = 3
): Promise<T> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return operation();
    } catch (error: any) {
      // Handle specific SQLite errors
      if (error.code === 'SQLITE_BUSY' || error.code === 'SQLITE_LOCKED') {
        if (attempt < maxRetries - 1) {
          await new Promise(resolve =>
            setTimeout(resolve, exponentialBackoffWithJitter(attempt))
          );
          continue;
        }
      }

      // Non-retryable error or max retries reached
      console.error(`Database operation failed:`, error);
      throw error;
    }
  }

  throw new Error('Unreachable');
}

Decision Matrix

Scenario	Write Frequency	Concurrency	Recommended Strategy
Desktop app	Low (<1/sec)	Single process	No-Lock + WAL
CLI tool	Batch	Single process	No-Lock + Transactions
API server	Medium (1-10/sec)	10-50 users	Optimistic + WAL
API server	High (10-100/sec)	50-500 users	Pessimistic Queue + WAL
Background jobs	Batch	2-10 workers	Optimistic + WAL + Batch transactions
Real-time system	High (>100/sec)	>500 users	⚠️ Consider client-server DB
Multi-tenant SaaS	Variable	>1000 users	⚠️ Consider client-server DB

When to Upgrade

If you're hitting these limits, consider upgrading to a client-server database (PostgreSQL, Neo4j):

Upgrade signals:

Lock waits >5% of operations
Write latency >100ms consistently
Need true multi-writer (not just read concurrency)
Need horizontal scaling/replication
Database size >10GB
Write throughput >1000 ops/sec sustained

Summary

Key Takeaways:

Enable WAL mode for production use - it's free performance for read-heavy workloads
Choose locking strategy based on write concurrency:
- No-Lock: Single process, low contention
- Optimistic: Moderate concurrency, retry acceptable
- Pessimistic: High concurrency, strict ordering
Batch operations in transactions for 10-20x speedup
Index merge properties for 7x speedup on merge operations
Keep transactions short to minimize lock contention
Monitor metrics to detect when you're approaching limits
Know when to upgrade to a client-server database

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLite Concurrency Best Practices for sqlite-graph

Overview

Understanding SQLite's Concurrency Model

Core Limitations

Lock Types

WAL Mode (Write-Ahead Logging)

What is WAL Mode?

Enabling WAL Mode

WAL Mode Configuration

When to Use WAL Mode

Locking Strategies

1. No-Lock Strategy (Default)

2. Optimistic Locking with Retry

3. Pessimistic Locking with Queue

Transaction Best Practices

Batch Operations in Transactions

Keep Transactions Short

Use Savepoints for Partial Rollback

Index Strategy for Merge Operations

Index Creation Strategy

Multi-Process Architecture Patterns

Pattern 1: Single Writer, Multiple Readers

Pattern 2: Write Queue Service

Pattern 3: Message Queue with Worker Pool

Monitoring and Debugging

Detect Lock Contention

WAL Checkpoint Monitoring

Performance Metrics

Production Recommendations

Startup Configuration

Error Handling

Decision Matrix

When to Upgrade

Summary

References

FilesExpand file tree

CONCURRENCY-BEST-PRACTICES.md

Latest commit

History

CONCURRENCY-BEST-PRACTICES.md

File metadata and controls

SQLite Concurrency Best Practices for sqlite-graph

Overview

Understanding SQLite's Concurrency Model

Core Limitations

Lock Types

WAL Mode (Write-Ahead Logging)

What is WAL Mode?

Enabling WAL Mode

WAL Mode Configuration

When to Use WAL Mode

Locking Strategies

1. No-Lock Strategy (Default)

2. Optimistic Locking with Retry

3. Pessimistic Locking with Queue

Transaction Best Practices

Batch Operations in Transactions

Keep Transactions Short

Use Savepoints for Partial Rollback

Index Strategy for Merge Operations

Index Creation Strategy

Multi-Process Architecture Patterns

Pattern 1: Single Writer, Multiple Readers

Pattern 2: Write Queue Service

Pattern 3: Message Queue with Worker Pool

Monitoring and Debugging

Detect Lock Contention

WAL Checkpoint Monitoring

Performance Metrics

Production Recommendations

Startup Configuration

Error Handling

Decision Matrix

When to Upgrade

Summary

References