Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,12 @@ This repository holds the [MMTk](https://www.mmtk.io/) bindings for Ruby. The bi
After building Ruby and the MMTk bindings, run Ruby with `RUBY_GC_LIBRARY=mmtk` environment variable. You can also configure the following environment variables:

- `MMTK_PLAN=<NoGC|MarkSweep|Immix>`: Configures the GC algorithm used by MMTk. Defaults to `Immix`.
- `MMTK_HEAP_MODE=<fixed|dynamic>`: Configures the MMTk heap used. `fixed` is a fixed size heap, `dynamic` is a dynamic sized heap that will grow and shrink in size based on heuristics using the [MemBalancer](https://dl.acm.org/doi/pdf/10.1145/3563323) algorithm. Defaults to `dynamic`.
- `MMTK_HEAP_MIN=<size>`: Configures the lower bound in heap memory usage by MMTk. Only valid when `MMTK_HEAP_MODE=dynamic`. `size` is in bytes, but you can also append `KiB`, `MiB`, `GiB` for larger sizes. Defaults to 1MiB.
- `MMTK_HEAP_MODE=<fixed|dynamic|ruby|cpu>`: Configures the MMTk heap used. Defaults to `dynamic`.
- `fixed`: a fixed size heap.
- `dynamic`: a dynamic sized heap that will grow and shrink in size based on heuristics using the [MemBalancer](https://dl.acm.org/doi/pdf/10.1145/3563323) algorithm.
- `ruby`: a dynamic sized heap that grows and shrinks based on the ratio of free to used slots, using the same `RUBY_GC_HEAP_FREE_SLOTS_*_RATIO` env vars as the default Ruby GC.
- `cpu`: a dynamic sized heap that adjusts itself to hit a target GC CPU overhead, using the algorithm from [Tavakolisomeh et al., "Heap Size Adjustment with CPU Control" (MPLR '23)](https://dl.acm.org/doi/10.1145/3617651.3622988). Tunable via `MMTK_GC_CPU_TARGET` and `MMTK_GC_CPU_WINDOW` (see below).
- `MMTK_HEAP_MIN=<size>`: Configures the lower bound in heap memory usage by MMTk. Only valid when `MMTK_HEAP_MODE` is `dynamic`, `ruby`, or `cpu`. `size` is in bytes, but you can also append `KiB`, `MiB`, `GiB` for larger sizes. Defaults to 1MiB.
- `MMTK_HEAP_MAX=<size>`: Configures the upper bound in heap memory usage by MMTk. Once this limit is reached and no objects can be garbage collected, it will crash with an out-of-memory. `size` is in bytes, but you can also append `KiB`, `MiB`, `GiB` for larger sizes. Defaults to 80% of your system RAM.
- `MMTK_GC_CPU_TARGET=<percent>`: Target GC CPU overhead, as a percentage, when `MMTK_HEAP_MODE=cpu`. After each GC cycle, the heap is grown if the measured GC CPU overhead exceeds this target and shrunk if it falls below. Defaults to `5`. The paper recommends `15` for the concurrent collector it targets (ZGC), but on MMTk-Ruby's stop-the-world Immix every percent of GC CPU also blocks the mutator, so a smaller budget gives better throughput. Empirical sweeps across ruby-bench find 5 Pareto-optimal vs. the `ruby` heap mode (~6% geomean speedup at essentially equal peak RSS).
- `MMTK_GC_CPU_WINDOW=<n>`: Number of recent GC cycles averaged when measuring GC CPU overhead for `MMTK_HEAP_MODE=cpu`. Larger values smooth the signal at the cost of responsiveness. Defaults to `3`.
98 changes: 98 additions & 0 deletions bin/compare-heap-modes
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#!/usr/bin/env bash
# Compare MMTk heap modes on ruby-bench.
#
# Runs the ruby-bench suite (expected checked out at $RUBY_BENCH_DIR, default
# ../ruby-bench) with two Ruby "executables" that are the same modular-GC
# Ruby, but wrapped so each sets a different MMTK_HEAP_MODE.
#
# Required env:
# RUBY_BIN Path to a Ruby built with --with-modular-gc and the
# MMTk binding installed. (e.g. ~/.rubies/ruby-mmtk/bin/ruby)
#
# Optional env:
# RUBY_BENCH_DIR Path to ruby-bench checkout (default: ../ruby-bench)
# MODES Space-separated list of heap modes to compare
# (default: "ruby cpu"). Others: "fixed dynamic".
# BENCHES Space-separated list of benchmarks to run (default: a
# curated small-but-GC-sensitive set). Pass empty string
# "" to run the whole default suite.
# WARMUP WARMUP_ITRS (default 5)
# BENCH MIN_BENCH_ITRS (default 10)
# TIME MIN_BENCH_TIME (default 20)
# MMTK_GC_CPU_TARGET CPU overhead target for `cpu` mode (default 5)
# MMTK_GC_CPU_WINDOW averaging window for `cpu` mode (default 3)
#
# Example:
# RUBY_BIN=~/.rubies/ruby-mmtk/bin/ruby \
# bin/compare-heap-modes
#
# RUBY_BIN=~/.rubies/ruby-mmtk/bin/ruby MODES="ruby cpu dynamic" \
# BENCHES="liquid-render psych-load railsbench" \
# bin/compare-heap-modes

set -euo pipefail

if [ -z "${RUBY_BIN:-}" ]; then
echo "error: RUBY_BIN must be set to a Ruby built with --with-modular-gc" >&2
exit 64
fi
if [ ! -x "$RUBY_BIN" ]; then
echo "error: RUBY_BIN=$RUBY_BIN is not executable" >&2
exit 64
fi

REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
BENCH_DIR="${RUBY_BENCH_DIR:-$REPO_ROOT/../ruby-bench}"
if [ ! -d "$BENCH_DIR" ]; then
echo "error: ruby-bench checkout not found at $BENCH_DIR" >&2
echo " clone it with: git clone https://github.com/ruby/ruby-bench $BENCH_DIR" >&2
exit 66
fi

# Put RUBY_BIN's bin directory first on PATH so `bundle exec`, `ruby`, and any
# shebangs invoking `ruby` resolve to the modular-GC Ruby instead of whatever
# system Ruby comes first in the caller's environment.
RUBY_BIN_DIR="$(cd "$(dirname "$RUBY_BIN")" && pwd)"
export PATH="$RUBY_BIN_DIR:$PATH"

# Clear gem-path vars that might still point at a different Ruby's gems.
unset GEM_HOME GEM_PATH BUNDLE_PATH RUBYLIB RUBYOPT 2>/dev/null || true

MODES=${MODES:-"ruby cpu"}
# A curated GC-sensitive subset. Override with BENCHES="".
DEFAULT_BENCHES="liquid-render psych-load railsbench lee binarytrees"
if [ -z "${BENCHES+x}" ]; then
BENCHES="$DEFAULT_BENCHES"
fi

export WARMUP_ITRS="${WARMUP:-5}"
export MIN_BENCH_ITRS="${BENCH:-10}"
export MIN_BENCH_TIME="${TIME:-20}"

# Export tunables so all wrapped runs see the same values. The `ruby` mode
# ignores MMTK_GC_CPU_*; the `cpu` mode ignores RUBY_GC_HEAP_*.
export MMTK_GC_CPU_TARGET="${MMTK_GC_CPU_TARGET:-5}"
export MMTK_GC_CPU_WINDOW="${MMTK_GC_CPU_WINDOW:-3}"

WRAPPER="$REPO_ROOT/bin/ruby-mmtk-mode"
RUBY_ARGS=()
for mode in $MODES; do
RUBY_ARGS+=(-e "mmtk-$mode::$WRAPPER $mode -- ")
done

cd "$BENCH_DIR"

echo "== compare-heap-modes =="
echo "ruby_bin: $RUBY_BIN"
echo "modes: $MODES"
echo "benches: ${BENCHES:-<all>}"
echo "warmup: $WARMUP_ITRS"
echo "bench: $MIN_BENCH_ITRS iters / $MIN_BENCH_TIME s min"
echo "cpu target:$MMTK_GC_CPU_TARGET% window=$MMTK_GC_CPU_WINDOW"
echo "---"

# `--rss` records peak RSS per run, essential for comparing memory footprint
# between heap-sizing policies.
# `--no-sudo` skips CPU governor / turbo tweaks that would need root.
export RUBY_BIN
exec bundle exec ./run_benchmarks.rb --no-sudo --rss "${RUBY_ARGS[@]}" ${BENCHES:-}
48 changes: 48 additions & 0 deletions bin/ruby-mmtk-mode
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/sh
# Wrapper that invokes a modular-GC Ruby with MMTk + a specific MMTK_HEAP_MODE.
#
# ruby-bench's run_benchmarks.rb compares Ruby executables passed via `-e`.
# Each `-e` entry is a single command line, so to compare two MMTk heap modes
# we need one executable per mode with the relevant env vars already baked in.
#
# Usage:
# bin/ruby-mmtk-mode <mode> [-- extra env VAR=VAL ...] -- <ruby args...>
#
# The caller is expected to set RUBY_BIN to the path of a Ruby built with
# --with-modular-gc and the MMTk binding installed (or to have it on $PATH).
#
# Examples:
# RUBY_BIN=$HOME/.rubies/ruby-mmtk/bin/ruby bin/ruby-mmtk-mode ruby -- -e 'puts GC.config'
# RUBY_BIN=$HOME/.rubies/ruby-mmtk/bin/ruby bin/ruby-mmtk-mode cpu -- -e 'puts GC.config'

set -eu

if [ $# -lt 1 ]; then
echo "usage: $0 <heap_mode> [VAR=VAL ...] -- <ruby args>" >&2
exit 64
fi

MODE=$1
shift

# Optional additional env vars before the `--` separator.
while [ $# -gt 0 ] && [ "$1" != "--" ]; do
case "$1" in
*=*) export "$1" ;;
*)
echo "$0: expected VAR=VAL or --, got: $1" >&2
exit 64
;;
esac
shift
done
if [ $# -gt 0 ] && [ "$1" = "--" ]; then
shift
fi

RUBY=${RUBY_BIN:-ruby}

exec env \
RUBY_GC_LIBRARY=mmtk \
MMTK_HEAP_MODE="$MODE" \
"$RUBY" "$@"
113 changes: 113 additions & 0 deletions bin/smoke-test
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
#!/usr/bin/env ruby
# frozen_string_literal: true

# Smoke test for MMTk heap modes.
#
# Runs an allocation-heavy loop under a given MMTK_HEAP_MODE and reports:
# - the mode Ruby actually booted with (GC.config)
# - GC cycle count triggered during the loop
# - wall-clock time and process CPU time
# - peak resident set size (peak RSS)
#
# Usage (after `rake install:release` against a modular-GC Ruby):
#
# bin/smoke-test # defaults to MMTK_HEAP_MODE=cpu
# MMTK_HEAP_MODE=ruby bin/smoke-test
# MMTK_HEAP_MODE=cpu MMTK_GC_CPU_TARGET=10 bin/smoke-test
# SMOKE_ITERATIONS=2_000_000 bin/smoke-test # longer run for trigger to adapt
#
# If this script is run without RUBY_GC_LIBRARY=mmtk set, it will re-exec
# itself with that env var plus whatever other MMTK_* vars you passed.

unless ENV["RUBY_GC_LIBRARY"] == "mmtk"
ENV["RUBY_GC_LIBRARY"] = "mmtk"
ENV["MMTK_HEAP_MODE"] ||= "cpu"
exec(RbConfig.ruby, __FILE__, *ARGV)
end

impl = GC.config[:implementation]
unless impl == "mmtk"
abort "smoke-test: expected GC implementation 'mmtk', got #{impl.inspect}. " \
"Is your Ruby built with --with-modular-gc and is the binding installed?"
end

require "fiddle"

# getrusage(RUSAGE_SELF) returns peak RSS in ru_maxrss. On macOS the value is
# in bytes; on Linux it's in kilobytes.
module Rusage
extend self

RUSAGE_SELF = 0

# struct rusage on macOS/Linux: first two fields are ru_utime / ru_stime
# (struct timeval = { long, long }), then a series of long integers.
# ru_maxrss is the 3rd long integer (offset after the 2 timevals).
# Each field here is a 64-bit long on 64-bit platforms.
# Layout (all i64):
# [0..1] ru_utime (sec, usec)
# [2..3] ru_stime (sec, usec)
# [4] ru_maxrss <-- what we want
# ... more fields we don't use
STRUCT_LONGS = 18

def peak_rss_bytes
libc = Fiddle::Handle::DEFAULT
getrusage = Fiddle::Function.new(
libc["getrusage"], [Fiddle::TYPE_INT, Fiddle::TYPE_VOIDP], Fiddle::TYPE_INT
)
buf = Fiddle::Pointer.malloc(STRUCT_LONGS * Fiddle::SIZEOF_LONG, Fiddle::RUBY_FREE)
raise "getrusage failed" unless getrusage.call(RUSAGE_SELF, buf) == 0
maxrss = buf[4 * Fiddle::SIZEOF_LONG, Fiddle::SIZEOF_LONG].unpack1("q")
# macOS reports bytes, Linux reports kilobytes.
RbConfig::CONFIG["host_os"].include?("darwin") ? maxrss : maxrss * 1024
end
end

puts "== MMTk smoke test =="
puts "implementation: #{GC.config[:implementation]}"
puts "mmtk_plan: #{GC.config[:mmtk_plan]}"
puts "mmtk_heap_mode: #{GC.config[:mmtk_heap_mode]}"
puts "mmtk_heap_min: #{GC.config[:mmtk_heap_min]}" if GC.config[:mmtk_heap_min]
puts "mmtk_heap_max: #{GC.config[:mmtk_heap_max]}"
puts "mmtk_worker_count: #{GC.config[:mmtk_worker_count]}"
if GC.config[:mmtk_heap_mode] == "cpu"
puts "cpu target (env): #{ENV.fetch('MMTK_GC_CPU_TARGET', '5')}%"
puts "cpu window (env): #{ENV.fetch('MMTK_GC_CPU_WINDOW', '3')}"
end
puts "---"

ITERATIONS = Integer(ENV.fetch("SMOKE_ITERATIONS", 500_000))
OBJECT_SIZE = Integer(ENV.fetch("SMOKE_OBJECT_SIZE", 256))
LIVE_SET = Integer(ENV.fetch("SMOKE_LIVE_SET", 2_000))

# The workload: maintain a rolling working set of LIVE_SET objects, each
# OBJECT_SIZE bytes. Each iteration allocates a new object and drops an old
# one. This produces a steady stream of garbage and a predictable live-set
# size, so the CPU trigger has a stable signal to converge on.

gc_before = GC.count
t_wall_start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
t_cpu_start = Process.clock_gettime(Process::CLOCK_PROCESS_CPUTIME_ID)

sink = Array.new(LIVE_SET) { String.new("x" * OBJECT_SIZE) }
i = 0
while i < ITERATIONS
sink[i % LIVE_SET] = String.new("x" * OBJECT_SIZE)
i += 1
end

t_wall_end = Process.clock_gettime(Process::CLOCK_MONOTONIC)
t_cpu_end = Process.clock_gettime(Process::CLOCK_PROCESS_CPUTIME_ID)
gc_after = GC.count

wall_s = t_wall_end - t_wall_start
cpu_s = t_cpu_end - t_cpu_start
rss = Rusage.peak_rss_bytes

printf "iterations: %d (live set %d x %dB)\n", ITERATIONS, LIVE_SET, OBJECT_SIZE
printf "gc cycles: %d (before=%d, after=%d)\n", (gc_after - gc_before), gc_before, gc_after
printf "wall time: %.3fs\n", wall_s
printf "cpu time: %.3fs (%.1f%% of wall)\n", cpu_s, (cpu_s / wall_s) * 100.0
printf "peak rss: %.1f MiB (%d bytes)\n", rss / 1024.0 / 1024.0, rss
puts "OK"
Loading
Loading