Skip to content

Latest commit

 

History

History
600 lines (388 loc) · 15.9 KB

File metadata and controls

600 lines (388 loc) · 15.9 KB

UPLC-CAPE Metrics Reference

Comprehensive documentation for all metrics collected and calculated by the UPLC-CAPE benchmarking framework.

Overview

The metrics.json file (generated by cape submission measure) contains both raw measurements from UPLC evaluation and derived metrics calculated from Conway mainnet protocol parameters. These metrics help assess both script efficiency and real-world production impact.

Metrics Categories


Raw Measurements

These metrics are directly measured from UPLC script evaluation:

cpu_units

Type: Integer aggregations (maximum, sum, minimum, median, sum_positive, sum_negative)

Description: CPU execution units consumed during script evaluation. Represents computational cost.

Aggregations:

  • maximum: Worst-case CPU cost across all test cases
  • sum: Total CPU cost across all test cases
  • minimum: Best-case CPU cost across all test cases
  • median: Median CPU cost across all test cases
  • sum_positive: Total CPU cost from successful evaluations only
  • sum_negative: Total CPU cost from failed evaluations only

Usage: Primary metric for comparing computational efficiency between implementations.

Example:

"cpu_units": {
  "maximum": 500000000,
  "sum": 1500000000,
  "minimum": 450000000,
  "median": 500000000,
  "sum_positive": 1500000000,
  "sum_negative": 0
}

memory_units

Type: Integer aggregations (maximum, sum, minimum, median, sum_positive, sum_negative)

Description: Memory units consumed during script evaluation. Represents memory cost.

Aggregations: Same as cpu_units

Usage: Secondary metric for comparing memory efficiency between implementations.

Example:

"memory_units": {
  "maximum": 1000000,
  "sum": 3000000,
  "minimum": 900000,
  "median": 1000000,
  "sum_positive": 3000000,
  "sum_negative": 0
}

script_size_bytes

Type: Integer (constant)

Description: Size of the serialized UPLC script in bytes. This is the on-chain storage size when deployed as a reference script.

Usage: Indicates deployment cost and reference script fees. Smaller is better for storage costs.

Example:

"script_size_bytes": 10000

term_size

Type: Integer (constant)

Description: Number of AST (Abstract Syntax Tree) nodes in the UPLC term. Represents script complexity.

Usage: Internal metric for script complexity analysis. Not directly related to execution cost.

Example:

"term_size": 1234

Aggregation Strategy for Derived Metrics

Derived metrics use a hybrid aggregation strategy based on the semantic meaning of each metric:

Budget and Capacity Metrics: maximum aggregation

Rationale: These metrics answer "Can my validator handle the worst-case input?"

  • Validators process adversarial, user-controlled inputs in production
  • Worst-case performance matters most (DoS attacks, griefing scenarios)
  • A validator cheap on average but expensive on edge cases = security risk
  • maximum captures real-world deployment constraints

Metrics using maximum:

  • tx_memory_budget_pct, tx_cpu_budget_pct
  • block_memory_budget_pct, block_cpu_budget_pct
  • scripts_per_tx, scripts_per_block

Fee Metrics: sum aggregation

Rationale: These metrics answer "What is the total cost of running the complete test suite?"

  • Represents actual cost incurred during benchmarking verification
  • Provides aggregate cost comparison across implementations
  • Useful for estimating verification overhead

Metrics using sum:

  • execution_fee_lovelace
  • reference_script_fee_lovelace
  • total_fee_lovelace

Gaming Resistance

This hybrid approach makes optimization gaming significantly harder:

  • Cannot optimize fees without improving individual executions
  • Raw metrics (all aggregations) remain transparent for scrutiny
  • Community review via source code submission
  • Distribution anomalies (e.g., high variance) are detectable

Example: A submission with normal sum but suspicious max (or vice versa) would be flagged during review.


Derived Metrics

These metrics are calculated from raw measurements using Conway mainnet protocol parameters.

Fee Calculations

execution_fee_lovelace

Type: Integer (lovelace)

Aggregation: Uses sum of CPU and memory units

Description: Total cost of executing all test cases in the benchmark suite.

Formula:

execution_fee = ceiling((memory_units.sum × 0.0577) + (cpu_units.sum × 0.0000721))

Protocol Parameters:

  • Memory price: 0.0577 lovelace/unit (577/10000)
  • CPU price: 0.0000721 lovelace/step (721/10000000)

Usage: Represents the total execution cost for running the complete test suite.

Example:

"execution_fee_lovelace": 93750

For 1,000,000 memory units and 500,000,000 CPU steps:

  • Memory cost: 1,000,000 × 0.0577 = 57,700 lovelace
  • CPU cost: 500,000,000 × 0.0000721 = 36,050 lovelace
  • Total: 93,750 lovelace

reference_script_fee_lovelace

Type: Integer (lovelace)

Description: Cost of storing the script as a reference script using Conway era tiered pricing.

Formula: Tiered pricing where each 25 KiB tier costs 1.2× the previous tier.

Tier Structure:

Tier Size Range (bytes) Price/Byte Cumulative Cost
1 0 - 25,600 15 384,000
2 25,600 - 51,200 18 844,800
3 51,200 - 76,800 21.6 1,397,760
4 76,800 - 102,400 25.92 2,061,312
5 102,400 - 128,000 31.104 2,857,574
6 128,000 - 153,600 37.3248 3,813,089
7 153,600 - 179,200 44.78976 4,959,707
8 179,200 - 204,800 53.747712 6,335,648

Maximum: 200 KiB per transaction (hard limit)

Usage: Represents the fixed cost of deploying the script as a reference script.

Example:

"reference_script_fee_lovelace": 150000

For a 10 KB (10,000 bytes) script:

  • Falls entirely in Tier 1
  • 10,000 × 15 = 150,000 lovelace

For a 75 KB (75,000 bytes) script:

  • Tier 1: 25,600 × 15 = 384,000
  • Tier 2: 25,600 × 18 = 460,800
  • Tier 3: 23,800 × 21.6 = 514,080
  • Total: 1,358,880 lovelace

total_fee_lovelace

Type: Integer (lovelace)

Aggregation: Uses sum (via execution_fee) + reference_script_fee

Description: Combined execution and reference script fee for the complete test suite.

Formula:

total_fee = execution_fee + reference_script_fee

Important: This does NOT include the base transaction fee (txFeeFixed + txFeePerByte × tx_size), as we're comparing script performance, not full transaction costs.

Usage: Total script-related cost for deploying and running all test cases.

Example:

"total_fee_lovelace": 243750

For the 10 KB script example:

  • Execution: 93,750 lovelace
  • Reference script: 150,000 lovelace
  • Total: 243,750 lovelace (~0.244 ADA)

Budget Utilization

These metrics show what percentage of transaction and block execution budgets are consumed by the worst-case execution.

tx_memory_budget_pct

Type: Double (percentage)

Aggregation: Uses maximum memory units

Description: Percentage of the transaction memory budget used in the worst-case execution.

Formula:

tx_memory_budget_pct = (memory_units.maximum / 14,000,000) × 100

Limit: 14,000,000 memory units per transaction

Usage: Indicates how much of the transaction's memory budget this script consumes in the worst case. Values >50% may limit how many scripts can fit in a transaction.

Example:

"tx_memory_budget_pct": 7.14

For 1,000,000 memory units:

  • 1,000,000 / 14,000,000 × 100 = 7.14%

tx_cpu_budget_pct

Type: Double (percentage)

Aggregation: Uses maximum CPU units

Description: Percentage of the transaction CPU budget used in the worst-case execution.

Formula:

tx_cpu_budget_pct = (cpu_units.maximum / 10,000,000,000) × 100

Limit: 10,000,000,000 CPU steps per transaction

Usage: Indicates how much of the transaction's CPU budget this script consumes in the worst case.

Example:

"tx_cpu_budget_pct": 5.0

For 500,000,000 CPU steps:

  • 500,000,000 / 10,000,000,000 × 100 = 5.0%

block_memory_budget_pct

Type: Double (percentage)

Aggregation: Uses maximum memory units

Description: Percentage of the block memory budget used in the worst-case execution.

Formula:

block_memory_budget_pct = (memory_units.maximum / 62,000,000) × 100

Limit: 62,000,000 memory units per block

Usage: Indicates block-level resource consumption in the worst case. Useful for understanding blockchain capacity impact.

Example:

"block_memory_budget_pct": 1.61

For 1,000,000 memory units:

  • 1,000,000 / 62,000,000 × 100 = 1.61%

block_cpu_budget_pct

Type: Double (percentage)

Aggregation: Uses maximum CPU units

Description: Percentage of the block CPU budget used in the worst-case execution.

Formula:

block_cpu_budget_pct = (cpu_units.maximum / 40,000,000,000) × 100

Limit: 40,000,000,000 CPU steps per block

Usage: Indicates block-level CPU consumption in the worst case.

Example:

"block_cpu_budget_pct": 1.25

For 500,000,000 CPU steps:

  • 500,000,000 / 40,000,000,000 × 100 = 1.25%

Capacity Calculations

These metrics indicate how many script executions can fit within transaction and block limits, based on worst-case resource usage.

scripts_per_tx

Type: Integer

Aggregation: Uses maximum memory and CPU units

Description: Maximum number of times this script can be executed in a single transaction, based on worst-case resource usage.

Formula:

scripts_per_tx = min(
  floor(14,000,000 / memory_units.maximum),
  floor(10,000,000,000 / cpu_units.maximum)
)

Usage: Indicates transaction-level scalability in the worst case. The limiting resource (memory or CPU) determines the result.

Example:

"scripts_per_tx": 14

For 1,000,000 memory units and 500,000,000 CPU steps:

  • Memory limit: 14,000,000 / 1,000,000 = 14
  • CPU limit: 10,000,000,000 / 500,000,000 = 20
  • Result: min(14, 20) = 14 (memory-limited)

scripts_per_block

Type: Integer

Aggregation: Uses maximum memory and CPU units

Description: Maximum number of times this script can be executed in a single block, based on worst-case resource usage.

Formula:

scripts_per_block = min(
  floor(62,000,000 / memory_units.maximum),
  floor(40,000,000,000 / cpu_units.maximum)
)

Usage: Indicates block-level scalability and blockchain throughput impact in the worst case.

Example:

"scripts_per_block": 62

For 1,000,000 memory units and 500,000,000 CPU steps:

  • Memory limit: 62,000,000 / 1,000,000 = 62
  • CPU limit: 40,000,000,000 / 500,000,000 = 80
  • Result: min(62, 80) = 62 (memory-limited)

Aggregation Strategies

For benchmarks with multiple test cases, raw measurements use various aggregation strategies:

Strategy Description
maximum Worst-case value across all test cases
sum Total value across all test cases
minimum Best-case value across all test cases
median Median value across all test cases
sum_positive Sum of values from successful evaluations only
sum_negative Sum of values from failed evaluations only

Derived metrics use a hybrid aggregation strategy:

  • Budget/Capacity metrics (budget percentages, scripts per tx/block) use maximum for worst-case semantics
  • Fee metrics (execution fee, total fee) use sum to represent total test suite cost
  • See Aggregation Strategy for Derived Metrics for detailed rationale

Protocol Parameters Reference

All derived metrics use Conway mainnet protocol parameters (as of 2025):

Transaction Limits

  • Max memory per transaction: 14,000,000 units
  • Max CPU per transaction: 10,000,000,000 steps

Block Limits

  • Max memory per block: 62,000,000 units
  • Max CPU per block: 40,000,000,000 steps

Execution Unit Pricing

  • Memory price: 0.0577 lovelace/unit (577/10000)
  • CPU price: 0.0000721 lovelace/step (721/10000000)

Reference Script Pricing

  • Base price (Tier 1): 15 lovelace/byte
  • Tier size: 25,600 bytes (25 KiB)
  • Tier multiplier: 1.2 (6/5)
  • Max reference script size: 204,800 bytes (200 KiB)

Other Parameters

  • Base transaction fee (fixed): 155,381 lovelace
  • Transaction fee per byte: 44 lovelace/byte

Note: Base transaction fees are NOT included in the reported total_fee_lovelace as we're comparing script performance, not full transaction costs.


Implementation Reference

Source code: lib/Cape/Protocol/Parameters.hs

Schema definition: submissions/TEMPLATE/metrics.schema.json

Test suite: test/Cape/Protocol/ParametersSpec.hs


Interpretation Guidelines

Cost Comparison

When comparing implementations:

  1. Execution cost (execution_fee_lovelace) shows runtime efficiency
  2. Reference script cost (reference_script_fee_lovelace) shows deployment cost
  3. Total cost (total_fee_lovelace) shows combined impact

Trade-offs exist: a larger script might have lower execution cost, or vice versa.

Capacity Analysis

Budget percentages and capacity metrics help identify production constraints:

  • < 10% budget: Very efficient, many instances per tx/block
  • 10-50% budget: Reasonable efficiency, moderate capacity
  • > 50% budget: High resource usage, limited capacity (highlighted in reports)
  • > 100% budget: Exceeds transaction/block limits, not deployable

Resource Limiting

The limiting resource (memory vs CPU) is shown by which budget percentage is higher:

  • Memory-limited: tx_memory_budget_pct > tx_cpu_budget_pct
  • CPU-limited: tx_cpu_budget_pct > tx_memory_budget_pct

This indicates which optimization (reducing memory vs reducing CPU) would have greater impact.


Example: Complete Metrics

{
  "scenario": "fibonacci",
  "version": "1.0.0",
  "measurements": {
    "cpu_units": {
      "maximum": 500000000,
      "sum": 1500000000,
      "minimum": 450000000,
      "median": 500000000,
      "sum_positive": 1500000000,
      "sum_negative": 0
    },
    "memory_units": {
      "maximum": 1000000,
      "sum": 3000000,
      "minimum": 900000,
      "median": 1000000,
      "sum_positive": 3000000,
      "sum_negative": 0
    },
    "script_size_bytes": 10000,
    "term_size": 1234,
    "execution_fee_lovelace": 281250,
    "reference_script_fee_lovelace": 150000,
    "total_fee_lovelace": 431250,
    "tx_memory_budget_pct": 21.43,
    "tx_cpu_budget_pct": 15.0,
    "block_memory_budget_pct": 4.84,
    "block_cpu_budget_pct": 3.75,
    "scripts_per_tx": 4,
    "scripts_per_block": 13
  },
  "evaluations": [
    {
      "name": "test_case_1",
      "description": "Fibonacci of 10",
      "cpu_units": 500000000,
      "memory_units": 1000000,
      "execution_result": "success"
    }
  ],
  "execution_environment": {
    "evaluator": "PlutusTx.Eval-1.52.0.0"
  },
  "timestamp": "2025-10-23T10:00:00Z"
}

Interpretation:

  • This script uses 21.43% of transaction memory budget (memory-limited)
  • Can fit 4 executions per transaction, 13 per block
  • Total cost per execution: ~0.43 ADA (assuming sum aggregation)
  • Memory is the limiting resource (21.43% vs 15% CPU)