Comprehensive documentation for all metrics collected and calculated by the UPLC-CAPE benchmarking framework.
The metrics.json file (generated by cape submission measure) contains both raw measurements from UPLC evaluation and derived metrics calculated from Conway mainnet protocol parameters. These metrics help assess both script efficiency and real-world production impact.
- Raw Measurements - Direct outputs from UPLC evaluation
- Derived Metrics - Calculated cost and capacity metrics
- Aggregation Strategies - How metrics are combined across multiple test cases
These metrics are directly measured from UPLC script evaluation:
Type: Integer aggregations (maximum, sum, minimum, median, sum_positive, sum_negative)
Description: CPU execution units consumed during script evaluation. Represents computational cost.
Aggregations:
maximum: Worst-case CPU cost across all test casessum: Total CPU cost across all test casesminimum: Best-case CPU cost across all test casesmedian: Median CPU cost across all test casessum_positive: Total CPU cost from successful evaluations onlysum_negative: Total CPU cost from failed evaluations only
Usage: Primary metric for comparing computational efficiency between implementations.
Example:
"cpu_units": {
"maximum": 500000000,
"sum": 1500000000,
"minimum": 450000000,
"median": 500000000,
"sum_positive": 1500000000,
"sum_negative": 0
}Type: Integer aggregations (maximum, sum, minimum, median, sum_positive, sum_negative)
Description: Memory units consumed during script evaluation. Represents memory cost.
Aggregations: Same as cpu_units
Usage: Secondary metric for comparing memory efficiency between implementations.
Example:
"memory_units": {
"maximum": 1000000,
"sum": 3000000,
"minimum": 900000,
"median": 1000000,
"sum_positive": 3000000,
"sum_negative": 0
}Type: Integer (constant)
Description: Size of the serialized UPLC script in bytes. This is the on-chain storage size when deployed as a reference script.
Usage: Indicates deployment cost and reference script fees. Smaller is better for storage costs.
Example:
"script_size_bytes": 10000Type: Integer (constant)
Description: Number of AST (Abstract Syntax Tree) nodes in the UPLC term. Represents script complexity.
Usage: Internal metric for script complexity analysis. Not directly related to execution cost.
Example:
"term_size": 1234Derived metrics use a hybrid aggregation strategy based on the semantic meaning of each metric:
Rationale: These metrics answer "Can my validator handle the worst-case input?"
- Validators process adversarial, user-controlled inputs in production
- Worst-case performance matters most (DoS attacks, griefing scenarios)
- A validator cheap on average but expensive on edge cases = security risk
maximumcaptures real-world deployment constraints
Metrics using maximum:
tx_memory_budget_pct,tx_cpu_budget_pctblock_memory_budget_pct,block_cpu_budget_pctscripts_per_tx,scripts_per_block
Rationale: These metrics answer "What is the total cost of running the complete test suite?"
- Represents actual cost incurred during benchmarking verification
- Provides aggregate cost comparison across implementations
- Useful for estimating verification overhead
Metrics using sum:
execution_fee_lovelacereference_script_fee_lovelacetotal_fee_lovelace
This hybrid approach makes optimization gaming significantly harder:
- Cannot optimize fees without improving individual executions
- Raw metrics (all aggregations) remain transparent for scrutiny
- Community review via source code submission
- Distribution anomalies (e.g., high variance) are detectable
Example: A submission with normal sum but suspicious max (or vice versa) would be flagged during review.
These metrics are calculated from raw measurements using Conway mainnet protocol parameters.
Type: Integer (lovelace)
Aggregation: Uses sum of CPU and memory units
Description: Total cost of executing all test cases in the benchmark suite.
Formula:
execution_fee = ceiling((memory_units.sum × 0.0577) + (cpu_units.sum × 0.0000721))
Protocol Parameters:
- Memory price: 0.0577 lovelace/unit (577/10000)
- CPU price: 0.0000721 lovelace/step (721/10000000)
Usage: Represents the total execution cost for running the complete test suite.
Example:
"execution_fee_lovelace": 93750For 1,000,000 memory units and 500,000,000 CPU steps:
- Memory cost: 1,000,000 × 0.0577 = 57,700 lovelace
- CPU cost: 500,000,000 × 0.0000721 = 36,050 lovelace
- Total: 93,750 lovelace
Type: Integer (lovelace)
Description: Cost of storing the script as a reference script using Conway era tiered pricing.
Formula: Tiered pricing where each 25 KiB tier costs 1.2× the previous tier.
Tier Structure:
| Tier | Size Range (bytes) | Price/Byte | Cumulative Cost |
|---|---|---|---|
| 1 | 0 - 25,600 | 15 | 384,000 |
| 2 | 25,600 - 51,200 | 18 | 844,800 |
| 3 | 51,200 - 76,800 | 21.6 | 1,397,760 |
| 4 | 76,800 - 102,400 | 25.92 | 2,061,312 |
| 5 | 102,400 - 128,000 | 31.104 | 2,857,574 |
| 6 | 128,000 - 153,600 | 37.3248 | 3,813,089 |
| 7 | 153,600 - 179,200 | 44.78976 | 4,959,707 |
| 8 | 179,200 - 204,800 | 53.747712 | 6,335,648 |
Maximum: 200 KiB per transaction (hard limit)
Usage: Represents the fixed cost of deploying the script as a reference script.
Example:
"reference_script_fee_lovelace": 150000For a 10 KB (10,000 bytes) script:
- Falls entirely in Tier 1
- 10,000 × 15 = 150,000 lovelace
For a 75 KB (75,000 bytes) script:
- Tier 1: 25,600 × 15 = 384,000
- Tier 2: 25,600 × 18 = 460,800
- Tier 3: 23,800 × 21.6 = 514,080
- Total: 1,358,880 lovelace
Type: Integer (lovelace)
Aggregation: Uses sum (via execution_fee) + reference_script_fee
Description: Combined execution and reference script fee for the complete test suite.
Formula:
total_fee = execution_fee + reference_script_fee
Important: This does NOT include the base transaction fee (txFeeFixed + txFeePerByte × tx_size), as we're comparing script performance, not full transaction costs.
Usage: Total script-related cost for deploying and running all test cases.
Example:
"total_fee_lovelace": 243750For the 10 KB script example:
- Execution: 93,750 lovelace
- Reference script: 150,000 lovelace
- Total: 243,750 lovelace (~0.244 ADA)
These metrics show what percentage of transaction and block execution budgets are consumed by the worst-case execution.
Type: Double (percentage)
Aggregation: Uses maximum memory units
Description: Percentage of the transaction memory budget used in the worst-case execution.
Formula:
tx_memory_budget_pct = (memory_units.maximum / 14,000,000) × 100
Limit: 14,000,000 memory units per transaction
Usage: Indicates how much of the transaction's memory budget this script consumes in the worst case. Values >50% may limit how many scripts can fit in a transaction.
Example:
"tx_memory_budget_pct": 7.14For 1,000,000 memory units:
- 1,000,000 / 14,000,000 × 100 = 7.14%
Type: Double (percentage)
Aggregation: Uses maximum CPU units
Description: Percentage of the transaction CPU budget used in the worst-case execution.
Formula:
tx_cpu_budget_pct = (cpu_units.maximum / 10,000,000,000) × 100
Limit: 10,000,000,000 CPU steps per transaction
Usage: Indicates how much of the transaction's CPU budget this script consumes in the worst case.
Example:
"tx_cpu_budget_pct": 5.0For 500,000,000 CPU steps:
- 500,000,000 / 10,000,000,000 × 100 = 5.0%
Type: Double (percentage)
Aggregation: Uses maximum memory units
Description: Percentage of the block memory budget used in the worst-case execution.
Formula:
block_memory_budget_pct = (memory_units.maximum / 62,000,000) × 100
Limit: 62,000,000 memory units per block
Usage: Indicates block-level resource consumption in the worst case. Useful for understanding blockchain capacity impact.
Example:
"block_memory_budget_pct": 1.61For 1,000,000 memory units:
- 1,000,000 / 62,000,000 × 100 = 1.61%
Type: Double (percentage)
Aggregation: Uses maximum CPU units
Description: Percentage of the block CPU budget used in the worst-case execution.
Formula:
block_cpu_budget_pct = (cpu_units.maximum / 40,000,000,000) × 100
Limit: 40,000,000,000 CPU steps per block
Usage: Indicates block-level CPU consumption in the worst case.
Example:
"block_cpu_budget_pct": 1.25For 500,000,000 CPU steps:
- 500,000,000 / 40,000,000,000 × 100 = 1.25%
These metrics indicate how many script executions can fit within transaction and block limits, based on worst-case resource usage.
Type: Integer
Aggregation: Uses maximum memory and CPU units
Description: Maximum number of times this script can be executed in a single transaction, based on worst-case resource usage.
Formula:
scripts_per_tx = min(
floor(14,000,000 / memory_units.maximum),
floor(10,000,000,000 / cpu_units.maximum)
)
Usage: Indicates transaction-level scalability in the worst case. The limiting resource (memory or CPU) determines the result.
Example:
"scripts_per_tx": 14For 1,000,000 memory units and 500,000,000 CPU steps:
- Memory limit: 14,000,000 / 1,000,000 = 14
- CPU limit: 10,000,000,000 / 500,000,000 = 20
- Result: min(14, 20) = 14 (memory-limited)
Type: Integer
Aggregation: Uses maximum memory and CPU units
Description: Maximum number of times this script can be executed in a single block, based on worst-case resource usage.
Formula:
scripts_per_block = min(
floor(62,000,000 / memory_units.maximum),
floor(40,000,000,000 / cpu_units.maximum)
)
Usage: Indicates block-level scalability and blockchain throughput impact in the worst case.
Example:
"scripts_per_block": 62For 1,000,000 memory units and 500,000,000 CPU steps:
- Memory limit: 62,000,000 / 1,000,000 = 62
- CPU limit: 40,000,000,000 / 500,000,000 = 80
- Result: min(62, 80) = 62 (memory-limited)
For benchmarks with multiple test cases, raw measurements use various aggregation strategies:
| Strategy | Description |
|---|---|
maximum |
Worst-case value across all test cases |
sum |
Total value across all test cases |
minimum |
Best-case value across all test cases |
median |
Median value across all test cases |
sum_positive |
Sum of values from successful evaluations only |
sum_negative |
Sum of values from failed evaluations only |
Derived metrics use a hybrid aggregation strategy:
- Budget/Capacity metrics (budget percentages, scripts per tx/block) use
maximumfor worst-case semantics - Fee metrics (execution fee, total fee) use
sumto represent total test suite cost - See Aggregation Strategy for Derived Metrics for detailed rationale
All derived metrics use Conway mainnet protocol parameters (as of 2025):
- Max memory per transaction: 14,000,000 units
- Max CPU per transaction: 10,000,000,000 steps
- Max memory per block: 62,000,000 units
- Max CPU per block: 40,000,000,000 steps
- Memory price: 0.0577 lovelace/unit (577/10000)
- CPU price: 0.0000721 lovelace/step (721/10000000)
- Base price (Tier 1): 15 lovelace/byte
- Tier size: 25,600 bytes (25 KiB)
- Tier multiplier: 1.2 (6/5)
- Max reference script size: 204,800 bytes (200 KiB)
- Base transaction fee (fixed): 155,381 lovelace
- Transaction fee per byte: 44 lovelace/byte
Note: Base transaction fees are NOT included in the reported total_fee_lovelace as we're comparing script performance, not full transaction costs.
Source code: lib/Cape/Protocol/Parameters.hs
Schema definition: submissions/TEMPLATE/metrics.schema.json
Test suite: test/Cape/Protocol/ParametersSpec.hs
When comparing implementations:
- Execution cost (
execution_fee_lovelace) shows runtime efficiency - Reference script cost (
reference_script_fee_lovelace) shows deployment cost - Total cost (
total_fee_lovelace) shows combined impact
Trade-offs exist: a larger script might have lower execution cost, or vice versa.
Budget percentages and capacity metrics help identify production constraints:
- < 10% budget: Very efficient, many instances per tx/block
- 10-50% budget: Reasonable efficiency, moderate capacity
- > 50% budget: High resource usage, limited capacity (highlighted in reports)
- > 100% budget: Exceeds transaction/block limits, not deployable
The limiting resource (memory vs CPU) is shown by which budget percentage is higher:
- Memory-limited:
tx_memory_budget_pct>tx_cpu_budget_pct - CPU-limited:
tx_cpu_budget_pct>tx_memory_budget_pct
This indicates which optimization (reducing memory vs reducing CPU) would have greater impact.
{
"scenario": "fibonacci",
"version": "1.0.0",
"measurements": {
"cpu_units": {
"maximum": 500000000,
"sum": 1500000000,
"minimum": 450000000,
"median": 500000000,
"sum_positive": 1500000000,
"sum_negative": 0
},
"memory_units": {
"maximum": 1000000,
"sum": 3000000,
"minimum": 900000,
"median": 1000000,
"sum_positive": 3000000,
"sum_negative": 0
},
"script_size_bytes": 10000,
"term_size": 1234,
"execution_fee_lovelace": 281250,
"reference_script_fee_lovelace": 150000,
"total_fee_lovelace": 431250,
"tx_memory_budget_pct": 21.43,
"tx_cpu_budget_pct": 15.0,
"block_memory_budget_pct": 4.84,
"block_cpu_budget_pct": 3.75,
"scripts_per_tx": 4,
"scripts_per_block": 13
},
"evaluations": [
{
"name": "test_case_1",
"description": "Fibonacci of 10",
"cpu_units": 500000000,
"memory_units": 1000000,
"execution_result": "success"
}
],
"execution_environment": {
"evaluator": "PlutusTx.Eval-1.52.0.0"
},
"timestamp": "2025-10-23T10:00:00Z"
}Interpretation:
- This script uses 21.43% of transaction memory budget (memory-limited)
- Can fit 4 executions per transaction, 13 per block
- Total cost per execution: ~0.43 ADA (assuming sum aggregation)
- Memory is the limiting resource (21.43% vs 15% CPU)