- SubmitResults transaction batching: Wrap the per-status loop in one transaction; cuts per-RPC
Begin/Commitoverhead from N to 1 - MQTT response router deadlock: Fix a hang in
mqserverthat wedged all subsequentCheckNTP/Metricsrequests when a receiver gave up on a late response
- Higher RPC timeouts: 30s response-header / 60s overall (was 10s/30s) so
SubmitResultsrides out database contention - trace_id on batch errors: Batch error logs now carry the matching
trace_idand failed batches surface as errored spans
- Journald improvements: Pick up upstream
go.ntppool.org/commonv0.10.2 with better systemd journal support
- Stop client/config tests from depending on the live devel API; route HTTP through an in-process fake and add a tripwire test that fails if any test reaches an external network
- Systemd journal integration
When stderr is connected to journald, log records are delivered via the native journal protocol with per-record
PRIORITY=, sojournalctl -pandLogLevelMax=in the unit file filter by severity.DEBUG_INVOCATION(fromRestartMode=debug) automatically raises stderr verbosity to debug on a failed-restart attempt. - MQTT session takeover backoff
When the broker reports a session takeover (reasonCode
0x8E), the agent escalates the reconnect backoff (2m → 5m → 10m → 15m) instead of reconnecting on the 10s default, and pauses NTP checks and result submission while another client holds the session. The backoff resets after 30 minutes of stable connection. The disconnect reasonCode is now always logged.
- Configurable log levels: New
--log-levelflag andMONITOR_LOG_LEVELenv var for stderr; OTLP log level is server-controlled via gRPC config and cached to state.json;--debugoverrides both to DEBUG - OTEL service name: Set explicit
OTEL_SERVICE_NAME=ntppool-agentso logs appear in Loki with the correct service name
- Systemd detection: Postinstall script exits gracefully on non-systemd systems and in containers where systemctl exists but systemd isn't PID 1
- New internal CA
- Build with Go 1.26 and refresh dependencies
- Migrate CI from Drone to Woodpecker
- Replace legacy units on upgrade: Postinstall disables all
ntppool-monitor@*systemd units and clears failed unit records; goreleaserconflictsdirective ensures clean package replacement
- Upgrade goreleaser
- Access logs: Include real client IP (via Fastly + RFC1918 XFF) and monitor name from the auth context
- Middleware ordering: Run authentication before logging so the certificate name is available when logs are generated (fixes
monitor=unknownandcertificateKey didn't return a stringerrors)
- Score deduplication: Reject out-of-order timestamps, fix zero-score edge cases in percentage comparison, and extract magic numbers to named constants
- Skip unresolvable old log_scores: Prevents the scorer from falling more than 3 hours behind when older entries can't be scored
- SQL update metrics: New
scorer_sql_updates_totalcounter broken down by operation type to identify frequent update patterns
- Fewer stratum updates: Minimize stratum update queries that were unnecessarily busy on the database
- Replace legacy package: Add
replacesdirective so deb/rpm/apk packages automatically uninstall the oldntppool-monitorpackage on upgrade
- Build with Go 1.25
- Allow manual build triggers; disable MySQL/MariaDB cert verification in test config
- Handler registration fix: Restored MQTT ad hoc NTP query functionality broken in router migration
- MQTT connection fix: Fix immediate disconnnections in the client
- Version filtering: Ad hoc requests now only sent to clients on version 4.0.4+ or exactly 3.8.6
- Performance-based replacement: Testing monitors can replace worse-performing active monitors
- Safety thresholds: Prevent removing all monitors when performance is universally poor
- Auto-pause for constraint violations: Monitors with unchangeable constraints (subnet/account) are paused instead of repeatedly failing
- Data point requirements: Minimum 9 measurements for testing promotion, 60 for active promotion
- Account limit fixes: Allow monitor swaps at account boundaries
- Priority 0 fix: Optimal performance monitors can now be promoted correctly
- Deadlock retry: Exponential backoff for database deadlocks with telemetry
- Paused server scores: Added support for paused status in scoring
- Simulation mode:
selector simulatecommand for safe algorithm testing - Server targeting:
--server-idparameter for processing specific servers
- 32-bit support: Added 386 architecture builds with softfloat
- RPM dependency alternatives: Enabled dependency alternatives for RPM packages to improve installation compatibility
- APK dependency cleanup: Removed unnecessary APK package dependencies for Alpine Linux builds
- NTP query debugging: Added
MONITOR_DEBUG_NTP_QUERIESenvironment variable for detailed NTP query logging
- Mutex crash fix: Updated common package dependency to resolve critical mutex-related crashes
- Authentication fixes: Improved dual mTLS/JWT authentication support with proper
RequestClientCerthandling
- Legacy API blocking: Twirp API now blocked for monitors who upgraded to v4.0.0 and newer
- v3.x monitors must re-register using the new
ntppool-agent setupprocess - Old registration methods no longer supported
- New API key system replaces Vault-based authentication
- New
setupcommand for monitor provisioning - Simplified dual-stack operation: single API key manages both IPv4 and IPv6 monitoring
- Improved configuration management with persistent state directory
- Enhanced error handling and retry logic for API operations
- Much improved setup flow for new monitors on the website
- Enhanced authentication and API communication
- Reduced registration delays between IPv4/IPv6 protocols
- Better handling of network connectivity issues
- Improved logging and diagnostic capabilities
- Added RISC-V 64-bit architecture support
- Updated to Go 1.24 with latest dependencies
- Dynamic testing pool sizing: Testing pool now adjusts based on active monitor availability
- Performance-based replacement: Monitors are selected based on performance metrics
- Network diversity constraints: Improved geographic and network distribution
- Per-status change limits: Prevents mass monitor changes that could affect server scores
- Multi-segment offset scoring: Updated scoring algorithm with stricter optimal performance thresholds (25ms vs previous 75ms) and two-segment linear degradation ranges
- Stratum validation: Raised threshold to 10
- Response selection: Prefers valid NTP responses over timeout errors
- Bootstrap logic: Emergency override system helps new servers get initial monitoring coverage
- Scheduler optimization: Uses queue timestamps for more efficient server scheduling
- Improved constraint checking for monitor promotions
- Better handling of monitors with no historical scores
- New Connect RPC API replacing legacy Twirp
- Enhanced authentication with JWT tokens and bearer authorization
- OpenTelemetry integration for improved monitoring and debugging
- Updated command-line interface using Kong for better flag parsing
- Uses systemd StateDirectory for persistent configuration storage
- Hot-reloading configuration changes without service restart
- Improved environment detection and API endpoint resolution
Migration Note: Operators running v3.x monitors should plan for re-registration as the old authentication system is no longer supported in v4.0.0.