Skip to content

fix/weekly cache metrics endpoint#1942

Merged
alanpeixinho merged 1 commit into
kernelci:mainfrom
profusion:fix/weekly-cache-metrics-endpoint
Jun 24, 2026
Merged

fix/weekly cache metrics endpoint#1942
alanpeixinho merged 1 commit into
kernelci:mainfrom
profusion:fix/weekly-cache-metrics-endpoint

Conversation

@alanpeixinho

Copy link
Copy Markdown
Contributor

What it is

Makes the metrics page cache its response for a week, so we can:

  1. Keep parity with email
  2. Avoid slow endpoint calls

How to test:

Test warming cache crontab

  1. First install the crontabs via poetry run python3 manage.py crontab add
  2. Make sure they are installed poetry run python3 manage.py crontab show
  3. You can execute the crontab via poetry run python3 manage.py crontab run <hash of cronjob>
    * It is important to point here that the warm is supposed to run at the same time as email metrics (saturday), so they might not be aligned to the same days as frontend requests.
    * Alternatively to this, we could change the system clock.
  4. Verify the function that queries and stores on redis runs properly.
  5. Verify redis now has updated cache on the keys (metricsTotalObjects, metricsBuildIncidents, metricsNewBuildIncidents, metricsLabSummary) (via redis-cli) or via python script.
  6. Remember to remove the crontab from your system poetry run python3 manage.py crontab remove

Testing endpoint

  1. Go to the metrics page
  2. Make sure it loads data from the range [lastSaturday - 7 days, lastSaturday] for the "previous week" and [lastSaturday - 14, lastSaturday] for the "2 previous weeks".
  3. Make sure subsequent access are cached.

Closes #1940

)
end_datetime = datetime.combine(
today - timedelta(days=end_days_ago),
time.max,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to work out the SQL below, I think this will be an 8-day window? Because it causes SQL to be at the end of the 7-day window (i.e. on the last day it will be at midnight).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are correct.
The cached query is happening at the start of the day (just like the mail notification), so the 8th day would end up empty.
But still, it is not robust to rely on time of the call.

@alanpeixinho alanpeixinho force-pushed the fix/weekly-cache-metrics-endpoint branch 3 times, most recently from 2ed607d to 98b56ae Compare June 22, 2026 11:32

@mentonin mentonin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. I think this works in general, but should be seen as a palliative solution. A better solution might be a rollup table. I think we could also use a set of prometheus metrics (we are tracking difference between timestamps of time series with low cardinality).

(SELECT COUNT(*) FROM checkouts WHERE _timestamp BETWEEN
NOW() - INTERVAL %(start_days_ago)s
AND NOW() - INTERVAL %(end_days_ago)s)
(SELECT COUNT(*) FROM checkouts WHERE _timestamp >=

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can merge this SELECT with the previous one for (minor) performance gains

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

from django.utils import timezone as django_timezone


def seeded_timestamp(*, days_ago: int = 1) -> datetime:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could include some specific tests related to the timestamp: out-of-order rows, cases before/after query intervals, check open or closed interval boundaries, etc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH won't let me mark this as resolved, but tests are looking very robust now 👍🏼

const today = new Date();
const weekEndDaysAgo = (today.getUTCDay() + 1) % DAYS_IN_WEEK;
const endDaysAgo = Math.max(weekEndDaysAgo - 1, 0);
const startDaysAgo = endDaysAgo + activeDays + (weekEndDaysAgo > 0 ? 1 : 0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this gets an interval of activeDays + 1 complete days (e.g. it includes 2 Fridays instead of 1 when activeDays=7)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a broader note, I don't like the interface here. /metrics?i=n returns metrics from n days before the last Saturday, which is not really intuitive. I think you can only set i=7 or i=14 from the UI, but then why do we expose that as an integer for the user to edit?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, I changed the backend behaviour and forgot to update here at some point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the topic of the url query param, I think I am with you.
I am removing this parameter altogether, unless users actually show need to share specific ranges via url. And even if they do, we might limit the options.

Comment thread dashboard/src/locales/messages/index.ts Outdated
'This is the legacy version of the {page}, please refer to the new, optimized version {newPageLink}. If you find any bugs or divergences, please report to {gitHubLink}.',
'metricsPage.computedAt': 'Computed {computedAt}',
'metricsPage.period.previousTwoWeeks': 'Previous 2 weeks',
'metricsPage.period.previousWeek': 'Previous week',

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'metricsPage.period.previousWeek': 'Previous week',
'metricsPage.period.previousWeek': 'Last week',

@alanpeixinho alanpeixinho force-pushed the fix/weekly-cache-metrics-endpoint branch from 98b56ae to 4b6cdd5 Compare June 23, 2026 19:40
<p>
{formatMessage(
{ id: 'metricsPage.computedAt' },
{ computedAt: formatDate(data.created_at, false, true) },

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we display "Created at" or the actual interval spanned by the data?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I really want to show the time data was computed, mostly because it is a long cache.
We could also show the range of the last week, but I am not sure if we gain much from doing it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wince we are slicing by _timestamp, I don't think the computed time is useful: the metrics for a specific time interval (in the past) should be constant

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I am removing, since it was introducing unecessary complexity

# UTC midnight, so each lands exactly on an interval_params day boundary. This lets
# adjacent metrics windows return distinct, non-zero counts to exercise the half-open
# [start, end) boundaries on both sides.
SEED_DAY_SPAN = 8

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this creates data at 00:00:00 at current day up to 7 days ago. The default query fetches events from 7 days ago, 00:00:00 up to yesterday, 23:59:59. So this does not test start date

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have moved the span further, not only to include prev responses, but also to help the test be more robust, and fail if we do not properly filter start date.

from django.utils import timezone as django_timezone


def seeded_timestamp(*, days_ago: int = 1) -> datetime:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH won't let me mark this as resolved, but tests are looking very robust now 👍🏼

"n_tests": 26,
"n_issues": 17,
"n_incidents": 7,
"prev_n_trees": 0,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous values should probably also be seeded and tested for

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

Comment thread dashboard/src/locales/messages/index.ts Outdated
'messages.olderPageVersion':
'This is the legacy version of the {page}, please refer to the new, optimized version {newPageLink}. If you find any bugs or divergences, please report to {gitHubLink}.',
'metricsPage.computedAt': 'Computed {computedAt}',
'metricsPage.period.previousTwoWeeks': 'Previous 2 weeks',

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'metricsPage.period.previousTwoWeeks': 'Previous 2 weeks',
'metricsPage.period.previousTwoWeeks': 'Last 2 weeks',

For consistency, since you accepted my suggestion

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@alanpeixinho alanpeixinho force-pushed the fix/weekly-cache-metrics-endpoint branch 2 times, most recently from 917fb5f to 7bed5fe Compare June 24, 2026 18:13
@alanpeixinho alanpeixinho requested a review from mentonin June 24, 2026 18:24
  * Use fixed UTC date bounds for metrics intervals so cache keys stay stable across the week
  * Expose created_at on the API
  * Align the dashboard with the email window, and warm the cache after the Saturday metrics email.
  * Change seeds to include older timestamp

  Closes kernelci#1940

Signed-off-by: Alan Peixinho <alan.peixinho@profusion.mobi>
@alanpeixinho alanpeixinho force-pushed the fix/weekly-cache-metrics-endpoint branch from 7bed5fe to 9ab53c0 Compare June 24, 2026 21:10
@alanpeixinho alanpeixinho added this pull request to the merge queue Jun 24, 2026
Merged via the queue into kernelci:main with commit 5ef7b2b Jun 24, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Weekly caching on metrics page

3 participants