LOG-8697: Use event.lastTimestamp instead of metadata.creationTimestamp for Kubernetes event logs#3226
Conversation
|
@vparfonov: This pull request references LOG-8697 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.8.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
| } | ||
|
|
||
| ExpectedLogTemplateBuilder = func(event, oldEvent *corev1.Event) types.EventRouterLog { | ||
| ExpectedLogTemplateBuilder = func(event, oldEvent *corev1.Event, outputType obs.OutputType) types.EventRouterLog { |
There was a problem hiding this comment.
Do we know of any case where first/last timestamp would be nil for which we should test if we get the creation timestamp? I see creation time default to epoc so I'm going to assume it will always be something, and hopefully not epoc. This otherwise LGTM
There was a problem hiding this comment.
Great question! Investigating this actually led me to the better way to handle these checks.
Kubernetes has two Event APIs. The old style (v1 core) uses firstTimestamp/lastTimestamp, while the new style (events.k8s.io/v1) uses eventTime for better precision and scaling.
eventrouter watches the v1.Event objects, it usually sees lastTimestamp. However, if a component has migrated to the new API internally, the API server has to 'downgrade' the event for us. During that conversion, lastTimestamp can end up being zero if the event is part of a series.
I’ve updated the logic to include a fallback chain:
lastTimestamp -> firstTimestamp -> eventTime -> creationTimestamp
I also added a check for the zero timestamp (0001-01-01...).
|
/hold |
|
/approve |
|
@vparfonov I defer to you to release the hold. I believe the coverage is OK, but wanted to check with you first |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jcantrill, vparfonov The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ernetes event logs EventRouter updates existing Event objects instead of creating new ones. The original metadata.creationTimestamp may therefore be days/weeks old while lastTimestamp reflects the most recent occurrence. Using creationTimestamp caused Loki (others sinks possibly) to reject records with "greater_than_max_sample_age". Use lastTimestamp for @timestamp and timestamp fields so event updates are ingested with the correct time. The fallback chain is: lastTimestamp -> firstTimestamp -> eventTime -> creationTimestamp
|
@vparfonov: This pull request references LOG-8697 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.8.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold cancel |
|
/lgtm |
|
@vparfonov: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
The currently uses
metadata.creationTimestampas the log event timestamp for Kubernetes Events, but Kubernetes updates existing Event objects when the same event repeats, instead of creating new ones.In this case:
metadata.creationTimestampremains the original time the Event object was createdlastTimestampis updated to reflect the most recent occurrenceIn long-running clusters, this means an Event object may have a
creationTimestampthat is weeks/months old, even though the event is still actively occurring. When Vector sends such records to Loki usingcreationTimestampas the log timestamp, Loki rejects them with:greater_than_max_sample_age.This PR address to use
event.lastTimestampas the log timestamp for EventRouter events instead ofmetadata.creationTimestamp.The logic to include a fallback chain:
lastTimestamp -> firstTimestamp -> eventTime -> creationTimestamp/cc @Clee2691 @cahartma
/assign @jcantrill
Links