Sorting Events¶
The sorting feature sorts incoming events from the BPF programs chronologically to ensure events are output in the order they occurred.
Enabling Sorting¶
Enable event sorting using the sort-events option:
tracee --output option:sort-events
Or in a configuration file:
output:
options:
sort-events: true
Information
There are 3 known sources of sorting issues:
-
In perf buffer, events are read in round robing order from CPUs buffers (and not according to invocation time).
-
Syscall events are invoked after internal events of the syscall (though the syscall happened before the internal events).
-
Virtual CPUs might enter sleep mode by host machine scheduler and send events after some delay.
How Sorting Works¶
To address the perf buffer issues, events are divided into queues according to the source CPU. This way the events are almost sorted (except for syscalls). Syscall events are inserted to their correct chronological position manually.
This way, all events which occurred before the last event of the most delaying CPU can be sent forward with guaranteed order.
To make sure syscall events are not missed when sending, a small delay is needed. Lastly, to address the vCPU sleep issue (which might cause up to 2 events to be received with a delay), the events need to be sent after a delay which is larger than the maximum possible vCPU sleep time (which is just an increase of the syscall events delay).
Sorting Algorithm¶
To summarize the algorithm's main logic, here is a textual simulation of the operation (assume that 2 scheduler ticks are larger than the maximum possible vCPU sleep time):
Tn = Timestamp (n == TOD)
#m = Event's Source CPU
-
Initial State
[ CPU 0 ] [ CPU 1 ] [ CPU 2 ] HEAD T1 T2 T4 T3 T5 T6 TAIL T8 -
Scheduler Tick #1
Incoming events: T9#1, T11#2, T13#1, T10#2, T12#2 Queues state after insert: [ CPU 0 ] [ CPU 1 ] [ CPU 2 ] HEAD T1 T2 T4 T3 T5 T10 + T6 T9 + T11 + TAIL T8 T13 + T12 + - No event sent. - Oldest timestamp = T1. - T8 is oldest timestamp in most recent timestamps. - In 2 ticks from now: send all events up to T8. - Bigger timestamps than T8 (+) will be sent in future scheduling. -
Scheduler Tick #2
Incoming events: T7#0, T22#1, T23#2, T20#0, T25#1, T24#2, T21#0 Queues state after insert: [ CPU 0 ] [ CPU 1 ] [ CPU 2 ] HEAD T1 ^ T2 ^ T4 ^ T3 ^ T5 ^ T10 T6 ^ T9 T11 T7 +^ T13 T12 T8 ^ T22 + T23 + T20 + T25 + T24 + TAIL T21 + - No event sent. - Oldest timestamp = T1. - T21 is oldest timestamp in most recent timestamps. - In 2 ticks from now: send all events up to T21. - T8 is previous oldest timestamp in most recent timestamps. - Next tick: send all events up to T8. - Bigger timestamps than T21 (+) will be sent in future scheduling. -
Scheduler Tick #3
Incoming events: T30#0, T34#1, T35#2, T31#0, T36#2, T32#0, T37#2, T33#0, T38#2, T50#1, T51#1 Queues state after insert: [ CPU 0 ] [ CPU 1 ] [ CPU 2 ] HEAD T20 ^ T9 ^ T10 ^ T21 ^ T13 ^ T11 ^ T30 + T22 T12 ^ T31 + T23 T24 T32 + T25 T35 + T33 + T34 + T36 + T50 + T37 + TAIL T51 + T38 + - Max sent timestamp = T8. - Oldest timestamp = T9. - T33 is oldest timestamp in most recent timestamps. - In 2 ticks from now: send all events up to T33. - T21 is previous oldest timestamp in most recent timestamps. - Next tick: send all events up to T21. - Bigger timestamps than T33 (+) will be sent in future scheduling.