@@ -11,24 +11,54 @@ and this enables optimizations that span multiple instructions.
1111Historically, the adaptive interpreter was referred to as ` tier 1 ` and
1212the JIT as ` tier 2 ` . You will see remnants of this in the code.
1313
14- ## The Optimizer and Executors
14+ ## The Trace Recorder and Executors
1515
16- The program begins running on the adaptive interpreter, until a ` JUMP_BACKWARD `
17- instruction determines that it is "hot" because the counter in its
16+ There are two interpreters in this section:
17+ 1 . Adaptive interpreter (the default behavior)
18+ 2 . Trace recording interpreter (enabled on JIT builds)
19+
20+ The program begins running on the adaptive interpreter, until a ` JUMP_BACKWARD ` or
21+ ` RESUME ` instruction determines that it is "hot" because the counter in its
1822[ inline cache] ( interpreter.md#inline-cache-entries ) indicates that it
1923executed more than some threshold number of times (see
2024[ ` backoff_counter_triggers ` ] ( ../Include/internal/pycore_backoff.h ) ).
21- It then calls the function ` _PyOptimizer_Optimize() ` in
25+ It then calls the function ` _PyJit_TryInitializeTracing ` in
2226[ ` Python/optimizer.c ` ] ( ../Python/optimizer.c ) , passing it the current
23- [ frame] ( frames.md ) and instruction pointer. ` _PyOptimizer_Optimize() `
24- constructs an object of type
25- [ ` _PyExecutorObject ` ] ( ../Include/internal/pycore_optimizer.h ) which implements
26- an optimized version of the instruction trace beginning at this jump.
27-
28- The optimizer determines where the trace ends, and the executor is set up
27+ [ frame] ( frames.md ) , instruction pointer and state.
28+ The interpreter then switches into "tracing mode" via the macro
29+ ` ENTER_TRACING() ` . On platforms that support computed goto and tail-calling
30+ interpreters, the dispatch table is swapped out, while other platforms that do
31+ not support either use a single flag in the opcode.
32+ Execution between the normal interpreter and tracing interpreter are
33+ interleaved via this dispatch mechanism. This means that while logically
34+ there are two interpreters, the implementation appears to be a single
35+ interpreter.
36+
37+ During tracing mode, after each interpreter instruction's ` DISPATCH() ` ,
38+ the interpreter jumps to the ` TRACE_RECORD ` instruction. This instruction
39+ records the previous instruction executed and also any live values of the next
40+ operation it may require. It then translates the previous instruction to
41+ a sequence of micro-ops using ` _PyJit_translate_single_bytecode_to_trace ` .
42+ To ensure that the adaptive interpreter instructions
43+ and cache entries are up-to-date, the trace recording interpreter always resets
44+ the adaptive counters of adaptive instructions it sees.
45+ This forces a re-specialization of any new instruction should an instruction
46+ deoptimize. Thus, feeding the trace recorder up-to-date information.
47+ Finally, the ` TRACE_RECORD ` instruction decides when to stop tracing
48+ using various heuristics.
49+
50+ Once trace recording concludes, ` LEAVE_TRACING() ` swaps out the dispatch
51+ table/the opcode flag set earlier by ` ENTER_TRACING() ` is unset.
52+ ` stop_tracing_and_jit() ` then calls ` _PyOptimizer_Optimize() ` which optimizes
53+ the trace and constructs an
54+ [ ` _PyExecutorObject ` ] ( ../Include/internal/pycore_optimizer.h ) .
55+
56+ JIT execution is set up
2957to either return to the adaptive interpreter and resume execution, or
3058transfer control to another executor (see ` _PyExitData ` in
31- Include/internal/pycore_optimizer.h).
59+ Include/internal/pycore_optimizer.h). When resuming to the adaptive interpreter,
60+ a "side exit", generated by an ` EXIT_IF ` may trigger recording of another trace.
61+ While a "deopt", generated by a ` DEOPT_IF ` , does not trigger recording.
3262
3363The executor is stored on the [ ` code object ` ] ( code_objects.md ) of the frame,
3464in the ` co_executors ` field which is an array of executors. The start
@@ -40,12 +70,7 @@ executor in `co_executors`.
4070
4171The micro-op (abbreviated ` uop ` to approximate ` μop ` ) optimizer is defined in
4272[ ` Python/optimizer.c ` ] ( ../Python/optimizer.c ) as ` _PyOptimizer_Optimize ` .
43- It translates an instruction trace into a sequence of micro-ops by replacing
44- each bytecode by an equivalent sequence of micro-ops (see
45- ` _PyOpcode_macro_expansion ` in
46- [ pycore_opcode_metadata.h] ( ../Include/internal/pycore_opcode_metadata.h )
47- which is generated from [ ` Python/bytecodes.c ` ] ( ../Python/bytecodes.c ) ).
48- The micro-op sequence is then optimized by
73+ It takes a micro-op sequence from the trace recorder and optimizes with
4974` _Py_uop_analyze_and_optimize ` in
5075[ ` Python/optimizer_analysis.c ` ] ( ../Python/optimizer_analysis.c )
5176and an instance of ` _PyUOpExecutor_Type ` is created to contain it.
0 commit comments