not logged in | [Login]
A single-pass parser transforms Lua source code into LuaJIT Bytecode. The bytecode operates on dynamically-typed values and has semantics very close to the source code. This is the reason why relatively few optimizations can be performed on the bytecode itself -- most of them wouldn't be valid transforms in a dynamic language.
Similarly, virtual register allocation for the bytecode closely follows scoping of local variables in the Lua source code. No attempt is made to minimize or overlap the lifetimes of local variables (virtual registers are cheap and plenty).
Constant folding is performed for all arithmetic operators (+ - * / % ^
and unary minus) and the logical not
operator. If the input consists
solely of constants, the output is turned into a constant, too. E.g.
this is the bytecode for return 1.5 * 3
:
0001 KNUM 0 0 ; 4.5
0002 RET1 0 2
The multiplication has been eliminated and only its result is returned.
Composite conditional expressions like x < 10 and x or 10
are turned
into a series of branches, instead of evaluating the intermediate
expressions into booleans. The logical operators and
and or
never
appear in the bytecode. not
only appears when the result is used
outside of a logical result context. E.g. this is the bytecode for
return a < b and x or y
:
0001 GGET 0 0 ; "a"
0002 GGET 1 1 ; "b"
0003 ISGE 0 1
0004 JMP 0 => 0008
0005 GGET 0 2 ; "x"
0006 IST 0
0007 JMP 1 => 0009
0008 => GGET 0 3 ; "y"
0009 => RET1 0 2
Conditionals can be eliminated if their input is a constant. This is
only done for 'truthness' tests. E.g. this is the bytecode for
return "a" and 99
:
0001 KSHORT 0 99
0002 RET1 0 2
The check for the truthness of "a"
has been eliminated.
A more common idiom is cond and x or y
, which is useful as a ternary
conditional. But the behavior is not the same, since x
is evaluated
and must not be false
or nil
(or the idiom would fail). However x
is very often a constant. E.g. this is the bytecode for
return a and 10 or 20
:
0001 GGET 0 0 ; "a"
0002 ISF 0
0003 JMP 1 => 0006
0004 KSHORT 0 10
0005 JMP 1 => 0007
0006 => KSHORT 0 20
0007 => RET1 0 2
The test for the truthness of 10
has been eliminated.
Logical operators compute an actual result, e.g. print("a" and 99)
prints 99
. But this result is not needed in a logical result context,
such as an if
statement. E.g. this is the bytecode for
if "a" and 99 then end
:
0001 RET0 0 1
The conditional as well as its result has been eliminated.
Unneeded results of calls are eliminated, too: e.g. the CALL*
instructions are patched to not return any results.
Unconditional jumps that lead to other jumps are folded into a jump to
the final target. Similarly, closing upvalues at the end of a scope is
often followed by a jump. The jump is then folded into the UCLO
, which
takes a branch target.
Lua can be used as data description language, which often involves
creating tables that are initialized with many constants. Instead of
allocating an empty table and generating lots of explicit stores, a
template table is generated by the parser. The template table can be
duplicated very fast with the TDUP
instruction (duplication is
necessary, because the resulting table may be modified).
Mixed constant and variable initializers are allowed, too. The template
table is only created, if at least one constant initializer (constant
key and constant value) is present in the table constructor expression.
E.g. this is the bytecode for return { foo=1, bar=2, 1,2,x,4 }
:
0001 TDUP 0 0
0002 GGET 1 1 ; "x"
0003 TSETB 1 0 3
0004 RET1 0 2
All of the constant initializers are merged into the template table.
Only the store operation t[3] = x
remains in the bytecode.
Many bytecode instructions have specialized forms for commonly used operand types or number of operands. This reduces dispatch costs, since it eliminates (often unpredictable) branches inside of their implementation:
KSTR
loads string constants. Or KSHORT
loads 16 bit signed integer constants, whereas KNUM
loads all other numeric constants.IST
and ISF
, which jump either on truthness or falseness. This saves a dispatch compared to using a single instruction plus an operand for the same purpose.ISEQN
is an equality comparison with a numeric constant.ADDVN
is an addition with a numeric constant on the right.USETN
is used to store a number constant.TGETB
is for loads with constant integer indices 0 to 255 or TSETS
is for stores with a constant string key.CALL
and tailcalls CALLT
.RET0
or RET1
instead of the generic RET
.CALLM
is a call that passes on multiple results as arguments.*FOR*
instructions are used for numeric 'for' loops.FUNCV
is the function header for a vararg Lua function.For more details see the description of the Bytecode instructions.
TODO