not logged in | [Login]
So you think you've found a bug in LuaJIT?
Hold your breath for a minute - are you sure it's not your mistake?
Run your code with PUC lua
, and with luajit -joff
and check that you don't
see the bug, and that you do with luajit
.
Check that you're not relying on any undefined behaviour.
I once thought I had found a bug, but in fact I was relying on the order in
which pairs
traversed a hash table.
Both lua
and luajit -joff
always traversed the table in exactly the same
order, but sometimes luajit
legitimately traversed the table in a different
order.
If you still think it might be LuaJIT's fault, read on...
JIT bugs can be hard to catch. The JIT compiler uses a number of heuristics to decide which traces to compile, and your bug showing up will probably depend on a particular trace starting at the right line being chosen for compilation at the right time.
Almost anything can tickle the trace-choosing heuristics and make the bug disappear, so the process of reducing your code to a smaller testcase can be slow. Even removing unused code can move or hide the problem - but sometimes code you couldn't remove at one stage can be removed later after other changes.
Persistence does win in the end.
print
If you're like me, you'll be tempted to sprinkle print
s in your code to see
if you can find out what's going on.
print
isn't implemented as a fast function, so traces containing print
are
aborted.
So if you put a print
in the middle of the problem trace,
you'll stop the trace being compiled and hide your problem.
If you want to get some output use io.write
which doesn't abort traces,
but beware - it does tickle the heuristics.
Try to get a dump of your code:
luajit -jdump=+rsx,<dump_file_name> <file.lua>
.
Unfortunately recording a dump can also influence the heuristics and so move or
hide the problem.
The smaller the dumpfile, the easier it'll be to finally track down the problem, use the size of the file as a metric of how well you're doing at reducing your case.
Different compiler options to LuaJIT can move, hide or expose the problem,
so it's worth trying a few of them to ferret it out.
I've found that there's usually a value of -Ohotloop=X
that'll set the bug
off so it's worth trying a few.
Here's a shell script that'll run your code with a whole range of compiler options. It assumes your code has a non-zero exit code when the bug occurs.
lua $1 > /dev/null
echo "lua OK"
luajit -joff $1 > /dev/null
echo "luajit -joff OK"
for i in {1..50}
do
echo -Ohotloop=$i
luajit -Ohotloop=$i -jdump=+rsx,test.dump $1 > /dev/null
if [ $? -ne 0 ]
then
echo "Error on " luajit -Ohotloop=$i -jdump=+rsx,test.dump $1 "> /dev/null"
mv test.dump error.dump
for o in "" "1" "2" "3" "-fold" "-cse" "-dce" "-narrow" "-loop" "-fwd" "-dse" "-abc" "-fuse"
do
echo -O$o -Ohotloop=$i
luajit -O$o -Ohotloop=$i -jdump=+rsx,test.dump $1 > /dev/null
done
break
fi
done
rm test.dump
Even if you have a very large test case and can't reduce it much,
the problem trace could still be a short piece of code.
You can reduce the size of the dump and track down the problem by turning
the jit compiler off for your modules one-by-one.
Put if jit then jit.off(true, true) end
at the top of your modules,
and see if the problem goes away.
If it does go away, remove the line.
If the problem persists, that module doesn't need to be compiled to exhibit
the problem, and you've just made the dump smaller.
Sorry for the shameless plug, but I use
luatrace
to find unused code that can (possibly) be removed, again reducing the size of
the test case.
Run lua -luatrace.profile <file.lua>
and look at annotated-source.txt
-
you'll see which lines aren't executed.
Of course, as I said earlier, not all unused code can be removed.