LuaJit 2.0 is out

alberth · on Nov 12, 2012

HUGE congrats to Mike Pall. No amount of praise to him is enough.

Hopefully, sponsors will continue to donate to this project to keep Mike fully employed and ensure long-term commitment of this project.

I encourage sponsors to considered donating [1]

On a side note, I wish the Programming Language Shootout [2] would add back in LuaJIT to demonstrate just how fast LuaJIT is in comparison to other languages. It's just amazing.

[1] http://luajit.org/sponsors.html

[2] http://shootout.alioth.debian.org/

_casperc · on Nov 12, 2012

Are there any other current benchmarks around for LuaJIT vs for instance java?

alberth · on Nov 12, 2012

When then Programming Language Shootout use to include LuaJIT, it had a benchmark against Java.

If you want to see just LuaJIT vs Lua, [1] is a good comparison.

The reason why I liked the inclusion of LuaJIT in the Programming Language Shootout was that it compared LuaJIT to other language implementations not just on speed, but memory consumption and lines of code. All of which, LuaJIT typically dominated.

[1] http://luajit.org/performance_x86.html

abecedarius · on Nov 13, 2012

https://github.com/darius/superbench is not a big fancy cross-language shootout, at all, but I tried a few little programs I actually cared about. LuaJIT 'won' in the sense that of the languages I tried it had the best combo of performance and pleasant productivity, informally. I didn't try to quantify the latter.

(Also, no Java because I wouldn't seriously consider it for hacking-for-fun.)

malkia · on Nov 13, 2012

On top of that, luajit is very small/compact, and easily embeddable. FFI alone stands out as a huge win overall!

alberth · on Nov 13, 2012

Where are the results of the benchmark?

abecedarius · on Nov 13, 2012

You can run superopt/benchmark.sh on your machine. I'm away from home and the machine with the various languages installed.

igouy · on Nov 13, 2012

On a side note, I wish there was more doing and less wishing others would do :)

http://shootout.alioth.debian.org/more.php#languagex

nickik · on Nov 13, 2012

I wish I could understand LuaJit completly. I think if there is any (reasonably sized) software I would like to completly understand its this one.

There is not only lots of good algorithms but also a lot of fantastic engineering. Things that you can not learn by reading papers.

Congrats Mike Pall. Now lets see if you can do the same for GC community that you did to the JIT community. Good Luck.

chubot · on Nov 13, 2012

Yeah I agree it's a very interesting piece of software.

But do you understand the prequisites -- namely processor microarchitecture? I know that's why I don't understand it. I imagine if I did it might not be that hard to understand.

Most people's knowledge of the stack stops somewhere around C. C hasn't changed for 40 years. It's a portable assembly language, i.e. the common mental model is that each construct in C is a handful of assembly instructions on any processor. But to get LuaJIT-type performance you have to understand exactly how the CPU works, i.e. pipelining and reordering, internal caching algorithms, internal branch prediction algorithms, and maybe to some extent multicore, although I guess LuaJIT is single-threaded like Lua.

Mike Pall's mental model is for sure not C (since a lot of it isn't even written in C), but the lower level CPU architecture. And that has changed a lot in the last 10 years. I think even if you coded a lot of assembly language in the 80's it wouldn't necessarily translate to working on something like LuaJIT. My impression from skimming some online posts is that he has read hundreds of pages of Intel/ARM reference documentation from cover to cover, for multiple CPUs. I think there are a limited number of people with the patience to do that, since it's non-portable and changes relatively quickly.

Another problem is I think that the open source tools that are available don't work at the right level. Like when you are running profilers, they are helping you profile assembly code generated by a C compiler. The performance characteristics he's taking advantage of are at a lower level. I think you need CPU-specific performance counters and so forth, and there's a different way of getting those for each make/model. Are there even open source tools that allow you to get this information? I'd appreciate a pointer.

It's just a different level of abstraction than most programmers are working with. Not that many people are even writing "low level" C these days, e.g. something like redis. A lot of "application level" C code you see these days is old and/or not particularly fast.

I would be interested if anyone has any pointers for this style of code other than "read a bunch of CPU manuals". :)

haberman · on Nov 13, 2012

I think that most sampling profilers like oprofile (http://oprofile.sourceforge.net/news/) or perf (https://perf.wiki.kernel.org/index.php/Main_Page) will give you CPU-level profiling and performance counters. The UI of those tools isn't super-friendly; if you're on OS X the "Instruments" app gives the same information with a far superior UI.

I have written a JIT that uses LuaJIT's dynamic assembly language engine DynASM (http://luajit.org/dynasm.html). DynASM is an incredibly piece of engineering, it lets you write really readable code for generating machine code at runtime, and is extremely small and low-overhead.

I keep meaning to write an article that demonstrates how to use LuaJIT for a small but interesting JIT. I just never quite get around to it. I always wish that I could use to implement the Universal Machine from ICFP 2008 (http://www.boundvariable.org/task.shtml), which is absolutely the most delightful problem ever, but as I recall it makes extensive use of self-modifying code which makes JIT-ting much more difficult and less effective.

nickik · on Nov 13, 2012

> I keep meaning to write an article that demonstrates how to use LuaJIT for a small but interesting JIT. I just never quite get around to it.

Please do.

sb · on Nov 13, 2012

Some hints for your quest:

- Linux's perf tool allows you to read HW performance counters. It's pretty self-explanatory, and some interesting ones are platform-neutral, whereas others you have to specify the CPU manufacturer's hex-code. (See Intel manual, for example.)

- For pipelining and HW/CPU details, I suggest you grab a copy of Hennesy and Patterson's Computer Architecture: A Quantitative Approach. There is also Computer Architecture: A Programmer's Perspective, which is good, but I think for what you're interested in, CAAQA is the better book.

- If you just want to play around with HW performance counters, you might want to try Intel's vtune, which comes (or at least did so, two years ago) as an Eclipse plugin/RCP-workbench.

daurnimator · on Nov 13, 2012

"Another problem is I think that the open source tools that are available don't work at the right level. Like when you are running profilers, they are helping you profile assembly code generated by a C compiler. The performance characteristics he's taking advantage of are at a lower level. I think you need CPU-specific performance counters and so forth, and there's a different way of getting those for each make/model. Are there even open source tools that allow you to get this information? I'd appreciate a pointer."

The way forward is a sampling profiler.

By sampling the stack every (eg) millisecond, you can build a picture of what is taking the longest.

nitrogen · on Nov 13, 2012

A sampling profiler still doesn't tell you what specific instructions are taking longer, how much memory contention is happening between CPUs, how many cache and TLB misses there are, etc.

daurnimator · on Nov 13, 2012

Sure it can; you can look at the next or previous machine instructions.

If they occur more often than others than you can see which are taking longer.

The other metrics you mentioned are often built into the operating system/memory manager IIRC.

nickik · on Nov 13, 2012

> But do you understand the prequisites -- namely processor microarchitecture?

I do not but I try to learn more about that (coursera ftw).

LuaJit is awesome on many level, first as you have pointed out attention to detailed on processer level is quite special and not many people can do that.

Other then the good usage of the processor is not the only reason luajit is fast. On the compiler level, the optimization luajit does are extensive. Even lots of novel innovations are in that that people writting papers should read (Mike Pall has posted a list of thing that he thinks are somewhat novel). The nature of the trace compiler makes many of these things less complicated then they would have been otherwise.

aidenn0 · on Nov 13, 2012

"Are there even open source tools that allow you to get this information? I'd appreciate a pointer."

http://icl.cs.utk.edu/papi/

malkia · on Nov 13, 2012

I would love if someone starts a project - "Reading & Understanding LuaJIT". I've seen a similar project for PostgreSQL.

nickik · on Nov 13, 2012

There are many compiler books but no books about JITs in general. There are some good blogpost about LuaJit but not that many. Following Mike Pall (MikeMike on HN or reddit is intressting too)

This is a good start: http://playingwithpointers.com/archives/1010

daurnimator · on Nov 13, 2012

Great link; I haven't read that one before.

Just submitted as own item: http://news.ycombinator.com/item?id=4777186

Scaevolus · on Nov 13, 2012

Mike Pall wrote a brief summary for "Reading & Understanding Lua", which is likely a prerequisite.

http://www.reddit.com/r/programming/comments/63hth/ask_reddi...

elliptic · on Nov 13, 2012

Have you a link for that?

malkia · on Nov 13, 2012

I just found about it today actually - http://code.google.com/p/postgres-learning/

zdw · on Nov 12, 2012

There have also been a few changes in the roadmap for LuaJIT, as announced here:

http://www.freelists.org/post/luajit/LuaJIT-Roadmap-20122013...

dbaupp · on Nov 12, 2012

Changelog: http://luajit.org/changes.html (extremely detailed)

minikomi · on Nov 13, 2012

What are people using LuaJit for in the wild?

Particularly interested in Android integration - has anyone used it for any projects? From my admittedly small experience using it, I love the language, and would really like to use it in some way practically.

stavros · on Nov 13, 2012

I'm using it for http://www.instahero.com/. Not mobile development, but I love the speed and sandboxing features.

minikomi · on Nov 13, 2012

Thanks for the reply!

How are you using it? With a lua web framework, or more low level stuff?

edit: great looking site btw.

stavros · on Nov 13, 2012

Thank you! I'm not developing the site in it, I'm using it for allowing the users to write their own analytics reports in an easy, fast and secure manner. It's just the bare LuaJIT interpreter with some simple libraries I'm developing for analyzing reports more efficiently.

minikomi · on Nov 13, 2012

Ah, kind of like webscript.io!

The sandboxing does seem to afford it a lot of cool use cases

gavinlynch · on Nov 12, 2012

Something tells me there is an MC Hammer joke in here regarding being 2.0-Lua-Jit-To-Quit, but I'm not sure I can find the phrasing to excite the level of humor I was hoping for...

Oh well, congrats on the release anyway! :)

hyuuu · on Nov 13, 2012

haha nice, this is like, one version only joke, once they hit 2.1, that wont fly

stavros · on Nov 13, 2012

2.1-Lua-Jit-To-Quit-.1

sophiebits · on Nov 12, 2012

Huge congrats to Mike Pall on the release. The man is a machine when it comes to this sort of thing.

pdog · on Nov 12, 2012

I understand that for LuaJIT 2.0, the whole VM has been rewritten. What's changed? What's new?

dmpk2k · on Nov 12, 2012

If you're referring to LuaJIT 1 vs 2, the big differences are that LuaJIT 2 has a trace-tree JIT, and interpreters for each platform written in assembly. The improvement in performance is significant: LuaJIT2's interpreter alone is faster than LuaJIT1's JIT.

It's hard to compare LuaJIT1 and 2, since LuaJIT2 is the most advanced JIT for a dynamically-typed language on the planet. The performance it can achieve with zero type annotations still blows me away.

rayiner · on Nov 12, 2012

Pedantry: it's actually not a trace-tree JIT. It doesn't treat traces as trees. See: http://lambda-the-ultimate.org/node/3851#comment-57679

nickik · on Nov 13, 2012

Yes. Thats a very importend point, its one of the majar diffrences between LuaJit and most other advanced trace jits.

Also everybody should read that thread, at least once a year.

tambourine_man · on Nov 13, 2012

What an amazing thread. Thanks

eslaught · on Nov 12, 2012

I think you're forgetting the gold standard in dynamic languages from the 80s: Common Lisp. It's still around, and it's still faster than almost every other dynamically typed language out there.

http://shootout.alioth.debian.org/u64/benchmark.php?test=all...

Half of the benchmarks are less than a factor of 2x slower than C, which is better than even GHC can do for Haskell.

Edit: Disclaimer: http://shootout.alioth.debian.org/dont-jump-to-conclusions.p...

CJefferson · on Nov 12, 2012

This code does do 'safety 0', which make lisp much closer to C (no array out-of-bounds checks, no int overflow checks among others).

There is no need for LuaJIT to generate completely unsafe code to achieve it's performance.

omaranto · on Nov 12, 2012

Well, dmpk2k did say "with zero type annotations". How does SBCL fare without them?

wesm · on Nov 12, 2012

Careful, you might hurt some people's feelings. Has anyone done a systematic comparison of LuaJIT's performance with PyPy's?

ltratt · on Nov 12, 2012

Yes, on synthetic benchmarks for LuaJIT-2.0.0-beta10 and PyPy 1.9. Those numbers need updating for the latest versions, but the basics are as follows. LuaJIT is incredibly fast to warm-up. On short benchmarks (around a second or less), it is way ahead of PyPy. As code runs for longer and longer, PyPy tends to catch up, and sometimes overtake, LuaJIT.

My experience of RPython and PyPy suggests that there is a fair bit of scope for reducing some of the warm-up cost. I don't think PyPy will ever match LuaJIT's warm-up time, but it may well move quite a bit closer over time. It'll be interesting to see.

nickik · on Nov 13, 2012

Are you sure PyPy can overtake LuaJit after warmup. I think it can maybe do it in a micro benchmark but I doute that in a complex benchmark PyPy has any chance.

LuaJit uses more advanced elimnation of loads and other optimications, also Lua is easier to optimize in generall.

I dont know the numbers but I would really be astunished if that was true.

Any way to of the cooles projects around.

sirclueless · on Nov 13, 2012

To me, the real deciding factor that tells me that LuaJIT will win in the long run is that LuaJIT traces the high-level semantics of Lua, whereas PyPy has to work with Python bytecode, which may have a considerable amount of lost information. There are some good examples in the lambda-the-ultimate thread (for example, optimizing based on the knowledge that HLOAD and ASTORE will never alias).

PyPy can't feasibly optimize the high-level Python AST, because the language is very large compared to Lua, and more or less behaviorally defined by Python's bytecode compiler.

gruseom · on Nov 13, 2012

This is one of the interesting things about LuaJIT: it demonstrates the value of a small, well-designed language in a shockingly awesome way. The care that the Lua designers put into making Lua simple and regular is one of the reasons why a single programmer (a demigod, yes, but still — one demigod) is able to make so impressive an implementation.

ltratt · on Nov 13, 2012

> Are you sure PyPy can overtake LuaJit after warmup. I think it can maybe do it in a micro > benchmark but I doute that in a complex benchmark PyPy has any chance.

When comparing the performance of different languages, micro benchmarks are about all we have, for better or worse. The best we can do is to run a fair number of different such benchmarks and make the comparison over them. I made my statement on the basis of such a comparison.

Both LuaJIT and RPython / PyPy are very clever systems and there's a surprising amount of overlap between the way they do things. But every system has its own strengths and weaknesses, and it doesn't surprise me personally that neither one is a winner 100% of the time.

driax · on Nov 13, 2012

I don't think PyPy is faster. Their design is significantly more complicated and thus takes longer to tune. Additionally I think python is a small bit more dynamic then Lua, complicating things.

I remember seeing a comparison between LuaJit and V8, both of which are highly tuned. LuaJit won on some of the benchmarks while V8 won others, and there were actually bit of a difference in the performance. That's the problem with Jitting, it's all tradeoffs. So what's good for a Jit in a browser is different from what's good in server environment. So you have to make a lot of benchmarks, testing widely different things, while at the same time not spending more time on optimizing one benchmark implementation in one language more than another.

sshumaker · on Nov 13, 2012

Also the addition of the ffi library. It totally changes the way you integrate with C code (for the better) - its the best FFI I've seen in any language.

erichocean · on Nov 14, 2012

The Factor FFI is pretty similar.

moe · on Nov 13, 2012

I wonder if a kickstarter could convince him to look at a Ruby VM ;)

riffraff · on Nov 13, 2012

well, he does have plans until luajit3... maybe we can book him in advance? I'd still like a fast ruby in five years :D

ksec · on Nov 13, 2012

May be someone should work on Ruby to Lua instead? Using Lua and LuaJIT as a underlying languages. Like Coffescript to Javascript.

That would be a lot faster then working on Ruby VM. And since Ruby 2.0's VM is still as slow as hell compared to LuaJIT

mamcx · on Nov 13, 2012

Is possible to use this for build a language? (I wanna to create my own language, but hope for something easier than LLVM, java, do it from zero in C)

fusiongyro · on Nov 13, 2012

I'd be surprised if it were easier than LLVM, which is pretty easy.

But if you want to make a language you should absolutely not get hung up on technology. Even though it's very dated, Crenshaw's "Let's Build a Compiler" can teach you everything you need to get started building languages from scratch and by hand, and it moves pretty fast too:

http://compilers.iecc.com/crenshaw/

There are other tutorials out there which will get you places. The first thing to do, really, is make a sketch of the language as you'd like to use it, and get excited about that, and then sit down and implement bits of it at a time. Compiler/interpreter writing is extremely rewarding—you get a lot of payback for your investment. Good luck!

jdvh · on Nov 13, 2012

It's actually really easy to take the JIT-compilation engine from LuaJit and use it for the compiler of your own programming language. The LuaJIT code is surprisingly modular, and you can pretty much take what you need.

For scanner/parser generation Lemon and Ragel are terrific. That takes care of building the AST for you. LuaJIT does the heavy lifting and takes care of the architecture specific edge cases. That just leaves the fun part in the middle where you decide on the semantics and syntax of your language.

someone13 · on Nov 13, 2012

Yes, totally possible! For example, see MoonScript, a CoffeeScript-like language that compiles to Lua:

http://moonscript.org/

z3phyr · on Nov 13, 2012

Yeah people are trying to port clojure on LuaJIT.. It is possible.

graue · on Nov 13, 2012

I got curious and looked into this. The relevant effort seems to be this, a new backend for the ClojureScript compiler which compiles to Lua instead of JavaScript:

https://github.com/raph-amiard/clojurescript-lua

http://raph-amiard.github.com/ClojureScriptLua.html

z3phyr · on Nov 13, 2012

Actually they use clojurescript because, the compiler is already writern in clojure. They just need to bootstrap it to th VM.

dmpk2k · on Nov 13, 2012

I'm curious how they map Clojure's persistent data structures with many threads onto Lua states.

nickik · on Nov 13, 2012

Clojure data structures do not have many threads. Clojure separates the form of the data and the way that they are processed. All Clojure Data Structures are completly feasable in a single threaded language.

dmpk2k · on Nov 13, 2012

They're feasible, but a lot less interesting if you're not taking advantage of multiple cores.

nickik · on Nov 13, 2012

I disagree, functional programming is the right way to go without multithreading.

z3phyr · on Nov 13, 2012

Functional programming makes multithreading easier. This is one of the primary reason people are taking interest in this way of programming.

nickik · on Nov 13, 2012

It got popular for this but generally functional programmers are not fans of functional programming because of multicore.

Rich Hicky for example has pointed this out in many diffrent talks.

comex · on Nov 13, 2012

I'd like to use Lua, partly to take advantage of LuaJIT and partly because it's an interesting language, but it drives me crazy that it lacks things like 'continue' and '+='. Maybe I should use MoonScript, but that completely discards Lua's aesthetics, which is hardly better.

catwell · on Nov 14, 2012

Lua 5.2 added "goto", which basically solves the lack of "continue". LuaJIT supports it too.

pheon · on Nov 13, 2012

modify the luajit frontend. Added continue and != and some other weirder syntactic sugar, was only a half dozzen lines of code.

fuzzythinker · on Nov 13, 2012

moonscript to the rescue! http://moonscript.org/reference/#update_assignment