Android has actually employed a hybrid JIT/AOT compilation model for a long time.
The application bytecode is only interpreted on first run and afterwards if there’s no cached JIT compilation for it. The runtime AOT compiles well-known methods and then profiles the application to identify targets for asynchronous JIT compilation when the device is idle and charging (so no excess battery drain): https://source.android.com/docs/core/runtime/configure#how_art_works
Compiling on the device allows the use of profile-guided optimizations (PGO), as well as the use of any non-baseline CPU features the device has, like instruction set extensions or later revisions (e.g. ARMv8.5-A vs ARMv8).
If apps had to be distributed entirely as compiled object code, you’d either have to pre-compile artifacts for every different architecture and revision you plan to support, or choose a baseline to compile against and then use feature detection at runtime, which adds branches to potentially hot code paths.
It would also require the developer to manually gather profiling data if they wanted to utilize PGO, which may limit them to just the devices they have on-hand, or paying through the nose for a cloud testing service like that offered by Firebase.
This is not to mention the massive improvement to the developer experience from not having to wait several minutes for your app to compile to test out each change. Call it laziness all you want, but it’s risky to launch a platform when no one wants to develop apps for it.
Any experienced Android dev will tell you it does kinda suck anyways, but it’d suck way worse if it was all C++ instead. I’d take Android development over iOS development any day of the week though. XCode is one of the worst software products ever conceived, and you’re forced to use it to build anything for iOS.
I know about all this — I actually began implementing my own JVM language a few days ago. I know Android uses Dalvik btw. But I guess a lot of people can use this info; infodump is always good. I do that.
btw I actually have messed around with libgcc-jit and I think at least on x86, it makes zero difference. I once did a test:
– Find /e/ with MAWK -> 0.9s
– Find /e/ with JAWK -> 50s.
No shit! It’s seriously slow.
Now compare this with go-awk: 19s.
Go has reference counting and heap etc, basically a ‘compiled VM’. I think if you want fast code, ditch runtime.
Actually, Android doesn’t really use Dalvik anymore. They still use the bytecode format, but built a new runtime. The architecture of that runtime is detailed on the page I linked. IIRC, Dalvik didn’t cache JIT compilation results and had to redo it every time the application was run.
FWIW, I’ve heard libgcc-jit doesn’t generate particularly high quality code. If the AOT compiled code was compiled with aggressive optimizations and a specific CPU in mind, of course it’ll be faster. JIT compiled code can meet or exceed native performance, but it depends on a lot of variables.
As for mawk vs JAWK vs go-awk, a JIT is not going to fix bad code. If it were a true apples to apples comparison, I’d expect a difference of maybe 30-50%, not ~2 orders of magnitude. A performance gap that wide suggests fundamental differences between the different implementations, maybe bad cache locality or inefficient use of syscalls in the latter two.
On top of that, you’re not really comparing the languages or runtimes so much as their regular expression engines. Java’s isn’t particularly fast, and neither is Go’s. Compare that to Javascript and Perl, both languages with heavyweight runtimes, but which perform extraordinarily well on this benchmark thanks to their heavily optimized regex engines.
It looks like mawk uses its own bespoke regex engine, which is honestly quite impressive in that it performs that well. However, it only supports POSIX regular expressions, and doesn’t even implement braces, at least in the latest release listed on the site: https://github.com/ThomasDickey/mawk-20140914
(The author creates a new Github repo to mirror each release, which shows just how much they refuse to learn to use Git. That’s a respectable level of contempt right there.)
Meanwhile, Java’s regex engine is a lot more complex with more features, such as lookahead/behind and backreferences, but that complexity comes at a cost. Similarly, if go-awk is using Go’s regexp package, it’s using a much more complex regex engine than is strictly necessary. And Golang admits in their own FAQ that it’s not nearly as optimized as other engines like PCRE.
Thus, it’s really not an apples to apples comparison. I suspect that’s where most of the performance difference arises.
Go has reference counting and heap etc, basically a ‘compiled VM’.
This statement is completely wrong. Like, to a baffling degree. It kinda makes me wonder if you’re trolling.
Go doesn’t use any kind of VM, and has never used reference counting for memory management as far as I can tell. It compiles directly to native machine code which is executed directly by the processor, but the binary comes with a runtime baked in. This runtime includes a tracing garbage collector and manages the execution of goroutines and related things like non-blocking sockets.
Additionally, heap management is a core function of any program compiled for a modern operating system. Programs written in C and C++ use heap allocations constantly unless they’re specifically written to avoid them. And depending on what you’re doing and what you need, a C or C++ program could end up with a more heavyweight collective of runtime dependencies than the JVM itself.
At the end of the day, trying to write the fastest code possible isn’t usually the most productive approach. When you have a job to do, you’re going to welcome any tool that makes that job easier.
Android has actually employed a hybrid JIT/AOT compilation model for a long time.
The application bytecode is only interpreted on first run and afterwards if there’s no cached JIT compilation for it. The runtime AOT compiles well-known methods and then profiles the application to identify targets for asynchronous JIT compilation when the device is idle and charging (so no excess battery drain): https://source.android.com/docs/core/runtime/configure#how_art_works
Compiling on the device allows the use of profile-guided optimizations (PGO), as well as the use of any non-baseline CPU features the device has, like instruction set extensions or later revisions (e.g. ARMv8.5-A vs ARMv8).
If apps had to be distributed entirely as compiled object code, you’d either have to pre-compile artifacts for every different architecture and revision you plan to support, or choose a baseline to compile against and then use feature detection at runtime, which adds branches to potentially hot code paths.
It would also require the developer to manually gather profiling data if they wanted to utilize PGO, which may limit them to just the devices they have on-hand, or paying through the nose for a cloud testing service like that offered by Firebase.
This is not to mention the massive improvement to the developer experience from not having to wait several minutes for your app to compile to test out each change. Call it laziness all you want, but it’s risky to launch a platform when no one wants to develop apps for it.
Any experienced Android dev will tell you it does kinda suck anyways, but it’d suck way worse if it was all C++ instead. I’d take Android development over iOS development any day of the week though. XCode is one of the worst software products ever conceived, and you’re forced to use it to build anything for iOS.
I know about all this — I actually began implementing my own JVM language a few days ago. I know Android uses Dalvik btw. But I guess a lot of people can use this info; infodump is always good. I do that.
btw I actually have messed around with libgcc-jit and I think at least on x86, it makes zero difference. I once did a test:
– Find /e/ with MAWK -> 0.9s – Find /e/ with JAWK -> 50s.
No shit! It’s seriously slow.
Now compare this with go-awk: 19s.
Go has reference counting and heap etc, basically a ‘compiled VM’. I think if you want fast code, ditch runtime.
Actually, Android doesn’t really use Dalvik anymore. They still use the bytecode format, but built a new runtime. The architecture of that runtime is detailed on the page I linked. IIRC, Dalvik didn’t cache JIT compilation results and had to redo it every time the application was run.
FWIW, I’ve heard libgcc-jit doesn’t generate particularly high quality code. If the AOT compiled code was compiled with aggressive optimizations and a specific CPU in mind, of course it’ll be faster. JIT compiled code can meet or exceed native performance, but it depends on a lot of variables.
As for mawk vs JAWK vs go-awk, a JIT is not going to fix bad code. If it were a true apples to apples comparison, I’d expect a difference of maybe 30-50%, not ~2 orders of magnitude. A performance gap that wide suggests fundamental differences between the different implementations, maybe bad cache locality or inefficient use of syscalls in the latter two.
On top of that, you’re not really comparing the languages or runtimes so much as their regular expression engines. Java’s isn’t particularly fast, and neither is Go’s. Compare that to Javascript and Perl, both languages with heavyweight runtimes, but which perform extraordinarily well on this benchmark thanks to their heavily optimized regex engines.
It looks like mawk uses its own bespoke regex engine, which is honestly quite impressive in that it performs that well. However, it only supports POSIX regular expressions, and doesn’t even implement braces, at least in the latest release listed on the site: https://github.com/ThomasDickey/mawk-20140914
(The author creates a new Github repo to mirror each release, which shows just how much they refuse to learn to use Git. That’s a respectable level of contempt right there.)
Meanwhile, Java’s regex engine is a lot more complex with more features, such as lookahead/behind and backreferences, but that complexity comes at a cost. Similarly, if go-awk is using Go’s
regexp
package, it’s using a much more complex regex engine than is strictly necessary. And Golang admits in their own FAQ that it’s not nearly as optimized as other engines like PCRE.Thus, it’s really not an apples to apples comparison. I suspect that’s where most of the performance difference arises.
This statement is completely wrong. Like, to a baffling degree. It kinda makes me wonder if you’re trolling.
Go doesn’t use any kind of VM, and has never used reference counting for memory management as far as I can tell. It compiles directly to native machine code which is executed directly by the processor, but the binary comes with a runtime baked in. This runtime includes a tracing garbage collector and manages the execution of goroutines and related things like non-blocking sockets.
Additionally, heap management is a core function of any program compiled for a modern operating system. Programs written in C and C++ use heap allocations constantly unless they’re specifically written to avoid them. And depending on what you’re doing and what you need, a C or C++ program could end up with a more heavyweight collective of runtime dependencies than the JVM itself.
At the end of the day, trying to write the fastest code possible isn’t usually the most productive approach. When you have a job to do, you’re going to welcome any tool that makes that job easier.