Building a Native Java Application for ARM64 with Quarkus

The Graal native image compiler is a great tool to significantly reduce the start time and memory consumption of Java applications by compiling them into native executables. However, compilation with this tool is sometimes hard to realize and can cause a lot of effort to get it right.

With Red Hat’s Quarkus framework, many of the problems with native-image can be eliminated. The Quarkus team provides numerous out-of-the box solutions for the sometimes hard-to-configure native build process. With Quarkus, implementing fast Java apps can be achieved with manageable effort.

By using Graal native image and Quarkus, Java applications become an interesting option for embedded use cases where fast start times and low memory consumption are key. Many embedded systems run on ARM hardware as it is more energy efficient than x86 hardware. With Java applications natively build for ARM hardware, a new dimension of code-reuse becomes possible. Java code can be shared between all kinds of platforms and architectures, meeting performance requirements of embedded systems, desktop computers and cloud environments in one go.

This article is about natively building Quarkus-based Java applications for ARM hardware and testing the result as part of a container-based development workflow.

Background

The first thing that comes into mind when targeting ARM is cross-compilation. When you are used to work with C++ compilers, you probably know, that most of them support compilation for target architectures that differ from the build machine. This would be great for Java apps as well. However, Graal native image does currently not support cross-compilation, which is a problem when you are working on an x86 machine.

So, if you want to build a native executable for ARM, you must run the ARM version of the Graal native image compiler. To be precise: You must run the aarch64 version of native-image on an aarch64 Linux operating system.

There are two options, how you can achieve this:

  • Build on real ARM hardware running an aarch64 version of Linux.
  • Build in a virtual ARM environment based on QEMU executing an aarch64 version of Linux.

Now, let’s look into both options and their pros and cons:

Building on ARM Hardware

The first option is to get some powerful ARM based computer that is able to run the resource-intensive native-image compile process. If you plan to build your apps in minutes and not in hours, you might look for something more powerful than a Raspberry PI. If you still want to stick with your Pi, consider some additional configuration. We have made good experiences with NVIDIA’s Jetson boards for building our apps.

Docker Image for Building Quarkus

Quarkus allows you to build your application with a Docker image containing the native-image executable. The Quarkus team provides some pre-configured images, that are used internally by Maven or Gradle to build the Linux x86_64 image of you application. To allow building with docker, you must configure your Quarkus project to use the extension container-image-docker. If you are building with Gradle, this simply means, that you add the following dependency:

implementation 'io.quarkus:quarkus-container-image-docker'
Now, the problem is, that the Quarkus build image is only available for x86. As a result, you cannot use it to build your app on ARM hardware. What you need, is an image that behaves like the Quarkus build image and is available for Linux aarch64.

Good news: Oracle provides an image for our desired architecture. The oracle/graalvm-ce image is a multi-architecture image that has an aarch64 variant. So, let’s build our Quarkus build image based on the one from the Graal team. This can be done by creating a Dockerfile with the following content:

Dockerfile.build.aarch64

FROM oracle/graalvm-ce:20.2.0-java11 AS build
RUN gu install native-image
WORKDIR /project
VOLUME ["/project"]
ENTRYPOINT ["native-image"] 

Please note, that the project volume is used by the Quarkus builder to mount you app’s sources into the image. The ENTRYPOINT is necessary, because Quarkus expects the image to execute the native-image tool when being run.

To use this image in your build process on an ARM machine, you have to build it there. It’s obvious, that you should have docker installed for this purpose. Then, you can run:

docker build -f Dockerfile.build.aarch64 -t nevernull/quarkus-build-aarch64 .

This will create the image with the name nevernull/quarkus-build-aarch64.

Now, use this image when building Quarkus. Quarkus kindly provides a property for this: With quarkus.native.builder-image, you can provide the name of the image, when building. If you are using Gradle, you call it with the other properties for enabling native container-based build as follows:

./gradlew build \
-Dquarkus.package.type=native \
-Dquarkus.native.container-build=true \
-Dquarkus.native.builder-image=nevernull/quarkus-build-aarch64:latest

When you execute this, you may go and get a coffee. On success, your ARM-ready native executable will be located in the build folder as *-runner application.

Compiling a simple Quarkus-based microservice application with several REST endpoints:

Building on x86 with QEMU ARM emulation

With the above approach, you need ARM hardware to build your app. But what, if you don’t have ARM hardware available? You might be able to simulate an ARM environment. For this, you can use the QEMU virtualization technology and Docker, as it has QEMU support built-in. Let’s go!

Create an AARCH64 image for building

If you are running Docker Desktop, all you need to do is start an image that has an aarch64 architecture. QEMU will automatically step in and simulate an ARM environment. So, we can simply use the aarch64 version of our build image above (oracle/graalvm-ce) and run the native-image compiler with it, right?

The problem is, that you need to tell Docker to explicitly pick up an aarch64 image. Some images have the architecture in their names (pointing to a single definition), but many images are multi-architecture images. This also applies to Oracle’s Graal CE image. In this case, Docker will select the image with the same architecture as your build machine. So how do you force Docker to select an aarch64 image?

Currently, the most simple approach is to point to a concrete image by its hash value. At the time of writing, the hash value of the aarch64 variant of the Graal CE 20.2 image is 494222b828e6096bd00b16b9626b54665546fc5b60a8080c99be8d29af829638, which can be found on its docker hub webpage. Pointing to this version is done by appending the hash value to the image reference in the Dockerfile as shown below:

FROM oracle/graalvm-ce:20.2.0-java11@sha256:494222b828e6096bd00b16b9626b54665546fc5b60a8080c99be8d29af829638 AS build
RUN gu install native-image
WORKDIR /project
VOLUME ["/project"]
ENTRYPOINT ["native-image"]

Use the aarch64 image for building on x86

Now, you can build this image on your x86 machine and use it for compiling with Quarkus/native-image. First, build the image with Docker in the same way as before:

docker build -f Dockerfile.build.aarch64 -t nevernull/quarkus-build-aarch64 .
Then compile your java app with this image:
./gradlew build \
-Dquarkus.package.type=native \
-Dquarkus.native.container-build=true \
-Dquarkus.native.builder-image=nevernull/quarkus-build-aarch64:latest
Now, go and get a coffee or two… or three…..

There is a problem: The ARM emulation by QEMU is significantly slower than the bare-metal performance on ARM hardware. While the build process will actually run through, it might not finish on the same day. Let’s see:

Compiling a simple Quarkus-based microservice application with several REST endpoints:

This is about 28 times slower than compiling on the Jetson board. And yes, we configured Docker to have access to all 4 cores and all 8GB of RAM as described in the manual.

Résumé: Please don’t try this at all. Get ARM hardware for building Java apps for aarch64! Maybe take a second look at the NVIDIA’s Jetson boards?

Creating a Docker image for execution on AARCH64

Okay, now it’s not a great idea to build the native executable with QEMU based ARM simulation. But it might be useful to put the resulting aarch64 executable into a corresponding container for local integration tests. With this approach it is possible to run the Quarkus-based unit tests against the executable. We remember: natively compiled Java apps start amazingly fast. Is this true in an QEMU environment as well?

So, let’s imagine a remote build machine or CI pipeline has build the native executable for aarch64 for us. We fetch it and want to put it into a container for integration tests. Again, we are running on an x86 machine. Therefore, we need to force Docker to use an aarch64 version of an image.

The Quarkus team recommends using the Red Hat Universal Base Image (ubi8/ubi-minimal) for executing native Java apps. That’s great, as it is really small. However, how can we convince Docker to use the aarch64 version of this multi-architecture image on an x86 machine? It’s currently not possible to find the hash value of the image on the corresponding webpage. But this hash value exists. And here is how to find it:

Enable experimental docker cli features. Then run:

docker manifest inspect registry.access.redhat.com/ubi8/ubi-minimal
This outputs the following information about the different components of the image:
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 737,
         "digest": "sha256:5f931273a2b9250318a45914228c25e8e3ea0feec846edd67c92f03af1596a8a",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 737,
         "digest": "sha256:c6592eb9cdd7ea7fa43beddf507ca2a8c2127f13ef66d49baea2fd28e37f62ba",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 737,
         "digest": "sha256:d32fa019f53a718e47714b50d91a19431e40de0174c0d522e5bf703da4235608",
         "platform": {
            "architecture": "ppc64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 737,
         "digest": "sha256:c16e3df0de887329613549ca94934353bc3d2212bb8e51eb7c0af94805d410ea",
         "platform": {
            "architecture": "s390x",
            "os": "linux"
         }
      }
   ]
}
Do you see the SHA hash of the arm64 variant? As described above, you can add it to the ID of the referenced image in your Dockerfile. Here is the default Quarkus Dockerfile in the aarch64 version.
FROM registry.access.redhat.com/ubi8/ubi-minimal@sha256:c6592eb9cdd7ea7fa43beddf507ca2a8c2127f13ef66d49baea2fd28e37f62ba
WORKDIR /work/
COPY build/*-runner /work/application
RUN chmod 775 /work
EXPOSE 8080
CMD ["./application", "-Dquarkus.http.host=0.0.0.0"]
When you build this image, make sure, the aarch64 variant of the executable (*runner) is located in the build folder. Alternatively, you can work with a different folder (e.g. an architecture-specific sub-folder).

When you build and run this image, Docker will again simulate an ARM environment. Is it feasible to test with it?

Starting a natively compiled Quarkus-based microservice application with several REST endpoints:

Although the app starts about 27 times slower in the simulation environment, it is so extraordinary fast, that the simulation does not matter. As a result, running integration tests with the setup is feasible, as long as there is no heavy computation involved.

Conclusion

  • The Quarkus framework and tools and the Graal native-image technology are great for building fast Java apps for ARM hardware. Fast startup times and low memory consumption make this combination a goto solution for embedded environments.
  • To get a productive workflow with these technologies, it is necessary to build on real ARM hardware.
  • The QEMU simulation environment on x86 introduces a performance penalty. Execution times grow by more than factor 25, which makes this approach unusable for Graal native-image compilation.
  • Running integration tests of native aarch64 executables with QEMU is feasible. If no heavy computations are involved, the resulting apps start and execute fast enough for testing and development purposes.

About the Author

Dr. Daniel Thommes is CEO of the NeverNull GmbH. He is expert for AOT-compiled high-performance Java applications. The NeverNull GmbH provides expert knowledge for software projects, architecture and design.