Building Portable Binaries for Single Board Computers with diozero and GraalVM

Matt Lewis
7 min readMay 8, 2021

--

Introduction

diozero is a portable device I/O library written in Java that provides support for GPIO, I2C, SPI and Serial devices. To date it is known to work on all versions on the Raspberry Pi from the very first ARMv6 based Model A through to the latest ARMv8-A Pi 4 variants, including both 32-bit as well as 64-bit Operating Systems. Support is not limited to just Raspberry Pis either — the Java write once, run anywhere design philosophy means that diozero will theoretically run on any Linux based device that can run Java — it is also tested on the following Single Board Computers, using both the manufacturer provided O/S as well as the excellent Armbian O/S:

  • ASUS TinkerBoard
  • HardKernel Odroid C2
  • Allwinner H3 boards (including NanoPi Neo and Duo2)
  • Beaglebone Green / Black
  • The now defunct NEXT Thing Co CHIP

This portability is achieved by separating the exposed API from the internal Service Provider Interface that is responsible for all actual device communication. The Java Service Loader API is used to dynamically load a Service Provider implementation dynamically at run-time. The diozero Concepts page describes this in more detail.

diozero provides a default built-in Service Provider implementation that works via a thin C layer that is invoked using JNI for maximum performance. The diozero-core JAR file is packaged with pre-built shared libraries for ARMv6 and ARMv7 32-bit CPUs, AArch64 / ARM 64-bit CPUs as well as x86-64.

In general, performance of diozero is pretty good — GPIO toggle on a Pi4 has been observed at frequencies of just under 28MHz (partly due to the built-in memory-mapped GPIO implementation). However, one problem remains — the Java Virtual Machine has to load a lot of classes on start-up and the Just-in-time Compiler will always take time to get “warmed up”. This can be a real problem for short-lived applications.

Fortunately GraalVM has been designed specifically to address these concerns — by statically compiling a Java application ahead-of-time into a native executable. The GraalVM website lists Linux ARM64 support as “experimental” at the time of writing — note that there is no support for ARM32 meaning that this will only work on Pi3 and Pi4 variants that are running with a 64-bit Operating System (Raspberry Pi OS or Ubuntu).

Building a Native diozero Application

This is the first time that I’ve used GraalVM and thought it would be useful to document my experience.

The diozero-sampleapps project includes a number of test applications that are used to test the various devices and features provided by diozero. For this article, I will focus on the simple GpioPerfTest application that I use to measure GPIO toggle performance. I used a Pi4 Model B with 2GB RAM running Raspberry Pi OS 64-bit “lite” (console only).

First of all, the GraalVM compiler needs to be installed on the device:

# Download the AArch64 GraalVM binary
wget https://github.com/graalvm/graalvm-ce-builds/releases/download/vm-21.0.0.2/graalvm-ce-java11-linux-aarch64-21.0.0.2.tar.gz
# Extract...
tar zxf graalvm-ce-java11-linux-aarch64-21.0.0.2.tar.gz
# ...and install in /usr/lib/jvm
sudo mv graalvm-ce-java11-21.0.0.2 /usr/lib/jvm/.

Make sure that GraalVM executables are available on the PATH:

export GRAALVM_HOME=/usr/lib/jvm/graalvm-ce-java11-21.0.0.2
export PATH=${PATH}:${GRAALVM_HOME}/bin

Install required Operating System dependencies:

sudo apt update && sudo apt -y install build-essential libz-dev zlib1g-dev zlibc

Install the native image builder:

gu install native-image

And test:

cat <<EOF > HelloWorld.java
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World!");
}
}
EOF

javac HelloWorld.java
native-image HelloWorld
./helloworld

As per the simple example above, creating a native image can be extremely straightforward. However, a GraalVM native image behaves very differently to a normal Java application — dynamically loaded dependencies (JNI libraries, the Java Service Loader and reflection) will not, by default, be available to the statically compiled native application.

The GpioPerfTest application could be converted into a native image via this command:

native-image --allow-incomplete-classpath --no-fallback \
-cp diozero-sampleapps-1.2.1.jar:diozero-core-1.2.1.jar:tinylog-api-2.2.1.jar:tinylog-impl-2.2.1.jar \
com.diozero.sampleapps.perf.GpioPerfTest

However, the resulting application would not run as a number of key dependencies would be missing and the JNI library would not function correctly.

Fortunately, the GraalVM native image builder includes a number of options to address this, specifically the following command line flags:

The next problem is knowing what to put in these files. Again, GraalVM has an answer to this — you can run the application as you would normally but with a “native-image-agent” that captures all reflection, JNI and resources that are loaded while the application is running:

$GRAALVM_HOME/bin/java -agentlib:native-image-agent=config-output-dir=config \
-cp diozero-sampleapps-1.2.1.jar:diozero-core-1.2.1.jar:tinylog-api-2.2.1.jar:tinylog-impl-2.2.1.jar \
com.diozero.sampleapps.perf.GpioPerfTest

Note that this will only capture the resources that are accessed by this specific application — as it only does GPIO output, it does not capture all other resources required for all scenarios — I2C, SPI and Serial. I have manually tweaked these files to include all known diozero configurations.

The GpioPerfTest application can now be converted to a static native image:

native-image -H:JNIConfigurationFiles=./config/jni-config.json \
-H:ReflectionConfigurationFiles=./config/reflect-config.json \
-H:ResourceConfigurationFiles=./config/resource-config.json \
-H:+TraceServiceLoaderFeature -H:+ReportExceptionStackTraces \
--allow-incomplete-classpath --no-fallback \
-cp diozero-sampleapps-1.2.1.jar:diozero-core-1.2.1.jar:tinylog-api-2.2.1.jar:tinylog-impl-2.2.1.jar \
com.diozero.sampleapps.perf.GpioPerfTest

Results

Default behaviour with “normal” Java VM:

> java -cp diozero-sampleapps-1.2.1.jar com.diozero.sampleapps.perf.GpioPerfTest 21 50000000
Starting GPIO performance test using GPIO 21 with 50000000 iterations
Duration for 50,000,000 iterations: 1.847s, frequency: 27,070,926 Hz
Duration for 50,000,000 iterations: 1.828s, frequency: 27,352,298 Hz
Duration for 50,000,000 iterations: 1.799s, frequency: 27,793,218 Hz
Duration for 50,000,000 iterations: 1.797s, frequency: 27,824,151 Hz
Duration for 50,000,000 iterations: 1.798s, frequency: 27,808,676 Hz

With a smaller number of iterations you see the impact of the JVM JIT compiler in the early results:

> java -cp diozero-sampleapps-1.2.1.jar com.diozero.sampleapps.perf.GpioPerfTest 21 500000
Starting GPIO performance test using GPIO 21 with 500000 iterations
Duration for 500,000 iterations: 0.053s, frequency: 9,433,962 Hz
Duration for 500,000 iterations: 0.043s, frequency: 11,627,907 Hz
Duration for 500,000 iterations: 0.020s, frequency: 25,000,000 Hz
Duration for 500,000 iterations: 0.018s, frequency: 27,777,778 Hz
Duration for 500,000 iterations: 0.018s, frequency: 27,777,778 Hz

With the GraalVM JVMCICompiler the speed really ramps up (I haven’t verified the frequency other than by using an LED):

> java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler -cp diozero-sampleapps-1.2.1.jar com.diozero.sampleapps.perf.GpioPerfTest 21 50000000
Starting GPIO performance test using GPIO 21 with 50000000 iterations
Duration for 50,000,000 iterations: 3.488s, frequency: 14,334,862 Hz
Duration for 50,000,000 iterations: 3.469s, frequency: 14,413,376 Hz
Duration for 50,000,000 iterations: 1.296s, frequency: 38,580,247 Hz
Duration for 50,000,000 iterations: 0.498s, frequency: 100,401,606 Hz
Duration for 50,000,000 iterations: 0.480s, frequency: 104,166,667 Hz

With a smaller number of iterations you see that the GraalVM JVMCICompiler doesn’t get going:

> java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+UseJVMCICompiler -cp diozero-sampleapps-1.2.1.jar com.diozero.sampleapps.perf.GpioPerfTest 21 500000
Starting GPIO performance test using GPIO 21 with 500000 iterations
Duration for 500,000 iterations: 0.112s, frequency: 4,464,286 Hz
Duration for 500,000 iterations: 0.101s, frequency: 4,950,495 Hz
Duration for 500,000 iterations: 0.100s, frequency: 5,000,000 Hz
Duration for 500,000 iterations: 0.100s, frequency: 5,000,000 Hz
Duration for 500,000 iterations: 0.100s, frequency: 5,000,000 Hz

With the GraalVM native image:

> ./com.diozero.sampleapps.perf.gpioperftest 21 10000000
Starting GPIO performance test using GPIO 21 with 10000000 iterations
Duration for 10,000,000 iterations: 1.162s, frequency: 8,605,852 Hz
Duration for 10,000,000 iterations: 1.203s, frequency: 8,312,552 Hz
Duration for 10,000,000 iterations: 1.163s, frequency: 8,598,452 Hz
Duration for 10,000,000 iterations: 1.163s, frequency: 8,598,452 Hz
Duration for 10,000,000 iterations: 1.163s, frequency: 8,598,452 Hz

Native image and smaller number of iterations:

> ./com.diozero.sampleapps.perf.gpioperftest 21 1000000
Starting GPIO performance test using GPIO 21 with 1000000 iterations
Duration for 1,000,000 iterations: 0.153s, frequency: 6,535,948 Hz
Duration for 1,000,000 iterations: 0.117s, frequency: 8,547,009 Hz
Duration for 1,000,000 iterations: 0.116s, frequency: 8,620,690 Hz
Duration for 1,000,000 iterations: 0.116s, frequency: 8,620,690 Hz
Duration for 1,000,000 iterations: 0.116s, frequency: 8,620,690 Hz

A separate application was developed that measured the time to toggle a GPIO on and off over time to see the JVM JIT “warm-up” in action.

The duration (in nanoseconds) for the first 100 toggles only is displayed in the following chart to illustrate the initial “warm-up” period:

The rolling average duration in nanoseconds was also displayed every 500th iteration to track the longer-term JVM JIT “warm-up” period:

Conclusion

The results are interesting but mainly as you would expect:

  • Using the JVM JIT provides good performance but takes time to get “warmed up”.
  • The new experimental GraalVM JVMCICompiler in JDK11 appeared to bring significant performance benefits for the longer running test (> 15 seconds), but slightly slower for the slower test runs (< 5 seconds). These results need to be verified by attaching a digital oscilloscope to the GPIO output pin.
  • GraalVM compiled native images provide marginally slower overall performance but is much more consistent and almost completely eliminates all startup delays.

--

--

Matt Lewis
Matt Lewis

No responses yet