Contents
- How CRaC Works
- Dependency & CRaC JDK
- Taking a Checkpoint
- Custom Resource Hooks
- Spring Boot Auto-Configuration for CRaC
- Docker Integration
- Kubernetes Deployment
- CRaC vs GraalVM vs SnapStart
CRaC uses Linux CRIU (Checkpoint/Restore In Userspace) at its core. The lifecycle has two phases:
- Checkpoint phase — the application starts normally, loads classes, wires Spring context, runs JIT compilation to reach peak performance. At a designated point (triggered via JVM API or signal), CRIU freezes all threads and writes the entire process state — heap, stack, file descriptors, sockets, JIT-compiled code — to image files on disk. The process then exits.
- Restore phase — at deployment time, CRIU reads the image files and reconstructs the process in memory. The JVM resumes from exactly where it left off. Startup time is typically 50–200 ms because class loading, JIT compilation, and Spring context initialisation are already baked into the image.
| Aspect | Normal JVM startup | CRaC restore |
| Class loading | ✖ Done at startup | ✔ Already in image |
| Spring context init | ✖ Done at startup | ✔ Already in image |
| JIT warm-up | ✖ Happens gradually under load | ✔ Pre-compiled code in image |
| File descriptors / sockets | ✔ Fresh connections | ✖ Must be recreated on restore |
| Startup latency | 5–15 seconds | 50–200 ms |
Unlike GraalVM Native Image, CRaC works with any JVM language and framework — no compilation changes, no reflection configuration, no build-time class analysis. The trade-off is the extra checkpoint step in the build pipeline.
CRaC requires a CRaC-enabled JDK (Azul Zulu builds for Linux x64 are the most commonly used) and a single Spring Boot dependency that activates the auto-configuration hooks.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- CRaC API — provided at runtime by the CRaC JDK; scope=provided avoids shipping it -->
<dependency>
<groupId>io.github.crac</groupId>
<artifactId>org-crac</artifactId>
<version>0.1.3</version>
</dependency>
# Download Azul Zulu JDK with CRaC support (Linux x64)
curl -L -o zulu-crac.tar.gz \
https://cdn.azul.com/zulu/bin/zulu21.36.17-ca-crac-jdk21.0.4-linux_x64.tar.gz
tar xzf zulu-crac.tar.gz
# Verify CRaC support
./zulu21-crac/bin/java -XX:CRaCCheckpointTo=/tmp/test -version
A checkpoint can be triggered in two ways: programmatically via the CRaC API inside the application, or externally via the jcmd tool or a UNIX signal. The programmatic approach is simpler and reproducible in build pipelines.
# Run the application with checkpoint output directory specified
java -XX:CRaCCheckpointTo=/path/to/checkpoint \
-jar target/myapp.jar &
# Wait for application to fully start and reach steady state
sleep 10
# Trigger checkpoint externally using jcmd
jcmd $(jps -q) JDK.checkpoint
# The JVM writes checkpoint files and exits
# /path/to/checkpoint/ now contains core image files
// Programmatic checkpoint — trigger after readiness probe passes
@SpringBootApplication
public class MyApp implements CommandLineRunner {
@Override
public void run(String... args) throws Exception {
// App is fully initialised here — good time to checkpoint
if (Arrays.asList(args).contains("--checkpoint")) {
log.info("Taking CRaC checkpoint...");
Core.getGlobalContext().checkpointRestore();
// Execution resumes here after restore
log.info("Restored from checkpoint!");
}
}
}
# Restore from checkpoint — instant startup
java -XX:CRaCRestoreFrom=/path/to/checkpoint
File descriptors, network connections, and any time-sensitive state become stale after restore. Register a Resource hook to close these before checkpoint and re-open them after restore. Spring Boot 3.2+ does this automatically for JDBC pools, caches, and embedded servers.
import org.crac.Context;
import org.crac.Core;
import org.crac.Resource;
@Component
public class RedisConnectionHook implements Resource {
private final RedisConnectionFactory connectionFactory;
public RedisConnectionHook(RedisConnectionFactory connectionFactory) {
this.connectionFactory = connectionFactory;
// Register with the global CRaC context
Core.getGlobalContext().register(this);
}
@Override
public void beforeCheckpoint(Context<? extends Resource> context) {
// Called just before checkpoint — close connections so they aren't captured in image
log.info("CRaC beforeCheckpoint: closing Redis connections");
connectionFactory.destroy(); // returns connections to pool and closes idle ones
}
@Override
public void afterRestore(Context<? extends Resource> context) {
// Called immediately after restore — re-establish connections with fresh sockets
log.info("CRaC afterRestore: reconnecting to Redis");
connectionFactory.getConnection(); // triggers pool initialisation
}
}
Spring Boot 3.2+ ships spring-boot-autoconfigure with CRaC-aware lifecycle management. When the CRaC JDK is detected at runtime, Spring automatically registers hooks for the following components — no custom code needed.
| Component | beforeCheckpoint action | afterRestore action |
| HikariCP connection pool | Evict all connections (closes idle + active) | Refill pool with fresh connections |
| Embedded Tomcat / Netty | Pause acceptors (stop accepting new connections) | Resume acceptors |
| Spring Cache (Caffeine) | Clear caches (stale data may be wrong after restore) | Caches refill on demand |
| Scheduled tasks | Cancel pending scheduled executions | Reschedule tasks |
# application.yml — no CRaC-specific properties needed for built-in components
# Spring Boot detects the CRaC JDK automatically via Runtime.version()
spring:
datasource:
hikari:
max-lifetime: 1800000 # 30 min — ensure connections refresh before pool cycle
lifecycle:
timeout-per-shutdown-phase: 30s
The standard pattern is a two-stage Docker build: a "checkpoint" stage runs the app and takes a snapshot; a "runtime" stage copies the snapshot into a lean image. The runtime image restores from the checkpoint instead of starting the JVM fresh.
# Stage 1: build the JAR
FROM maven:3.9-eclipse-temurin-21 AS build
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline -q
COPY src ./src
RUN mvn package -DskipTests -q
# Stage 2: create CRaC checkpoint inside a privileged container
FROM azul/zulu-openjdk-debian:21-crac AS checkpoint
WORKDIR /app
COPY --from=build /app/target/myapp.jar .
# Run app, wait for it to start, trigger checkpoint, capture image
RUN java -XX:CRaCCheckpointTo=/checkpoint -jar myapp.jar & \
sleep 15 && \
jcmd $(jps -q -J-XX:+UseG1GC myapp) JDK.checkpoint && \
wait
# Stage 3: lean runtime image with checkpoint baked in
FROM azul/zulu-openjdk-debian:21-crac
WORKDIR /app
COPY --from=checkpoint /checkpoint /checkpoint
CMD ["java", "-XX:CRaCRestoreFrom=/checkpoint"]
The checkpoint stage requires --privileged to allow CRIU to use Linux kernel capabilities (CAP_SYS_PTRACE). Run this stage in a trusted CI/CD environment — not on a shared runner. The resulting runtime image does NOT need privileged mode.
CRaC-restored pods start in under 200 ms, making them ideal for Kubernetes horizontal pod autoscaling. New replicas become ready almost instantly during traffic spikes. The runtime container does not need elevated privileges — only the checkpoint creation step does.
apiVersion: apps/v1
kind: Deployment
metadata:
name: product-service
spec:
replicas: 3
template:
spec:
containers:
- name: product-service
image: myregistry/product-service:crac-1.0
# No privileged: true needed for restore — only for checkpoint creation
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1"
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 1 # CRaC: ready almost immediately
periodSeconds: 2
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 2
periodSeconds: 10
---
# HPA — CRaC makes scale-out fast enough for bursty traffic
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: product-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: product-service
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
CRaC is one of three approaches to solving the JVM cold start problem. Each makes different trade-offs.
| Aspect | CRaC | GraalVM Native Image | AWS SnapStart |
| Cold start | 50–200 ms | 10–100 ms | <1 s |
| Peak throughput | Full JIT — same as JVM | Lower (AOT, no JIT) | Full JIT |
| Build complexity | Medium (checkpoint step) | High (native build, config) | Low (SAM flag) |
| Framework compatibility | High — any JVM code | Medium — reflection/proxies need config | High — any JVM code |
| OS requirement | Linux only (CRIU) | Linux, macOS, Windows | Lambda managed runtime (Linux) |
| Cloud portability | Any cloud / on-prem | Any cloud / on-prem | AWS Lambda only |
| Best for | K8s auto-scaling, fast rollouts | Ultra-low memory, CLI tools | Lambda serverless only |