Contents

Virtual threads are lightweight threads introduced by Project Loom and finalized as JEP 444 in Java 21. Unlike traditional platform threads, which map one-to-one with OS threads, virtual threads are managed entirely by the JVM. The JVM multiplexes potentially millions of virtual threads onto a small pool of OS-level carrier threads.

Before virtual threads, scaling a thread-per-request server meant either limiting concurrency to a few thousand OS threads or rewriting everything with reactive/async APIs. Virtual threads eliminate that tradeoff: you write straightforward blocking code, and the JVM handles the scheduling transparently.

/* The concurrency landscape before and after virtual threads: Before (Java < 21): ───────────────────────────────────────────────────────── Option A: Thread-per-request with platform threads → Simple code, but limited to ~2,000–5,000 concurrent threads → Each thread consumes ~1 MB of stack memory Option B: Reactive / async (CompletableFuture, RxJava, WebFlux) → Scales well, but code becomes callback spaghetti → Stack traces are useless, debugging is painful After (Java 21+): ───────────────────────────────────────────────────────── Virtual threads: Thread-per-request that actually scales → Simple blocking code (looks like Option A) → Millions of concurrent threads (scales like Option B) → Each virtual thread costs ~1 KB of memory → Full stack traces, easy debugging ───────────────────────────────────────────────────────── */

Java 21 provides several ways to create and run virtual threads. The simplest is Thread.startVirtualThread(), which creates and immediately starts a virtual thread. For more control, use the builder API Thread.ofVirtual(). For production workloads, Executors.newVirtualThreadPerTaskExecutor() gives you an executor that spawns a new virtual thread for every submitted task.

// 1. Quick one-liner — create and start immediately Thread vt = Thread.startVirtualThread(() -> { System.out.println("Hello from virtual thread: " + Thread.currentThread()); }); vt.join(); // 2. Builder API — name, uncaught exception handler, etc. Thread named = Thread.ofVirtual() .name("worker-", 0) // names: worker-0, worker-1, ... .uncaughtExceptionHandler((t, e) -> System.err.println(t.getName() + " failed: " + e.getMessage())) .start(() -> { System.out.println("Running on: " + Thread.currentThread().getName()); }); named.join(); // 3. Unstarted thread — useful when you need the Thread reference first Thread unstarted = Thread.ofVirtual() .name("lazy-worker") .unstarted(() -> doWork()); unstarted.start(); // start later unstarted.join();

For production server code, the executor-based approach is usually the best fit. It integrates with existing code that expects an ExecutorService and ensures proper lifecycle management with try-with-resources.

// 4. Virtual-thread-per-task executor — the production workhorse try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { // Submit 100,000 tasks — each gets its own virtual thread List<Future<String>> futures = new ArrayList<>(); for (int i = 0; i < 100_000; i++) { final int taskId = i; futures.add(executor.submit(() -> { Thread.sleep(Duration.ofSeconds(1)); // simulates I/O return "Result-" + taskId; })); } // Collect results for (Future<String> f : futures) { f.get(); // blocks, but that's fine — we're on a virtual thread too } } System.out.println("All 100,000 tasks completed"); Thread.isVirtual() returns true for virtual threads. Use it to verify your code is actually running on a virtual thread when needed.

Understanding the differences between virtual and platform threads helps you decide when and where to use each. Platform threads are still the right choice for long-running CPU-bound work, while virtual threads shine for I/O-bound tasks with high concurrency.

/* Virtual Threads vs Platform Threads — Side by Side ═══════════════════════════════════════════════════════════════════ Property Platform Thread Virtual Thread ─────────────────────────────────────────────────────────────────── Managed by OS kernel JVM runtime Stack size ~1 MB (default) ~1 KB (grows as needed) Creation cost Expensive (~1 ms) Cheap (~1 μs) Max practical count ~5,000 per JVM Millions per JVM Scheduling OS preemptive JVM cooperative (yield at blocking) Blocking I/O Blocks OS thread Releases carrier thread CPU-bound work Good (OS time-slicing) Avoid (starves carrier pool) Thread pooling Essential Unnecessary (create new ones) ThreadLocal Works (but leaky) Works (prefer ScopedValue) synchronized Works (but may pin) Works (but may pin — use locks) ═══════════════════════════════════════════════════════════════════ Rule of thumb: → I/O-bound tasks (HTTP, DB, file I/O) → Virtual threads → CPU-bound tasks (crypto, compression) → Platform threads → Mixed workload → Virtual threads for I/O, platform threads for CPU */ // Quick benchmark: creating 100,000 threads // Platform threads — takes seconds and may crash with OutOfMemoryError // Virtual threads — completes in milliseconds long start = System.nanoTime(); List<Thread> threads = new ArrayList<>(); for (int i = 0; i < 100_000; i++) { threads.add(Thread.startVirtualThread(() -> { try { Thread.sleep(Duration.ofSeconds(1)); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } })); } for (Thread t : threads) { t.join(); } long elapsed = Duration.ofNanos(System.nanoTime() - start).toMillis(); System.out.println("100K virtual threads completed in " + elapsed + " ms"); // Typical output: 100K virtual threads completed in ~1050 ms

Virtual threads use a continuation-based scheduling model. When a virtual thread performs a blocking operation (I/O, sleep, lock acquisition), the JVM unmounts it from its carrier thread, saving its stack as a heap-allocated continuation. The carrier thread is then free to run another virtual thread. When the blocking operation completes, the JVM mounts the virtual thread back onto an available carrier.

The carrier thread pool is a ForkJoinPool managed by the JVM. By default, its parallelism equals the number of available processors. You can tune it with the system property -Djdk.virtualThreadScheduler.parallelism=N.

/* How virtual threads are scheduled on carrier threads: ───────────────────────────────────────────────────────────────── Carrier threads (OS threads managed by ForkJoinPool): ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Carrier-0│ │ Carrier-1│ │ Carrier-2│ │ Carrier-3│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ Virtual threads (millions possible): VT-1: [run]→[blocked on I/O → unmount]→[I/O done → remount]→[run]→[done] VT-2: [run]→[sleep → unmount]→[wake → remount]→[run]→[done] VT-3: [run]→[blocked on lock → unmount]→[lock acquired → remount]→[done] ... VT-N: [waiting to mount]→[run]→[done] Timeline on Carrier-0: ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ VT-1│ │ VT-5│ │VT-12│ │ VT-1│ │VT-99│ ← different VTs over time └─────┘ └─────┘ └─────┘ └─────┘ └─────┘ Key: When VT-1 blocks, it unmounts. Carrier-0 picks up VT-5. When VT-1's I/O completes, it remounts on any available carrier. ───────────────────────────────────────────────────────────────── */ // Observing mounting and unmounting in practice Thread.startVirtualThread(() -> { // Carrier info is visible via thread toString() System.out.println("Before sleep: " + Thread.currentThread()); // Output: VirtualThread[#21]/runnable@ForkJoinPool-1-worker-1 Thread.sleep(Duration.ofMillis(100)); // unmounts from carrier System.out.println("After sleep: " + Thread.currentThread()); // Output: VirtualThread[#21]/runnable@ForkJoinPool-1-worker-3 // Notice: may be on a DIFFERENT carrier after remounting! }).join(); The continuation (saved stack) of a virtual thread lives on the heap and is garbage-collected when the virtual thread terminates. This is why virtual threads use so little memory compared to platform threads whose stacks are allocated in native memory.

The entire point of virtual threads is that blocking operations become cheap. When a virtual thread calls Thread.sleep(), reads from a Socket, or makes a JDBC query, it unmounts from the carrier thread instead of holding onto an expensive OS thread. This means your simple, sequential, blocking code can handle massive concurrency.

// A simple HTTP server handling 100K concurrent connections // using virtual threads — each connection gets its own thread import java.net.ServerSocket; import java.net.Socket; import java.io.*; import java.util.concurrent.Executors; public class VirtualThreadHttpServer { public static void main(String[] args) throws Exception { try (var serverSocket = new ServerSocket(8080); var executor = Executors.newVirtualThreadPerTaskExecutor()) { System.out.println("Server listening on port 8080"); while (true) { Socket client = serverSocket.accept(); executor.submit(() -> handleRequest(client)); } } } static void handleRequest(Socket client) { try (client; var reader = new BufferedReader(new InputStreamReader(client.getInputStream())); var writer = new PrintWriter(client.getOutputStream(), true)) { // Read the HTTP request line — blocking call, but cheap on a virtual thread String requestLine = reader.readLine(); // Simulate a slow database query (500ms) Thread.sleep(Duration.ofMillis(500)); // Simulate calling an external API (200ms) Thread.sleep(Duration.ofMillis(200)); // Write response String body = "{\"status\":\"ok\",\"thread\":\"" + Thread.currentThread() + "\"}"; writer.println("HTTP/1.1 200 OK"); writer.println("Content-Type: application/json"); writer.println("Content-Length: " + body.length()); writer.println(); writer.print(body); } catch (Exception e) { System.err.println("Error handling request: " + e.getMessage()); } } }

The same pattern applies to database access. Blocking JDBC calls work naturally with virtual threads — each query runs on its own virtual thread, and the carrier threads are freed during I/O waits.

// Database access with virtual threads — no reactive driver needed public class UserRepository { private final DataSource dataSource; public UserRepository(DataSource dataSource) { this.dataSource = dataSource; } public User findById(long id) { // All blocking calls here — getConnection(), executeQuery(), ResultSet.next() // Virtual thread unmounts during each blocking I/O operation try (Connection conn = dataSource.getConnection(); PreparedStatement ps = conn.prepareStatement( "SELECT id, name, email FROM users WHERE id = ?")) { ps.setLong(1, id); try (ResultSet rs = ps.executeQuery()) { if (rs.next()) { return new User(rs.getLong("id"), rs.getString("name"), rs.getString("email")); } } } catch (SQLException e) { throw new RuntimeException("Failed to fetch user " + id, e); } return null; } } // Fetch 10,000 users concurrently — each on its own virtual thread try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { List<Future<User>> futures = LongStream.rangeClosed(1, 10_000) .mapToObj(id -> executor.submit(() -> repo.findById(id))) .toList(); List<User> users = futures.stream() .map(f -> { try { return f.get(); } catch (Exception e) { return null; } }) .filter(Objects::nonNull) .toList(); System.out.println("Fetched " + users.size() + " users concurrently"); } Virtual threads do not magically speed up CPU-bound work. If your task is purely computational (no I/O), a virtual thread offers no benefit over a platform thread. The performance gains come from freeing carrier threads during blocking I/O.

Pinning occurs when a virtual thread cannot unmount from its carrier thread during a blocking operation. This defeats the purpose of virtual threads because the carrier is held hostage. The most common cause is blocking inside a synchronized block or method — the JVM currently cannot unmount a virtual thread that holds a monitor lock.

// BAD — synchronized pins the virtual thread to its carrier public class PinningExample { private final Object lock = new Object(); public void doWork() { synchronized (lock) { // Virtual thread is PINNED here — carrier thread is blocked Thread.sleep(Duration.ofSeconds(1)); // carrier cannot be reused! callExternalApi(); // still pinned } } } // GOOD — ReentrantLock allows unmounting during blocking operations public class NoPinningExample { private final ReentrantLock lock = new ReentrantLock(); public void doWork() { lock.lock(); try { // Virtual thread can unmount here — carrier thread is freed Thread.sleep(Duration.ofSeconds(1)); // carrier is reused! callExternalApi(); // not pinned } finally { lock.unlock(); } } }

To detect pinning in your application, use the JVM diagnostic flag -Djdk.tracePinnedThreads=full. This prints a stack trace whenever a virtual thread is pinned, helping you identify synchronized blocks that need to be converted to ReentrantLock.

/* Detecting pinned threads: ───────────────────────────────────────────────────────────── JVM flag: -Djdk.tracePinnedThreads=full Output when pinning occurs: Thread[#25,ForkJoinPool-1-worker-1,5,CarrierThreads] com.example.PinningExample.doWork(PinningExample.java:8) <== monitors:1 java.lang.VirtualThread.parkOnCarrierThread(VirtualThread.java:187) Options: -Djdk.tracePinnedThreads=full → full stack trace -Djdk.tracePinnedThreads=short → one-line summary Common sources of pinning: ───────────────────────────────────────────────────────────── 1. Your own synchronized blocks → Replace with ReentrantLock 2. Third-party libraries → Check for updated versions 3. JDK internals (rare) → Usually fixed in newer JDK releases 4. Native methods (JNI) → Always pin, no workaround ───────────────────────────────────────────────────────────── */ Not all synchronized usage causes problems. Short synchronized blocks that do not perform I/O (e.g., incrementing a counter) are fine — pinning only matters when the virtual thread would block inside the synchronized region.

Structured concurrency (JEP 462, preview in Java 21+) pairs naturally with virtual threads. StructuredTaskScope ensures that concurrent subtasks are treated as a unit — if one fails, the others are cancelled. This prevents thread leaks and makes error handling predictable.

import java.util.concurrent.StructuredTaskScope; // ShutdownOnFailure — cancel everything if any subtask fails public record OrderDetails(Order order, User user, Inventory inventory) {} public OrderDetails fetchOrderDetails(long orderId) throws Exception { try (var scope = new StructuredTaskScope.ShutdownOnFailure()) { // Each fork() runs on its own virtual thread Subtask<Order> orderTask = scope.fork(() -> orderService.findById(orderId)); Subtask<User> userTask = scope.fork(() -> userService.findByOrderId(orderId)); Subtask<Inventory> inventoryTask = scope.fork(() -> inventoryService.check(orderId)); // Wait for all subtasks to complete (or one to fail) scope.join(); // Propagate the first exception if any subtask failed scope.throwIfFailed(); // All succeeded — combine results return new OrderDetails( orderTask.get(), userTask.get(), inventoryTask.get() ); } // scope.close() cancels any still-running subtasks }

ShutdownOnSuccess is the opposite pattern: it returns as soon as the first subtask succeeds, cancelling the rest. This is useful for racing multiple strategies — for example, querying a cache and a database simultaneously.

// ShutdownOnSuccess — return the first successful result public User findUserFast(long userId) throws Exception { try (var scope = new StructuredTaskScope.ShutdownOnSuccess<User>()) { // Race: cache vs database — whoever answers first wins scope.fork(() -> cache.getUser(userId)); // might be instant scope.fork(() -> database.findUser(userId)); // slower but reliable scope.join(); // Returns the first successful result; the other is cancelled return scope.result(); } } Structured concurrency enforces that child threads cannot outlive their parent scope. When the try block exits, close() is called automatically, which interrupts any subtasks that are still running. This eliminates the risk of orphaned threads.

Migrating from thread pools to virtual threads is usually straightforward. The key insight is that virtual threads are cheap and disposable — you do not pool them. Here is a step-by-step guide for common migration scenarios.

// BEFORE: Fixed thread pool — limits concurrency to 200 threads ExecutorService executor = Executors.newFixedThreadPool(200); // AFTER: Virtual thread executor — unlimited concurrency ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor(); // The rest of your code stays the same! executor.submit(() -> handleRequest(request));

While the executor swap is simple, there are several things to watch out for during migration. The following checklist covers the most common pitfalls.

/* Migration Checklist — Thread Pools to Virtual Threads ═══════════════════════════════════════════════════════════════════ ✅ DO: ─────────────────────────────────────────────────────────────────── • Replace Executors.newFixedThreadPool(N) with Executors.newVirtualThreadPerTaskExecutor() • Replace synchronized blocks (that contain I/O) with ReentrantLock • Use try-with-resources for the executor: try (var exec = Executors.newVirtualThreadPerTaskExecutor()) { ... } • Keep blocking I/O code as-is — it works great on virtual threads • Use ScopedValue instead of ThreadLocal for request-scoped data • Use semaphores to limit concurrent access to scarce resources: Semaphore dbPool = new Semaphore(50); // limit DB connections ❌ DON'T: ─────────────────────────────────────────────────────────────────── • Don't pool virtual threads — create a new one for each task • Don't use ThreadLocal for object pooling (e.g., SimpleDateFormat) — millions of virtual threads means millions of pooled objects • Don't run CPU-intensive work on virtual threads — use platform threads for crypto, compression, or heavy computation • Don't assume thread identity — a virtual thread may run on different carrier threads over its lifetime • Don't use Thread.yield() — it has no useful effect on virtual threads ═══════════════════════════════════════════════════════════════════ */

When your application needs to limit concurrency (e.g., to avoid overwhelming a database connection pool), use a Semaphore instead of limiting the thread pool size. With virtual threads, the thread count is not the right knob for controlling resource usage.

// Limiting concurrency to scarce resources with a Semaphore public class RateLimitedService { // Only allow 50 concurrent database connections private final Semaphore dbPermits = new Semaphore(50); private final DataSource dataSource; public RateLimitedService(DataSource dataSource) { this.dataSource = dataSource; } public String queryDatabase(String sql) throws Exception { dbPermits.acquire(); // virtual thread unmounts while waiting try (Connection conn = dataSource.getConnection(); Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(sql)) { return rs.next() ? rs.getString(1) : null; } finally { dbPermits.release(); } } } // Spring Boot — configure virtual threads in application.properties: // spring.threads.virtual.enabled=true // // That single property switches Tomcat to use virtual threads // for all request handling — no other code changes needed. Spring Boot 3.2+ supports virtual threads out of the box. Set spring.threads.virtual.enabled=true in your application.properties and your entire web layer uses virtual threads automatically.