When building networked applications in Java, it helps to understand how TCP, HTTP, Netty, and gRPC fit together. This article walks through the layers of the stack, how request/response boundaries work, and how frameworks like Netty and gRPC help you build fast systems without wrestling with raw bytes.

Transport Layer

At its core, TCP (Transmission Control Protocol) is a reliable, ordered stream of bytes. Thinking about a TCP connection is easy if you imagine it as two pipes: one from the client to the server and one from the server back to the client. Each pipe just carries bytes.

TCP does NOT know about request or response boundaries. It delivers bytes in order, and it takes care of retransmissions, acknowledgements, and flow control so you do not have to. Following our pipe analogy, when you read from a TCP connection, imagine placing a bucket under the pipe; bytes drip in and you accumulate them until you have enough to parse a complete message. That "enough" part is defined by the protocol on top of TCP, not by TCP itself.

Key takeaway: TPC is not a constant stream of raw bytes. The request and response boundaries are defined by the application layer, not TCP itself.

Netty and TCP

Netty is a Java library that gives you a higher-level API around TCP without making you manage threads, selectors, or connection lifecycles by hand.

At the transport layer abstraction level, Netty handles:

Connection management: handling opens, closes, pooling, and failures.
Non-blocking I/O and event loops: letting you serve many connections without one thread per connection.
Pipelines and handlers: letting you register callbacks for events such as data received or connection closed.

With Netty, you think in terms of protocols and messages and callbacks instead of byte buffers and sockets.

Application Layer

At the application layer is where we define the boundaries, i.e. the framing. HTTP is a good example of this. Below we see how different versions of HTTP works on top of TCP.

HTTP Versions

HTTP/1.0 (default)


TCP SYN   ------------------>
         <------------------  SYN-ACK
ACK       ------------------>  TCP ESTABLISHED

HTTP REQUEST  ---------------->  (End = TCP close)
HTTP RESPONSE <----------------  (End = TCP close)

TCP FIN   ------------------>
         <------------------  FIN-ACK
ACK       ------------------>  TCP CLOSED

End of request/response: TCP connection close

With HTTP/1.0, the end of a message is signaled by closing the connection. This works but is expensive: one connection per request/response.

HTTP/1.0 + keep-alive


HTTP REQUEST #1 -------------->  (End = Content-Length)
HTTP RESPONSE #1 <-------------- (End = Content-Length)
HTTP REQUEST #2 -------------->  (End = Content-Length)
HTTP RESPONSE #2 <-------------- (End = Content-Length)

End of request/response: Content-Length header

This adds persistent connections so you can reuse TCP for multiple requests. The content-length header tells the receiver how many bytes belong to each message. Note that not every client or server supported this extension, and chunked encoding was still not available.

HTTP/1.1 (persistent by default)


HTTP REQUEST #1 -------------->  (End = Content-Length / chunked)
HTTP RESPONSE #1 <-------------- (End = Content-Length / chunked)
HTTP REQUEST #2 -------------->  (End = Content-Length / chunked)
HTTP RESPONSE #2 <-------------- (End = Content-Length / chunked)

When a sender does not know the full body size in advance, it can send chunks. The receiver knows the message ends when it sees a chunk with size 0.


4\r\n
Wiki\r\n
5\r\n
pedia\r\n
0\r\n
\r\n

HTTP/1.1 with Connection: close

We can still use TCP close to signal end of request/response like before.


HTTP REQUEST ---------------->  (End = TCP close)
HTTP RESPONSE (close) <--------  (End = TCP close)

HTTP/2 (single TCP, multiplexed, interleaved responses)


STREAM 1 (ID 1, client): HEADERS ----------->  (GET /index.html)
STREAM 2 (ID 3, client): HEADERS ----------->  (POST /submit)
STREAM 3 (ID 5, client): HEADERS ----------->  (GET /style.css)

STREAM 2: DATA --------------->  (request body, chunk 1)
STREAM 1: DATA <--------------  (response body, End response = END_STREAM)
STREAM 3: DATA <--------------  (response body, End response = END_STREAM)
STREAM 2: DATA --------------->  (request body, chunk 2)
STREAM 2: DATA <--------------  (response body, chunk 1)
STREAM 2: DATA <--------------  (response body, chunk 2, End request = END_STREAM, End response = END_STREAM)

In HTTP/2, each stream has a numeric stream ID. The rule is simple:

Client-initiated streams have odd IDs: 1, 3, 5, …
Server-initiated streams have even IDs: 2, 4, 6, …

Stream IDs help the connection track which frames belong to which logical stream. Multiplexing lets multiple streams share a single TCP connection. Interleaving allows faster responses to arrive before slower ones, removing HTTP-level head-of-line blocking. The END_STREAM flag signals when a request or response is complete.

Netty and HTTP

At the application layer, Netty provides protocol-level building blocks. It includes decoders and encoders for HTTP, so you can think in terms of “received HTTP request” or “send HTTP response” instead of “parse bytes from a buffer.” In other words, instead of registering callbacks for raw bytes, you can register them at a higher level of abstraction, for example when an HTTP request is received. This way, Netty handles the details of HTTP framing for you. You can also send responses using the HTTP abstraction, without having to write raw bytes yourself.

The following shows an example HTTP server with Netty:

import io.netty.bootstrap.ServerBootstrap;
import io.netty.channel.*;
import io.netty.channel.nio.NioEventLoopGroup;
import io.netty.channel.socket.SocketChannel;
import io.netty.channel.socket.nio.NioServerSocketChannel;
import io.netty.handler.codec.http.*;

public class MinimalHttpServer {

    public static void main(String[] args) throws InterruptedException {
        int port = 8080;

        // EventLoopGroup handles I/O operations (boss = accept connections, worker = handle traffic)
        EventLoopGroup bossGroup = new NioEventLoopGroup(1);
        EventLoopGroup workerGroup = new NioEventLoopGroup();

        try {
            ServerBootstrap b = new ServerBootstrap();
            b.group(bossGroup, workerGroup)
             .channel(NioServerSocketChannel.class)
             .childHandler(new ChannelInitializer<SocketChannel>() {
                 @Override
                 protected void initChannel(SocketChannel ch) {
                     ChannelPipeline p = ch.pipeline();
                     p.addLast(new HttpServerCodec());          // HTTP encoder/decoder
                     p.addLast(new HttpObjectAggregator(65536)); // Aggregate HTTP message fragments
                     p.addLast(new SimpleHttpHandler());        // Your handler
                 }
             });

            ChannelFuture f = b.bind(port).sync();
            System.out.println("Server started at http://127.0.0.1:" + port);
            f.channel().closeFuture().sync();
        } finally {
            bossGroup.shutdownGracefully();
            workerGroup.shutdownGracefully();
        }
    }

    static class SimpleHttpHandler extends SimpleChannelInboundHandler<FullHttpRequest> {
        @Override
        protected void channelRead0(ChannelHandlerContext ctx, FullHttpRequest msg) {
            FullHttpResponse response = new DefaultFullHttpResponse(
                    HttpVersion.HTTP_1_1,
                    HttpResponseStatus.OK,
                    ctx.alloc().buffer().writeBytes("Hello, World".getBytes())
            );
            response.headers().set(HttpHeaderNames.CONTENT_TYPE, "text/plain");
            response.headers().set(HttpHeaderNames.CONTENT_LENGTH, response.content().readableBytes());
            ctx.writeAndFlush(response).addListener(ChannelFutureListener.CLOSE);
        }
    }
}

Note that with a ChannelPipeline, we register a chain of handlers that are called in order for inbound events. HttpServerCodec does the HTTP decoding. HttpObjectAggregator does the aggregation, since HTTP messages can arrive in fragments. For example, headers arrive first as an HttpRequest, and the body may follow in one or more HttpContent objects. For large requests or chunked transfers, the message is split into multiple pieces. Without aggregation, your handler would need to process each fragment individually and manually assemble the full message.

Netty and gRPC

gRPC is an RPC framework that runs on top of HTTP/2. It uses HTTP/2 streams to carry messages encoded with Protocol Buffers (protobuf), and it supports streaming in both directions.

In Java, gRPC uses Netty as the HTTP/2 transport. Netty handles:

TCP connections
HTTP/2 framing and streams
Non-blocking I/O

On top of that, gRPC adds:

RPC semantics (methods, services)
Message serialization (protobuf)
Streaming abstractions

Together, they provide efficient, multiplexed, binary communication for client-server systems.

Multiplexing

Multiplexing simply means sharing a resource between multiple uses instead of having a dedicate resource for each use. We mean different things when they talk about multiplexing at different layers.

At the transport layer, the shared resource is he thread handling the requests. At this layer, multiplexing usually refers to handling many TCP connections efficiently on a small number of threads, as in Java NIO or Netty. A single thread monitors multiple channels with a selector and reacts when data is ready to read or write, avoiding one thread per connection. Transport-layer multiplexing is like a receptionist watching many phones and picking up each one only when it rings.

At the application layer, the shared resource is the underlaying TCP connection used by the HTTP requests/responses. At this layer, multiplexing means sending multiple logical streams over a single TCP connection, like in HTTP/2 or gRPC. Each stream has a unique ID, and frames from different streams can be interleaved on the wire, allowing fast responses to proceed without waiting for slower ones. Application-layer multiplexing is like a single phone line carrying multiple conversations at once, with each caller identified by a unique ID.

Sending Large Files to a Netty Server

Netty makes it easy to handle large file uploads efficiently, even if the client isn’t using Netty. Let’s say the client is using curl or any standard HTTP library to upload a big file.

At the client side, the file is streamed over HTTP, either with a Content-Length header if the size is known, or using chunked transfer encoding if it isn’t (see HTTP 1.1 above). The client library reads the file incrementally and sends bytes over TCP, so you never have to load the entire file into memory. For example, with curl:


curl -X POST http://server/upload --data-binary @largefile.bin

On the server side, Netty receives the request over the same TCP connection. Its HttpObjectDecoder breaks the incoming bytes into three parts:

HttpRequest – the headers and metadata.
HttpContent chunks – each chunk of the file body as it arrives.
LastHttpContent – marks the end of the file.

Your Netty handler can then process each chunk as it arrives, writing directly to disk or another sink. This way, only a small chunk is in memory at any time, making it possible to handle files of gigabytes in size. Once LastHttpContent is received, the server knows the upload is complete and can send a response back to the client.

Here’s a simplified illustration of what’s happening under the hood:


Client (curl)                              Server (Netty)
-----------                                -------------
Open file / stream bytes
          |
          v
TCP connection
          |
POST /upload HTTP/1.1
          |
[File chunk 1] -------------------------> HttpContent 1
[File chunk 2] -------------------------> HttpContent 2
[File chunk 3] -------------------------> HttpContent 3
...
[File chunk N] -------------------------> HttpContent N
          |
[End of file] --------------------------> LastHttpContent
          |
Server writes chunks to disk incrementally
          |
Server responds HTTP 200 OK

Key points:

The client doesn’t need Netty; any HTTP client works.
The server handles large files without loading the entire file into memory.
Netty’s pipeline and chunked handling make uploads efficient and backpressure-aware.

This approach scales easily to gigabytes of data and integrates seamlessly with your existing Netty HTTP server.

The following code shows an example:

public class FileUploadHandler extends SimpleChannelInboundHandler<HttpObject> {
    private RandomAccessFile file;

    @Override
    protected void channelRead0(ChannelHandlerContext ctx, HttpObject msg) throws Exception {
        if (msg instanceof HttpRequest) {
            file = new RandomAccessFile("uploaded.bin", "rw");
        }
        if (msg instanceof HttpContent) {
            ByteBuf content = ((HttpContent) msg).content();
            while (content.isReadable()) {
                file.writeByte(content.readByte());
            }
            if (msg instanceof LastHttpContent) {
                file.close();
                ctx.writeAndFlush(new DefaultFullHttpResponse(
                        HttpVersion.HTTP_1_1, HttpResponseStatus.OK));
            }
        }
    }
}

Conclusion

TCP provides a reliable byte stream, but no message boundaries. Netty abstracts TCP with asynchronous I/O and pipelines so you can focus on protocols. HTTP (1.x and HTTP/2) defines how to frame messages and where request/response ends. gRPC uses HTTP/2 streams and protobuf to give you a high-level RPC mechanism.

Understanding where each piece fits makes it easier to build networked systems that are both correct and performant, and to pick the right abstractions for your code.

Search This Blog

MyDistributed.Systems

Java Networking with Netty

Transport Layer

Netty and TCP

Application Layer

HTTP Versions

HTTP/1.0 (default)

HTTP/1.0 + keep-alive

HTTP/1.1 (persistent by default)

HTTP/1.1 with Connection: close

HTTP/2 (single TCP, multiplexed, interleaved responses)

Netty and HTTP

Netty and gRPC

Multiplexing

Sending Large Files to a Netty Server

Conclusion

Comments

Post a Comment

Popular posts from this blog

In-memory vs. On-disk Databases

Model-based Testing Distributed Systems with P Language

Log Structured Storage

DynamoDB, Ten Years Later

ByteGraph: A Graph Database for TikTok