Building Your Own Nginx — Part 1: The Shape of the Web
Setting the table — the client-server model, DNS, TCP, HTTP, reverse proxies, load balancers, and the smallest possible web server in Go.
We're working through a series on building our own nginx from scratch (the video we're following along with is here), and instead of just letting the lessons evaporate the moment the video ends, we're going to write them down. This first post is the "set the table" post: before we can build a reverse proxy, we have to understand the world it lives in. So we'll walk through the client-server model, how clients actually find servers, what HTTP and TCP really are underneath the buzzwords, and then we'll write the smallest possible web server in Go and read every line of it together.
No prior knowledge required. If you know that websites exist and that Go is a programming language, you're qualified.
The basic shape: clients and servers
Almost everything on the internet is two computers talking to each other, and one of them started the conversation. The one that started it is the client (your browser, your phone app, curl in a terminal). The one that answered is the server (some machine sitting in a datacenter waiting for someone to talk to it).
That's it. That's the whole client-server model.
The client says, "hey, can I have the homepage?" and the server says, "sure, here it is." A server doesn't just randomly send you stuff — it waits. It's a very patient program whose entire personality is "I will sit here until someone asks me something."
How does a client even find a server? — DNS
Servers don't have names like "google" — they have IP addresses, which are just numbers like 142.250.190.46. Humans are bad at remembering numbers, so we invented a phonebook for the internet called DNS (Domain Name System).
When you type google.com into your browser, your computer doesn't magically know where Google lives. It first asks a DNS server, "hey, what's the IP for google.com?" The DNS server looks it up and responds with the number. Only then can your browser actually connect to Google.
DNS is the part everyone forgets exists, right up until it breaks and the whole internet seems down. It's the translation layer between human-friendly names and machine-friendly numbers.
The pipe between them — TCP
Once your computer knows the server's IP address, it needs an actual connection to it. That connection is built on top of TCP (Transmission Control Protocol).
TCP is a reliable, ordered pipe of bytes between two computers. "Reliable" means if a packet gets lost on the way, TCP notices and resends it. "Ordered" means the bytes come out the other side in the same order they went in. "Pipe of bytes" means TCP itself doesn't care what you're sending — it's just moving raw bytes around. It has no idea what a webpage is, what a header is, or what JSON is. It just delivers your bytes faithfully.
Think of TCP as a really diligent postal service that guarantees your letters arrive, and arrive in order. It does not read your letters. It does not know what's inside.
What you send through the pipe — HTTP
So we have a reliable pipe of bytes. We need a language to speak through that pipe so the client and server understand each other. That language, on the web, is HTTP (HyperText Transfer Protocol).
HTTP is a text-based protocol. When your browser asks Google for the homepage, it literally sends a chunk of text down the TCP pipe. The server reads that text, figures out what was being asked, and sends a chunk of text back.
That's all "the web" really is: text being shuffled through TCP pipes, with HTTP as the agreed-upon shape of that text.
The shape of an HTTP message
Here's roughly what a request from your browser looks like when you load a page:
GET /index.html HTTP/1.1\r\n Host: example.com\r\n User-Agent: Mozilla/5.0\r\n Accept: text/html\r\n \r\n
Let's break this down, because every part is doing a job.
The request line — GET /index.html HTTP/1.1. This is always the very first line. It has three parts: the method (what the client wants to do — GET, POST, PUT, DELETE), the path (the resource being asked for — /index.html), and the version (which dialect of HTTP we're speaking — HTTP/1.1).
Carriage returns and newlines — \r\n. Every line in an HTTP message ends with these two characters. \r is a carriage return, \n is a newline, and together they're the official "line break" of the HTTP world. They're invisible when you read the message, but they're the seams holding the whole thing together. Forget them and your parser breaks.
Headers — Host: example.com, User-Agent: ..., etc. After the request line comes a list of Key: Value pairs called headers. Headers are metadata: who's calling, what they accept, what kind of content they're sending, cookies, auth tokens, anything that isn't the actual payload.
The empty line — that lonely \r\n at the end. This is the most important "blank space" in computing. It's how HTTP says, "headers are done, anything that comes after this is the body." Without that empty line, the server would have no idea where the headers end and the body begins.
The body — for a GET there usually isn't one, but for a POST or PUT the body is where you put the actual payload: a form submission, a JSON object, an uploaded file, whatever.
The response from the server has the exact same shape, except instead of a request line it has a status line — something like HTTP/1.1 200 OK — and then the same headers / empty line / body structure.
JSON: the way we usually shape the body
The body is just bytes. HTTP doesn't care what's in there. But two computers exchanging data need to agree on a format, and these days that format is almost always JSON.
JSON is a way of writing structured data as text. {"name": "Ama", "age": 24} is something both your Go server and a JavaScript browser can read with zero ambiguity. We "encode" data into JSON before stuffing it into the HTTP body, and the receiver "decodes" it back into something their language understands.
Why JSON and not, say, just sending raw memory? Because the two computers might be running totally different languages, on different hardware, with different ways of laying out data in memory. JSON is the neutral middle ground — boring, text-based, and universally understood.
Life before load balancers
Now we have enough vocabulary to understand why load balancers exist. So let's rewind to the bad old days.
Picture this: you've written a web app. You rent a single server somewhere. You point your domain name at that server's IP. People type yourapp.com into their browser, DNS sends them to your server's IP, and the server answers on port 80 (the default port for HTTP) or port 443 (the default port for HTTPS, the encrypted version).
Ports are like apartment numbers — the IP gets you to the building, the port gets you to the right unit. By convention, port 80 means "hi I'm a website, please talk HTTP to me," and port 443 means "hi I'm a website but encrypted, please talk HTTPS to me."
In this setup, one server runs one app. That's it. That's all you can do. If you want a second app, you need a second server with a second IP. If your one server gets overwhelmed by traffic, your one app goes down. If your one server crashes, your one app dies. If you want to deploy a new version, your users see downtime while you swap the binary.
This works fine when you're tiny. It stops working the moment you're not.
Enter the reverse proxy
The first idea is: what if we put a second server in front of our real server, and that second server's only job is to forward requests?
That's a reverse proxy. It sits between the client and the actual application server. The client thinks it's talking to your app — really it's talking to the proxy, and the proxy is talking to your app on the client's behalf.
This sounds pointless until you realize what it unlocks. The proxy can:
- Route different URLs to different backend apps.
yourapp.com/apigoes to one server,yourapp.com/bloggoes to another. Suddenly one domain can host many apps. - Handle HTTPS for you. The proxy decrypts incoming traffic, talks plain HTTP to your backend, and re-encrypts the response on the way out. Your app servers don't need to deal with TLS certificates at all.
- Cache responses. If a thousand people ask for the same page in a second, the proxy can answer 999 of them from memory and only bother the real server once.
- Add security. Block bad IPs, rate-limit abusers, sanitize headers, hide what's actually running behind it.
The "reverse" in reverse proxy is just because a normal (forward) proxy sits in front of the client (think corporate VPN). A reverse proxy sits in front of the server. Same idea, opposite direction.
And then: the load balancer
Reverse proxies open the door to the next idea. Once you're forwarding requests anyway, why forward to one backend? Why not forward to one of many?
That's a load balancer. It's a reverse proxy with multiple backends behind it, and it spreads incoming traffic across them. If you have ten copies of your app running on ten servers, the load balancer hands request #1 to server A, request #2 to server B, request #3 to server C, and so on. (The strategy can be smarter — round-robin, least-connections, weighted, etc. — but the spirit is "spread the load.")
Now your single point of failure isn't a single server. If server B catches fire, the load balancer notices it stopped responding and just stops sending traffic to it. The other nine keep serving. Users notice nothing.
This is the moment your app stops being a fragile single-server thing and becomes something that can actually survive in the real world.
Where everything sits
Putting it all together, here's the modern shape of a serving stack:
Client (browser)
│
│ 1. DNS: "what's the IP for yourapp.com?"
▼
DNS server ──► returns IP of the load balancer
│
│ 2. TCP connection to that IP, port 443
▼
Load Balancer / Reverse Proxy ◄── HTTPS terminates here
│
│ 3. forwards request over plain HTTP
▼
┌──────────┬──────────┬──────────┐
│ App #1 │ App #2 │ App #3 │ ◄── your actual servers
└──────────┴──────────┴──────────┘ The DNS record points to the load balancer, not your real servers. Your real servers are usually on a private network the public internet can't even reach directly. The only public-facing thing is the load balancer.
Route tables
The load balancer needs to know which backend to send a given request to, and it figures that out from a route table — a set of rules that map incoming requests to backends.
A route table is conceptually as simple as it sounds:
| If the request looks like… | …send it to |
|---|---|
yourapp.com/api/* | the api server pool |
yourapp.com/static/* | the cdn / static pool |
admin.yourapp.com/* | the admin server pool |
| anything else | the main web app pool |
When a request hits the load balancer, it scans this table top to bottom, finds the first rule that matches the request's host and path, and forwards the request accordingly. That's how a single domain can power dozens of separate services without the user ever knowing.
Beefing up the load balancer instead of the servers
Here's a thing that surprised us: you generally invest in making the load balancer fast and powerful, not the individual application servers.
The reasoning is: your application servers are cattle. If one falls over, you replace it. You make them small, identical, disposable, and you run lots of them. Scaling up just means adding more of them.
The load balancer, on the other hand, sees every single request before anything else does. It's the bottleneck of the whole system. So it gets the beefy machine, the fast network card, the careful tuning. A modern load balancer can handle hundreds of thousands of requests per second on one box.
Beefing up an application server gives you a faster single point of failure. Beefing up a load balancer gives you a faster system, because it lets you fan out to many cheap workers behind it.
Why we let the Load Balancer decode responses
A related move: things like decompression, decryption, and even some response transformation get done at the load balancer rather than at the application server.
The application server's job is to do the work — query the database, run the business logic, generate the HTML or JSON. We want every spare CPU cycle on that server going toward that. If we made it also handle TLS handshakes, gzip compression, header rewriting, and so on, we'd be burning expensive cycles on plumbing.
So we offload that plumbing to the load balancer. It terminates HTTPS (decrypts on the way in, encrypts on the way out), it handles compression, it normalizes headers, and it hands the application a clean, simple, plain-HTTP request. The application server just answers in plain HTTP. The load balancer wraps the response back up appropriately for the client.
This is sometimes called TLS termination or SSL offloading when it's about encryption specifically, but the general principle — put the boring expensive plumbing at the edge, keep the application servers focused on application work — shows up everywhere in this kind of architecture.
What is nginx, then?
nginx (pronounced "engine-x") is one of the most popular pieces of software for being that load balancer / reverse proxy / web server at the edge. It's fast, it's battle-tested, and it powers a huge chunk of the internet. When people say "I put nginx in front of my app," they mean exactly what we just described: a reverse proxy / load balancer sitting between clients and the real application servers.
That's what we're going to build, piece by piece, over this series. Not a full nginx — that thing is the result of decades of engineering — but a working version of one, in Go, that we understand top to bottom.
Building the smallest possible web server in Go
Before we can build a proxy, we have to be able to build a plain server. Because under the hood, a reverse proxy is "a server that accepts a request and then turns around and acts like a client to a different server" — so server-building is the foundation.
Here's the whole thing in one shot. It's a single Go file. We'll walk through every section right after.
package main import ( "bufio" "fmt" "log" "net" "strings" ) func main() { // setup a server to listen to connections on port 8080. // net.Listen opens a tcp socket bound to that port and gives us back a "listener", // which is essentially a doorway that clients can knock on to talk to us. ln, err := net.Listen("tcp", ":8080") if err != nil { // if we cannot even bind to the port (maybe something else is already using it), // there is nothing meaningful we can do, so we just stop the program. log.Fatal("an error occured") } fmt.Println("Listening on port 8080") // the server's job is never "done", it should keep accepting new connections forever, // so we sit in an infinite loop and wait for clients to show up. for { // Accept blocks (pauses the program) until a client actually connects. // when one does, conn is the raw tcp pipe between us and that specific client. // every client gets their own conn. conn, err := ln.Accept() if err != nil { log.Fatal(err) } // hand the connection off to a goroutine so the main loop can immediately // go back to accepting more clients. without this, we could only serve // one client at a time while everyone else waits in line. go handleConnection(conn) } } // Request is our own little representation of an HTTP request. // an HTTP request is just text that follows a specific shape, and these are the pieces // we care about pulling out of that text: // - method: the verb, like GET or POST // - path: the resource the client is asking for, like /index.html // - version: the http version, like HTTP/1.1 // - headers: extra metadata the client sends along (Host, User-Agent, etc.). // a header key can technically appear more than once, so we store a slice of values. type Request struct { method string path string version string headers map[string][]string } func handleConnection(conn net.Conn) *Request { // make sure that after we do what we want on each connection, we close it. // each connection is essentially a sequence of bytes, and we need to make sense // of those bytes, hence the parsing. defer conn.Close() // a raw connection just gives us bytes one at a time, which is annoying. // bufio.NewReader wraps the connection and gives us nicer helpers like ReadString, // so we can read the request line by line instead of byte by byte. reader := bufio.NewReader(conn) // ReadString reads until the first occurence of '\n', which gives us the request line. // whatever the reader reads is consumed — it won't show up again on the next read. requestLine, err := reader.ReadString('\n') if err != nil { // if we couldn't even read the first line, the client probably disconnected // or sent us garbage. nothing useful to do, so we just bail out. return nil } // HTTP lines end with "\r\n" (carriage return + newline). those characters are part // of the protocol but not part of the content, so we strip them before going further. requestLine = strings.TrimRight(requestLine, "\r\n") // every request line has 3 parts: method, path, version. split into 3. // if we don't get three parts, the line is malformed and we give up. parts := strings.SplitN(requestLine, " ", 3) if len(parts) != 3 { return nil } // fill in the first three fields. headers come later because they live on the // lines after this one. req := Request{ method: parts[0], path: parts[1], version: parts[2], } // after the request line, the client sends a bunch of header lines. // we'll collect them into this map as we go. headers := make(map[string][]string) for { // because the reader consumes what it reads, the next ReadString call // gives us the next line — i.e. the next header. headerLine, err := reader.ReadString('\n') if err != nil { return nil } headerLine = strings.TrimRight(headerLine, "\r\n") // HTTP marks the end of the headers section with a completely empty line. // once we see one, we know there are no more headers and we stop looping. if headerLine == "" { break } // each header line looks like "Key: Value", so the colon is our split point. // TrimSpace cleans up leftover whitespace (especially the space after the colon). colonIndex := strings.Index(headerLine, ":") headerKey := strings.TrimSpace(headerLine[:colonIndex]) headerValue := strings.TrimSpace(headerLine[colonIndex+1:]) // append rather than assign — the same header key can legitimately appear // more than once in a single request (e.g. multiple Set-Cookie or Accept entries). headers[headerKey] = append(headers[headerKey], headerValue) } // stitch the parsed headers onto the Request now that we are done collecting them. req.headers = headers // for now we just print the parsed request so we can eyeball what we received. fmt.Println(req) // craft a minimal valid HTTP response by hand: // - status line: "HTTP/1.1 200 OK" // - a blank line (\r\n\r\n) to mark the end of the headers // - the body: "Hello User" // then write those bytes back through the same connection the client used. responseBytes := []byte("HTTP/1.1 200 OK \r\n\r\nHello User") conn.Write(responseBytes) // we return the parsed request mostly for completeness/debugging. nothing actually // consumes this return value yet, since handleConnection runs in its own goroutine. return &req }
That's the whole program. Now let's actually understand it.
Opening the door — net.Listen
ln, err := net.Listen("tcp", ":8080")
net.Listen is from Go's standard library net package, the package for raw network I/O. The first argument is the protocol — "tcp" — and the second is the address we want to bind to. :8080 means "any IP on this machine, port 8080."
Why 8080 and not 80? Because on most operating systems, ports below 1024 are privileged — you need root/admin to bind to them. Port 80 is one of those. 8080 is the unofficial "I'm just a developer messing around" port and any user can bind to it.
What net.Listen actually does under the hood is ask the operating system to set up a TCP socket bound to that port. The OS now knows: "if anyone shows up at this machine asking for port 8080, wake this program up." The ln (listener) we get back is our handle on that socket. It's the doorway.
If something else is already using port 8080, this call returns an error and we kill the program — there's nothing useful we can do without our doorway.
The accept loop
for { conn, err := ln.Accept() ... go handleConnection(conn) }
A server's whole life is "wait for someone, talk to them, repeat." That's why we're in an infinite for loop with no condition.
ln.Accept() is the interesting part. It blocks — meaning it pauses the program right there — until an actual client opens a TCP connection to our port. The moment that happens, Accept returns a conn, which is the dedicated TCP pipe between us and that specific client. Every client gets their own conn. They're independent.
Then we do something important:
go handleConnection(conn)
The go keyword launches handleConnection(conn) in a goroutine. Goroutines are Go's lightweight concurrency primitive — think of them as very cheap threads. The function runs concurrently while main immediately loops back to Accept and waits for the next client.
Without that go, we could only handle one client at a time. Client #2 would have to wait until we'd finished talking to client #1. With the go, we hand each client off to their own little worker and the front door stays open.
Goroutines are cheap enough that "one per connection" is a perfectly reasonable strategy in Go. In other languages this would be wildly expensive — in Go it's idiomatic.
Why we model a Request ourselves
type Request struct { method string path string version string headers map[string][]string }
An HTTP request that arrives on the wire is just text. To do anything useful with it, we need to parse it into a structured value our program can manipulate. That's what the Request struct is for: it's our in-memory shape for the parts of the request we care about.
headers is a map[string][]string — a map from a header name to a slice of strings, not just one string. Why? Because in HTTP a single header name can legitimately appear multiple times in one request (think Set-Cookie, Accept, Cache-Control). If we used a plain map[string]string, the second occurrence would silently overwrite the first. Using a slice lets us keep all of them.
defer — running something at the end, no matter what
defer conn.Close()
defer is one of Go's signature features. It schedules a function call to run when the surrounding function returns — whether it returns normally or via an early return. We're saying: "no matter how this function ends, close the connection."
This matters because connections are limited resources. If we forget to close them, the OS keeps them open forever and eventually we run out and the server stops accepting new clients. Putting the cleanup right next to the setup is much harder to mess up than scattering conn.Close() calls before every return.
What is a buffered reader?
Before we look at the code, let's slow down on this one. The phrase "buffered reader" gets thrown around a lot, and it's worth pinning down what it actually means.
A reader in Go is anything you can call Read on to pull bytes out of. A file, a network connection, a string, an open HTTP response — they're all readers. The interface is intentionally tiny: "give me some bytes."
A buffer is just a chunk of memory the program holds onto temporarily. A scratchpad.
A buffered reader is a reader that wraps another reader and keeps a buffer in between. Instead of fetching bytes from the underlying source one-at-a-time on demand, it grabs a big chunk into its buffer the first time you ask, and then serves your subsequent small reads out of that buffer until the buffer runs dry. When the buffer is empty, it goes back to the source for another big chunk.
Two reasons this is useful:
1. Performance. Asking the operating system for bytes is expensive. Each call crosses the boundary between your program and the kernel — that's a syscall, and syscalls are not free. If you call Read a thousand times to get a thousand bytes, that's a thousand syscalls. If a buffered reader fetches 4096 bytes once and serves your thousand small reads from its in-memory buffer, that's one syscall plus a bunch of cheap memory copies. Same end result, dramatically less overhead.
2. Higher-level helpers. Once the reader has its own buffer, it can offer operations that a raw reader can't, because those operations need to peek ahead. Things like:
ReadString(delim)— read up to and including a specific byte, like a newline.ReadBytes(delim)— same idea but returns[]byteinstead ofstring.ReadLine— read one line at a time.Peek(n)— look at the nextnbytes without consuming them.
A raw reader can't do Peek because once you've read a byte, it's gone — you can't unread it. A buffered reader can, because the byte is sitting in its buffer and the buffer cursor just doesn't advance.
So back to our code:
reader := bufio.NewReader(conn)
bufio is short for buffered I/O, and it's part of Go's standard library. bufio.NewReader(conn) wraps the raw net.Conn in a buffered reader. The reason we need this is that a raw net.Conn exposes bytes, but doesn't care about structure. If you just call Read on a conn, you get whatever bytes happened to arrive — could be 3, could be 3000. There's no concept of "give me one line."
HTTP, however, is a line-based protocol. The request line is one line. Each header is one line. The boundary between headers and body is a blank line. We want to read this thing line by line, and ReadString('\n') lets us do exactly that.
It reads bytes from the connection until it hits a newline, then hands us back the whole line as a string. The bytes it consumed are gone from its internal buffer — the next call to ReadString will continue from where the last one left off. That's how we walk down the request one line at a time without losing our place.
Parsing the request line
requestLine, err := reader.ReadString('\n') ... requestLine = strings.TrimRight(requestLine, "\r\n") parts := strings.SplitN(requestLine, " ", 3)
The first line of any HTTP request is the request line and it has exactly three parts separated by spaces: method, path, version. So we read that first line, strip the trailing \r\n (which is part of the protocol but not the content), and split on spaces into 3 parts.
strings.SplitN(..., 3) is Split but with a cap on the number of pieces — at most 3. We use SplitN instead of plain Split because we want to be safe: if the path or version somehow contained a space, plain Split would over-split and we'd get a corrupt parse. Capping at 3 keeps everything in the right slot.
If we get fewer than 3 parts, the request was malformed and we just give up by returning nil.
Reading headers in a loop
for { headerLine, err := reader.ReadString('\n') headerLine = strings.TrimRight(headerLine, "\r\n") if headerLine == "" { break } colonIndex := strings.Index(headerLine, ":") headerKey := strings.TrimSpace(headerLine[:colonIndex]) headerValue := strings.TrimSpace(headerLine[colonIndex+1:]) headers[headerKey] = append(headers[headerKey], headerValue) }
Same pattern as before — read a line, trim the \r\n. The key insight is the empty-line check. HTTP says the headers section ends with a completely blank line. So if we read a line and it's empty after trimming, we're done.
For non-empty lines, we find the first : and split: everything before is the header name, everything after is the value. TrimSpace cleans up the customary space that comes after the colon (Host: example.com — note the space).
We append to the slice rather than overwrite, for the multi-value header reason explained earlier.
Writing the response
responseBytes := []byte("HTTP/1.1 200 OK \r\n\r\nHello User") conn.Write(responseBytes)
A response is also just text in a specific shape: status line, headers, blank line, body. We're building the bare minimum here — a status line followed immediately by \r\n\r\n (which is the blank line marking "no more headers") and then Hello User as the body.
[]byte(...) converts the string to a byte slice because that's what conn.Write expects — raw bytes to push down the TCP pipe. The client (curl, browser, whatever) reads those bytes, recognizes the HTTP shape, and shows you "Hello User."
Then defer conn.Close() fires when this function returns, the connection closes, and that goroutine evaporates. Meanwhile the main accept loop is already off accepting the next client.
Try it
That's a real HTTP server. You can curl http://localhost:8080/ and it will answer with Hello User. We didn't import a web framework. We didn't even import net/http (Go's built-in HTTP package). We built it directly on top of TCP because we wanted to see HTTP, not abstract over it.
Where we go from here
Right now this server only knows how to say "Hello User" to anyone who asks. In the next post we'll teach it to actually look at the request and respond differently based on the path — which is the first step toward writing a real router. After that we'll teach it to forward requests to another server instead of answering them itself, and at that point we'll officially have built a reverse proxy. Then we add multiple backends, a route table, some basic load balancing, and we've got our own little nginx.
The thing we keep coming back to is how shockingly little magic there is in any of this. HTTP is just text in a specific shape. TCP is just a pipe of bytes. A server is just a program in an infinite loop. Once you've written one by hand, the whole web stops feeling like a black box.
See you in part 2.