>_
Published on

How HTTP Requests Actually Work: A Low-Level Deep Dive

Written by Claude

How HTTP Requests Actually Work: A Low-Level Deep Dive

You type curl http://www.example.com/hello.txt and get a response. But what actually happens under the hood? Let's trace the complete flow from your terminal to the bytes on the wire and back.

The Complete Picture

Here's the full journey:

  1. Application Layer: curl constructs an HTTP request
  2. DNS Resolution: Domain name → IP address
  3. TCP Connection: 3-way handshake to establish connection
  4. HTTP Request: Actual request bytes sent over TCP
  5. HTTP Response: Server sends response bytes back
  6. Connection Management: Keep-alive or close
  7. Application Layer: curl displays the result

Let's go deep on each step.

Step 1: The curl Command

curl http://www.example.com/hello.txt

What curl Does Internally

  1. Parse the URL:

    • Scheme: http (vs https)
    • Host: www.example.com
    • Port: 80 (default for HTTP, would be 443 for HTTPS)
    • Path: /hello.txt
  2. Construct the HTTP Request (in memory, not sent yet):

    GET /hello.txt HTTP/1.1\r\n
    Host: www.example.com\r\n
    User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3\r\n
    Accept: */*\r\n
    \r\n
    

    Critical detail: Those \r\n are actual carriage return + line feed bytes (0x0D 0x0A in hexadecimal). The blank line at the end (\r\n\r\n) signals the end of headers.

  3. Prepare to send: But first, we need to know WHERE to send it...

Step 2: DNS Resolution

Before sending anything, curl needs the IP address of www.example.com.

DNS Query Flow

curl → OS resolver → DNS server → response → OS → curl

In practice:

  1. Check local cache: OS maintains DNS cache

    # On macOS/Linux you can see cached entries
    dscacheutil -cachedump -entries Host
    
  2. If not cached, query DNS server:

    • OS sends UDP packet to DNS server (usually port 53)
    • DNS query: "What's the IP for www.example.com?"
    • DNS response: "93.184.216.34" (example IP)

What a DNS query looks like (UDP packet, simplified):

DNS Query Packet:
┌─────────────────────────────────┐
Transaction ID: 0x1234Flags: Standard query           │
Questions: 1Question: www.example.comType: A (IPv4 address)Class: IN (Internet)└─────────────────────────────────┘

Response:

DNS Response Packet:
┌─────────────────────────────────┐
Transaction ID: 0x1234Flags: Response, no error       │
Answers: 1│ www.example.com93.184.216.34TTL: 86400 seconds              │
└─────────────────────────────────┘

Now curl knows: send HTTP request to 93.184.216.34:80

Step 3: TCP Connection Establishment

HTTP runs over TCP, which guarantees reliable, ordered delivery. But first, we need to establish the connection.

The 3-Way Handshake

Client (curl)                    Server (93.184.216.34:80)
     │                                    │
SYN (seq=1000)     │───────────────────────────────────>     │                                    │
SYN-ACK (seq=5000,     │                    ack=1001)<───────────────────────────────────│
     │                                    │
ACK (seq=1001, ack=5001)     │───────────────────────────────────>     │                                    │
Connection ESTABLISHED

What actually gets sent (TCP segment in an IP packet):

SYN packet (client → server):

IP Header:
  Source IP: 192.168.1.100 (your machine)
  Dest IP: 93.184.216.34 (example.com)
  Protocol: TCP (6)

TCP Header:
  Source Port: 54321 (random high port chosen by OS)
  Dest Port: 80
  Sequence Number: 1000 (random initial sequence)
  Acknowledgment: 0
  Flags: SYN
  Window Size: 65535
  Checksum: [calculated]

SYN-ACK packet (server → client):

IP Header:
  Source IP: 93.184.216.34
  Dest IP: 192.168.1.100
  Protocol: TCP (6)

TCP Header:
  Source Port: 80
  Dest Port: 54321
  Sequence Number: 5000 (server's random initial)
  Acknowledgment: 1001 (client's seq + 1)
  Flags: SYN, ACK
  Window Size: 65535
  Checksum: [calculated]

ACK packet (client → server):

TCP Header:
  Source Port: 54321
  Dest Port: 80
  Sequence Number: 1001
  Acknowledgment: 5001 (server's seq + 1)
  Flags: ACK

Connection is now ESTABLISHED. Both sides have a socket ready to send/receive data.

What's a Socket?

A socket is a file descriptor that represents one end of the TCP connection. On your machine:

// Simplified version of what curl does internally
int sockfd = socket(AF_INET, SOCK_STREAM, 0);  // Create socket
connect(sockfd, server_addr, addr_len);         // Connect (3-way handshake happens here)
// sockfd is now ready to send/receive

Step 4: Sending the HTTP Request

Now curl has an established TCP connection. Time to send the HTTP request!

The HTTP Request as Bytes

Remember our HTTP request:

GET /hello.txt HTTP/1.1\r\n
Host: www.example.com\r\n
User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3\r\n
Accept: */*\r\n
\r\n

As actual bytes (in hexadecimal):

47 45 54 20 2F 68 65 6C 6C 6F 2E 74 78 74 20 48   G E T   / h e l l o . t x t   H
54 54 50 2F 31 2E 31 0D 0A 48 6F 73 74 3A 20 77   T T P / 1 . 1 \r \n H o s t :   w
77 77 2E 65 78 61 6D 70 6C 65 2E 63 6F 6D 0D 0A   w w . e x a m p l e . c o m \r \n
...

Note:

  • 0x47 = 'G', 0x45 = 'E', 0x54 = 'T'
  • 0x0D = '\r' (carriage return)
  • 0x0A = '\n' (line feed)
  • 0x0D 0x0A 0x0D 0x0A = the blank line that ends headers

Sending via TCP

// Simplified C code (what curl does)
char request[] = "GET /hello.txt HTTP/1.1\r\n"
                 "Host: www.example.com\r\n"
                 "User-Agent: curl/7.16.3\r\n"
                 "Accept: */*\r\n"
                 "\r\n";

int bytes_sent = send(sockfd, request, strlen(request), 0);

What happens in the OS:

  1. Application buffer → Kernel buffer: The request bytes are copied to the kernel's TCP send buffer

  2. TCP segmentation: If the data is larger than MSS (Maximum Segment Size, typically ~1460 bytes), TCP splits it into multiple segments. Our request is small, so it fits in one segment.

  3. TCP wrapping: TCP adds its header

  4. IP wrapping: IP layer adds IP header

  5. Ethernet framing (if on Ethernet)

  6. Physical transmission: The frame is converted to electrical signals (Ethernet), radio waves (WiFi), or light pulses (fiber optic) and transmitted.

The Journey Across the Network

Your MachineRouterISPInternet BackboneDestination ISPServer

Each router:

  1. Receives the frame
  2. Examines the destination IP (93.184.216.34)
  3. Looks up routing table: "Where do I forward this?"
  4. Decrements TTL (Time To Live)
  5. Recalculates checksums
  6. Forwards to next hop

Key point: Routers only look at the IP header. They don't see or care about TCP or HTTP. That's the beauty of layering.

Step 5: Server Receives and Processes

On the Server Side

  1. Network card receives frame: Interrupt generated

  2. Kernel processes:

    • Strips Ethernet header
    • Validates IP checksum, examines destination IP
    • Strips IP header
    • Validates TCP checksum, examines destination port (80)
    • Looks up socket for (dest_port=80, client_IP, client_port)
    • Adds data to socket's receive buffer
    • Wakes up the server process (e.g., nginx, apache)
  3. Server application (nginx/apache) reads:

    char buffer[4096];
    int bytes_received = recv(sockfd, buffer, sizeof(buffer), 0);
    
    // buffer now contains:
    // "GET /hello.txt HTTP/1.1\r\nHost: www.example.com\r\n..."
    
  4. Server parses the HTTP request:

    # Pseudocode of what the server does
    request_line, rest = buffer.split(b'\r\n', 1)
    method, path, version = request_line.split(b' ')
    
    # Parse headers
    headers = {}
    while True:
        line, rest = rest.split(b'\r\n', 1)
        if line == b'':  # Empty line = end of headers
            break
        name, value = line.split(b': ', 1)
        headers[name] = value
    
    # Process request
    if path == b'/hello.txt':
        response = read_file('hello.txt')
    
  5. Server constructs HTTP response:

    HTTP/1.1 200 OK\r\n
    Date: Mon, 27 Jul 2009 12:28:53 GMT\r\n
    Server: nginx/1.18.0\r\n
    Content-Type: text/plain\r\n
    Content-Length: 51\r\n
    Connection: keep-alive\r\n
    \r\n
    Hello World! My payload includes a trailing CRLF.
    

Step 6: Server Sends Response

Same process as request, but in reverse:

  1. Server calls send()
  2. TCP segments (might be multiple if response is large)
  3. IP packets with Source: 93.184.216.34:80, Dest: 192.168.1.100:54321
  4. Routed back through the internet
  5. Client receives: NIC receives frame, kernel strips headers, data lands in socket receive buffer, curl's recv() call returns the data

How does curl know when the response is complete?

Options:

  1. Content-Length header: Read exactly that many bytes for the body
  2. Transfer-Encoding: chunked: Read chunks until 0\r\n\r\n
  3. Connection: close: Read until server closes connection
  4. No body: Some responses (HEAD, 204, 304) never have a body

In our example, Content-Length: 51, so curl reads headers until \r\n\r\n, then reads exactly 51 more bytes.

Step 7: Connection Management

After the response, what happens to the connection?

HTTP/1.1 Default: Keep-Alive

If neither side sends Connection: close, the TCP connection stays open:

Client                          Server
  │                               │
GET /page1.html  │──────────────────────────────>  │                               │
200 OK (page1)<──────────────────────────────│
  │                               │
GET /page2.html                (SAME connection)
  │──────────────────────────────>  │                               │
200 OK (page2)<──────────────────────────────│

Benefits:

  • No 3-way handshake overhead for subsequent requests
  • Especially important for HTTPS (avoids expensive TLS handshake)

Closing the Connection

If Connection: close was sent (or HTTP/1.0), TCP teardown happens:

4-Way Handshake (TCP close):

Client                    Server
  │                         │
FIN (seq=X)  │────────────────────────>  │                         │
ACK (ack=X+1)<────────────────────────│
  │                         │
FIN (seq=Y)<────────────────────────│
  │                         │
ACK (ack=Y+1)  │────────────────────────>  │                         │
Connection CLOSED

Capturing This in Action

You can actually see all of this happen!

Using tcpdump/Wireshark

# Capture packets on interface en0
sudo tcpdump -i en0 -w capture.pcap host www.example.com

# In another terminal
curl http://www.example.com/hello.txt

# Stop tcpdump (Ctrl+C)
# Open capture.pcap in Wireshark to see every packet

What you'll see in Wireshark:

No.  Time    Source           Dest             Protocol  Info
1    0.000   192.168.1.100    8.8.8.8          DNS       Standard query A www.example.com
2    0.015   8.8.8.8          192.168.1.100    DNS       Standard query response A 93.184.216.34
3    0.016   192.168.1.100    93.184.216.34    TCP       5432180 [SYN]
4    0.045   93.184.216.34    192.168.1.100    TCP       8054321 [SYN, ACK]
5    0.045   192.168.1.100    93.184.216.34    TCP       5432180 [ACK]
6    0.046   192.168.1.100    93.184.216.34    HTTP      GET /hello.txt HTTP/1.1
7    0.085   93.184.216.34    192.168.1.100    TCP       8054321 [ACK]
8    0.120   93.184.216.34    192.168.1.100    HTTP      HTTP/1.1 200 OK (text/plain)
9    0.120   192.168.1.100    93.184.216.34    TCP       5432180 [ACK]

Breaking it down:

  • Packets 1-2: DNS query/response
  • Packets 3-5: TCP 3-way handshake
  • Packet 6: HTTP GET request
  • Packet 7: Server ACKs the request
  • Packet 8: HTTP 200 response
  • Packet 9: Client ACKs the response

Using strace to See System Calls

strace -e trace=socket,connect,sendto,recvfrom,close curl http://www.example.com/hello.txt

Output:

socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("93.184.216.34")}, 16) = 0
sendto(3, "GET /hello.txt HTTP/1.1\r\nHost: "..., 147, MSG_NOSIGNAL, NULL, 0) = 147
recvfrom(3, "HTTP/1.1 200 OK\r\nDate: Mon, 27 "..., 16384, 0, NULL, NULL) = 326
close(3) = 0

This shows:

  1. socket(): Create TCP socket (fd=3)
  2. connect(): Establish connection (3-way handshake happens here)
  3. sendto(): Send 147 bytes (our HTTP request)
  4. recvfrom(): Receive 326 bytes (HTTP response)
  5. close(): Close socket (4-way teardown)

Timing Breakdown

For a typical HTTP request to a nearby server:

Activity                        Time
────────────────────────────────────────
DNS lookup (cached)             ~0 ms
DNS lookup (uncached)           ~20-50 ms
TCP handshake (3-way)           ~30-100 ms (RTT)
TLS handshake (HTTPS)           ~60-200 ms (2 RTTs)
HTTP request sent               ~1 ms
Server processing               ~10-100 ms
HTTP response received          ~30 ms
Total (HTTP, no TLS)            ~70-250 ms
Total (HTTPS, first time)       ~130-450 ms

RTT (Round Trip Time): Time for packet to go from client → server → client

For a server on the other side of the world, RTT might be 200-300ms, making the handshakes very expensive.

HTTPS: What Changes?

For HTTPS (curl https://www.example.com/hello.txt):

  1. DNS resolution: Same
  2. TCP connection: Same (port 443 instead of 80)
  3. TLS handshake: NEW! Happens before HTTP
  4. HTTP request/response: Encrypted in TLS records

TLS Handshake (Simplified)

Client                          Server
  │                               │
ClientHello    (supported ciphers, random)  │──────────────────────────────>  │                               │
ServerHello       (chosen cipher, random,  │            certificate)<──────────────────────────────│
  │                               │
ClientKeyExchange    (encrypted pre-master secret)  │──────────────────────────────>  │                               │
ChangeCipherSpecFinished (encrypted)  │──────────────────────────────>  │                               │
ChangeCipherSpecFinished (encrypted)<──────────────────────────────│
  │                               │
Encrypted HTTP traffic      │
<────────────────────────────>

After TLS handshake, all HTTP data is encrypted before being passed to TCP.

Summary: The Full Journey

  1. You type: curl http://www.example.com/hello.txt
  2. curl does: Parses URL, constructs HTTP request in memory
  3. DNS resolution: www.example.com → 93.184.216.34
  4. TCP connection: socket() → file descriptor, connect() → 3-way handshake
  5. send() HTTP request: Application → kernel buffer → TCP segment → IP packet → Ethernet frame → electrical signals
  6. Routing: Through multiple routers/switches to destination
  7. Server receives: NIC → interrupt → kernel → socket receive buffer → server application reads via recv()
  8. Server processes: Parses HTTP request, reads file, constructs HTTP response
  9. send() HTTP response: Same process in reverse
  10. Client receives: recv() returns response bytes, curl parses response, displays to terminal
  11. Connection management: Keep-alive stays open, or close() triggers 4-way teardown

Key Insights

  1. Layering is everything: Each layer adds headers, does its job, and passes to the next layer. No layer needs to know about layers above or below.

  2. TCP handles reliability: HTTP doesn't worry about lost packets, retransmission, ordering. TCP handles all that.

  3. Stateless HTTP on stateful TCP: HTTP is stateless (each request independent), but it runs on TCP (which maintains connection state).

  4. The network is just dumb pipes: Routers forward based on IP addresses. They don't know or care about HTTP.

  5. Everything is bytes: At the end of the day, it's all just bytes flowing through wires/air/fiber. The structure (HTTP, TCP, IP, Ethernet) is just how we organize those bytes.

  • tcpdump: Command-line packet capture
  • Wireshark: GUI packet analyzer
  • strace/dtrace: System call tracing
  • netstat/ss: View open connections
  • curl -v: Verbose mode shows HTTP headers
  • nc (netcat): Manually send/receive TCP data

Try this experiment:

# Start a simple HTTP server
python3 -m http.server 8000

# In another terminal, make a manual HTTP request
nc localhost 8000
GET / HTTP/1.1
Host: localhost

[press Enter twice]

You'll see the raw HTTP response! This really drives home that HTTP is just text over TCP.