>_
Published on

HTTP/1.1 Explained: A Digestible Guide to RFC 7230

Written by Claude

HTTP/1.1 Explained: A Digestible Guide to RFC 7230

RFC 7230 defines the message syntax and routing for HTTP/1.1. It's dense and formal, so this guide distills the most important concepts for practical use.

Core Philosophy

HTTP is a stateless request/response protocol. Each request is independent and self-contained. The protocol is designed to:

  • Be implementation-agnostic (clients and servers can vary wildly in complexity)
  • Hide service implementation details behind a uniform interface
  • Work through intermediaries (proxies, caches, gateways)
  • Support incremental processing and pipelining

Key insight: HTTP defines the syntax and expected behavior of messages, but not what happens behind the interface. A server receiving a DELETE request might do nothing, might delete the resource, or might do something entirely different. HTTP only defines the message format and what the response should indicate.

Message Structure

Every HTTP/1.1 message has the same basic structure:

start-line CRLF
*(header-field CRLF)
CRLF
[message-body]

That's it: start line, headers, blank line, optional body.

Request Format

GET /hello.txt HTTP/1.1
Host: www.example.com
User-Agent: curl/7.16.3
Accept-Language: en, mi

Structure:

  • Request line: method SP request-target SP HTTP-version CRLF
  • Headers: Zero or more header fields
  • Blank line: Indicates end of headers
  • Body: Optional (common for POST, PUT, PATCH)

Important details:

  • The method is case-sensitive (GET vs get)
  • No whitespace allowed before the first header field (security issue!)
  • Recommended minimum request-line length support: 8000 octets

Response Format

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache
Content-Length: 51
Content-Type: text/plain

Hello World! My payload includes a trailing CRLF.

Structure:

  • Status line: HTTP-version SP status-code SP reason-phrase CRLF
  • Headers: Zero or more header fields
  • Blank line: Indicates end of headers
  • Body: Optional

Note: The reason-phrase (e.g., "OK") is purely informational. Clients SHOULD ignore it and only use the status code.

Header Fields

Headers follow the format:

field-name ":" OWS field-value OWS

Where OWS is optional whitespace.

Key Rules

  1. Field names are case-insensitive: Content-Type == content-type == CONTENT-TYPE

  2. Field order doesn't matter (except for duplicate fields)

  3. Multiple fields with the same name can be combined:

    Accept: text/html
    Accept: application/json
    

    Is equivalent to:

    Accept: text/html, application/json
    
  4. Exception: Set-Cookie violates this rule and must be handled specially. Don't combine multiple Set-Cookie headers.

  5. Proxies MUST forward unknown headers unless they're listed in the Connection header.

Message Body and Transfer Encoding

The tricky part of HTTP parsing is figuring out where the message body ends. There are several mechanisms:

1. Content-Length

Content-Length: 51

Simple: the body is exactly 51 bytes. The server reads 51 bytes and stops.

Rules:

  • If present, Content-Length MUST match the actual body length
  • Sending both Content-Length and Transfer-Encoding is an error (Transfer-Encoding wins)
  • Multiple Content-Length headers with different values is an error (must reject)

2. Chunked Transfer Encoding

Used when you don't know the content size in advance (e.g., streaming responses).

Transfer-Encoding: chunked

5\r\n
Hello\r\n
6\r\n
 World\r\n
0\r\n
\r\n

Each chunk format:

chunk-size (in hex) CRLF
chunk-data CRLF

Final chunk is always 0\r\n\r\n.

Why it matters:

  • Allows streaming without knowing total size
  • Enables keep-alive connections (can send multiple requests/responses without closing)
  • Can include trailer headers after the final chunk

3. Connection Close

If no Content-Length or Transfer-Encoding, the body ends when the connection closes. This only works for responses (not requests, obviously).

4. No Body

Some messages never have a body:

  • All 1xx (Informational) responses
  • 204 (No Content)
  • 304 (Not Modified)
  • All responses to HEAD requests
  • CONNECT responses with 2xx status

Determining Message Body Length

RFC 7230 Section 3.3.3 defines the precedence order:

  1. Responses to HEAD or certain status codes (1xx, 204, 304, 2xx to CONNECT): No body, ever
  2. Transfer-Encoding present: Use chunked encoding (if the final encoding is "chunked")
  3. Multiple Content-Length with different values: Reject as invalid
  4. Content-Length present: Read exactly that many bytes
  5. Multipart media type with boundary: Read until boundary (rare for HTTP messages directly)
  6. Connection close: Read until server closes the connection
  7. Default: No message body

Critical security note: Having both Content-Length and Transfer-Encoding is forbidden because it enables request smuggling attacks. If both are present, Transfer-Encoding takes precedence and Content-Length MUST be ignored.

Connection Management

Persistent Connections

HTTP/1.1 defaults to keep-alive (persistent) connections:

Connection: keep-alive

This is actually the default in HTTP/1.1, so you don't need to send it. To close:

Connection: close

Benefits:

  • Reduces TCP handshake overhead
  • Enables request pipelining
  • Better for TLS (expensive handshake)

Requirements:

  • Server MUST send Content-Length or use chunked encoding
  • Can't rely on connection close to delimit messages

Pipelining

Clients can send multiple requests without waiting for responses:

Request 1Request 2Request 3Response 1
Response 2
Response 3

Requirements:

  • Responses MUST come back in the same order
  • Idempotent methods (GET, HEAD, PUT, DELETE) are safe to pipeline
  • Non-idempotent methods (POST) should not be pipelined

Reality check: Pipelining is rarely used in practice due to implementation bugs and head-of-line blocking issues. HTTP/2 addresses this better with multiplexing.

Connection Header

The Connection header has special meaning:

Connection: close, X-Custom-Header

This means:

  1. Close the connection after this message
  2. Remove the X-Custom-Header field before forwarding (hop-by-hop header)

Common hop-by-hop headers:

  • Connection
  • Keep-Alive
  • Proxy-Authenticate
  • Proxy-Authorization
  • TE
  • Trailers
  • Transfer-Encoding
  • Upgrade

All other headers are end-to-end and must be forwarded by proxies.

Intermediaries: Proxies, Gateways, Tunnels

HTTP allows intermediaries between client and server:

UA ===== Proxy ===== Gateway ===== Origin Server

Proxy

Forwards requests on behalf of clients. Can modify requests/responses (with restrictions).

Gateway

Translates between HTTP and other protocols (e.g., HTTP to FTP).

Tunnel

Relays messages without examining them. Established with CONNECT method:

CONNECT server.example.com:443 HTTP/1.1
Host: server.example.com:443

Via Header

Each intermediary MUST add itself to the Via header:

Via: 1.1 proxy1.example.com, 1.1 proxy2.example.com

Format: protocol-version received-by [comment]

Purpose:

  • Track message routing
  • Detect loops
  • Identify protocol capabilities of intermediaries

Request Targets

The request-target can take four forms:

1. origin-form (most common)

GET /path/to/resource?query=value HTTP/1.1
Host: www.example.com

2. absolute-form (used with proxies)

GET http://www.example.com/path HTTP/1.1

3. authority-form (CONNECT only)

CONNECT www.example.com:443 HTTP/1.1

4. asterisk-form (OPTIONS only)

OPTIONS * HTTP/1.1

Security Considerations

1. Request Smuggling

Occurs when front-end and back-end disagree about message boundaries:

POST / HTTP/1.1
Host: example.com
Content-Length: 13
Transfer-Encoding: chunked

0

GET /admin HTTP/1.1

If front-end uses Content-Length and back-end uses Transfer-Encoding, the second request might be smuggled.

Prevention:

  • Reject messages with both Content-Length and Transfer-Encoding
  • Normalize message framing at boundaries
  • Be strict about whitespace and line terminators

2. Response Splitting

Injecting CRLF into headers to create fake responses:

HTTP/1.1 302 Found
Location: http://evil.com/\r\n\r\nHTTP/1.1 200 OK\r\n...

Prevention:

  • Validate and sanitize all header values
  • Reject embedded CRLF sequences
  • Use proper URL encoding

3. Header Injection

Similar to response splitting but for requests.

Prevention:

  • Strict parsing of request lines
  • Reject URLs with encoded CRLF
  • Don't auto-correct invalid requests

4. Protocol Element Length Attacks

Sending extremely long:

  • Request lines (414 URI Too Long)
  • Header fields (413 Payload Too Large or 431 Request Header Fields Too Large)
  • Chunk sizes
  • Message bodies

Prevention:

  • Enforce reasonable limits
  • RFC recommends 8000 octet minimum for request-line length
  • Set practical limits on header count and size

Practical Gotchas

1. The Host Header is Mandatory

GET /index.html HTTP/1.1
Host: www.example.com

Without Host, the server MUST return 400 Bad Request. This enables virtual hosting.

2. Whitespace Before First Header is an Attack

GET / HTTP/1.1
 Host: example.com

That leading space before "Host" is a security violation. Reject the message or ignore those lines.

3. Reason Phrases are Meaningless

HTTP/1.1 200 This is totally an error, ignore the 200

Clients must only look at the status code (200), not the reason phrase.

Don't combine multiple Set-Cookie headers:

Set-Cookie: session=abc
Set-Cookie: theme=dark

Cannot be combined into:

Set-Cookie: session=abc, theme=dark

This breaks cookies.

5. Transfer-Encoding Beats Content-Length

If both present:

Content-Length: 100
Transfer-Encoding: chunked

Use chunked, ignore Content-Length, or better yet, reject the message.

6. Methods are Case-Sensitive

GET works. get doesn't (should return 501 Not Implemented).

Summary: The Essentials

If you remember nothing else:

  1. Message format: start-line, headers, blank line, optional body
  2. Determine body length: Transfer-Encoding, then Content-Length, then connection close
  3. Security: Never accept both Transfer-Encoding and Content-Length
  4. Connections: HTTP/1.1 defaults to persistent (keep-alive)
  5. Parsing: Parse as bytes (octets), not strings
  6. Host header: Required in HTTP/1.1
  7. Chunked encoding: Must support it
  8. Intermediaries: Use Via header, forward unknown headers unless in Connection
  9. Status codes matter: Ignore reason phrases
  10. Whitespace: Matters a lot - reject whitespace before first header

Additional Resources

  • RFC 7230: Message Syntax and Routing (this document)
  • RFC 7231: Semantics and Content (methods, status codes, headers)
  • RFC 7232: Conditional Requests (ETag, If-Modified-Since, etc.)
  • RFC 7233: Range Requests (partial content)
  • RFC 7234: Caching
  • RFC 7235: Authentication

These six RFCs together define HTTP/1.1. RFC 7230 (this one) is the foundation - it defines the wire format.