- Published on
HTTP/1.1 Explained: A Digestible Guide to RFC 7230
HTTP/1.1 Explained: A Digestible Guide to RFC 7230
RFC 7230 defines the message syntax and routing for HTTP/1.1. It's dense and formal, so this guide distills the most important concepts for practical use.
Core Philosophy
HTTP is a stateless request/response protocol. Each request is independent and self-contained. The protocol is designed to:
- Be implementation-agnostic (clients and servers can vary wildly in complexity)
- Hide service implementation details behind a uniform interface
- Work through intermediaries (proxies, caches, gateways)
- Support incremental processing and pipelining
Key insight: HTTP defines the syntax and expected behavior of messages, but not what happens behind the interface. A server receiving a DELETE request might do nothing, might delete the resource, or might do something entirely different. HTTP only defines the message format and what the response should indicate.
Message Structure
Every HTTP/1.1 message has the same basic structure:
start-line CRLF
*(header-field CRLF)
CRLF
[message-body]
That's it: start line, headers, blank line, optional body.
Request Format
GET /hello.txt HTTP/1.1
Host: www.example.com
User-Agent: curl/7.16.3
Accept-Language: en, mi
Structure:
- Request line:
method SP request-target SP HTTP-version CRLF - Headers: Zero or more header fields
- Blank line: Indicates end of headers
- Body: Optional (common for POST, PUT, PATCH)
Important details:
- The method is case-sensitive (GET vs get)
- No whitespace allowed before the first header field (security issue!)
- Recommended minimum request-line length support: 8000 octets
Response Format
HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache
Content-Length: 51
Content-Type: text/plain
Hello World! My payload includes a trailing CRLF.
Structure:
- Status line:
HTTP-version SP status-code SP reason-phrase CRLF - Headers: Zero or more header fields
- Blank line: Indicates end of headers
- Body: Optional
Note: The reason-phrase (e.g., "OK") is purely informational. Clients SHOULD ignore it and only use the status code.
Header Fields
Headers follow the format:
field-name ":" OWS field-value OWS
Where OWS is optional whitespace.
Key Rules
Field names are case-insensitive:
Content-Type==content-type==CONTENT-TYPEField order doesn't matter (except for duplicate fields)
Multiple fields with the same name can be combined:
Accept: text/html Accept: application/jsonIs equivalent to:
Accept: text/html, application/jsonException:
Set-Cookieviolates this rule and must be handled specially. Don't combine multiple Set-Cookie headers.Proxies MUST forward unknown headers unless they're listed in the
Connectionheader.
Message Body and Transfer Encoding
The tricky part of HTTP parsing is figuring out where the message body ends. There are several mechanisms:
1. Content-Length
Content-Length: 51
Simple: the body is exactly 51 bytes. The server reads 51 bytes and stops.
Rules:
- If present, Content-Length MUST match the actual body length
- Sending both Content-Length and Transfer-Encoding is an error (Transfer-Encoding wins)
- Multiple Content-Length headers with different values is an error (must reject)
2. Chunked Transfer Encoding
Used when you don't know the content size in advance (e.g., streaming responses).
Transfer-Encoding: chunked
5\r\n
Hello\r\n
6\r\n
World\r\n
0\r\n
\r\n
Each chunk format:
chunk-size (in hex) CRLF
chunk-data CRLF
Final chunk is always 0\r\n\r\n.
Why it matters:
- Allows streaming without knowing total size
- Enables keep-alive connections (can send multiple requests/responses without closing)
- Can include trailer headers after the final chunk
3. Connection Close
If no Content-Length or Transfer-Encoding, the body ends when the connection closes. This only works for responses (not requests, obviously).
4. No Body
Some messages never have a body:
- All 1xx (Informational) responses
- 204 (No Content)
- 304 (Not Modified)
- All responses to HEAD requests
- CONNECT responses with 2xx status
Determining Message Body Length
RFC 7230 Section 3.3.3 defines the precedence order:
- Responses to HEAD or certain status codes (1xx, 204, 304, 2xx to CONNECT): No body, ever
- Transfer-Encoding present: Use chunked encoding (if the final encoding is "chunked")
- Multiple Content-Length with different values: Reject as invalid
- Content-Length present: Read exactly that many bytes
- Multipart media type with boundary: Read until boundary (rare for HTTP messages directly)
- Connection close: Read until server closes the connection
- Default: No message body
Critical security note: Having both Content-Length and Transfer-Encoding is forbidden because it enables request smuggling attacks. If both are present, Transfer-Encoding takes precedence and Content-Length MUST be ignored.
Connection Management
Persistent Connections
HTTP/1.1 defaults to keep-alive (persistent) connections:
Connection: keep-alive
This is actually the default in HTTP/1.1, so you don't need to send it. To close:
Connection: close
Benefits:
- Reduces TCP handshake overhead
- Enables request pipelining
- Better for TLS (expensive handshake)
Requirements:
- Server MUST send Content-Length or use chunked encoding
- Can't rely on connection close to delimit messages
Pipelining
Clients can send multiple requests without waiting for responses:
Request 1 →
Request 2 →
Request 3 →
← Response 1
← Response 2
← Response 3
Requirements:
- Responses MUST come back in the same order
- Idempotent methods (GET, HEAD, PUT, DELETE) are safe to pipeline
- Non-idempotent methods (POST) should not be pipelined
Reality check: Pipelining is rarely used in practice due to implementation bugs and head-of-line blocking issues. HTTP/2 addresses this better with multiplexing.
Connection Header
The Connection header has special meaning:
Connection: close, X-Custom-Header
This means:
- Close the connection after this message
- Remove the
X-Custom-Headerfield before forwarding (hop-by-hop header)
Common hop-by-hop headers:
- Connection
- Keep-Alive
- Proxy-Authenticate
- Proxy-Authorization
- TE
- Trailers
- Transfer-Encoding
- Upgrade
All other headers are end-to-end and must be forwarded by proxies.
Intermediaries: Proxies, Gateways, Tunnels
HTTP allows intermediaries between client and server:
UA ===== Proxy ===== Gateway ===== Origin Server
Proxy
Forwards requests on behalf of clients. Can modify requests/responses (with restrictions).
Gateway
Translates between HTTP and other protocols (e.g., HTTP to FTP).
Tunnel
Relays messages without examining them. Established with CONNECT method:
CONNECT server.example.com:443 HTTP/1.1
Host: server.example.com:443
Via Header
Each intermediary MUST add itself to the Via header:
Via: 1.1 proxy1.example.com, 1.1 proxy2.example.com
Format: protocol-version received-by [comment]
Purpose:
- Track message routing
- Detect loops
- Identify protocol capabilities of intermediaries
Request Targets
The request-target can take four forms:
1. origin-form (most common)
GET /path/to/resource?query=value HTTP/1.1
Host: www.example.com
2. absolute-form (used with proxies)
GET http:/path HTTP/1.1
3. authority-form (CONNECT only)
CONNECT www.example.com:443 HTTP/1.1
4. asterisk-form (OPTIONS only)
OPTIONS * HTTP/1.1
Security Considerations
1. Request Smuggling
Occurs when front-end and back-end disagree about message boundaries:
POST / HTTP/1.1
Host: example.com
Content-Length: 13
Transfer-Encoding: chunked
0
GET /admin HTTP/1.1
If front-end uses Content-Length and back-end uses Transfer-Encoding, the second request might be smuggled.
Prevention:
- Reject messages with both Content-Length and Transfer-Encoding
- Normalize message framing at boundaries
- Be strict about whitespace and line terminators
2. Response Splitting
Injecting CRLF into headers to create fake responses:
HTTP/1.1 302 Found
Location: http://evil.com/\r\n\r\nHTTP/1.1 200 OK\r\n...
Prevention:
- Validate and sanitize all header values
- Reject embedded CRLF sequences
- Use proper URL encoding
3. Header Injection
Similar to response splitting but for requests.
Prevention:
- Strict parsing of request lines
- Reject URLs with encoded CRLF
- Don't auto-correct invalid requests
4. Protocol Element Length Attacks
Sending extremely long:
- Request lines (414 URI Too Long)
- Header fields (413 Payload Too Large or 431 Request Header Fields Too Large)
- Chunk sizes
- Message bodies
Prevention:
- Enforce reasonable limits
- RFC recommends 8000 octet minimum for request-line length
- Set practical limits on header count and size
Practical Gotchas
1. The Host Header is Mandatory
GET /index.html HTTP/1.1
Host: www.example.com
Without Host, the server MUST return 400 Bad Request. This enables virtual hosting.
2. Whitespace Before First Header is an Attack
GET / HTTP/1.1
Host: example.com
That leading space before "Host" is a security violation. Reject the message or ignore those lines.
3. Reason Phrases are Meaningless
HTTP/1.1 200 This is totally an error, ignore the 200
Clients must only look at the status code (200), not the reason phrase.
4. Set-Cookie is Special
Don't combine multiple Set-Cookie headers:
Set-Cookie: session=abc
Set-Cookie: theme=dark
Cannot be combined into:
Set-Cookie: session=abc, theme=dark
This breaks cookies.
5. Transfer-Encoding Beats Content-Length
If both present:
Content-Length: 100
Transfer-Encoding: chunked
Use chunked, ignore Content-Length, or better yet, reject the message.
6. Methods are Case-Sensitive
GET works. get doesn't (should return 501 Not Implemented).
Summary: The Essentials
If you remember nothing else:
- Message format: start-line, headers, blank line, optional body
- Determine body length: Transfer-Encoding, then Content-Length, then connection close
- Security: Never accept both Transfer-Encoding and Content-Length
- Connections: HTTP/1.1 defaults to persistent (keep-alive)
- Parsing: Parse as bytes (octets), not strings
- Host header: Required in HTTP/1.1
- Chunked encoding: Must support it
- Intermediaries: Use Via header, forward unknown headers unless in Connection
- Status codes matter: Ignore reason phrases
- Whitespace: Matters a lot - reject whitespace before first header
Additional Resources
- RFC 7230: Message Syntax and Routing (this document)
- RFC 7231: Semantics and Content (methods, status codes, headers)
- RFC 7232: Conditional Requests (ETag, If-Modified-Since, etc.)
- RFC 7233: Range Requests (partial content)
- RFC 7234: Caching
- RFC 7235: Authentication
These six RFCs together define HTTP/1.1. RFC 7230 (this one) is the foundation - it defines the wire format.