WebSockets vs Polling: Why Real-Time Apps Need a Persistent Connection

If you have ever built a dashboard that refreshes every few seconds, a chat box that checks for new messages on a timer, or a notification badge that polls an endpoint every minute, you have already felt the friction of HTTP polling. It works. But it carries a hidden cost that compounds quickly as your user count grows.

This post breaks down the four main approaches to real-time communication, where each one breaks down, and why WebSockets are the right default for most interactive apps.

Short Polling: The Simplest Thing That Could Possibly Work

Short polling is the most naive approach. The client sends an HTTP request, the server responds immediately with whatever data it has (or an empty response if there is nothing new), and the client waits a fixed interval before asking again.

// Short polling every 3 seconds
setInterval(async () => {
  const res = await fetch('/api/messages?since=' + lastTimestamp)
  const data = await res.json()
  if (data.messages.length) renderMessages(data.messages)
}, 3000)

The problem is structural. Every request pays a full TCP handshake cost if the connection is not reused, sends an HTTP request header block that can easily run 500-800 bytes, waits for server processing, and gets a response header block back. When nothing has changed, all of that work produces an empty payload. Multiply that by 1,000 users polling every three seconds and you are generating 20,000 pointless requests per minute.

Beyond bandwidth, there is the latency floor. If something happens 100ms after the client just polled, that event will not reach the user for up to 2.9 more seconds. For a notification that someone just paid you, that gap feels broken.

Long Polling: Holding the Connection Open

Long polling improves on short polling by having the server hold the request open until data is actually available. The client sends a request, the server parks it, and when an event occurs the server responds. The client immediately sends another request and the cycle continues.

// Long polling - client re-requests immediately after each response
async function longPoll() {
  try {
    const res = await fetch('/api/events?timeout=30')
    const data = await res.json()
    handleEvent(data)
  } finally {
    longPoll() // reconnect immediately
  }
}

This cuts unnecessary requests and gets latency down close to zero for the actual event delivery. But every response still closes the connection, which means the client pays the reconnection cost for every single message. Under high message frequency that overhead adds up fast. There is also the server-side complexity: holding thousands of open HTTP connections consumes file descriptors and threads depending on your server model.

Server-Sent Events: One Direction, Low Overhead

Server-Sent Events (SSE) establish a persistent HTTP connection where the server can push data to the client at any time. The connection stays open. The client does not have to re-request. The protocol is text-based and uses a simple format that browsers understand natively.

// SSE on the client - browser handles reconnection automatically
const source = new EventSource('/api/stream')
source.onmessage = (event) => {
  const data = JSON.parse(event.data)
  renderUpdate(data)
}

SSE is a good fit when the server talks to the client but the client rarely needs to talk back. Stock tickers, live score feeds, and activity streams are natural matches. The browser handles reconnection automatically and there is no third-party library required.

The limitation is that SSE is strictly unidirectional. If your client needs to send data back with any frequency, you are back to firing separate HTTP requests. For bidirectional workflows like chat or collaborative editing, SSE plus POST is an awkward combination.

WebSockets: Persistent, Bidirectional, Low Overhead

A WebSocket connection starts as an HTTP request and upgrades to a persistent TCP connection via a handshake. After that handshake, the connection stays open and both sides can send framed messages at any time without repeating the HTTP overhead.

// WebSocket - bidirectional, persistent
const ws = new WebSocket('wss://api.example.com/ws')

ws.onopen = () => {
  ws.send(JSON.stringify({ type: 'subscribe', room: 'general' }))
}

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data)
  handleMessage(msg)
}

// Send from client at any time - no new request needed
function sendMessage(text) {
  ws.send(JSON.stringify({ type: 'message', text }))
}

The per-frame overhead drops to 2-14 bytes for the WebSocket framing header, compared to hundreds of bytes of HTTP headers on every polling request. For a chat app sending dozens of messages per second, this is meaningful. For an IoT device sending 10 readings per second per device across a fleet of 10,000 sensors, it is the difference between a manageable bill and an infrastructure crisis.

The Overhead Math

To make the comparison concrete, here is a rough per-message cost comparison for a minimal payload (say, a 40-byte JSON message):

Short polling (empty response):  ~800 bytes headers + 0 bytes payload = 800 bytes wasted
Short polling (with message):    ~800 bytes headers + 40 bytes payload = 840 bytes total
Long polling (with message):     ~600 bytes headers + 40 bytes payload = 640 bytes total
SSE (with message):              ~400 bytes initial + ~50 bytes per event frame
WebSocket (with message):        ~10 bytes framing  + 40 bytes payload = 50 bytes total

At low volume the difference is trivial. At high frequency or high concurrency it is the dominant cost in your system.

When Polling Is Still the Right Call

Not every use case needs WebSockets. Polling is completely reasonable when updates are infrequent (checking for a build result every 10 seconds), when the data is naturally pull-based (user manually refreshing a report), or when your infrastructure does not support long-lived connections (some edge runtimes and serverless environments). If you are sending fewer than one message per minute per user, a simple poll with a sensible interval is easier to operate and reason about than managing WebSocket connections.

The tipping point is usually somewhere around one event per 5-10 seconds. Below that, polling is fine. Above that, the overhead compounds fast enough that WebSockets pay for themselves in infrastructure cost alone, before you even count the latency improvement.

How NoLag Approaches This

NoLag uses WebSockets as the transport layer for all real-time message delivery. Every client opens a single persistent connection when it connects, and all subscriptions, messages, and presence events flow over that connection in both directions.

On top of WebSockets, NoLag uses MessagePack as the wire format instead of JSON. MessagePack is a binary serialization format that produces payloads 30-50% smaller than equivalent JSON. Combined with the low framing overhead of WebSockets, the per-message cost is about as low as you can get without a custom binary protocol. For a chat app with thousands of concurrent users or an IoT fleet sending high-frequency telemetry, that efficiency matters.

The connection also handles reconnection and message replay automatically. If a client drops off the network and reconnects, it receives any messages it missed during the gap, so you do not have to build catch-up logic yourself.

The short version: use polling when updates are rare and simplicity matters. Use WebSockets when you need low latency, high frequency, or bidirectional communication. Those are not the same tradeoff, and reaching for WebSockets by default for anything interactive is the right instinct.