Data Structures · Systems Programming

Ring Buffers in C

A ring buffer (a.k.a. circular buffer) is a fixed-size FIFO queue that wraps around on itself. It's the backbone of audio pipelines, serial drivers, lock-free queues — and the RX/TX descriptor rings used by most high-performance NICs. Let's build one from scratch.

▶ Play with the demo View the C code

Why this site exists. This is interview prep for a low-level C / networking role at AMD's Network Solutions Group (the former Solarflare team in Cambridge). A single-producer / single-consumer ring buffer is one of the highest-yield C coding exercises for that team — it maps directly onto the NIC descriptor-ring datapath. Learn it cold here, then drill the full C question bank.

What is a ring buffer?

Imagine a regular array, but with the two ends glued together so it forms a circle. You keep two cursors — one for writing (head) and one for reading (tail) — and when either cursor runs off the end, it wraps back to index 0. The storage never grows; old space is reused as data is consumed.

🔁

Fixed memory

Allocate once. No malloc per element, no fragmentation, predictable footprint — ideal for embedded.

⚡

O(1) operations

Enqueue and dequeue are constant-time index updates. No shifting elements like a naïve array queue.

🎧

Streaming-friendly

Perfect for producer/consumer flows: audio samples, UART bytes, network packets, log lines.

🔒

Lock-free capable

With one producer and one consumer, you can build it with atomics and zero locks.

How it works

Three numbers describe the whole structure at any moment: head (where the next write lands), tail (where the next read comes from), and count (how many items are stored). The wrap-around is just modular arithmetic.

Enqueue (write)

If the buffer is full, reject (or overwrite the oldest).
Store the new value at buf[head].
Advance: head = (head + 1) % capacity.
Increment count.

Dequeue (read)

If the buffer is empty, reject.
Read the value at buf[tail].
Advance: tail = (tail + 1) % capacity.
Decrement count.

The full-vs-empty problem. When head == tail, is the buffer empty or completely full? Both states leave the cursors in the same spot! The cleanest fix — used in the code below — is to track an explicit count. Two popular alternatives are always leaving one slot empty (so full means head + 1 == tail) or using free-running indices that never wrap, comparing head - tail.

See it in motion

Press enqueue and dequeue to watch head (blue) and tail (teal) chase each other around the ring. Notice how indices wrap from 7 back to 0, and how the buffer refuses to enqueue when full or dequeue when empty.

head (write → 0) tail (read → 0) occupied

Buffer initialized — capacity 8, empty.

A complete implementation in C

Here's a clean, portable byte ring buffer. We use an explicit count field so full and empty are never ambiguous. Drop these two files into any project.

fileringbuf.h

#ifndef RINGBUF_H
#define RINGBUF_H

#include <stddef.h>
#include <stdbool.h>

// A fixed-capacity FIFO ring buffer for bytes.
typedef struct {
    unsigned char *buf;   // backing storage, owns 'cap' bytes
    size_t cap;           // capacity in bytes
    size_t head;          // index of next write
    size_t tail;          // index of next read
    size_t count;         // number of bytes currently stored
} ringbuf_t;

bool   rb_init(ringbuf_t *rb, size_t capacity);
void   rb_free(ringbuf_t *rb);
bool   rb_is_empty(const ringbuf_t *rb);
bool   rb_is_full(const ringbuf_t *rb);
size_t rb_size(const ringbuf_t *rb);

// Returns true on success, false if the buffer is full.
bool   rb_put(ringbuf_t *rb, unsigned char byte);
// Returns true on success, false if the buffer is empty.
bool   rb_get(ringbuf_t *rb, unsigned char *out);

#endif // RINGBUF_H

fileringbuf.c

#include "ringbuf.h"
#include <stdlib.h>

bool rb_init(ringbuf_t *rb, size_t capacity) {
    if (capacity == 0) return false;        // avoids % by zero later
    rb->buf = malloc(capacity);
    if (!rb->buf) return false;
    rb->cap = capacity;
    rb->head = rb->tail = rb->count = 0;
    return true;
}

void rb_free(ringbuf_t *rb) {
    free(rb->buf);
    rb->buf = NULL;
    rb->cap = rb->head = rb->tail = rb->count = 0;
}

bool rb_is_empty(const ringbuf_t *rb) { return rb->count == 0; }
bool rb_is_full(const ringbuf_t *rb)  { return rb->count == rb->cap; }
size_t rb_size(const ringbuf_t *rb)   { return rb->count; }

bool rb_put(ringbuf_t *rb, unsigned char byte) {
    if (rb_is_full(rb)) return false;
    rb->buf[rb->head] = byte;
    rb->head = (rb->head + 1) % rb->cap;  // wrap around
    rb->count++;
    return true;
}

bool rb_get(ringbuf_t *rb, unsigned char *out) {
    if (rb_is_empty(rb)) return false;
    *out = rb->buf[rb->tail];
    rb->tail = (rb->tail + 1) % rb->cap;  // wrap around
    rb->count--;
    return true;
}

Using it

filemain.c

#include "ringbuf.h"
#include <stdio.h>

int main(void) {
    ringbuf_t rb;
    if (!rb_init(&rb, 4)) return 1;

    rb_put(&rb, 'H');
    rb_put(&rb, 'i');
    rb_put(&rb, '!');

    unsigned char c;
    while (rb_get(&rb, &c)) {
        putchar(c);          // prints: Hi!
    }
    putchar('\n');

    rb_free(&rb);
    return 0;
}

Compile & run: cc main.c ringbuf.c -o demo && ./demo

Going faster

Power-of-two capacity

The modulo operator % can be slow on microcontrollers without a hardware divider. If the capacity is a power of two, you can replace it with a bitwise AND, which is essentially free.

// When capacity is a power of two, the expensive '%' becomes a
// single bitwise AND — handy in tight loops and on small MCUs.
#define CAP 256                 // must be a power of two
unsigned char buf[CAP];
size_t head = 0, tail = 0;

void put(unsigned char b) {
    buf[head & (CAP - 1)] = b;  // head % CAP, but faster
    head++;                     // free-running counter
}

unsigned char get(void) {
    return buf[tail++ & (CAP - 1)];
}

// size = head - tail ; full when (head - tail) == CAP

Lock-free SPSC queue

The single-producer / single-consumer pattern is a workhorse for passing data between an interrupt handler and the main loop, or between two threads. Because the producer only mutates head and the consumer only mutates tail, you need no mutex — just the right memory ordering on the atomics.

filespsc.c

#include <stdatomic.h>
#include <stdbool.h>
#include <stddef.h>

// Single-producer / single-consumer lock-free ring buffer.
// The producer only writes 'head'; the consumer only writes 'tail'.
typedef struct {
    unsigned char *buf;
    size_t cap;                 // power of two
    _Atomic size_t head;        // written by producer
    _Atomic size_t tail;        // written by consumer
} spsc_t;

bool spsc_put(spsc_t *q, unsigned char b) {
    size_t h = atomic_load_explicit(&q->head, memory_order_relaxed);
    size_t t = atomic_load_explicit(&q->tail, memory_order_acquire);
    if (h - t == q->cap) return false;        // full
    q->buf[h & (q->cap - 1)] = b;
    atomic_store_explicit(&q->head, h + 1, memory_order_release);
    return true;
}

bool spsc_get(spsc_t *q, unsigned char *out) {
    size_t t = atomic_load_explicit(&q->tail, memory_order_relaxed);
    size_t h = atomic_load_explicit(&q->head, memory_order_acquire);
    if (h == t) return false;                 // empty
    *out = q->buf[t & (q->cap - 1)];
    atomic_store_explicit(&q->tail, t + 1, memory_order_release);
    return true;
}

Memory ordering matters. The release store on the writer side pairs with the acquire load on the reader side. This guarantees that the data byte is visible beforethe index update that publishes it. Get this wrong and you'll read torn or stale data on weakly ordered CPUs (ARM, RISC-V).

Common pitfalls

Mistake	Consequence	Fix
Using only head == tail for state	Can't distinguish full from empty	Track count, leave a gap, or use free-running indices
Forgetting to wrap an index	Out-of-bounds read/write, memory corruption	Always `% cap` (or `& (cap-1)`)
Sharing without atomics across threads	Data races, torn reads, lost items	Use _Atomic + acquire/release, or a lock
Off-by-one in “leave one slot” scheme	Silently loses one byte of capacity or overflows	Pick one convention and unit-test the boundaries
Index overflow on free-running counters	Unsigned wrap is well-defined regardless of capacity; the power-of-two rule is for masking, not subtraction	Use unsigned types; keep capacity well below half the counter range so `head - tail` stays unambiguous

Where ring buffers show up

🔊

Audio & DSP

Jitter buffers and sample queues between the audio callback and your processing thread.

📟

UART / serial drivers

An ISR pushes received bytes in; the main loop drains them out at its leisure.

🌐

Networking

Packet rings (e.g. kernel NIC queues, io_uring) move buffers between kernel and userspace.

📝

Logging & tracing

Fixed-size in-memory log rings keep the most recent events without unbounded growth.

Next steps: try extending the implementation to store arbitrary structs instead of bytes, add a bulk rb_write(rb, src, n) that copies a whole chunk with memcpy across the wrap point, or benchmark % vs & on your target hardware.

Go deeper →

Ring buffers in the NIC datapath (the version they'll ask about)

Drill →

The full C question bank, with answers