Ring Buffers in C
A ring buffer (a.k.a. circular buffer) is a fixed-size FIFO queue that wraps around on itself. It's the backbone of audio pipelines, serial drivers, lock-free queues β and the RX/TX descriptor rings used by most high-performance NICs. Let's build one from scratch.
What is a ring buffer?
Imagine a regular array, but with the two ends glued together so it forms a circle. You keep two cursors β one for writing (head) and one for reading (tail) β and when either cursor runs off the end, it wraps back to indexΒ 0. The storage never grows; old space is reused as data is consumed.
Fixed memory
Allocate once. No malloc per element, no fragmentation, predictable footprint β ideal for embedded.
O(1) operations
Enqueue and dequeue are constant-time index updates. No shifting elements like a naΓ―ve array queue.
Streaming-friendly
Perfect for producer/consumer flows: audio samples, UART bytes, network packets, log lines.
Lock-free capable
With one producer and one consumer, you can build it with atomics and zero locks.
How it works
Three numbers describe the whole structure at any moment: head (where the next write lands), tail (where the next read comes from), and count (how many items are stored). The wrap-around is just modular arithmetic.
Enqueue (write)
- If the buffer is full, reject (or overwrite the oldest).
- Store the new value at
buf[head]. - Advance:
head = (head + 1) % capacity. - Increment
count.
Dequeue (read)
- If the buffer is empty, reject.
- Read the value at
buf[tail]. - Advance:
tail = (tail + 1) % capacity. - Decrement
count.
head == tail, is the buffer empty or completely full? Both states leave the cursors in the same spot! The cleanest fix β used in the code below β is to track an explicit count. Two popular alternatives are always leaving one slot empty (so full means head + 1 == tail) or using free-running indices that never wrap, comparing head - tail.See it in motion
Press enqueue and dequeue to watch head (blue) and tail (teal) chase each other around the ring. Notice how indices wrap from 7 back to 0, and how the buffer refuses to enqueue when full or dequeue when empty.
A complete implementation in C
Here's a clean, portable byte ring buffer. We use an explicit count field so full and empty are never ambiguous. Drop these two files into any project.
#ifndef RINGBUF_H
#define RINGBUF_H
#include <stddef.h>
#include <stdbool.h>
// A fixed-capacity FIFO ring buffer for bytes.
typedef struct {
unsigned char *buf; // backing storage, owns 'cap' bytes
size_t cap; // capacity in bytes
size_t head; // index of next write
size_t tail; // index of next read
size_t count; // number of bytes currently stored
} ringbuf_t;
bool rb_init(ringbuf_t *rb, size_t capacity);
void rb_free(ringbuf_t *rb);
bool rb_is_empty(const ringbuf_t *rb);
bool rb_is_full(const ringbuf_t *rb);
size_t rb_size(const ringbuf_t *rb);
// Returns true on success, false if the buffer is full.
bool rb_put(ringbuf_t *rb, unsigned char byte);
// Returns true on success, false if the buffer is empty.
bool rb_get(ringbuf_t *rb, unsigned char *out);
#endif // RINGBUF_H
#include "ringbuf.h"
#include <stdlib.h>
bool rb_init(ringbuf_t *rb, size_t capacity) {
if (capacity == 0) return false; // avoids % by zero later
rb->buf = malloc(capacity);
if (!rb->buf) return false;
rb->cap = capacity;
rb->head = rb->tail = rb->count = 0;
return true;
}
void rb_free(ringbuf_t *rb) {
free(rb->buf);
rb->buf = NULL;
rb->cap = rb->head = rb->tail = rb->count = 0;
}
bool rb_is_empty(const ringbuf_t *rb) { return rb->count == 0; }
bool rb_is_full(const ringbuf_t *rb) { return rb->count == rb->cap; }
size_t rb_size(const ringbuf_t *rb) { return rb->count; }
bool rb_put(ringbuf_t *rb, unsigned char byte) {
if (rb_is_full(rb)) return false;
rb->buf[rb->head] = byte;
rb->head = (rb->head + 1) % rb->cap; // wrap around
rb->count++;
return true;
}
bool rb_get(ringbuf_t *rb, unsigned char *out) {
if (rb_is_empty(rb)) return false;
*out = rb->buf[rb->tail];
rb->tail = (rb->tail + 1) % rb->cap; // wrap around
rb->count--;
return true;
}
Using it
#include "ringbuf.h"
#include <stdio.h>
int main(void) {
ringbuf_t rb;
if (!rb_init(&rb, 4)) return 1;
rb_put(&rb, 'H');
rb_put(&rb, 'i');
rb_put(&rb, '!');
unsigned char c;
while (rb_get(&rb, &c)) {
putchar(c); // prints: Hi!
}
putchar('\n');
rb_free(&rb);
return 0;
}
cc main.c ringbuf.c -o demo && ./demoGoing faster
Power-of-two capacity
The modulo operator % can be slow on microcontrollers without a hardware divider. If the capacity is a power of two, you can replace it with a bitwise AND, which is essentially free.
// When capacity is a power of two, the expensive '%' becomes a
// single bitwise AND β handy in tight loops and on small MCUs.
#define CAP 256 // must be a power of two
unsigned char buf[CAP];
size_t head = 0, tail = 0;
void put(unsigned char b) {
buf[head & (CAP - 1)] = b; // head % CAP, but faster
head++; // free-running counter
}
unsigned char get(void) {
return buf[tail++ & (CAP - 1)];
}
// size = head - tail ; full when (head - tail) == CAP
Lock-free SPSC queue
The single-producer / single-consumer pattern is a workhorse for passing data between an interrupt handler and the main loop, or between two threads. Because the producer only mutates head and the consumer only mutates tail, you need no mutex β just the right memory ordering on the atomics.
#include <stdatomic.h>
#include <stdbool.h>
#include <stddef.h>
// Single-producer / single-consumer lock-free ring buffer.
// The producer only writes 'head'; the consumer only writes 'tail'.
typedef struct {
unsigned char *buf;
size_t cap; // power of two
_Atomic size_t head; // written by producer
_Atomic size_t tail; // written by consumer
} spsc_t;
bool spsc_put(spsc_t *q, unsigned char b) {
size_t h = atomic_load_explicit(&q->head, memory_order_relaxed);
size_t t = atomic_load_explicit(&q->tail, memory_order_acquire);
if (h - t == q->cap) return false; // full
q->buf[h & (q->cap - 1)] = b;
atomic_store_explicit(&q->head, h + 1, memory_order_release);
return true;
}
bool spsc_get(spsc_t *q, unsigned char *out) {
size_t t = atomic_load_explicit(&q->tail, memory_order_relaxed);
size_t h = atomic_load_explicit(&q->head, memory_order_acquire);
if (h == t) return false; // empty
*out = q->buf[t & (q->cap - 1)];
atomic_store_explicit(&q->tail, t + 1, memory_order_release);
return true;
}
release store on the writer side pairs with the acquire load on the reader side. This guarantees that the data byte is visible beforethe index update that publishes it. Get this wrong and you'll read torn or stale data on weakly ordered CPUs (ARM, RISC-V).Common pitfalls
| Mistake | Consequence | Fix |
|---|---|---|
| Using only head == tail for state | Can't distinguish full from empty | Track count, leave a gap, or use free-running indices |
| Forgetting to wrap an index | Out-of-bounds read/write, memory corruption | Always % cap (or & (cap-1)) |
| Sharing without atomics across threads | Data races, torn reads, lost items | Use _Atomic + acquire/release, or a lock |
| Off-by-one in βleave one slotβ scheme | Silently loses one byte of capacity or overflows | Pick one convention and unit-test the boundaries |
| Index overflow on free-running counters | Unsigned wrap is well-defined regardless of capacity; the power-of-two rule is for masking, not subtraction | Use unsigned types; keep capacity well below half the counter range so head - tail stays unambiguous |
Where ring buffers show up
Audio & DSP
Jitter buffers and sample queues between the audio callback and your processing thread.
UART / serial drivers
An ISR pushes received bytes in; the main loop drains them out at its leisure.
Networking
Packet rings (e.g. kernel NIC queues, io_uring) move buffers between kernel and userspace.
Logging & tracing
Fixed-size in-memory log rings keep the most recent events without unbounded growth.
rb_write(rb, src, n) that copies a whole chunk with memcpy across the wrap point, or benchmark % vs & on your target hardware.