49 termshover any term anywhere

The glossary

Every term below is auto-linked across the question banks: when one appears anywhere it gets a dotted underline and a popup. This page is the full list in one place.

ABA: A lock-free hazard: a value changes A→B→A, so a CAS comparing only the value wrongly succeeds.
AIMD: Additive-Increase / Multiplicative-Decrease — TCP’s congestion shape: ramp up linearly, halve on loss (the sawtooth).
all-reduce: A collective that reduces (e.g. sums) a value across all GPUs and gives every GPU the result — on the critical path of training.
cache line: The unit of cache transfer and coherence, typically 64 bytes — the smallest chunk moved between caches and memory.
CAS: Compare-And-Swap — an atomic “set to new only if it still equals expected”; the building block of lock-free code.
CTPIO: Cut-Through Programmed I/O — the NIC begins transmitting a frame before it has finished crossing PCIe.
DCQCN: Data Center QCN — the standard, ECN-driven RoCEv2 congestion-control algorithm.
DDIO: Data Direct I/O — lets inbound DMA land in last-level cache instead of DRAM, cutting latency at high rates.
descriptor: A small struct in host memory describing one buffer (DMA address, length, status/ownership bit) that the driver and NIC hand back and forth.
DMA: Direct Memory Access — the device moves data to/from host memory itself, by physical (or IOVA) address, without the CPU copying each byte.
doorbell: An MMIO register write that tells the NIC “I’ve posted work up to here” — a posted, fire-and-forget write.
DPDK: Data Plane Development Kit — a userspace poll-mode-driver framework for high-rate packet processing, bypassing the kernel.
ef_vi: Solarflare’s layer-2 API for direct userspace access to the NIC’s RX/TX rings and event queue — the lowest-latency datapath.
false sharing: Two unrelated variables on the same cache line, so writes from different cores bounce the line between their caches.
GRO: Generic Receive Offload — the stack coalesces several received segments of a flow into one large buffer to cut per-packet cost.
GSO: Generic Segmentation Offload — defer splitting a large buffer into MTU-sized segments until late (or to the NIC).
hugepage: A large (e.g. 2 MB) memory page — fewer TLB entries and contiguous physical memory, used for DMA buffers.
incast: Many senders bursting to one receiver/switch port at once, overflowing its buffer and causing synchronized drops.
IOMMU: I/O Memory Management Unit — translates device (IOVA) addresses to physical and bounds-checks DMA, isolating devices.
IOVA: I/O Virtual Address — the address a device uses for DMA; the IOMMU maps it to a physical address.
kernel bypass: Talking to the NIC from userspace (Onload / ef_vi / DPDK), skipping syscalls, copies, and the kernel stack on the hot path.
memory barrier: An instruction (or atomic ordering) that stops the CPU/compiler reordering memory accesses across it.
MESI: A cache-coherence protocol with states Modified / Exclusive / Shared / Invalid that keep each core’s view of a cache line consistent.
MMIO: Memory-Mapped I/O — device registers mapped into the address space, so a load/store becomes a bus transaction to the device.
MOESI: MESI plus an Owned state, letting a dirty line be shared without writing back first (used on AMD).
MPMC: Multi-Producer / Multi-Consumer — multiple writers contend on the shared indices, needing CAS (reserve/commit).
MSI-X: Message-Signaled Interrupts (eXtended) — interrupts delivered as memory writes, with many vectors steerable to specific CPUs.
MSS: Maximum Segment Size — the largest TCP payload per segment, derived from the MTU.
MTU: Maximum Transmission Unit — the largest L2 payload a link carries (typically 1500 bytes on Ethernet).
NAPI: The Linux NIC interrupt-mitigation API: an interrupt schedules a polled, budgeted batch drain instead of one interrupt per packet.
NUMA: Non-Uniform Memory Access — memory attached to one CPU socket is faster to reach than another socket’s memory.
Onload: Solarflare’s userspace TCP/IP stack behind a sockets shim (LD_PRELOAD), accelerating unmodified apps via kernel bypass.
PCIe: PCI Express — the serial bus between the CPU and devices like the NIC; carries MMIO, DMA, and interrupts as packetized TLPs.
PFC: Priority Flow Control — per-priority Ethernet pause that makes a fabric lossless, at the risk of head-of-line blocking / deadlock.
queue pair: An RDMA send queue + receive queue the app posts work requests to; the NIC executes them and signals a completion queue.
RCU: Read-Copy-Update — readers run lock-free; a writer unlinks, then waits a grace period (all readers finish) before freeing.
RDMA: Remote Direct Memory Access — the NIC reads/writes remote memory directly via queue pairs, zero-copy and kernel-bypassing.
RoCEv2: RDMA over Converged Ethernet v2 — RDMA carried over UDP/IP on Ethernet, common in AI/HPC clusters.
RSS: Receive Side Scaling — the NIC hashes each flow to one of several RX queues so traffic spreads across CPUs.
sk_buff: The Linux kernel’s packet buffer structure carrying a packet and its metadata up and down the stack.
SPSC: Single-Producer / Single-Consumer — one writer of each index, so the ring needs no locks (just release/acquire).
store buffer: A per-core queue of pending stores that lets a core run ahead before its stores are globally visible — a source of reordering.
TIME_WAIT: The TCP state the active closer sits in for ~2×MSL, so late duplicate segments can’t corrupt a new connection on the same ports.
TLB: Translation Lookaside Buffer — caches virtual→physical page translations so the MMU skips the page-table walk on a hit.
TLP: Transaction Layer Packet — the unit of PCIe traffic (memory read/write, completion, etc.).
TSO: TCP Segmentation Offload — the NIC splits one large TCP buffer into MSS-sized segments and fills their checksums.
TX_PUSH: An ef_vi optimization that writes the descriptor + packet with the doorbell in one shot, avoiding a PCIe read-back.
UEC: Ultra Ethernet Consortium — an open Ethernet stack for AI/HPC scale-out (packet spray, in-order delivery, modern congestion control).
XDP: eXpress Data Path — a BPF hook in the driver, before sk_buff allocation, for very fast drop / redirect / transmit.

← Drive it

The interactive demo gallery

Use it →

The senior question bank