49 termshover any term anywhere
The glossary
Every term below is auto-linked across the question banks: when one appears anywhere it gets a dotted underline and a popup. This page is the full list in one place.
- ABA
- A lock-free hazard: a value changes A→B→A, so a CAS comparing only the value wrongly succeeds.
- AIMD
- Additive-Increase / Multiplicative-Decrease — TCP’s congestion shape: ramp up linearly, halve on loss (the sawtooth).
- all-reduce
- A collective that reduces (e.g. sums) a value across all GPUs and gives every GPU the result — on the critical path of training.
- cache line
- The unit of cache transfer and coherence, typically 64 bytes — the smallest chunk moved between caches and memory.
- CAS
- Compare-And-Swap — an atomic “set to new only if it still equals expected”; the building block of lock-free code.
- CTPIO
- Cut-Through Programmed I/O — the NIC begins transmitting a frame before it has finished crossing PCIe.
- DCQCN
- Data Center QCN — the standard, ECN-driven RoCEv2 congestion-control algorithm.
- DDIO
- Data Direct I/O — lets inbound DMA land in last-level cache instead of DRAM, cutting latency at high rates.
- descriptor
- A small struct in host memory describing one buffer (DMA address, length, status/ownership bit) that the driver and NIC hand back and forth.
- DMA
- Direct Memory Access — the device moves data to/from host memory itself, by physical (or IOVA) address, without the CPU copying each byte.
- doorbell
- An MMIO register write that tells the NIC “I’ve posted work up to here” — a posted, fire-and-forget write.
- DPDK
- Data Plane Development Kit — a userspace poll-mode-driver framework for high-rate packet processing, bypassing the kernel.
- ef_vi
- Solarflare’s layer-2 API for direct userspace access to the NIC’s RX/TX rings and event queue — the lowest-latency datapath.
- false sharing
- Two unrelated variables on the same cache line, so writes from different cores bounce the line between their caches.
- GRO
- Generic Receive Offload — the stack coalesces several received segments of a flow into one large buffer to cut per-packet cost.
- GSO
- Generic Segmentation Offload — defer splitting a large buffer into MTU-sized segments until late (or to the NIC).
- hugepage
- A large (e.g. 2 MB) memory page — fewer TLB entries and contiguous physical memory, used for DMA buffers.
- incast
- Many senders bursting to one receiver/switch port at once, overflowing its buffer and causing synchronized drops.
- IOMMU
- I/O Memory Management Unit — translates device (IOVA) addresses to physical and bounds-checks DMA, isolating devices.
- IOVA
- I/O Virtual Address — the address a device uses for DMA; the IOMMU maps it to a physical address.
- kernel bypass
- Talking to the NIC from userspace (Onload / ef_vi / DPDK), skipping syscalls, copies, and the kernel stack on the hot path.
- memory barrier
- An instruction (or atomic ordering) that stops the CPU/compiler reordering memory accesses across it.
- MESI
- A cache-coherence protocol with states Modified / Exclusive / Shared / Invalid that keep each core’s view of a cache line consistent.
- MMIO
- Memory-Mapped I/O — device registers mapped into the address space, so a load/store becomes a bus transaction to the device.
- MOESI
- MESI plus an Owned state, letting a dirty line be shared without writing back first (used on AMD).
- MPMC
- Multi-Producer / Multi-Consumer — multiple writers contend on the shared indices, needing CAS (reserve/commit).
- MSI-X
- Message-Signaled Interrupts (eXtended) — interrupts delivered as memory writes, with many vectors steerable to specific CPUs.
- MSS
- Maximum Segment Size — the largest TCP payload per segment, derived from the MTU.
- MTU
- Maximum Transmission Unit — the largest L2 payload a link carries (typically 1500 bytes on Ethernet).
- NAPI
- The Linux NIC interrupt-mitigation API: an interrupt schedules a polled, budgeted batch drain instead of one interrupt per packet.
- NUMA
- Non-Uniform Memory Access — memory attached to one CPU socket is faster to reach than another socket’s memory.
- Onload
- Solarflare’s userspace TCP/IP stack behind a sockets shim (LD_PRELOAD), accelerating unmodified apps via kernel bypass.
- PCIe
- PCI Express — the serial bus between the CPU and devices like the NIC; carries MMIO, DMA, and interrupts as packetized TLPs.
- PFC
- Priority Flow Control — per-priority Ethernet pause that makes a fabric lossless, at the risk of head-of-line blocking / deadlock.
- queue pair
- An RDMA send queue + receive queue the app posts work requests to; the NIC executes them and signals a completion queue.
- RCU
- Read-Copy-Update — readers run lock-free; a writer unlinks, then waits a grace period (all readers finish) before freeing.
- RDMA
- Remote Direct Memory Access — the NIC reads/writes remote memory directly via queue pairs, zero-copy and kernel-bypassing.
- RoCEv2
- RDMA over Converged Ethernet v2 — RDMA carried over UDP/IP on Ethernet, common in AI/HPC clusters.
- RSS
- Receive Side Scaling — the NIC hashes each flow to one of several RX queues so traffic spreads across CPUs.
- sk_buff
- The Linux kernel’s packet buffer structure carrying a packet and its metadata up and down the stack.
- SPSC
- Single-Producer / Single-Consumer — one writer of each index, so the ring needs no locks (just release/acquire).
- store buffer
- A per-core queue of pending stores that lets a core run ahead before its stores are globally visible — a source of reordering.
- TIME_WAIT
- The TCP state the active closer sits in for ~2×MSL, so late duplicate segments can’t corrupt a new connection on the same ports.
- TLB
- Translation Lookaside Buffer — caches virtual→physical page translations so the MMU skips the page-table walk on a hit.
- TLP
- Transaction Layer Packet — the unit of PCIe traffic (memory read/write, completion, etc.).
- TSO
- TCP Segmentation Offload — the NIC splits one large TCP buffer into MSS-sized segments and fills their checksums.
- TX_PUSH
- An ef_vi optimization that writes the descriptor + packet with the doorbell in one shot, avoiding a PCIe read-back.
- UEC
- Ultra Ethernet Consortium — an open Ethernet stack for AI/HPC scale-out (packet spray, in-order delivery, modern congestion control).
- XDP
- eXpress Data Path — a BPF hook in the driver, before sk_buff allocation, for very fast drop / redirect / transmit.