Understanding how struct field ordering affects Go application performance
5-20%Throughput Improvement
10-30%Memory Reduction
64 bytesCPU Cache Line Size
CPU Cache Lines and Performance
Modern CPUs fetch memory in 64-byte cache lines. When a struct spans multiple cache lines, accessing it requires multiple memory fetches, reducing performance significantly.
Cache-Inefficient Struct
This struct forces two cache line fetches due to poor field ordering:
type BadLayout struct {
Field1 [60]byte// Uses most of first cache line
Flag bool// Forces second cache line
Field2 int64// Also in second cache line
}
Field1 (60)
pad
Flag
pad
Field2
Optimized Layout
type GoodLayout struct {
Field1 [60]byte// 60 bytes
Field2 int64// 8 bytes - starts at aligned boundary
Flag bool// 1 byte + minimal padding
}
Key insight: Keeping hot fields together and within cache line boundaries can improve iteration performance by 10-20% for large slices of structs.
Memory Alignment Rules
Go follows specific alignment rules for different architectures. Fields must start at addresses divisible by their alignment requirement.
AMD64 / ARM64 (64-bit)
Type
Size
Alignment
bool, int8, uint8
1 byte
1 byte
int16, uint16
2 bytes
2 bytes
int32, uint32, float32
4 bytes
4 bytes
int64, uint64, float64
8 bytes
8 bytes
string
16 bytes
8 bytes
slice
24 bytes
8 bytes
pointer, map, chan, interface
8 bytes
8 bytes
386 (32-bit)
On 32-bit systems, pointers are 4 bytes and int64/float64 have 4-byte alignment (not 8). This means struct layouts can differ between architectures.
Use the extension: Switch between architectures in Go Memory Visualizer to see exactly how your struct layouts differ across platforms.
Best Practices
Order fields by alignment - Place 8-byte aligned fields first, then 4-byte, then 2-byte, finally 1-byte
Group similar types - Keep fields of the same size together to minimize padding
Consider cache line boundaries - For performance-critical structs, keep total size under 64 bytes when possible
Use the extension's optimization - One-click automatic field reordering
Benchmark your specific use case - Performance gains vary by access patterns