Performance Impact of Memory Layout

Understanding how struct field ordering affects Go application performance

5-20% Throughput Improvement
10-30% Memory Reduction
64 bytes CPU Cache Line Size

CPU Cache Lines and Performance

Modern CPUs fetch memory in 64-byte cache lines. When a struct spans multiple cache lines, accessing it requires multiple memory fetches, reducing performance significantly.

Cache-Inefficient Struct

This struct forces two cache line fetches due to poor field ordering:

type BadLayout struct { Field1 [60]byte // Uses most of first cache line Flag bool // Forces second cache line Field2 int64 // Also in second cache line }
Field1 (60)
pad
Flag
pad
Field2

Optimized Layout

type GoodLayout struct { Field1 [60]byte // 60 bytes Field2 int64 // 8 bytes - starts at aligned boundary Flag bool // 1 byte + minimal padding }

Key insight: Keeping hot fields together and within cache line boundaries can improve iteration performance by 10-20% for large slices of structs.

Memory Alignment Rules

Go follows specific alignment rules for different architectures. Fields must start at addresses divisible by their alignment requirement.

AMD64 / ARM64 (64-bit)

Type Size Alignment
bool, int8, uint81 byte1 byte
int16, uint162 bytes2 bytes
int32, uint32, float324 bytes4 bytes
int64, uint64, float648 bytes8 bytes
string16 bytes8 bytes
slice24 bytes8 bytes
pointer, map, chan, interface8 bytes8 bytes

386 (32-bit)

On 32-bit systems, pointers are 4 bytes and int64/float64 have 4-byte alignment (not 8). This means struct layouts can differ between architectures.

Use the extension: Switch between architectures in Go Memory Visualizer to see exactly how your struct layouts differ across platforms.

Best Practices

  1. Order fields by alignment - Place 8-byte aligned fields first, then 4-byte, then 2-byte, finally 1-byte
  2. Group similar types - Keep fields of the same size together to minimize padding
  3. Consider cache line boundaries - For performance-critical structs, keep total size under 64 bytes when possible
  4. Use the extension's optimization - One-click automatic field reordering
  5. Benchmark your specific use case - Performance gains vary by access patterns

Optimal Field Ordering Pattern

type OptimalStruct struct { // 8-byte aligned fields first Pointer *SomeType Slice []byte String string Int64 int64 Float64 float64 // 4-byte aligned fields Int32 int32 Float32 float32 // 2-byte aligned fields Int16 int16 // 1-byte aligned fields last Bool1 bool Bool2 bool Byte1 byte }

Real-World Performance Gains

  • API Servers: 5-15% throughput improvement from better cache locality on request/response structs
  • Data Processing: 10-20% faster iteration over large slices of optimized structs
  • Memory Usage: 10-30% reduction in heap allocations, reducing GC pressure
  • Cloud Costs: Smaller memory footprint = lower infrastructure costs at scale

Benchmarking Your Structs

Always benchmark your specific use case to measure actual performance impact:

func BenchmarkBadLayout(b *testing.B) { items := make([]BadLayout, 10000) b.ResetTimer() for i := 0; i < b.N; i++ { for j := range items { _ = items[j].Flag } } } func BenchmarkGoodLayout(b *testing.B) { items := make([]GoodLayout, 10000) b.ResetTimer() for i := 0; i < b.N; i++ { for j := range items { _ = items[j].Flag } } }

Run with: go test -bench=. -benchmem

Related Tools

  • Go Memory Visualizer - Real-time visualization and one-click optimization
  • go tool compile -m - Escape analysis to see what allocates on heap
  • go build -gcflags=-m - Additional compiler optimization insights
  • pprof - Memory profiling for production applications
  • unsafe.Sizeof() - Check struct sizes programmatically
Install Go Memory Visualizer View Examples