Go Struct Memory Performance Guide

CPU Cache Lines and Performance

Modern CPUs fetch memory in 64-byte cache lines. When a struct spans multiple cache lines, accessing it requires multiple memory fetches, reducing performance significantly.

Cache-Inefficient Struct

This struct forces two cache line fetches due to poor field ordering:

                    type BadLayout struct {
    Field1 [60]byte  // Uses most of first cache line
    Flag   bool      // Forces second cache line
    Field2 int64     // Also in second cache line
}
                

Field1 (60)

pad

Flag

pad

Field2

Optimized Layout

                    type GoodLayout struct {
    Field1 [60]byte  // 60 bytes
    Field2 int64     // 8 bytes - starts at aligned boundary
    Flag   bool      // 1 byte + minimal padding
}
                

Key insight: Keeping hot fields together and within cache line boundaries can improve iteration performance by 10-20% for large slices of structs.

Memory Alignment Rules

Go follows specific alignment rules for different architectures. Fields must start at addresses divisible by their alignment requirement.

AMD64 / ARM64 (64-bit)

Type	Size	Alignment
bool, int8, uint8	1 byte	1 byte
int16, uint16	2 bytes	2 bytes
int32, uint32, float32	4 bytes	4 bytes
int64, uint64, float64	8 bytes	8 bytes
string	16 bytes	8 bytes
slice	24 bytes	8 bytes
pointer, map, chan, interface	8 bytes	8 bytes

386 (32-bit)

On 32-bit systems, pointers are 4 bytes and int64/float64 have 4-byte alignment (not 8). This means struct layouts can differ between architectures.

Use the extension: Switch between architectures in Go Memory Visualizer to see exactly how your struct layouts differ across platforms.

Best Practices

Order fields by alignment - Place 8-byte aligned fields first, then 4-byte, then 2-byte, finally 1-byte
Group similar types - Keep fields of the same size together to minimize padding
Consider cache line boundaries - For performance-critical structs, keep total size under 64 bytes when possible
Use the extension's optimization - One-click automatic field reordering
Benchmark your specific use case - Performance gains vary by access patterns

Optimal Field Ordering Pattern

                    type OptimalStruct struct {
    // 8-byte aligned fields first
    Pointer   *SomeType
    Slice     []byte
    String    string
    Int64     int64
    Float64   float64
    
    // 4-byte aligned fields
    Int32     int32
    Float32   float32
    
    // 2-byte aligned fields
    Int16     int16
    
    // 1-byte aligned fields last
    Bool1     bool
    Bool2     bool
    Byte1     byte
}
                

Real-World Performance Gains

API Servers: 5-15% throughput improvement from better cache locality on request/response structs
Data Processing: 10-20% faster iteration over large slices of optimized structs
Memory Usage: 10-30% reduction in heap allocations, reducing GC pressure
Cloud Costs: Smaller memory footprint = lower infrastructure costs at scale

Benchmarking Your Structs

Always benchmark your specific use case to measure actual performance impact:

                    func BenchmarkBadLayout(b *testing.B) {
    items := make([]BadLayout, 10000)
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        for j := range items {
            _ = items[j].Flag
        }
    }
}

func BenchmarkGoodLayout(b *testing.B) {
    items := make([]GoodLayout, 10000)
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        for j := range items {
            _ = items[j].Flag
        }
    }
}
                

Run with: go test -bench=. -benchmem

Related Tools

Go Memory Visualizer - Real-time visualization and one-click optimization
go tool compile -m - Escape analysis to see what allocates on heap
go build -gcflags=-m - Additional compiler optimization insights
pprof - Memory profiling for production applications
unsafe.Sizeof() - Check struct sizes programmatically

Performance Impact of Memory Layout