How SOA works under the hood?

griush · 12 November 2024 11:33

I was reading https://odin-lang.org/docs/overview/#soa-struct-arrays, what is the difference between this and normal arrays? I understand is similar to Zig’s MultiArrayList, but I would like some more details.

Thanks

Barinzaya · 12 November 2024 12:37

Say we have a struct like:

Foo :: struct {
    x: f64,
    y, z: u8,
}

The normal (array of structs/AoS) representation ([N]Foo) would look like this in memory:

| x1 | y1 | z1 | padding | x2 | y2 | z2 | padding | x3 | y3 | z3 | padding | ...

This keeps all of the struct’s data together, meaning that you can take a pointer to any individual struct in it, i.e. &foo_aos[1] can give you a ^Foo and all of its data will be in one location.

In contrast, the struct of arrays/SoA representation (#soa[N]Foo) would instead something look like this:

| x1 | x2 | x3 | ... | y1 | y2 | y3 | ... | z1 | z2 | z3 | ...

Thus, it’s similar to (but more flexible than) a literal struct of arrays:

Foo_SoA :: struct ($N: int) {
    x: [N]f64,
    y, z: [N]u8,
}

Odin smooths things over so that it semantically behaves as similarly to a regular array of structs as it can, but there are a number of effects, roughly arranged from positive to negative (YMMV):

Since the different fields of a struct can be far away from each other in memory, this can have performance impacts. This can be good or bad, depending on how you’re using it. If you’re usually iterating over everything in sequence, the effect can be positive compared to a regular AoS array, particularly if you aren’t accessing all of the fields. However, if you’re accessing multiple fields of the struct, and accessing the elements of the array in no specific order, it can be negative as it will require data to be pulled from multiple places (and with random access, it will be less likely to be cached).
A #soa[]Foo requires a single length and a pointer per field. So for the above Foo, size_of(#soa[]Foo) is 4 pointers large; one for a length, one for x, one for y, and one for z. Thus, it is effectively multiple slices combined. This representation allows it to be used through soa_zip to combine multiple slices into a single #soa slice, allowing for use of for in to iterate over multiple slices in tandem.
You can get an array/slice of an individual field. For instance, with a : #soa[3]Foo, you can access a.x and get [3]f64 (and likewise with a #soa[]Foo, where you get []f64). This can be useful if you’re only interested in one (or a few) fields, and in some cases can also work well in conjunction with array programming. This can also enable the use of #simd, when that level of performance is required.
For structs with varied size fields (like the example), the data can potentially be packed better. For instance, a normal [N]Foo would require 6 bytes of padding per element for a total of 16*N bytes, whereas a #soa[N]Foo can potentially pack things better. This is, in effect, 10 * N bytes, but the array as a whole would still need to be padded to a multiple of 8 due tof64’s alignment. The difference is negligible for the amount of memory in this example, but can add up with a large number of values in an array.
Odin smooths over the behavior and allows you to access a #soa[N]Foo like a regular [N]Foo to copy a Foo out of it, but you can’t get a ^Foo directly into an #soa[N]Foo. ^Foo requires a specific memory layout, namely that all of the data is together in memory, which is no longer the case. You instead get a #soa ^#soa[N]Foo, which internally is just a pointer to the #soa[N]Foo and an index. This is necessary to be able to find where the individual field data is located. The behavior is similar with a slice, except that #soa ^#soa[]Foo is a pointer to the slice (note: not a pointer to the array data itself) and an index.

In short, it’s a somewhat niche feature that’s really nice to have when it matches what you want to do, but isn’t something you should be using by default. The soa_zip part is probably the least specialized use of it, and to some extent that’s more of a handy side effect than anything.

gingerBill · 12 November 2024 12:38

Foo :: struct {
    a: f32,
    b: i32,
    c: u64,
}

Fixed :: [3]Foo
Slice :: []Foo
Dynamic :: [dynamic]Foo

Soa_Fixed :: #soa[3]Foo
Soa_Slice :: #soa[]Foo
Soa_Dynamic :: #soa[dynamic]Foo

// internals
Soa_Fixed :: struct {
    a: [3]f32,
    b: [3]i32,
    c: [3]u64,
}

Soa_Slice :: struct {
    a: [^]f32,
    b: [^]f64,
    c: [^]u64,
    len: int,
}

Soa_Dynamic :: struct {
    a: [^]f32,
    b: [^]f64,
    c: [^]u64,
    len: int,
    cap: int,
    allocator: runtime.Allocator,
}