How to get size of generic structure, when typeid is not constant?

I’m implementing ECS. the init function accepts a list of typeids of all possible components. all components are stored an big array in format of Block_Header | Component_Header | T1 | Component_Header | T2 | ...

// ...

@(private)
Component_Header :: struct {
	set:       bool,
}

@(private)
Component :: struct($T: typeid) {
	header:    Component_Header,
	component: T,
}

init :: proc(w: ^World, allocator: runtime.Allocator, types: []typeid) {
	// ...

	size := size_of(Block_Header)
	for t in types {
		w.offsets[t] = size
		size += size_of(Component(type_of(t))) // HERE is the problem
	}

	w.stride = size
	return
}

the problem here is that size_of(Component(type_of(t))) basically returns size of typeid itself (8 bytes). I understand that here t is not compiletime-known, so it’s impossible to instantiate Container(T) from it. But i don’t know what is the correct way of getting the correct size of it?

Upd. If i do size_of(Component_Header) + size += type_info_of(t).size i get incorrect size because of alignment

i hacked my way to this

	size := size_of(Block_Header)
	for t in types {
		w.offsets[t] = size
		size += mem.align_forward_int(size_of(Component_Header), type_info_of(t).align)
		size += type_info_of(t).size
	}

but i’m not sure if this is the right way to do this

Edit: Removed due to it’s inaccuracy and to cleanup forum space.

the problem i was facing here is that size_of(Component_Header) + type_info_of(T).size does not always equal to size_of(Component(T)) because of the alingment. in my case the T was [2]f64, resulting the whole Component(T) to have 8 byte alignment, whereas the align_of(Component_Header) itself was 2 bytes, if i remember correctly

Edit: The way size_of works has bugged me for while and I have not yet put my finger on what it is I’m missing. I’m endeavoring to work this out eventually.

I seemed to remember reading that a struct’s size is based on the largest field’s size and alignment, so running with that I did some stuff to print size and alignment in 3 different ways to show the numbers all add up.

The below prints the values for:

  • Component(T) where T is [2]f32
    ** Then comp1 := Component([2]f32)
  • Component(T) where T is [2]f64
    ** Then comp2 := Component([2]f64)
  • total of types := [2]typeid{ [2]f64, [2]f32 }

It appears all you need for size of Component(T) to be the same as size of typeid is

type_info_of(t).size + type_info_of(t).align

Note the print_comp_size_by_types procedure for the ultimate proof. It adds the values together of the first 2 approaches by utilizing typeid array of both types. Hopefully this is easy to follow. I had to print everything in a formatted manner to be able to wrap my head around it.

Component_Header :: struct {
	set:       bool,
}

Component :: struct($T: typeid) {
	header:    Component_Header,
	component: T,
}

print3columns :: proc(cols: [3]any) {
	fmt.printfln("%-20v%6v%6v", fmt.tprint(cols[0]), fmt.tprint(cols[1]), fmt.tprint(cols[2]))
}

print_comp_size_by_T :: proc($T: typeid, expr := #caller_expression) {
	p1, p2 := strings.index(expr, "("), strings.last_index(expr, ")")
	size, align := size_of(T), align_of(T)
	print3columns({expr[p1+1:p2], "Bytes", "Bits"})
	print3columns({"size",  size,  8*size})
	print3columns({"align", align, 8*align})
}

print_comp_size_by_var :: proc(comp: $T, expr := #caller_expression) {
	p1, p2 := strings.index(expr, "("), strings.last_index(expr, ")")
	size, align := size_of(T), align_of(T)
	print3columns({expr[p1+1:p2], "Bytes", "Bits"})
	print3columns({"size",  size,  8*size})
	print3columns({"align", align, 8*align})
}

print_comp_size_by_types :: proc(types: []typeid, expr := #caller_expression) {
	size, align: int
	for t in types {
		ti := type_info_of(t)
		size  += ti.size + ti.align
		align += ti.align
	}
	print3columns({fmt.tprintf("typeid[%v]", len(types)), "Bytes", "Bits"})
	print3columns({"size",  size,  8*size})
	print3columns({"align", align, 8*align})
}

main :: proc() {
	//	Component([2]f32) approach
	print_comp_size_by_T(Component([2]f32))
	fmt.println()

	//	comp1 := Component([2]f32) approach
	comp1 := Component([2]f32) {Component_Header{true}, {42, 42}}
	print_comp_size_by_var(comp1)
	fmt.println()

	//	Component([2]f64) approach
	print_comp_size_by_T(Component([2]f64))
	fmt.println()

	//	comp2 := Component([2]f64) approach
	comp2 := Component([2]f64) {Component_Header{true}, {42, 42}}
	print_comp_size_by_var(comp2)
	fmt.println()
	
	//	Ultimate goal: types := []typeid method
	types := []typeid{[2]f32, [2]f64}
	print_comp_size_by_types(types)
}

Output:

Component([2]f32)    Bytes  Bits
size                    12    96
align                    4    32

comp1                Bytes  Bits
size                    12    96
align                    4    32

Component([2]f64)    Bytes  Bits
size                    24   192
align                    8    64

comp2                Bytes  Bits
size                    24   192
align                    8    64

typeid[2]            Bytes  Bits
size                    36   288
align                   12    96
2 Likes

I left a few print statements using afmt instead of fmt. :person_facepalming: Habits ya know.

Hopefully that did not confuse things. I updated the above code to use fmt. Also added other edits for my future self to reference.

This is why “data oriented design” (go watch this and then this) is so important for efficiency.

ComponentA :: struct {
    header:    bool,     // 1 byte
    // 3 bytes of padding
    component:  f32,     // 4 bytes
}   // alignment 4, bytes 8

ComponentB :: struct{
    header1:    bool,    // 1 byte
    header2:    bool,    // 1 byte
    icomponent:  i16,     // 2 bytes
    fcomponent:  f32,    // 4 bytes
}   // alignment 4, bytes 8

ComponentC :: struct{
    header1:    bool,    // 1 byte
    header2:    bool,    // 1 byte
    // 2 bytes of padding
    fcomponent:  f32,    // 4 bytes
    ucomponent:  u16,     // 2 bytes
    // 2 bytes of padding
}   // alignment 4, bytes 12

Wasted space is eventually wasted time. Write your data structures accordingly.
Use SOA when appropriate. It’s one of Odin’s strengths.

1 Like

So would it be a fair statement to say that what we are talking about is just the memory footprint of addresses used by structures (and or variables), and not the total memory footprint of both memory used by addresses and data?

So take for example a string, If I wanted to know it’s total memory footprint, addresses and data, would I always have to calculate that or is there a procedure similar to size_of that does all that? If the answer is, I have to always calculate that, then would the below be the way to do it for, in this case, a string?

mystring := "I'm a string vibrating in multiple dimensions. Do I have a local size or a non-local wave function that is collapsing through your eye balls?"
total_size := size_of(mystring)+(size_of(byte)*len(mystring))
// or less redundant since size of byte is ... well a byte
total_size := size_of(mystring)+len(mystring)
1 Like

Yes. Slices, dynamic arrays, and maps are structures containing pointers to data. It’s why they use make / delete instead of new / free. It’s mentioned in the FAQ. Strings are too, but they have a simplified syntax for ease of use. Reminder: don’t try to delete() a string literal.

stra := "abc"       // align 8, bytes 16; length 3 (elsewhere)
strb := "ABCDEF"    // align 8, bytes 16; length 6 (elsewhere)

Note that fixed arrays are data, not structures with pointers.

arra : [1]i32       // align 4, bytes 4
arra[0] = 1         // align 4, bytes 4

arrb : [2]i32       // align 4, bytes 8
arrb[0] = 2         // align 4, bytes 4

arrc : [4]i16       // align 2, bytes 8
arrc[0] = 3         // align 2, bytes 2

Appreciate your response. I’m solid on the difference between dynamic vs. literals, make vs free, and stack vs heap. Slices are windows or views (or a subset) of other data. Of course don’t delete literal data, etc. I’m clear on what the string type actually is (a byte array with fixed capacity == length in the case of literals; length and capacity in the case of dynamic).

I suppose I’m having trouble articulating the question (because I’m in learning mode and not sure what it is I don’t know yet), so bare with me.

Is there a procedure that knows the difference between literal and dynamic variables and the nuances of each type, that can give me the total memory footprint of a variable? One that takes into account all the memory used to store and track length, capacity, type, and data. The full monty if you will.

Not trying to insult you. Posting is a spectator sport. Plus, I’m trying to fix the distinctions in my own mind. (I greatly admire Odin, but I don’t actually use it much. I am amazed at some of the work I see done here, including yours.)

As for the existence of a generic “size_of_whatever” proc, I can’t find one. (Doesn’t mean it doesn’t exist somewhere.) You’d really have to work to make it function correctly. It might end up a something like a custom allocator that had full run-time query capabilities.

It’s probably easier to just write a custom query for each of the data structures you actually use in a project. It’s not a truly generic solution, but real programs are seldom so generic that you don’t have at least some idea what types you’re going to be using.

I’m good on point 1. I’ve come to respect (from work experience) those that ask questions, even when dumb. Shows a real interest in understanding, which is an action in itself; and actions speak volumes. I have years of “experience”, but I can also admit there are many gaps in my understanding of some specifics (mostly due to lack of necessity over the years). On point 2, so true, and I am an avid spectator as you already know. On point 3, these discussions is how we do it, am I right?

It’s possible I’m over-thinking this, which is also tendency of mine. Both a benefit and a detriment, depending who you ask :wink: I like to think it’s the former cause what’s the alternative, under-thinking? :slight_smile:

That’s why I love Assembly - where you definitely will “have to” understand data type/structure clearly before anything could work (& why ?).

Wrong data type/size
~> wrong register/ corrupt data

** but this is mostly hidden step since C.

No more guessing size because it has to be accurate, even since you allocate memory.

Meanwhile, at higher-level language like C/Zig/Odin/Rust, you may have to guess before actually confirm some chunk of memory or data you just wrote…

Example : I confused a lot when I first used Odin allocator & tried to correctly size allocated chunk to fit my struct.