Need help creating a test case (solved)

xuul · 6 November 2025 18:32

I’ve run into an issue where if I run my program in succession, causing it to operate on a byte array a few times in the same way each time, the output is randomly inconsistent. But if I change the context.allocator to the example below, and manually destroy the allocator, my program is consistent all the time. Problem is, I am unable to narrow it down without running my largish program to reproduce it. I need an easy, generic way, to fill a byte array with variable data that is always the same (at least most of the time), so that I can reproduce this with a much smaller example. Needs to be variable, and not constant literals, not randomly generated, and preferably human readable if printed as a string. Any ideas? Some system procedure or similar?

// Seems the context allocator is not consistantly destroyed on exit.
// I was getting inconsistant results from running the same command
// in succession until I switched to this method.
tracking_allocator: mem.Tracking_Allocator
mem.tracking_allocator_init(&tracking_allocator, context.allocator)
context.allocator = mem.tracking_allocator(&tracking_allocator)
defer mem.tracking_allocator_destroy(&tracking_allocator)

DerTee · 11 November 2025 13:22

This is a shot in the dark, but I’ll try to give you something that might be usable even though I don’t understand your requirements at all:

use the system time/date and calculate some stuff based on that (yes, this is basically a bad PRNG, but I don’t understand why those aren’t allowed)
use the process list sorted by something like memory usage as input data
use system logs as input
request a website with a random site feature at runtime like Wikipedia:Random - Wikipedia (you could also go with easy mode and download a bunch of them and choose one based on system time so you don’t need to include a web request dependency)

Does any of this help or give you ideas? Did you already solve it?

While I don’t mean to offend you, I don’t understand what the bug does. I need more concrete detail to have chance of understanding what is going on (what exactly does your program do, what is the correct output, what is the bad output). Currently I hear “something is wrong with the default heap allocator and if I slap a tracking allocator on top, still using the heap allocator as a backing allocator, then everything is working as expected. It has to do with destroying the allocator.” This seems very odd and I’d really want to dig into this, to make sure there actually is a problem and not a misunderstanding. I would be very surprised if the cause was that the default heap allocator is not destroyed by Odin’s runtime at program exit. That should be completely irrelevant for any program, because nothing relevant happens after the destruction, so you might as well leave that to the OS at program exit.

Good luck!

xuul · 12 November 2025 01:56

No offense taken. All valid points and some good ideas. I will try them. I apologize for the wall of text below. I was trying to avoid this, but since I have you concerned and confused, I felt maybe it was necessary.

My goal is this:

Do the work of debugging this myself so that I’m not submitting an erroneous bug report. Also verify this is not a problem with my approach. Which is most likely the case.
Not ask anyone else to do it for me, which is why i left some details out. Did not want to waste anyone’s time.
Generate a smaller example of code before pasting it into the forums. Thought it would be easier to evaluate for further questions and learning. Since I have not yet found a way to reduce this to a smaller snippet, and I’ve been asked for more details, excerpts from my code is below.
My requirements for generating test data comes from the idea that a known byte array constant is more likely to be easily dealt with by the context allocator (and also by me) than a dynamically created byte array of potentially varying size.

Description of Problem
I have a program that parses and executes on several “switches” read from os.args. Sometimes expected data is not output to the terminal.

Each command does the following:

Executes an external tool
Receives data from a remote host
Processes the returned byte array data
Outputs the results
Program exits

More specifically, if 3 specific commands are processed together, I sometimes see output for all 3, and most of the time only see output for 1 and 2, where 3 is missing. If I “slap” a tracking allocator on top, all 3 successfully output every time. Additionally, the tracking allocator does not capture any allocations not freed, or any incorrect frees.

So far I have tried the following:

Place a time delay between each command to rule out network latency and host processing latency. A delay of up to 5 seconds does not change the behavior. So i believe latency is not the issue. Also with the tracking allocator and no time delay in between, I can run the program with the same commands over and over in fast succession with consistent output. (i.e. up arrow enter, up arrow enter…)
Define every command with their own distinct variable name and separate out reuse of variables as a potential issue. Still no change in behavior. Not convinced I ruled variable reuse out, but I’m leaning in that direction.

Details of Execution:

Command 1

adb.exec.command = { "adb", "shell", "pm", "list", "packages", "-3" }
exec_command(&adb, pf)
packages, p_ok := bytes_remove_string(adb.stdout, "package:")
delete_process(&adb)
adb.exec.command = {"adb", "shell", "ps", "-o", "ARGS=CMD"}
exec_command(&adb, pf)
running_packages := bytes.split(adb.stdout, {'\n'})
delete_process(&adb)
printfcln(pf.title, "%s", "Running User Installed (3rd Party) Packages:")
for running in running_packages {
	if bytes.contains(packages, running) {
		printfcln(pf.data, "%s", running)
	}
}
delete(running_packages)
if p_ok do delete(packages)

Command 2

adb.exec.command = {"adb", "shell", "dumpsys", "diskstats"}
exec_command(&adb, pf)
parse_system_data_sizes(adb.stdout, pf) // outputs to terminal when done
delete_process(&adb)

Command 3

adb.exec.command = { "adb", "shell", "pm", "list", "packages", "-3" }
exec_command(&adb, pf)
user_packages, up_ok := bytes_remove_string(adb.stdout, "package:")
delete_process(&adb)
adb.exec.command = {"adb", "shell", "pm", "list", "packages", "-s"}
exec_command(&adb, pf)
system_packages, sp_ok := bytes_remove_string(adb.stdout, "package:")
delete_process(&adb)
if bytes_contains_string(user_packages, os.args[idx+1]) || bytes_contains_string(system_packages, os.args[idx+1]) {
	adb.exec.command = {"adb", "shell", "dumpsys", "diskstats"}
	exec_command(&adb, pf)
	parse_package_size(adb.stdout, os.args[idx+1], pf) // outputs to terminal when done
	delete_process(&adb)
}
if up_ok do delete(user_packages)
if sp_ok do delete(system_packages)

System, Versions, etc.
odin dev-2025-11-nightly
Kubuntu 25.10
kernel 6.17

DerTee · 12 November 2025 16:50

This helps a lot, but it’s not my wheelhouse at all

So, again I’m guessing wildly and I’m pretty clueless! It seems likely to me that you have some kind of race condition. Do you know if the adb commands are blocking or not? Are you sure you wait for one process to finish before the next one starts? Are you sure the processes don’t read/write the same memory at the same time?

I don’t remember which allocators are thread safe, but that is one thing I would look up.
I’d also test if the error happens if you execute only one single command. My guess is that a single command always works without errors because then there can’t be a race condition.

I’d also test if using the mutex alloator instead of the tracking allocator solves your issue. I have not used it, but I did a quick search while trying to find thread safe allocators in the packages.

xuul · 13 November 2025 03:55

Your last comments gave me some ideas. Sometimes it helps to chat things out with someone, so thanks for your time.

I wrapped each command in debug code to check the state of each execution. In all cases where command 3 fails, the execution of adb is successful, exits, and returns data. I’ve also been able to verify that my program does not continue until those conditions are met. I’m using os2.process_exec to execute the command, and my program does not continue until it receives data in either stdin or stderr. I’m also checking if os2.Error contains any error data, which it does not, so no os level errors. I believe this rules out, at least in this case, race conditions. The adb command appears to be happy.

Everything works as expected all the way into the last if statement of command 3 I posted. This leaves my parse_package_size procedure, which does some more byte array processing. There’s a good chanced I mucked something up in that procedure. If that’s the case, why would it always work when using the tracking allocator (and it not capture any bad frees or incorrect frees), while the default context allocator not always work?

When I’ve got more time, I’ll dig deeper into my parse_package_size procedure and wrap debug code around everything there to narrow it down.

James_Feister · 14 November 2025 06:32

I’m not sure where the bytes_remove_string and bytes_contains_string methods are from. If they are yours did you write test cases against them, that might help sus out issues there.

In Command 1 you are

packages, p_ok := bytes_remove_string(adb.stdout, "package:")

but not deleting the packages var later on.
While in Command 3 you are assigning user_packages and system_packages and deleting them at the end of the command.

Could be the memory management either in the commands or the methods you are using. Try breaking them up in more modular pieces and writing some tests.

Good luck.

xuul · 14 November 2025 12:30

Thanks for feedback. At the end of command 1 is …

if p_ok do delete(packages)

… where p_ok is if the procedure allocated for the remove.

Here’s my remove and contains procedures. Basically a copy from the bytes package but changed to except a string key to make things more ergonomic for me. Am I using transmute in a bad way?

// if n < 0, no limit on the number of removals
bytes_remove_string :: proc(b: []byte, key: string, n := 1) -> (output: []byte, was_allocation: bool) {
	return bytes.remove(b, transmute([]byte)key, n)
}

// return true on first success from strings slice, this is basically an OR
bytes_contains_string :: proc(b: []byte, strings: ..string) -> (ok: bool) {
	for s in strings {
		bs := transmute([]byte)s
		for index := 0; len(b) >= len(bs) && index <= len(b) - len(bs); index += 1 {
			if bytes.equal(b[index:index+len(s)], bs) do return true
		}
	}
	return false
}

James_Feister · 16 November 2025 04:55

Sorry about that missed it in my scan

Did you get to write any test cases?

xuul · 19 November 2025 08:00

Thanks for everyone’s feedback. I believe I’ve solved the problem. Short answer: bad memory management in the parse_package_size procedure on my part. Boy, I’ve really developed some bad habits working in C++. I’m grateful for Odin showing me the path to Valhalla.

From the test output data, it appears that the memory tracker was forcing the OS(linux) to allocate different memory each time I ran the program. When using the default context allocator and running my program in quick succession, it appeared that portions of the previous memory allocations were re-used. That in combination with my poor memory management seems to have caused the incorrect output.