Experimenting with variable reuse within declarations and its issues

gingerBill · 5 November 2024 14:55

I AM NOT LOOKING FOR SYNTAX SUGGESTIONS. PLEASE DO NOT GIVE ME THEM!

I’ve been experimenting on a private branch of the compiler to allow for explicit variable reuse within a variable declaration.

The idea looks a bit like this:

a, ok := foo()
...
c, *ok := bar()
...
*a, b, *c, d := baz()

Note that ok was previously declared within the procedure, so marking it with *ok in another variable declaration means it would reuse the previous declaration of ok rather than result in an redeclaration error.

I know a lot of people want this however the more and more I think about it, the more I think it’s probably a code smell in the first place. I’ve grepped a lot of my code to see where this pattern happens a lot and it’s always bodges and unit @tests, rather than anything useful.

I’d love to see examples where this “idiom” would be useful because I want my gut to be proven wrong about it being a bad idea and a “code smell”.

I AM NOT LOOKING FOR SYNTAX SUGGESTIONS. PLEASE DO NOT GIVE ME THEM!

dimsuz · 5 November 2024 16:15

In my case I often come across wanting this whenever one proc contains multiple proc calls of the same package which return the same error. Looked in my quick-scripts in Odin and here are few examples (disclaimer: I consider myself a begginner in Odin):

// here open_err, dir_err could be the same err
walk_proc :: proc(info: os.File_Info, in_err: os.Errno, user_data: rawptr) -> (err: os.Errno, skip_dir: bool) {
  handle, open_err := os.open(info.fullpath, os.O_RDONLY)
  if (open_err != 0) {
    fmt.eprintf("error opening directory: %s, error=%d", info.fullpath, open_err)
    return 0, true
  }
  files, dir_err := os.read_dir(handle, -1)
  if (dir_err != 0) {
    fmt.eprintf("error reading directory: %s, error=%d", info.fullpath, dir_err)
    return 0, true
  }
  // ...
}

Another one (with asserts, but could be proper error handling):

create_mock_users :: proc() -> []User {
  year, month, _ := time.date(time.now())
  day, err_month := dt.last_day_of_month(year, month)
  assert(err_month == .None)
  epoch_date, err_components := dt.components_to_date(2023, 11, 30)
  assert(err_components == .None)
  // ...
}

Another one I had somewhere is making multiple http calls each of which returns (response, err). Etc, etc.

gingerBill · 5 November 2024 16:43

The thing is, in those two cases, they look absolutely fine with custom err names. I can understand it might be a bit more typing but in both cases, they are a form of “early out”.

dimsuz · 5 November 2024 17:04

Yeah, the main issue is actually laziness to write more code and give “good” names to vars. But I’d rather type more than have some “magic” generated by the compiler, unless it’s clearly communicated (such as your solution above).

And in the end I can always add some kind of editor snippet to help me here.

Also, as a beginner I wasn’t sure if this is the correct idiom to give errors explicit names like this – thought that I may miss something, but then I asked in Discord (prior to this forum launch), got confirmation that this is ok.

gingerBill · 5 November 2024 17:27

The more I think about it, the more I think the idea of the reuse variable declarations is a bad idea.

Why? Because it’s solving an aesthetic problem rather than a practical problem. If you allow what I wrote above, I know for definite people will want something like the following:

*foo.field, ok := bar()

which is literally breaking loads of semantic rules and not being consistent nor easy to understand.

Why? foo.field is not a variable but a field off a value, so you are not reused a variable but just doing an assignment. All this hypothetical shorthand is doing is removing the need to declare a single line:

ok: bool
foo.field, ok = bar()

But people will want this. How do I know? I’ve just had someone propose this exactly .

I think this alone is a good enough reason to say no to the entire endeavour.

zannzen · 5 November 2024 17:38

Maybe to flip the exercise above, instead of * being used to reuse previously declared thing, the * could signify you’re explicitly shadowing a previous variable in the same scope?

It would disallow the foo.field example while allowing the reuse of err

The error case above is basically the only situation I find myself in where I have many short-lived variables. Once I handle an error I don’t actually need that variable again for the rest of the function. It’s definitely more of an aesthetic problem than anything else though.

dimsuz · 5 November 2024 18:02

It’s not always a single line. With foo.field example it sure is, but as a beginner I initially had friction with having something like

v1, err := proc_call1()
v2, err := proc_call2()

Then I realize that “err” can’t be reused in the second line, edit:

err : Error
v1, err = proc_call1()
v2, err = proc_call2()

and now it turns out that v1 and v2 both need to be explicitly declared, so this becomes

err: Error
v1: V1
v2: V2
v1, err = proc_call1()
v2, err = proc_call2()

And this did cause some feeling that this is not elegant and I’m doing something wrong.

So this lead me to think that explicitly naming errors is a better way. Maybe this could be mentioned in the “idioms” doc somewhere to make beginners aware.

Feoramund · 5 November 2024 18:25

Something feels awfully wrong about this.

:= means: “hello, I am declaring some new variables here.”
*IDENTIFIER means: “no, wait, this one’s an exception to the rule.”

Syntactic exceptions like this have always proven confusing for me. It’s easier to hold the rules in your head if there are few exceptions, both for programming languages and human tongues. This is what leads to early and skillful fluency: simple rules.

I understand it can be annoying to write this sort of code:

blah, err := allocator.procedure(...)
blah2: []bytes = ---
blah2, err = allocator.procedure(...)

or

blah, err1 := allocator.procedure(...)
blah2, err2 := allocator.procedure(...)

I do it from time to time.

But I don’t think that justifies an exception to the language just to make code easier to write. We read code far more often than we write it, understanding is paramount to maintenance, and I am willing to deal with having to write code that just feels like a little more work than getting syntactic sugar for the above code blocks at the risk of complicating the language.

Jeroen · 5 November 2024 18:37

Alternatively:

err, v1, v2 := Error.None, V1(0), V2{}
v1, err = proc_call1()
v2, err = proc_call2()

flysand7 · 5 November 2024 18:40

he thing is, in those two cases, they look absolutely fine with custom err names. I can understand it might be a bit more typing but in both cases, they are a form of “early out”.

I’ve had an experience where copy-pasting lines, or simply bad thinking has led me to making this kind of mistake (borrowing the example above):

create_mock_users :: proc() -> []User {
  year, month, _ := time.date(time.now())
  day, err_month := dt.last_day_of_month(year, month)
  assert(err_month == nil)
  epoch_date, err_components := dt.components_to_date(2023, 11, 30)
  assert(err_month == nil) // wrong value being tested!
  // ...
}

I don’t know if anyone else has experienced these kinds of issues, and these days I tend to be extra careful when unique-naming error values. You could make an argument that this is not unique to error values, but there are reasons why that’s not the case:

Unlike other variables, it’s common for you to name error values err and ok, and when writing assertion code you might accidentally assume you named your error value that instead of foo_err or foo_ok.
With error handling code copy-pasting becomes way more common than usual. In a lot of projects I’ve copy-pasted this snippet of code:
```
if foo_err != nil {
  fmt.printfln("Failed to do X: %v", foo_err)
  os.exit(1)
}
```
It’s common that after copy pasting you think through what needs to be changed, and in most cases it’s just the error messages. But despite that there are still ways to forget to do that or not pay too much attention. From my recent experience, I can say that the probability of me forgetting to change foo_err to bar_err is roughly the same as if I’d wrote println instead of printfln if I didn’t copy paste the code.

What makes this worse is that errors with error handling don’t show up, unless you test the error handling code. Of course, if you have a test, you should also test the error paths of the program, but reducing these copy-paste errors should further reduce the margins for error. Besides, some error paths are very hard to test and will be untested, but that’s rare.

Edit: pre-declaring variables would solve this issue too, but only if all err values in the scope have the same type. Otherwise pre-declaring doesn’t really work and you have to different-name them.

mweihs · 5 November 2024 19:28

How about making err and ok “special” variable names, so that they can be reused by default?

gingerBill · 5 November 2024 20:12

Oh fuck no. Please for the love of God, no.

mweihs · 5 November 2024 20:30

That was clear enough!

dogue · 5 November 2024 22:49

+1 to this

If Odin is going to have a feature like this, an explicit opt-in seems appropriate.

DragosPope · 6 November 2024 09:24

it seems like a feature that promotes an antipattern to me. You’ll be losing a lot of information on the previous values when debugging. I remember i was one of the people that kinda wanted something like this, but in retrospective it just looks like a very bad idea both for using it in your code and for the extra modifications it adds to the language.

When this is needed, the current friction of requiring to declare your types explicitly will be insignificant compared to the readability you gain from declaring them.

edit: i’m also assuming this will come with a lot of edge case rules, like disallowing *a := 1 *a, *b := 1, 2 but allowing a, *b := 1, 2 and so on

gingerBill · 8 November 2024 11:38

All of those edge cases are already taken care of, and even more edge cases such as restricting the variables that can be redeclared to variables declared within the same procedure too.

It’s effectively too many rules for it to be as useful as you think.

The main reason for it is to reduce the amount of predeclaration of variables that are needed e.g.

f, ok := foo()

b: f32
b, ok = bar()

whilst the redeclaration approach would be this:

f,  ok := foo()
b, *ok := bar()

The problem is that even though I know this is a “useful” idea, it feels like the pros are completely outweighed by the cons.

Fix · 8 November 2024 11:50

I’m not using Odin in a professional capacity, only on personal projects. But I do use Odin daily.

Would redeclaration allow changing types?

v1, ok := packageA.foo() // ok is now type packageA.Error
v2, *ok := packageB.bar() // What happens here? does it redeclare as packageB.Error? Does the compiler error out?

I’m not bothered by the way it already is. We’ll live, having to write an extra line to declare a variable, and reuse another one.

gingerBill · 8 November 2024 12:19

It would not allow for changing types. ok would have to be same type. Changing types is kind of crazy in my view, especially in a statically typed language.

hasen · 8 November 2024 21:08

Why not just have reuse be the default? I’m not sure what that means, but if you see a declaration:

a, b := something()

where a is new but b already exists, do tolerate it if b would be the exact same type.

Like, what is wrong with that?
I think that’s what Go does, isn’t it?

Not sure how pointers would work, but I’d just let it be the same variable on the same location on the stack.

GPicquot · 9 November 2024 04:03

I second the idea it is the more “natural” approach (as long as the type is not changed). Probably some edge cases with polymorphism to be searched for, and clarifying the rules around mutability, which I suppose cannot change

PS: the *variable_name syntax might also be confusing for people coming from C/ZIG/GO languages