Modelling a stack-based language

mutt · 20 July 2025 21:33

I have been utilizing Odin to prototype a new programming language.

It is a dynamic (not-compiled) concatenative language with a single typed stack. Having solidified it’s core design, I wish to more appropriately align it’s internals with Odin’s.

Most of the language is merely wrapping Odin’s base types and operations. I am attempting to determine the best way to tap into Odin’s type system to construct it’s own runtime type system, as well as how best to model the “cell” of it’s stack to play friendly with the compiler and this stack-based paradigm.

greenfork · 21 July 2025 21:32

So what kind of problems did you stumble upon when trying to model it this way?

mutt · 22 July 2025 14:57

There is no “problem” other than the fact that my stack cell’s union is unnecessarily large in size. It is convenient to be able to wrap Odin’s types and use them as a type in my language. However, continuing to toss things into this union has the potential to increase that size even more and I would prefer to resolve this before implementing more behaviour.

The question was intentionally vague and not particularly intended as a “help” question as I can learn more by seeing someone else’s fresh approach, but this may be too open-ended a question.

charles_23195 · 24 July 2025 10:49

If you want to directly use the base language’s types with a union, your cells are necessarily going to get fairly large, unless you make many of them references. If you want small cells, you have to roll your own base types. This gives more flexibility in the long run, at the expense of a little more work up front.

You can make a cell one word (8 bytes) and have a lot of functionality. “Item data” can be either 32 bit data (i32, f32, rune, etc.) or an indexed reference to a dynamic array of that module and type.

source module: 2 bytes
base type: 2 bytes (either universal or per module)
item data: 4 bytes

You can use negative indexing as a single bit flag. For example, use a negative base type to indicate a reference to that type.

If you want more data items than this allows, use a two word cell. That gives you 4 bytes for metadata. You can even hold short strings directly inside the cell. There are any number of variations on this concept. Pick one that works for you and play with it.

mutt · 26 July 2025 15:10

Super! This is extremely helpful info.

Before seeing your reply I did begin to simply convert the larger types to references, but can tell I will not be fully satisfied with that implementation.

Now seeing the potential for bit-trickery, there is a lot of clever things that can be done to match the design of my language. However, also realizing I need much more experience using my language first to determine what those things will be.

charles_23195 · 26 July 2025 17:59

Happy to help. I’ve been down this road myself.
Concatenative/stack-based language inspirations to draw from:
False, Joy, and Factor - You probably already know about these.
Uiua - A stack based, modernized version of APL. It’s amazing what the guy has done with it.
Listack - My very own proof of concept esolang. Polymorphic, uniform function call syntax, (round blocks are immediate).

Looking at your quick start guide, you’re heading down the road of a “write only” personal esolang. That’s cool if it’s your intent. Especially if you’re into code golfing.

This is standard postfix language:

[boolean condition] [true branch] [false branch] if

My one great contribution to programming theory: #single #word #comments.

#if [boolean condition] #then [true branch] #else [false branch] if

mutt · 27 July 2025 15:31

I looked at every concatenative language I could find to study their design (syntactically), I remember Listack.

While I am unsure what you mean by “write only”, Readymade is the furthest thing from “esoteric”. It’s intent is to be the simplest & easiest programming language to learn, understand, and use for someone with no knowledge of programming whatsoever.

It can expose any commonly useful “higher-level” constructs with nothing more than postfix notation, ASCII symbols, and self-awareness. For example, lets look at the ‘standard postfix conditional’ you mentioned; there is actually a lot of complexity going on here.

[boolean condition] [true branch] [false branch] if

For me to use this, I must know the stack order and that the conditional is called if.
This is fine enough, except the direction I normally read the English language is quasi-reversed in my code. So if what? Is there only a true branch with another if-else or ifElse or …?

This is all entirely arbitrary, but does it need to be?

#if [boolean condition] #then [true branch] #else [false branch] if

What is cool about this code to me are your comments, they are convenient and expressive. In fact by design, they encourage you to be: “single word comments”.

The point I am trying to make here is that your great contribution to programming theory is the basis of Readymade. Let me explain…

In this particular example, your comments are providing prefixed information about proceeding code. This is true both for the # sigil and not surprisingly, the content of your comments. This pattern of temporarily inverting notation turns out to be incredibly useful, just as mirrors are useful.

So I ask, can we express the above if construct in a manner that relies on nothing other than common sense and intuition, rather than external documentation? Can a language actually document itself?

The following will make more sense by reading the seemingly elementary doc/intro.m, but let’s say the following represent ‘true’ and ‘false’ respectively:

{+} {-}

This is also our construct for conditional checks of the same boolean value.

{+} {+ "this will print". }
{-} {+ "this will not". }

An if/else becomes {+- [] [] }, even use {-+ [] [] } because language is about expressivity and choosing to handle a false variant first is rather expressive in-and-of itself.

[boolean condition] {+- [true branch] [false branch]}

I do not play golf, such things do not interest me. The terse nature of the language is of utility and common sense. It both eases composition and exposes the underlying motion and patterns of the code.

We both know what dup and over could do, but are the words Chuck Moore happen to come up with one day really the best choice for those operations? 3-4 character length (English) words for the most basic operations of the language? Why?

Just use _ to express duplication and provide convenient indexing.

_ _1

Follow this concept to express all memory/core operations; that’s Readymade.

There is a stigma that using only ASCII symbols makes a language just a toy and not a serious tool to develop quality software. I would bet that someone with no prior-programming experience would be able to reliably learn and retain the builtins of Readymade significantly faster than any language based on English for the same reason a child learns to match shaped blocks before learning to read and write script.

There is also a similar stigma for languages with a shared stack. As someone who has developed your own concatenative language, you seem to refer to it as a “road you’ve gone down”. I am quite interested to know your viewpoint about such stigmas and how you view stack-based programming given your experience.