Algebraic Data Types for C99 (opens in new tab)

(github.com)

376 pointsbondant2y ago225 comments

225 comments

93 comments · 14 top-level

different_base2y ago· 44 in thread

One of the crimes of modern imperative programming languages is not having ADTs (except maybe Rust) built-in. It is such a basic mental model of how humans think and solve problems. But instead we got inheritance and enums which are practically very primitive.

adrian_b2y ago

Moreover, they have already been proposed by John McCarthy in October 1964, 60 years ago, for inclusion in the successor of ALGOL 60, which makes even more weird the lack of widespread support.

(And in fact Algol 68 had a better implementation than most later languages, but Algol 68 was missing completely any documentation suitable for newbies, like tutorials and programming examples, while not being promoted by any hardware vendor, like IBM or DEC, so it was doomed.)

floxy2y ago

>The more I ponder the principles of language design, and the techniques which put them into practice, the more is my amazement and admiration of ALGOL 60. Here is a language so far ahead of its time, that it was not only an improvement on its predecessors, but also on nearly all its successors.

https://web.eecs.umich.edu/~bchandra/courses/papers/Hoare_Hi...

jjice2y ago

I've never written Swift, but it seems like they have it too https://docs.swift.org/swift-book/documentation/the-swift-pr...

I also would love a future where ADTs are more common in imperative languages

odyssey72y ago

Swift “enumerations” are very nice.

skywal_l2y ago

Zig is a modern imperative programming language with ADTs: https://ziglang.org/documentation/master/#Tagged-union

chongli2y ago

While this is a step up from C, it is still a long way from the full power and generality of algebraic data types. The key word here is algebraic. In a language with ADTs, such as Haskell, you can pattern match on an arbitrarily complex types, not just the outermost tag. A contrived example (from [1]):

    contrived :: ([a], Char, (Int, Float), String, Bool) -> Bool
    contrived    ([],  'b',  (1,   2.0),   "hi",   True) = False

To achieve a result like this using Zig's switch syntax would seem to involve a huge amount of boilerplate code and nested switch statements.

[1] https://www.haskell.org/tutorial/patterns.html

1 more reply

madeofpalk2y ago

Also Typescript https://www.typescriptlang.org/docs/handbook/2/everyday-type...

2 more replies

CraigJPerry2y ago

>> not having ADTs (except maybe Rust) built-in

Most of the common languages today have product types.

Java[1], Rust, Haskell, etc. have sum types.

I think it gets a bit more escoteric beyond that though - i don't doubt that there's probably some haskell extension for quotient types[2] or some other category theory high-jinx.

Most languages have ADTs built in.

[1] https://blogs.oracle.com/javamagazine/post/inside-the-langua... [2] https://en.wikipedia.org/wiki/Quotient_type

nextaccountic2y ago

Does Java sealed classes enable something like an exhaustive pattern matching? (A form of pattern matching that will fail at compile time if you add a new class that extends the sealed class)

3 more replies

brabel2y ago

Please don't forget Dart!

https://medium.com/dartlang/dart-3-1-a-retrospective-on-func...

LegionMammal9782y ago

Java's sealed classes are still somewhat more limited than Rust's or Haskell's sum types, in that each instance of the superclass holds a fixed variant (i.e., subclass), so you can't change the variant without creating a new instance. Clearly, this limitation is necessary for references to stay intact, but I've personally ran into this issue when trying to represent a sum type in an ORM.

2 more replies

speed_spread2y ago

Java sum types work but still need a bit of syntax sugar on the declaration side, IMHO.

1 more reply

thewakalix2y ago

Java doesn't have pattern-matching yet. Haskell is not an imperative language.

j2kun2y ago

Your examples, on the TIOBE index, are #4, #18, and #28. https://www.tiobe.com/tiobe-index/

1 more reply

Verdex2y ago

NAND is a universal circuit primitive because it can be used to create all of the other circuit primitives. But if you think about it, this is more of an argument of manufacturing than it is in comprehensibility. Only needing to manufacture NAND is easy, but if you could only create your circuit this way, then you would have an unmaintainable mess.

You can do the same thing with boolean logic and just have not-and, but thankfully we have and, or, not, xor. Similarly, you don't need greater-than-or-equal because you can just write 'x > y || x == y'.

Comprehension is linked to how closely you can express the idea of what you're doing in the object language that you have to look at. It might be convenient to compile everything down to SK combinators so that your optimizer and evaluator can be simpler, but people should never look at that level (at least not until you suspect a compiler defect).

So we get to object oriented programming. Where our data expression has an AND property (a class has an INT field AND a STRING field), an existential property (interfaces: there exists some object with these methods), and inheritance (a truly bizarre feature where we duck tape subtyping to a method and field grouping mechanism with a bunch of hooks).

With interfaces and inheritance you can simulate both a universal property (generic) and an OR property. But because it's not a direct expression, we leave this giant gap for what people intended to happen to diverge from what actually happens. Especially after time passes, defects are found, and requirements change. [For example, when using interfaces to simulate an OR property, there really isn't any mechanism to let everyone know that this construct is closed. So if something erroneously gets added, you won't know to check the entire code base. And if requirement change and you need to add a new case, then you have to check the entire code base. Completeness checking of ADTs give you this for free in your pattern matches.]

Too many non-trivial architectural messes that I've encountered in my career have been due to either someone trying to solve all of their problems with interfaces or the same with inheritance* when a simple OR data structure would have made everything simple, clear, and correct.

[*] - Inheritance being more problematic when someone tries to create a non-trivially sized category hierarchy, which ruins the day when requirements change and suddenly the tree needs to be reorganized but doing so would invalidate entire swaths of the code base already accepting types with a different assumed (and undocumented) hierarchal tree structure. Thankfully most people have gotten the memo and switched to interfaces.

debo_2y ago

ADT feels like an unfortunately acronym-collision with "Abstract data types."

pjmlp2y ago

Yep, people that eventually buy a Modula-2 ADT book, when hunting old stuff, are in for a surprise. :)

2 more replies

jghn2y ago

More often than not, when I say ADT to someone outside of the FP world, they assume I mean abstract data type.

1 more reply

keybored2y ago

ADT feels like an unfortunately acronym-collision with "algebraic data types."

They were both introduced in the same decade.

ajross2y ago

> [Algebraic Data Types are] such a basic mental model of how humans think and solve problems

I think that's actually wrong for "Sum types". Product types, sure. The idea of storing a bunch of fields in a single thing matches the way we've been organizing information since we started writing things down.

But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.

Where by extension: class-based inheritance is actually pretty simple to understand. The classic "IS-A" relationship isn't as simple as "fields in a struct", but it's not hard to understand (c.f. all the animal analogies), and the syntax for expressing it is pretty clean in most languages.

Is it the "best" way to solve a problem? Maybe not. Neither are ADT sum types. But I think there's a major baby-in-the-bathwater problem with trying to be different. I really don't think, for the case of typical coders writing typical code, that ADTs are bringing as much to the table as the experts think.

keybored2y ago

> But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.

Syntaxes like: `A | B(Int) | C(String)`. That means A, B, or C.

> Where by extension: class-based inheritance is actually pretty simple to understand. The classic "IS-A" relationship isn't as simple as "fields in a struct", but it's not hard to understand

This value is either `A`, an Int (`B(Int)`) or a String (`C(String)`). Or: this knapsack either contains an A, B, or C. Difficult?

> (c.f. all the animal analogies),

Reminds me of the fact that software isn’t typically as static as animal taxonomies.

> and the syntax for expressing it is pretty clean in most languages.

I’m most used to Java where you spread `extends` over N files. (Sealed classes in Java is an exception.)

It’s fine. I don’t understand how it is particularly clean.

> Is it the "best" way to solve a problem? Maybe not. Neither are ADT sum types.

Is this an argument? ’Cause I don’t see it.

> But I think there's a major baby-in-the-bathwater problem

Inheritance is something that needs to have some concrete implementation affordance. Baby in the bathwater? I don’t see how you bolt this onto the struct model in a way that gets out of the way for people who don’t want to use it (the zero-cost-abstraction (in the if you don’t use it sense) is important to some low-level languages).[1]

Maybe the designers of hypothetical language X thinks that algebraic data types is enough. What baby are you missing then?

[1] For algebraic data types: structs are straightforward enough. While the “sum type” can be implemented by leaving enough space for the largest variant. That one-size-fits-all strategy isn’t perfect for all use-cases but it seems to have been good enough for Rust which has a lot, a lot of design discussions over minutiae.

> trying to be different.

With a technology from the ’70s. I also saw your “one oddball new idea is a revolution” snark. You’re clearly being very honest.

1 more reply

mrkeen2y ago

> But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.

  data Bool = True | False

throwawaymaths2y ago

> class-based inheritance is actually pretty simple to understand

Simple to understand, a nightmare to debug, as you'll be chasing where your data and data contracts across a ton of files.

1 more reply

tombert2y ago

Yeah, when I first learned Haskell a million years ago, and Erlang slightly less than a million years ago, the pattern matching was so plainly obviously the "correct" way to do things; it just felt like it was exactly how I thought about problems, and all the constructs with if/switch/enums had been an attempt to force my brain thinking into something that executes.

It honestly does annoy me that a lot of mainstream languages still haven't really adopted ADTs; when Java 8 added a lot of (well-needed) new syntax, it felt like that was an ideal opportunity to add ADTs and pattern matching (though I'm sure that was easier said than done).

kaashif2y ago

> when Java 8 added a lot of (well-needed) new syntax, it felt like that was an ideal opportunity to add ADTs and pattern matching

Well at least Java does now (as of Java 21) have pattern matching (including nested record destructuring) and sealed classes, which let you have decent sum types.

The one issue is that everything is nullable, but that's a wider Java issue.

1 more reply

thesz2y ago

"Haskell is the best imperative language," (C) various software engineers.

Also, algebraic data types can be seen as hierarchy consisting of abstract base class and several final children classes. So it is an inheritance model, just restricted one.

taeric2y ago

I'm curious on supporting evidence for it being a basic mental model of how humans think? That sounds like a fairly strong claim.

Verdex2y ago

I'm a huge proponent of ADTs being a more comprehensible way to write code than some of the alternatives. But I do have to agree with you that there isn't really evidence that this is a basic mental model.

However

What we do see is a bunch of mathematical disciplines that end up creating properties like: AND, OR, Universal, Existential, Implication, (and a few others). They end up in places like: set theory, type theory, category theory, various logics, lattice theory, etc.

Now, maybe they're only copying one another and this is more of a memetic phenomena. Or maybe they've hit upon something that's important for human comprehensibility.

That would be the 'evidence' of the positive effect of ADTs (scare quotes because it might just be math memes and not fundamental). But we can also think about what I feel is legit evidence for the negative effect of lacking ADTs.

Consider what happens if instead of having the standard boolean logic operators and, or, not, xor, we only have the universal not-and operator. Now a straightforward statement like: A && B || C becomes (((A !& B) !& (A !& B)) !& ((A !& B) !& (A !& B))) !& (B !& B) [I think...]. It's more complicated to tell what's actually supposed to be going on AND the '&&' simulation can get intertwined with the '||' simulation. The result being that requirements changes or defect fixes end up modifying the object level expression in a way where there is no longer any mapping back to standard boolean logic. Comprehensibility approaches zero.

And we've seen this happen with interfaces and inheritance being used to implement what would otherwise be a relatively simple OR property (with the added benefit that pattern matching ADTs often comes with totality checking; not something you can do with interfaces which can always have another instance even up to and including objects loaded at runtime).

1 more reply

naasking2y ago

Verdex's explanation is detailed but too long IMO. The short version is that ADTs/sum types formally correspond to the OR-logical connective, and records/product types formally correspond to AND-logical connective. I think you'd be hard-pressed to argue that people don't think in terms of "X AND Y OR Z". These are core primitives of any kind of reasoning.

1 more reply

Tainnor2y ago

> except maybe Rust

Swift, Kotlin and Scala all have had ADTs for a while, even Java has it now.

zozbot2342y ago

Pascal has had variant records since the 1970s.

adrian_b2y ago

But Pascal's variant records (1970-11) had very ugly design errors in comparison with the unions of Algol 68 (1968-12), which made them either useless or annoying for most applicatons.

Niklaus Wirth is well known as a critic of Algol 68 (before the design of Algol 68 was finalized), but in the case of his variant records he has completely failed to create something competitive.

2 more replies

Alifatisk2y ago

Isn’t ADT abbreviation for Abstract Data Type? Or does it depend in context nowadays?

rowanG0772y ago

Context. It means algebraic data type here.

Jaxan2y ago

You answered your own question: it depends and the context and is confusing imo. Both are very common in compsci

1 more reply

drycabinet2y ago

Wait until you switch to unions in rust and ask yourself whether it is a union or a struct.

1 more reply

mgaunard2y ago

C has always had them, it's called union.

In practice you need to couple it with an enum, and your visitation mechanism is a switch statement. But C doesn't impose that on you and lets you do it as you see fit.

duped2y ago

You're confusing semantics for implementation. The point of union and discriminated union types (not what C calls union) is to enable compiler checked pattern matching, which tagged enums in C plus a switch statement do not get you.

coldtea2y ago

>C has always had them, it's called union

It also has all the features of Haskell, since you can implement a Haskell compiler in C.

estebank2y ago

Tagged unions + pattern matching is what gp wants. You can always encode whatever model you want using any programming language, but language features/ergonomics matter.

naasking2y ago

That you can sort of simulate the skeleton of algebraic data types does not mean that C has algebraic data types. The whole point of the algebra part is that the syntax has a compositional semantics which is completely absent in C, unless you go to great lengths as with this macro header.

anon-39882y ago

lol this is like saying C doesn't need structs, you can just declare the variables with a common prefix separately! See ma, product types!

2 more replies

bmoxb2y ago

That is not a proper alternative to real pattern matching.

epolanski2y ago

May not be built in but many mainstream languages such as typescript have libraries or the tools to easily implement them.

tombert2y ago· 9 in thread

Interesting.

Algebraic Data Types are almost always one of the things I miss when I use imperative languages. I have to do Java at work, and while I've kind of come around on Java and I don't think it's quite as bad as I have accused it of being, there's been several dozen instances of "man I wish Java had F#'s discriminated unions".

Obviously I'm aware that you can spoof it with a variety of techniques, and often enums are enough for what you need, but most of those techniques lack the flexibility and terseness of proper ADTs; if nothing else those techniques don't have the sexy pattern matching that you get with a functional language.

This C extension looks pretty sweet since it appears to have the pattern matching I want; I'll see if I can use it for my Arduino projects.

estebank2y ago

Everyone who hasn't used ADTs and pattern matching doesn't get what the big deal is all about. Everyone who is used to ADTs and pattern matching doesn't get what the big deal is all about, until they have to work in a language that doesn't have them. And everyone who just found out about them can't shut up about them being the best thing since sliced bread.

im3w1l2y ago

I have mainly used them in Rust. They are nice I suppose, but nothing mindblowing.

To me it feels very similar to an interface (trait) implemented by a bunch of classes (structs). I have multiple times wondered which of those two approaches would be better in a given situation, often wanting some aspects of both.

Being able to exhaustively pattern match is nice. But being able to define my classes in different places is also nice. And being able to define methods on the classes is nice. And defining a function that will only accept particular variant is nice.

From my perspective a discriminant vs a vtable pointer is a boring implementation detail the compiler should just figure out for me based on what would be more optimal in a given situation.

5 more replies

acchow2y ago

I’m in the latter camp (from Ocaml) and now using Go. Go feels clunky and awkward.

1 more reply

AlecBG2y ago

Sealed interfaces in java 21 allow pattern matching

tombert2y ago

Yeah I know, we just don't use Java 21 at work yet. I'm super excited for that update, and it actually looks like we will be transitioning to that by the end of the year, but I haven't had a chance to play with it just yet.

I do find it a little annoying that it's taken so long for Java to get a feature that, in my opinion, was so clearly useful; it feels like they were about a decade later on this than they should have been, but I'll take whatever victories I can get.

1 more reply

lupire2y ago

Kotlin is JVM compatible and has ADTs.

Java has https://github.com/functionaljava/functionaljava

which is unsupported but stable.

tombert2y ago

Sure, and Scala has had ADTs since its inception as well I think, and that's also JVM. It's not ADTs, but Clojure does have some level of pattern matching/destructuring as well.

It wasn't that I though that the JVM was incapable of doing something like an ADT, just that vanilla Java didn't support it. While it's easy to say that "companies should just use Kotlin", that's a bit of a big ordeal if you already have a 15 year old codebase that's written in Java.

I've heard of but never used the Functional Java library, though it'd be a tough sell to get my work to let me import a library that hasn't been updated in two years.

1 more reply

brabel2y ago

Java 21's pattern matching (you don't need functionaljava, and shouldn't really use that unless you're really into FP) is kind of nicer than Kotlin's, because you can automatically "destruct" records in your matches.

For Java, see https://www.baeldung.com/java-lts-21-new-features

Kotlin's: https://www.baeldung.com/kotlin/when

Make up your own mind.

1 more reply

eru2y ago

> Algebraic Data Types are almost always one of the things I miss when I use imperative languages.

Algebraic data types and pattern matching actually work really well in imperative languages, too. See eg Rust.

jackling2y ago· 6 in thread

Could you not get most of the benefits of ADTs using structs + unions + enums? I've used the pattern where I had a union of several types and an enum to differentiate which one to pick. Something like std::variant seems to work a bit like a sum type.

The only issue is you can't do a clean switch statement that matches on the specific value of a field, but nested switch statements aren't that messy.

acuozzo2y ago

Yes, and you can also get many of the benefits of OOP with convention and discipline, but doing so requires you to frequently get down in the weeds since, e.g., vtables must be dealt with manually.

The trouble with this approach is that there's a lot of mental overhead in dotting all of your i's and crossing all of your t's. It's draining, so you start to, e.g., shoehorn additional functionality into existing classes instead of making new ones.

You eventually wind up perceiving the abstraction as costly which lessons your use of it at the expense of producing a more elegant solution to the problem(s) you're solving.

tl,dr? The ability to just state "Darmok and Jalad at Tanagra" is transformative when the alternative is telling an entire story every time you want to reference a complex idea.

naasking2y ago

> Could you not get most of the benefits of ADTs using structs + unions + enums?

The modelling aspects can be simulated, yes, but that's barely half of the benefits of ADTs. Pattern matching is a big ergonomic benefit.

cryptonector2y ago

TFA gets pattern matching.

The critical thing is that the compiler (or macro system) needs to check that you've checked all the alternatives.

1 more reply

cryptonector2y ago

The absolutely critical thing is to have checked every alternative when dealing with a sum type value. This is hard to do with macros, though not impossible (basically you'd need a macro to start a matching context and which introduces a variable in which to keep track of all the alternatives checked, then you need to arrange for the end of the matching context to check that all alternatives were checked).

mattgreenrocks2y ago

I generally don't mind C++ for most code when it's absolutely necessary, but I'm not a huge fan of std::variant. Using std::visit to exhaustively match all cases feels hacky. It really would benefit from being a first-class language feature. It's more impactful to a lot of day-to-day code than other things they've worked on, such as coroutines.

TheBicPen2y ago

The 4th example at https://en.cppreference.com/w/cpp/utility/variant/visit that uses a class template makes the feature a bit nicer, but still not as ergonomic as something like Rust.

klysm2y ago· 4 in thread

If I ever implement a product from scratch again, discriminated unions with compiler enforced exhaustive pattern matching is a hard requirement. It’s too powerful to not have.

nmfisher2y ago

Union types are my biggest wish for Dart, but unfortunately it doesn't look like they'll be added any time soon.

They've recently added support for compiler-enforced pattern matching over sealed classes, which I suppose does get you halfway there though.

actionfromafar2y ago

What languages could fit that description today? I don't really understand what it even means but maybe I could understand better if I could look at examples in languages which have them.

phi-go2y ago

For example Rust: https://doc.rust-lang.org/std/keyword.enum.html

1 more reply

seivan2y ago

Swift, Typescript depending on your opinion with structural typing, and Rust.

Think Scala, Elm and Haskell have it as well.

Having that and elixirs pattern matching would be insane.

linkdd2y ago· 4 in thread

This is the work of a wizard.

I've known C for almost 20 years, and never would I have thought the macro system was powerful enough to allow such black magic.

This is awesome!

cl912y ago

> I've known C for almost 20 years

The author is only 19 years old. I feel really dumb now.

clnhlzmn2y ago

You might also be interested in metalang99 by the same author.

CuriousCosmic2y ago

Yeah xmacros (the style of macro use) are pretty fancy. "Classically" they are used for creating and accessing type safe generics or for reducing boilerplate for hardware register and interrupt definitions.

They are kind of cursed but at their core they are actually incredibly simple and a reliable tool for reducing cognitive complexity and boilerplate in C based projects.

lupire2y ago

ADTs are mostly string replacement on generic structs and unions, plus tagging on the union. It's not a complicated use of macros.

modeless2y ago· 4 in thread

> PLEASE, do not use top-level break/continue inside statements provided to of and ifLet; use goto labels instead.

Seems like a pretty big footgun. But otherwise, very cool.

zzo38computer2y ago

Using goto instead isn't a problem, but knowing not to use break/continue inside of such blocks is something that you will have to be aware of.

I had written a immediate mode UI out of macros, and this reminded me of that although in my case it is not a problem, although some blocks are ones that you can use "break". For example, you can use "break" to exit out of a win_form block ("goto" also works), while a win_command block does not capture "break" so using break (or goto) inside of a win_command block will break out of whatever block the win_command is in (probably a win_form block; for example, this would commonly be used in the case of a "Cancel" button).

3922y ago

What's neat about Rust is that in its macro land, writing the code that checked for this condition would be not only possible, but doable, imaginable, and aided by easily installable OSS libraries.

So it's not just about being slightly better in some ways, but smoothing over so many paper cuts that it can be hard to see how they have added up overtime across ecosystems, like CPython and co having so many of its own vocab types, or HPC libs.

For example, the problem with this macro that causes this wouldn't even be problems in a well written Rust macro. They're artifacts of smart people trying to work around C's limitations.

But then the macro wouldn't have been written anyway because this is a port of a native Rust feature (which means it gets taken advantage of in community software).

linkdd2y ago

goto is a footgun only if you use it to move from function to function, which btw was what "goto considered harmful" was about. That practice has disappeared, and now goto, within a function, is pretty harmless and quite identical to break/continue in fact.

modeless2y ago

Goto isn't the footgun. The footgun is if you use break/continue by accident then some unspecified bad thing will happen, silently I'm guessing.

samatman2y ago· 4 in thread

Let's say you have a C program to write, and you really want exhaustive pattern matching on the tags of unions (which is what Datatype99 provides: "Put simply, Datatype99 is just a syntax sugar over tagged unions").

Let's say further that you already know Rust exists, and aren't going to use it for reasons that anyone writing a C program already knows.

At least consider Zig. Here's a little something I wrote in Zig two days ago:

    /// Adjust a label-bearing OpCode by `l`. No-op if no label.
    pub fn adjust(self: *OpCode, l: i16) void {
        switch (self.*) {
            inline else => |*op| {
                const PayType = @TypeOf(op.*);
                if (PayType != void and @hasField(PayType, "l")) {
                    op.*.l += l;
                }
            },
        }
    }

This uses comptime (inline else) to generate all branches of a switch statement over a tagged union, and add an offset to members of that union which have an "l" field. You can vary the nature of the branches on any comptime-available type info, which is a lot, and all the conditions are compile-time, each branch of the switch has only the logic needed to handle that variant.

"But my program is already in C, I just need it for one file" right. Try Zig. You might like it.

Koshkin2y ago

> At least consider Zig.

Considered the example given. Now my eyes hurt. I think I started appreciating Lisp more.

pajko2y ago

Seems like Nim can be useful too, plus it compiles to C.

https://gist.github.com/unclechu/eb37cc81e80afbbb5e74990b62e...

samatman2y ago

I've been a Nim respecter for many years, it's slept on in general as a language.

The difference here is that Nim compiles to C and you can turn the garbage collector off: Zig compiles C and there's no garbage collector. That means the entire standard library is available when generating object code. It's also trivial to opt-in to the C ABI on a fine-grained basis, by defining a function or struct with the extern keyword.

I believe this is still fairly current about the difficulties of building Nim dylibs for C programs: https://peterme.net/dynamic-libraries-in-nim.html

I expect Nim will stabilize about where D has: it will have a dialect of the language which, with relatively painless accommodations, is able to produce object code which speaks C ABI. Zig is different. The language is relentlessly focused on providing a better alternative to C while occupying the same niche, and a lot of design time has been spent on making it practical to take an existing C program and start writing the new parts of it in Zig.

It's a good language, Nim, and getting better. I'd recommend it for someone who is considering Go, for example.

2 more replies

brabel2y ago

In Nim, ADTs are painful still (as your example clearly shows), but they are working on adding proper ADTs to the language (I can't find where I read that, but I am sure I did!).

1 more reply

naasking2y ago· 2 in thread

Definitely looks nicer and probably works better than my older attempt [1], but uses 8x more code and depends on the awesome but kinda scary Metalang9 macro toolkit. I think libsum is a good intro if you want to see how algebraic data types work underneath.

[1] https://github.com/naasking/libsum

Hirrolot2y ago

I have a star on your repository, so it seems I was looking into it while designing Datatype99 :)

cryptonector2y ago

GH stars kinda function as a bookmark system, except I never go looking at what all I've starred, so it's more of an optimistic bookmark system.

I only sometimes use it as a "I would recommend this repo" -- how can one do that anyways, given that the repo could morph into something one would no longer recommend?

KerrAvon2y ago· 2 in thread

Anyone considering using this should be strongly looking at using Swift or Rust instead. You can build almost any given language idea using the C macro preprocessor, but that doesn't mean it's a good idea to ship production code using it.

The worst codebases to inherit as a maintenance programmer are the ones where people got clever with the C preprocessor. Impossible to debug and impossible to maintain.

endgame2y ago

C99 is a stable target for writing bootstrappable software: there are multiple mature compiler implementations, at least one of which is bootstrappable down to hex0, and the bootstrap chain is not too long.

3922y ago

I find that most abuses of the preprocessor are by folks unwilling/unable to simplify their design into a form that's (a) native to the C language/runtime or (b) not repetitive to type.

This library on the other hand addresses a nasty papercut whose presence usually stops folks with modern language experience from choosing C when it might otherwise be valid. Plus you can't beat C's long-term stability.

Though I agree that 90+% who _think_ they still need C should probably move on to making Rust work for them, instead.

drycabinet2y ago

Wikipedia has something interesting on this (how unions can be implemented using "class hierarchy in object-oriented programming"): https://en.wikipedia.org/wiki/Tagged_union#Class_hierarchies...

There is a lengthy blog post about the same stuff, except that the author doesn't seem to have come across the said wiki section yet: https://nandakumar.org/blog/2023/12/paradigms-in-disguise.ht...

Kudos to the dev of datatype99 for showing the problem with such ad-hoc methods in the readme right away.

1 more reply

mingodad2y ago

There is also https://melt.cs.umn.edu/ that has an extension that add templates and algebraic data types to C : https://github.com/melt-umn/ableC-template-algebraic-data-ty...

otikik2y ago

What a madlad. Kudos for implementing this.

WhereIsTheTruth2y ago

Tagged Union is a must have in a programming language

rurban2y ago

I certainly won't use that. It's not type safe, and doesn't even allow names for its pattern matching sugar. Why he calls this simple struct matching sugar via tagged unions "Algebraic Data Types" is beyond my understanding. He cannot even do nested structs nor unions.

  datatype(
    BinaryTree,
    (Leaf, int),
    (Node, BinaryTree *, int, BinaryTree *)
  );

No names for the struct fields, so you need to rely on the position.

And then used:

    int sum(const BinaryTree *tree) {
    match(*tree) {
        of(Leaf, x) return *x;
        of(Node, lhs, x, rhs) return sum(*lhs) + *x + sum(*rhs);
    }

    // Invalid input (no such variant).
    return -1;
    }

Where lhs, x, rhs magically match the types defined above. What a nonsense design!

j / k navigate · click thread line to collapse

225 comments

93 comments · 14 top-level

different_base2y ago· 44 in thread

adrian_b2y ago

Moreover, they have already been proposed by John McCarthy in October 1964, 60 years ago, for inclusion in the successor of ALGOL 60, which makes even more weird the lack of widespread support.

floxy2y ago

https://web.eecs.umich.edu/~bchandra/courses/papers/Hoare_Hi...

jjice2y ago

I've never written Swift, but it seems like they have it too https://docs.swift.org/swift-book/documentation/the-swift-pr...

I also would love a future where ADTs are more common in imperative languages

odyssey72y ago

Swift “enumerations” are very nice.

skywal_l2y ago

Zig is a modern imperative programming language with ADTs: https://ziglang.org/documentation/master/#Tagged-union

chongli2y ago

    contrived :: ([a], Char, (Int, Float), String, Bool) -> Bool
    contrived    ([],  'b',  (1,   2.0),   "hi",   True) = False

To achieve a result like this using Zig's switch syntax would seem to involve a huge amount of boilerplate code and nested switch statements.

[1] https://www.haskell.org/tutorial/patterns.html

1 more reply

madeofpalk2y ago

Also Typescript https://www.typescriptlang.org/docs/handbook/2/everyday-type...

2 more replies

CraigJPerry2y ago

>> not having ADTs (except maybe Rust) built-in

Most of the common languages today have product types.

Java[1], Rust, Haskell, etc. have sum types.

I think it gets a bit more escoteric beyond that though - i don't doubt that there's probably some haskell extension for quotient types[2] or some other category theory high-jinx.

Most languages have ADTs built in.

[1] https://blogs.oracle.com/javamagazine/post/inside-the-langua... [2] https://en.wikipedia.org/wiki/Quotient_type

nextaccountic2y ago

Does Java sealed classes enable something like an exhaustive pattern matching? (A form of pattern matching that will fail at compile time if you add a new class that extends the sealed class)

3 more replies

brabel2y ago

Please don't forget Dart!

https://medium.com/dartlang/dart-3-1-a-retrospective-on-func...

LegionMammal9782y ago

2 more replies

speed_spread2y ago

Java sum types work but still need a bit of syntax sugar on the declaration side, IMHO.

1 more reply

thewakalix2y ago

Java doesn't have pattern-matching yet. Haskell is not an imperative language.

j2kun2y ago

Your examples, on the TIOBE index, are #4, #18, and #28. https://www.tiobe.com/tiobe-index/

1 more reply

Verdex2y ago

debo_2y ago

ADT feels like an unfortunately acronym-collision with "Abstract data types."

pjmlp2y ago

Yep, people that eventually buy a Modula-2 ADT book, when hunting old stuff, are in for a surprise. :)

2 more replies

jghn2y ago

More often than not, when I say ADT to someone outside of the FP world, they assume I mean abstract data type.

1 more reply

keybored2y ago

ADT feels like an unfortunately acronym-collision with "algebraic data types."

They were both introduced in the same decade.

ajross2y ago

> [Algebraic Data Types are] such a basic mental model of how humans think and solve problems

But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.

keybored2y ago

> But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.

Syntaxes like: `A | B(Int) | C(String)`. That means A, B, or C.

> Where by extension: class-based inheritance is actually pretty simple to understand. The classic "IS-A" relationship isn't as simple as "fields in a struct", but it's not hard to understand

This value is either `A`, an Int (`B(Int)`) or a String (`C(String)`). Or: this knapsack either contains an A, B, or C. Difficult?

> (c.f. all the animal analogies),

Reminds me of the fact that software isn’t typically as static as animal taxonomies.

> and the syntax for expressing it is pretty clean in most languages.

I’m most used to Java where you spread `extends` over N files. (Sealed classes in Java is an exception.)

It’s fine. I don’t understand how it is particularly clean.

> Is it the "best" way to solve a problem? Maybe not. Neither are ADT sum types.

Is this an argument? ’Cause I don’t see it.

> But I think there's a major baby-in-the-bathwater problem

Maybe the designers of hypothetical language X thinks that algebraic data types is enough. What baby are you missing then?

> trying to be different.

With a technology from the ’70s. I also saw your “one oddball new idea is a revolution” snark. You’re clearly being very honest.

1 more reply

mrkeen2y ago

> But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.

  data Bool = True | False

throwawaymaths2y ago

> class-based inheritance is actually pretty simple to understand

Simple to understand, a nightmare to debug, as you'll be chasing where your data and data contracts across a ton of files.

1 more reply

tombert2y ago

kaashif2y ago

> when Java 8 added a lot of (well-needed) new syntax, it felt like that was an ideal opportunity to add ADTs and pattern matching

Well at least Java does now (as of Java 21) have pattern matching (including nested record destructuring) and sealed classes, which let you have decent sum types.

The one issue is that everything is nullable, but that's a wider Java issue.

1 more reply

thesz2y ago

"Haskell is the best imperative language," (C) various software engineers.

Also, algebraic data types can be seen as hierarchy consisting of abstract base class and several final children classes. So it is an inheritance model, just restricted one.

taeric2y ago

I'm curious on supporting evidence for it being a basic mental model of how humans think? That sounds like a fairly strong claim.

Verdex2y ago

However

Now, maybe they're only copying one another and this is more of a memetic phenomena. Or maybe they've hit upon something that's important for human comprehensibility.

1 more reply

naasking2y ago

1 more reply

Tainnor2y ago

> except maybe Rust

Swift, Kotlin and Scala all have had ADTs for a while, even Java has it now.

zozbot2342y ago

Pascal has had variant records since the 1970s.

adrian_b2y ago

But Pascal's variant records (1970-11) had very ugly design errors in comparison with the unions of Algol 68 (1968-12), which made them either useless or annoying for most applicatons.

Niklaus Wirth is well known as a critic of Algol 68 (before the design of Algol 68 was finalized), but in the case of his variant records he has completely failed to create something competitive.

2 more replies

Alifatisk2y ago

Isn’t ADT abbreviation for Abstract Data Type? Or does it depend in context nowadays?

rowanG0772y ago

Context. It means algebraic data type here.

Jaxan2y ago

You answered your own question: it depends and the context and is confusing imo. Both are very common in compsci

1 more reply

drycabinet2y ago

Wait until you switch to unions in rust and ask yourself whether it is a union or a struct.

1 more reply

mgaunard2y ago

C has always had them, it's called union.

In practice you need to couple it with an enum, and your visitation mechanism is a switch statement. But C doesn't impose that on you and lets you do it as you see fit.

duped2y ago

coldtea2y ago

>C has always had them, it's called union

It also has all the features of Haskell, since you can implement a Haskell compiler in C.

estebank2y ago

Tagged unions + pattern matching is what gp wants. You can always encode whatever model you want using any programming language, but language features/ergonomics matter.

naasking2y ago

anon-39882y ago

lol this is like saying C doesn't need structs, you can just declare the variables with a common prefix separately! See ma, product types!

2 more replies

bmoxb2y ago

That is not a proper alternative to real pattern matching.

epolanski2y ago

May not be built in but many mainstream languages such as typescript have libraries or the tools to easily implement them.

tombert2y ago· 9 in thread

Interesting.

This C extension looks pretty sweet since it appears to have the pattern matching I want; I'll see if I can use it for my Arduino projects.

estebank2y ago

im3w1l2y ago

I have mainly used them in Rust. They are nice I suppose, but nothing mindblowing.

From my perspective a discriminant vs a vtable pointer is a boring implementation detail the compiler should just figure out for me based on what would be more optimal in a given situation.

5 more replies

acchow2y ago

I’m in the latter camp (from Ocaml) and now using Go. Go feels clunky and awkward.

1 more reply

AlecBG2y ago

Sealed interfaces in java 21 allow pattern matching

tombert2y ago

1 more reply

lupire2y ago

Kotlin is JVM compatible and has ADTs.

Java has https://github.com/functionaljava/functionaljava

which is unsupported but stable.

tombert2y ago

Sure, and Scala has had ADTs since its inception as well I think, and that's also JVM. It's not ADTs, but Clojure does have some level of pattern matching/destructuring as well.

I've heard of but never used the Functional Java library, though it'd be a tough sell to get my work to let me import a library that hasn't been updated in two years.

1 more reply

brabel2y ago

For Java, see https://www.baeldung.com/java-lts-21-new-features

Kotlin's: https://www.baeldung.com/kotlin/when

Make up your own mind.

1 more reply

eru2y ago

> Algebraic Data Types are almost always one of the things I miss when I use imperative languages.

Algebraic data types and pattern matching actually work really well in imperative languages, too. See eg Rust.

jackling2y ago· 6 in thread

The only issue is you can't do a clean switch statement that matches on the specific value of a field, but nested switch statements aren't that messy.

acuozzo2y ago

Yes, and you can also get many of the benefits of OOP with convention and discipline, but doing so requires you to frequently get down in the weeds since, e.g., vtables must be dealt with manually.

You eventually wind up perceiving the abstraction as costly which lessons your use of it at the expense of producing a more elegant solution to the problem(s) you're solving.

tl,dr? The ability to just state "Darmok and Jalad at Tanagra" is transformative when the alternative is telling an entire story every time you want to reference a complex idea.

naasking2y ago

> Could you not get most of the benefits of ADTs using structs + unions + enums?

The modelling aspects can be simulated, yes, but that's barely half of the benefits of ADTs. Pattern matching is a big ergonomic benefit.

cryptonector2y ago

TFA gets pattern matching.

The critical thing is that the compiler (or macro system) needs to check that you've checked all the alternatives.

1 more reply

cryptonector2y ago

mattgreenrocks2y ago

TheBicPen2y ago

The 4th example at https://en.cppreference.com/w/cpp/utility/variant/visit that uses a class template makes the feature a bit nicer, but still not as ergonomic as something like Rust.

klysm2y ago· 4 in thread

If I ever implement a product from scratch again, discriminated unions with compiler enforced exhaustive pattern matching is a hard requirement. It’s too powerful to not have.

nmfisher2y ago

Union types are my biggest wish for Dart, but unfortunately it doesn't look like they'll be added any time soon.

They've recently added support for compiler-enforced pattern matching over sealed classes, which I suppose does get you halfway there though.

actionfromafar2y ago

What languages could fit that description today? I don't really understand what it even means but maybe I could understand better if I could look at examples in languages which have them.

phi-go2y ago

For example Rust: https://doc.rust-lang.org/std/keyword.enum.html

1 more reply

seivan2y ago

Swift, Typescript depending on your opinion with structural typing, and Rust.

Think Scala, Elm and Haskell have it as well.

Having that and elixirs pattern matching would be insane.

linkdd2y ago· 4 in thread

This is the work of a wizard.

I've known C for almost 20 years, and never would I have thought the macro system was powerful enough to allow such black magic.

This is awesome!

cl912y ago

> I've known C for almost 20 years

The author is only 19 years old. I feel really dumb now.

clnhlzmn2y ago

You might also be interested in metalang99 by the same author.

CuriousCosmic2y ago

They are kind of cursed but at their core they are actually incredibly simple and a reliable tool for reducing cognitive complexity and boilerplate in C based projects.

lupire2y ago

ADTs are mostly string replacement on generic structs and unions, plus tagging on the union. It's not a complicated use of macros.

modeless2y ago· 4 in thread

> PLEASE, do not use top-level break/continue inside statements provided to of and ifLet; use goto labels instead.

Seems like a pretty big footgun. But otherwise, very cool.

zzo38computer2y ago

Using goto instead isn't a problem, but knowing not to use break/continue inside of such blocks is something that you will have to be aware of.

3922y ago

What's neat about Rust is that in its macro land, writing the code that checked for this condition would be not only possible, but doable, imaginable, and aided by easily installable OSS libraries.

For example, the problem with this macro that causes this wouldn't even be problems in a well written Rust macro. They're artifacts of smart people trying to work around C's limitations.

But then the macro wouldn't have been written anyway because this is a port of a native Rust feature (which means it gets taken advantage of in community software).

linkdd2y ago

modeless2y ago

Goto isn't the footgun. The footgun is if you use break/continue by accident then some unspecified bad thing will happen, silently I'm guessing.

samatman2y ago· 4 in thread

Let's say further that you already know Rust exists, and aren't going to use it for reasons that anyone writing a C program already knows.

At least consider Zig. Here's a little something I wrote in Zig two days ago:

    /// Adjust a label-bearing OpCode by `l`. No-op if no label.
    pub fn adjust(self: *OpCode, l: i16) void {
        switch (self.*) {
            inline else => |*op| {
                const PayType = @TypeOf(op.*);
                if (PayType != void and @hasField(PayType, "l")) {
                    op.*.l += l;
                }
            },
        }
    }

"But my program is already in C, I just need it for one file" right. Try Zig. You might like it.

Koshkin2y ago

> At least consider Zig.

Considered the example given. Now my eyes hurt. I think I started appreciating Lisp more.

pajko2y ago

Seems like Nim can be useful too, plus it compiles to C.

https://gist.github.com/unclechu/eb37cc81e80afbbb5e74990b62e...

samatman2y ago

I've been a Nim respecter for many years, it's slept on in general as a language.

I believe this is still fairly current about the difficulties of building Nim dylibs for C programs: https://peterme.net/dynamic-libraries-in-nim.html

It's a good language, Nim, and getting better. I'd recommend it for someone who is considering Go, for example.

2 more replies

brabel2y ago

In Nim, ADTs are painful still (as your example clearly shows), but they are working on adding proper ADTs to the language (I can't find where I read that, but I am sure I did!).

1 more reply

naasking2y ago· 2 in thread

[1] https://github.com/naasking/libsum

Hirrolot2y ago

I have a star on your repository, so it seems I was looking into it while designing Datatype99 :)

cryptonector2y ago

GH stars kinda function as a bookmark system, except I never go looking at what all I've starred, so it's more of an optimistic bookmark system.

I only sometimes use it as a "I would recommend this repo" -- how can one do that anyways, given that the repo could morph into something one would no longer recommend?

KerrAvon2y ago· 2 in thread

The worst codebases to inherit as a maintenance programmer are the ones where people got clever with the C preprocessor. Impossible to debug and impossible to maintain.

endgame2y ago

3922y ago

I find that most abuses of the preprocessor are by folks unwilling/unable to simplify their design into a form that's (a) native to the C language/runtime or (b) not repetitive to type.

Though I agree that 90+% who _think_ they still need C should probably move on to making Rust work for them, instead.

drycabinet2y ago

Wikipedia has something interesting on this (how unions can be implemented using "class hierarchy in object-oriented programming"): https://en.wikipedia.org/wiki/Tagged_union#Class_hierarchies...

There is a lengthy blog post about the same stuff, except that the author doesn't seem to have come across the said wiki section yet: https://nandakumar.org/blog/2023/12/paradigms-in-disguise.ht...

Kudos to the dev of datatype99 for showing the problem with such ad-hoc methods in the readme right away.

1 more reply

mingodad2y ago

There is also https://melt.cs.umn.edu/ that has an extension that add templates and algebraic data types to C : https://github.com/melt-umn/ableC-template-algebraic-data-ty...

otikik2y ago

What a madlad. Kudos for implementing this.

WhereIsTheTruth2y ago

Tagged Union is a must have in a programming language

rurban2y ago

  datatype(
    BinaryTree,
    (Leaf, int),
    (Node, BinaryTree *, int, BinaryTree *)
  );

No names for the struct fields, so you need to rely on the position.

And then used:

    int sum(const BinaryTree *tree) {
    match(*tree) {
        of(Leaf, x) return *x;
        of(Node, lhs, x, rhs) return sum(*lhs) + *x + sum(*rhs);
    }

    // Invalid input (no such variant).
    return -1;
    }

Where lhs, x, rhs magically match the types defined above. What a nonsense design!

j / k navigate · click thread line to collapse