NumPy-style broadcasting in Futhark (opens in new tab)

(futhark-lang.org)

132 pointszfnmxt2y ago28 comments

28 comments

19 comments · 6 top-level

eigenspace2y ago· 9 in thread

I’d strongly suggest not going the Numpy route, and instead reading about broadcast in julia [1]. In many contexts, it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array. A classic example would be the matrix exponential[2], versus the element-wise exponential of a matrix.

In julia the former is `exp(M)`, and the latter is `exp.(M)`

[1] https://julialang.org/blog/2017/01/moredots/

[2] https://en.m.wikipedia.org/wiki/Matrix_exponential

cycomanic2y ago

I actually think Julia's behaviour (partially) copied from matlab is bad. I had to review a lot of matlab/Julia simulation code write by PhD/grad students (as well as my own) and most of the time when things somehow are not working debugging becomes a task of "find the missing dot". If they would have decided on a more obvious symbol (the dot is really just way too easy to overlook) for broadcasting it would be much better, but I still much prefer that broadcast and matrix operation would use entirely different operators/named functions. Instead we have the case that a little annotation can dramatically change the behaviour of the code (the matrix exponential is in fact a great example).

That said I mainly work with fields and calculus so I might be biased because the majority of operations are broadcast and I only rarely use matrix operations.

tomsmeding2y ago

The nice thing is that if this dot-approach (or some other syntax with a similar effect) were implemented in Futhark, then because Futhark is statically-typed, all mismatches would be compile-time errors. Those are very hard to overlook :)

1 more reply

setopt2y ago

> most of the time when things somehow are not working debugging becomes a task of "find the missing dot". If they would have decided on a more obvious symbo

I guess you could use your editor’s highlight-regexp function to make the dots more obvious during debugging :)

I personally like the dots because they’re so unintrusive, but I work in a field where most operations are matrix operations, so perhaps you’re right…

zfnmxtOP2y ago

I'd actually say what we did in Futhark is more similar to what Julia does (in the sense that the dot operator is syntactic sugar for Julia's `broadcast`) than what NumPy does---I wrote "NumPy-style" in the title just because I figured that's what most people would be familiar with. A more accurate title might be "Rank-polymorphic function applications in Futhark".

Ultimately it's quite different from both languages, though. Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically). This is really the main challenge that the system addresses and it permits you to have broadcasting in an ML/Hindley-Milner/Haskell-style type system along with full type inference and all the usual jazz and goodies.

Our system also doesn't have anything to do with vectorization/performance optimizations---in Futhark, `[1,2,3] + [4,5,6]` is just syntactic sugar for `map (+) [1,2,3] [4,5,6]` (which is how you'd write it without broadcasting), where `map` is already a parallel construct. So broadcasting in Futhark is literally just syntactic sugar for constructs that are inherently parallel, anyway---anything you can express with broadcasting can already be written in stock Futhark and there's no "magic" going on. Broadcasting in Futhark just means that some `map`s can be implicit, rather than explicit.

So it's a very disciplined and transparent (since it's static you can ask the compiler to show you exactly how the broadcasting will work) approach to broadcasting that the user has a lot of control over.

> it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array

Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level. It tries to actually use the fewest number of `map`s to get a rank-correct function application, so there's a preference for applying a function to higher-dimensional elements over scalars. This might sound non-intuitive/confusing, but it actually almost always aligns with what the programmer intended.

All it really does is use the minimal number of `map`s (and `rep`s) to massage a function and an argument with a rank mismatch into something that makes sense (or is rank correct)---it doesn't even have a notion of scalars, only rank differences!

eigenspace2y ago

> Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically).

Julia’s broadcast doesnt do anything dynamic. If you know the input types of a broadcast expression, you know the output type.

> Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level.

Julia’s broadcast doesnt do that either. One dot is one level deep.

Julia also doesn't really have a concept of scalars (in fact, our number types are actually zero-dimensional, one-element iterables). When I said ‘elementwise’ I just meant each element of a container, which can in turn be a container.

1 more reply

rscho2y ago

I'd strongly advise taking a page from J's book and allow rank specification for function application. Julia having ranks element and infinite is nice, but what about application row-by-row ? On on regular sub arrays?

Athas2y ago

You can just use 'map' explicitly. It doesn't have to be special syntax.

lalaithion2y ago

Futhark is a statically typed language, so it can’t have the same ambiguity.

eigenspace2y ago

What ambiguity?

stabbles2y ago· 2 in thread

Somewhat confusing to see a matrix represented as an array of arrays, it will never be fast.

tomsmeding2y ago

This is in the surface language only: Futhark's arrays must be "rectangular" (if you have an array of arrays, the subarrays are mandated to be equally-sized; this is statically checked in most cases, and when it can't, it becomes a runtime check), and thus they are represented as a single, actual multi-dimensional array.

adgjlsfhk12y ago

imo that seems like some pretty weird semantics. how do you write an operation that needs to bundle together lists of different lengths?

1 more reply

enisberk2y ago· 1 in thread

This is really cool work! Congrats on both the paper and the graduation! A long time ago, I worked on optimizing broadcast operations on GPUs [1]. Coming up with a strategy that promises high throughput across different array dimensionalities is quite challenging. I am looking forward to reading your work.

[1]https://scholar.google.com/citations?view_op=view_citation&h...

zfnmxtOP2y ago

> Congrats on both the paper and the graduation!

Thanks! Although I still have to actually graduate and the paper is in review, so maybe your congratulations are a bit premature! :)

> A long time ago, I worked on optimizing broadcast operations on GPUs [1].

Something similar happens in Futhark, actually. When something like `[1,2,3] + 4` is elaborated to `map (+) [1,2,3] (rep 4)`, the `rep` is eliminated by pushing the `4` into the `map`: `map (+4) [1,2,3]`. Futhark ultimately then compiles it to efficient CUDA/OpenCL/whatever.

mindcrime2y ago· 1 in thread

Apologies in advance for the slightly off-topic comment, but does anybody else find the name "Futhark" a bit.. erm... I dunno, "off-putting" for lack of a better term? What I mean is, I don't work with the language or anything, so it's never "top of mind" for me. And every time I see it show up on HN (or whatever) for some reason when I see that name something "pattern matches" and makes me think it's some eso-lang ala INTERCAL, Befunge, Malbolge, etc. It's only if I click through on the link and do some reading that I'm reminded that "Oh, this is something serious."

Probably it's just a me thing, but I'd be curious to know if anybody else experiences this with regards to Futhark.

digging2y ago

First I'm hearing of the programming language, but Futhark is the name given to several forms of the Old Norse runic alphabet, not made up wholesale or derived from other less-serious names.

jrevels2y ago

Cool stuff! Thanks for the write-up. Getting broadcast ergonomics right in a fully statically typed context seems both tricky and pretty satisfying.

> We thought it’d be a fun/simple feature and a good break from the heavy duty work on automatic differentiation we had just published

Perhaps it wasn't as much of a break as you might think - funnily enough, static broadcast semantics can serve as an interesting structural foothold for certain AD optimizations [1]

(disclaimer: I'm a coauthor on that paper :) but it's been quite a while since I've done work in the AD space)

[1] https://arxiv.org/pdf/1810.08297

xcdzvyn2y ago

I've seen Futhark on the front page a few times recently, has there been a resurgence of interest for some reason?

Exciting if so. I think it's a cool language

j / k navigate · click thread line to collapse

28 comments

19 comments · 6 top-level

eigenspace2y ago· 9 in thread

In julia the former is `exp(M)`, and the latter is `exp.(M)`

[1] https://julialang.org/blog/2017/01/moredots/

[2] https://en.m.wikipedia.org/wiki/Matrix_exponential

cycomanic2y ago

That said I mainly work with fields and calculus so I might be biased because the majority of operations are broadcast and I only rarely use matrix operations.

tomsmeding2y ago

1 more reply

setopt2y ago

> most of the time when things somehow are not working debugging becomes a task of "find the missing dot". If they would have decided on a more obvious symbo

I guess you could use your editor’s highlight-regexp function to make the dots more obvious during debugging :)

I personally like the dots because they’re so unintrusive, but I work in a field where most operations are matrix operations, so perhaps you’re right…

zfnmxtOP2y ago

> it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array

eigenspace2y ago

> Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically).

Julia’s broadcast doesnt do anything dynamic. If you know the input types of a broadcast expression, you know the output type.

> Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level.

Julia’s broadcast doesnt do that either. One dot is one level deep.

1 more reply

rscho2y ago

Athas2y ago

You can just use 'map' explicitly. It doesn't have to be special syntax.

lalaithion2y ago

Futhark is a statically typed language, so it can’t have the same ambiguity.

eigenspace2y ago

What ambiguity?

stabbles2y ago· 2 in thread

Somewhat confusing to see a matrix represented as an array of arrays, it will never be fast.

tomsmeding2y ago

adgjlsfhk12y ago

imo that seems like some pretty weird semantics. how do you write an operation that needs to bundle together lists of different lengths?

1 more reply

enisberk2y ago· 1 in thread

[1]https://scholar.google.com/citations?view_op=view_citation&h...

zfnmxtOP2y ago

> Congrats on both the paper and the graduation!

Thanks! Although I still have to actually graduate and the paper is in review, so maybe your congratulations are a bit premature! :)

> A long time ago, I worked on optimizing broadcast operations on GPUs [1].

mindcrime2y ago· 1 in thread

Probably it's just a me thing, but I'd be curious to know if anybody else experiences this with regards to Futhark.

digging2y ago

First I'm hearing of the programming language, but Futhark is the name given to several forms of the Old Norse runic alphabet, not made up wholesale or derived from other less-serious names.

jrevels2y ago

Cool stuff! Thanks for the write-up. Getting broadcast ergonomics right in a fully statically typed context seems both tricky and pretty satisfying.

> We thought it’d be a fun/simple feature and a good break from the heavy duty work on automatic differentiation we had just published

Perhaps it wasn't as much of a break as you might think - funnily enough, static broadcast semantics can serve as an interesting structural foothold for certain AD optimizations [1]

(disclaimer: I'm a coauthor on that paper :) but it's been quite a while since I've done work in the AD space)

[1] https://arxiv.org/pdf/1810.08297

xcdzvyn2y ago

I've seen Futhark on the front page a few times recently, has there been a resurgence of interest for some reason?

Exciting if so. I think it's a cool language

j / k navigate · click thread line to collapse