Facebook open sources Haxl (opens in new tab)

(code.facebook.com)

317 pointsainsej12y ago76 comments

76 comments

52 comments · 13 top-level

AaronFriel12y ago· 11 in thread

This is a fascinating project. Haxl is the brainchild of former Glasgow Haskell Compiler lead* Simon Marlow.

The tl;dr of Haxl: what if you could describe accessing a data store (a la SQL) and have the compiler and library work together to "figure out" the most efficient way to perform queries, including performing multiple queries in parallel? That's what Haxl does, it allows you to specify the "shape" of your query, the type checker verifies its correctness, and the library executes it in parallel for you, without the developer having to know about synchronizing access or anything.

Here's a link to their paper (PDF): http://www.haskell.org/wikiupload/c/cf/The_Haxl_Project_at_F...

* - I am not sure if he's still committing, or if he's only doing application development. His accomplishments in Haskell land though, are many.

Edited: I removed my comment about GitHub issues, seems it's a known problem. :)

vamega12y ago

He also wrote a book called Parallel and Concurrent Programming in Haskell[1], which I've heard is an excellent read[2].

[1] http://chimera.labs.oreilly.com/books/1230000000929

[2] http://www.serpentine.com/blog/2014/03/18/book-review-parall...

freyrs312y ago

It's a great read indeed. If anyone is interested in more concrete applications of Haskell then a read through this should be enough to convince anyone that we can do some really amazing parallel programming on top of Haskell's RTS.

simonmar12y ago

He is still committing, but not quite so often :)

Evgeny12y ago

Is it a cultural reference? Found it in bestcomments, so looks like many people do get it, but I don't. Genuinely interested having English as a second language.

efnx12y ago

I see what you did there, haha.

thisisdave12y ago

I feel like I'm missing something.

I haven't used databases much, but don't most SQL implementations already "figure out" the most efficient way to perform queries? Can't most implementations already perform queries in parallel?

AaronFriel12y ago

Yes, but this is designed for non-SQL datastores, and arbitrary application code connecting portions of "data acquisition" and "data operation". Imagine if instead of an ORM, you had a single system that weaved together your application code and the database queries, and ensured that where they could execute in parallel, they did.

1 more reply

carlio12y ago

GitHub is currently having problems - https://twitter.com/githubstatus

emmanueloga_12y ago

reminds me of http://ql.io/ (which does not seem to have taken off... last commit: Apr 24, 2013).

pjc5012y ago

Isn't that what an SQL query planner does?

AaronFriel12y ago

Yes, but this is for non-SQL systems, and it does quite a bit more. It's more like an ORM in this respect, because it weaves together the query planning and processing part which handles retrieving data, and the execution of operating on that data.

JonCoens12y ago· 9 in thread

I'm an engineer on the Haxl project and am really excited to launch this today. Ask me anything!

X412y ago

Are you employing the ideas behind reactive programming? And can you explain the types of monads you used for what problem and why? I am writing a paper on Functional Reactive Programming and Haxl really made me curious. The paper (currently in german, but I'll translate it) proposes a new Hypothesis that tries to shred FRP in general, by showing a novel way that solves some of the problems automatically that naturally occur with FRP.

I am really interested in seeing how you solve problems for distributed systems with Haxl and how query sharding is handled etc..

I've wasted a whole day looking for Haxl online a few weeks ago, just to find out that it wasn't released yet. The release really makes me happy :)

JonCoens12y ago

This is in a similar space to reactive programming, but isn't reactive at its core. The best explanation of the monad we use developed is described in the paper here: http://community.haskell.org/~simonmar/papers/haxl-icfp14.pd...

Query sharding is at the data source layer, which Haxl doesn't delve into. It's up to each data source integration with Haxl to do the appropriate routing/etc.

Hope you find it useful!

ciferkey12y ago

How large has the haskell team at Facebook gotten (unless there isn't an official group and its on a project by project bases)?

JonCoens12y ago

Haxl is the only team using Haskell in prod, so the team itself isn't all that large. With the traction we gain, though, we could grow.

2 more replies

tveita12y ago

I'm curious how this is executed.

Is it like a query engine, where you work with the entire query up-front, apply transforms and build a query plan?

Or is it more like an event loop, where you run as far as you can until the code blocks on IO, batch up and send all the pending IO requests, and run further when the tasks you're blocked on resolve?

lbrandy12y ago

Part of the beauty is that the actual way IO (note: in this version, IO here means 'reads from the network', almost always) is scheduled is abstracted away such that we could go with either approach w/o impacting client code.

That said, the way it currently works is more like the first. You can think of the entire haxl run (program) as an AST that is given to the execution. It expands as much of the AST as possible (anything that's not IO), and anywhere it needs IO it enqueues those requests to be scheduled. Once it's explored as much as possible, it aggressively schedules the IO (deduping, batching, and overlapping the calls). Once it all comes back, it unblocks the AST where it can, and repeats the process.

This isn't necessarily the optimal scheduling (as you point out, unblocking each part of the tree as each result comes in might be better). It was specifically designed to make it easy to play with this kind of stuff later. Since the concurrency is entirely implicit the implementation is entirely abstracted away.

1 more reply

nusbit12y ago

Is there any resources about why Facebook uses haskell? What's your experience?

JonCoens12y ago

Our blog post goes into some of this: https://code.facebook.com/posts/302060973291128/open-sourcin...

Interpreted code was no longer cutting it for perf reasons, and any time you create your own language you end up reinventing the entire tool chain (debuggers, profilers, etc.). Haskell provides so much functionality in the language itself and has mature solutions to the other issues plaguing us in FXL, so it was a natural choice.

JonCoens12y ago

lbrandy also gives a good explanation here: https://news.ycombinator.com/item?id=7874537

lbrandy12y ago· 6 in thread

Hi. I'm one of the engineers who has worked on this so if anyone has any specific questions I can help answer and/or get someone to answer.

As said by @nbm, we also have a blog post up: https://code.facebook.com/posts/302060973291128/open-sourcin....

Igglyboo12y ago

I had no idea you guys used Haskell at FB. Other than Haxl, what are your main use cases for Haskell?

simonmar12y ago

Haxl is the only major project at FB using Haskell right now, but who knows where this will lead?

evmar12y ago

Have you replaced all use of FXL with Haxl? Or are both languages supported? In the latter case, what is the relative proportion of each in the live codebase?

(I appreciate that migrating code that already works to a new language often just introduces bugs for no gain, so please don't take my questions as trying to dig up dirt or anything. I'm genuinely just curious.)

evincarofautumn12y ago

We are still in the process of migrating from FXL to Haxl, and there is too much FXL code to translate manually, so at the moment we are treating FXL as the source of truth, compiling the FXL codebase to Haskell, and running both concurrently to verify correctness.

awda12y ago

Why Haskell?

lbrandy12y ago

Ah, my favorite question.

We previously had a custom DSL and it outgrew it's DSL-ness. The DSL was really good at one thing (implicit concurrency and scheduling io), and bad at everything else (cpu, memory, debugging, tooling). The predecessor was wildly successful and created new problems. Once all those secondary concerns became first order, we didn't want to start building all this ecosystem stuff for our homemade DSL. We needed to go from DSL to, ya know, an L. So the question is which...

If you understand the central idea of Haxl, I don't know of any other language that would let you do what Haxl in Haskell does. The built in language support for building DSLs (hijacking the operators including applicative/monadic operations) -really- shines in this case. I would -love- to see haxl-like implicit concurrency in other languages that feel as natural and concise. Consider that a challenge. I thought about trying to do it in C++ for edification/pedagogical purposes but it's an absolutely brutal mess of templates and hackery. There may be a better way, though.

3 more replies

fiatjaf12y ago· 5 in thread

http://hackage.haskell.org/package/haxl

Why do Haskell libraries on Hackage doesn't come even with a single example, getting started, how to use, quick start, nothing, really, just function declarations? This scares Haskell newbies.

m0nastic12y ago

The "documentation" on Hackage is almost universally just the haddock-generated files (which is why it's mostly just function declarations and type signatures).

Most libraries list a "Home Page" that more often than not includes more useful documentation (Haxl's, for example, has the things you've mentioned).

I concur, that most of the time, the documentation on Hackage isn't really sufficient, but I've found that for the most part I just use it to find the homepage, and then go there to read the actual documentation.

I agree that it would be nice if everything was all in one place.

dllthomas12y ago

"I agree that it would be nice if everything was all in one place."

I actually find "distilled reference with links to source" a fantastically valuable view. I've no objection to providing some sort of combined view, but let's not lose what we have in a quest for consolidation. I've no idea if that's what you meant or not, and don't mean to put words in your mouth of course, just expressing a concern.

1 more reply

tel12y ago

It's definitely suboptimal, especially for newbies. The standard response (which I agree with completely) is that the types make API documentation so much more valuable that it's less commonly a concern for advanced users. So there's a bit of a squeaky wheel problem compared with, for instance, Ruby where docs are completely required to understand most libraries even slightly.

imanaccount12y ago

You mean "this library" not "Haskell libraries". Some libraries have real documentation (either on hackage or off) to go with the API reference.

http://hackage.haskell.org/package/pipes-4.1.2/docs/Pipes-Tu...

http://hackage.haskell.org/package/aeson-0.7.0.6/docs/Data-A...

Kiro12y ago

The point is valid though. The lack of documentation/examples on most package pages made my learning experience a real hassle.

1 more reply

simonmar12y ago· 3 in thread

Here's our paper about the ideas behind Haxl: http://community.haskell.org/~simonmar/papers/haxl-icfp14.pd...

seanmcdirmid12y ago

Somehow I think this should have been upvoted to the top.

untothebreach12y ago

Simon Marlow and Sean McDirmid in the same thread? Be still my heart!

vdijkbas12y ago

Also see the slides of Simon's talk at ZuriHac 2014 last weekend:

http://www.haskell.org/haskellwiki/ZuriHac2014#Talk_by_Simon...

EGreg12y ago· 3 in thread

How do you guys batch requests in PHP? You don't, right? So this is an intermediate layer basically, and it sends requests every few millisecods and waits to batch things in between?

nbm12y ago

The systems described here isn't directly involved with any PHP code.

Like all Facebook services, they are communicated with over the Thrift RPC system, and may have PHP (or any other language) clients, and may talk to other services using Thrift (or occasionally other protocols), some of which may use PHP.

If you're asking a general question about batching requests in PHP, http://docs.hhvm.com/manual/en/hack.async.php may be informative.

EGreg12y ago

Cool so Hack supports async while php doesn't.

fnord_murks12y ago

If you want to have similar query parallelization magic for PHP+SQL have a look at the standalone SQLTap service written by the guys from DaWanda.com (https://github.com/paulasmuth/sqltap).

EGreg12y ago· 2 in thread

http://platform.qbix.com/guide/patterns ;-)

ixmatus12y ago

You obviously don't understand what Haxl is for.

EGreg12y ago

Why is that obvious to you? Do you care to explain your cryptic comment?

1 more reply

nbm12y ago

The release blog post is here - https://code.facebook.com/posts/302060973291128/open-sourcin...

It contains a lot more information about the problem it was originally created to solve, and potential other use cases.

edofic12y ago

Slides from Marlow from 9 months ago about Haxl and it's workings https://github.com/meiersi/HaskellerZ/blob/master/meetups/20...

radnam12y ago

Cool ! Are there any alternatives out there, especially to deal with fault tolerance especially when if we want to establish connection with 100's of varied databases?

polskibus12y ago

How does the functionality of Haxl differ from a mature ORM system? I'm thinking about .NET Entity Framework + LINQ in particular since it not only does the mapping but also assists in query generation, scheduling.

mkesper12y ago

It would be cool if you could transform those tables into HTML tables. They would look prettier (not nice JPG noise) and would also be more accessible.

skyahead12y ago

are there similar things for iOS?

j / k navigate · click thread line to collapse

76 comments

52 comments · 13 top-level

AaronFriel12y ago· 11 in thread

This is a fascinating project. Haxl is the brainchild of former Glasgow Haskell Compiler lead* Simon Marlow.

Here's a link to their paper (PDF): http://www.haskell.org/wikiupload/c/cf/The_Haxl_Project_at_F...

* - I am not sure if he's still committing, or if he's only doing application development. His accomplishments in Haskell land though, are many.

Edited: I removed my comment about GitHub issues, seems it's a known problem. :)

vamega12y ago

He also wrote a book called Parallel and Concurrent Programming in Haskell[1], which I've heard is an excellent read[2].

[1] http://chimera.labs.oreilly.com/books/1230000000929

[2] http://www.serpentine.com/blog/2014/03/18/book-review-parall...

freyrs312y ago

simonmar12y ago

He is still committing, but not quite so often :)

Evgeny12y ago

Is it a cultural reference? Found it in bestcomments, so looks like many people do get it, but I don't. Genuinely interested having English as a second language.

efnx12y ago

I see what you did there, haha.

thisisdave12y ago

I feel like I'm missing something.

I haven't used databases much, but don't most SQL implementations already "figure out" the most efficient way to perform queries? Can't most implementations already perform queries in parallel?

AaronFriel12y ago

1 more reply

carlio12y ago

GitHub is currently having problems - https://twitter.com/githubstatus

emmanueloga_12y ago

reminds me of http://ql.io/ (which does not seem to have taken off... last commit: Apr 24, 2013).

pjc5012y ago

Isn't that what an SQL query planner does?

AaronFriel12y ago

JonCoens12y ago· 9 in thread

I'm an engineer on the Haxl project and am really excited to launch this today. Ask me anything!

X412y ago

I am really interested in seeing how you solve problems for distributed systems with Haxl and how query sharding is handled etc..

I've wasted a whole day looking for Haxl online a few weeks ago, just to find out that it wasn't released yet. The release really makes me happy :)

JonCoens12y ago

Query sharding is at the data source layer, which Haxl doesn't delve into. It's up to each data source integration with Haxl to do the appropriate routing/etc.

Hope you find it useful!

ciferkey12y ago

How large has the haskell team at Facebook gotten (unless there isn't an official group and its on a project by project bases)?

JonCoens12y ago

Haxl is the only team using Haskell in prod, so the team itself isn't all that large. With the traction we gain, though, we could grow.

2 more replies

tveita12y ago

I'm curious how this is executed.

Is it like a query engine, where you work with the entire query up-front, apply transforms and build a query plan?

Or is it more like an event loop, where you run as far as you can until the code blocks on IO, batch up and send all the pending IO requests, and run further when the tasks you're blocked on resolve?

lbrandy12y ago

1 more reply

nusbit12y ago

Is there any resources about why Facebook uses haskell? What's your experience?

JonCoens12y ago

Our blog post goes into some of this: https://code.facebook.com/posts/302060973291128/open-sourcin...

JonCoens12y ago

lbrandy also gives a good explanation here: https://news.ycombinator.com/item?id=7874537

lbrandy12y ago· 6 in thread

Hi. I'm one of the engineers who has worked on this so if anyone has any specific questions I can help answer and/or get someone to answer.

As said by @nbm, we also have a blog post up: https://code.facebook.com/posts/302060973291128/open-sourcin....

Igglyboo12y ago

I had no idea you guys used Haskell at FB. Other than Haxl, what are your main use cases for Haskell?

simonmar12y ago

Haxl is the only major project at FB using Haskell right now, but who knows where this will lead?

evmar12y ago

Have you replaced all use of FXL with Haxl? Or are both languages supported? In the latter case, what is the relative proportion of each in the live codebase?

evincarofautumn12y ago

awda12y ago

Why Haskell?

lbrandy12y ago

Ah, my favorite question.

3 more replies

fiatjaf12y ago· 5 in thread

http://hackage.haskell.org/package/haxl

Why do Haskell libraries on Hackage doesn't come even with a single example, getting started, how to use, quick start, nothing, really, just function declarations? This scares Haskell newbies.

m0nastic12y ago

The "documentation" on Hackage is almost universally just the haddock-generated files (which is why it's mostly just function declarations and type signatures).

Most libraries list a "Home Page" that more often than not includes more useful documentation (Haxl's, for example, has the things you've mentioned).

I agree that it would be nice if everything was all in one place.

dllthomas12y ago

"I agree that it would be nice if everything was all in one place."

1 more reply

tel12y ago

imanaccount12y ago

You mean "this library" not "Haskell libraries". Some libraries have real documentation (either on hackage or off) to go with the API reference.

http://hackage.haskell.org/package/pipes-4.1.2/docs/Pipes-Tu...

http://hackage.haskell.org/package/aeson-0.7.0.6/docs/Data-A...

Kiro12y ago

The point is valid though. The lack of documentation/examples on most package pages made my learning experience a real hassle.

1 more reply

simonmar12y ago· 3 in thread

Here's our paper about the ideas behind Haxl: http://community.haskell.org/~simonmar/papers/haxl-icfp14.pd...

seanmcdirmid12y ago

Somehow I think this should have been upvoted to the top.

untothebreach12y ago

Simon Marlow and Sean McDirmid in the same thread? Be still my heart!

vdijkbas12y ago

Also see the slides of Simon's talk at ZuriHac 2014 last weekend:

http://www.haskell.org/haskellwiki/ZuriHac2014#Talk_by_Simon...

EGreg12y ago· 3 in thread

How do you guys batch requests in PHP? You don't, right? So this is an intermediate layer basically, and it sends requests every few millisecods and waits to batch things in between?

nbm12y ago

The systems described here isn't directly involved with any PHP code.

If you're asking a general question about batching requests in PHP, http://docs.hhvm.com/manual/en/hack.async.php may be informative.

EGreg12y ago

Cool so Hack supports async while php doesn't.

fnord_murks12y ago

If you want to have similar query parallelization magic for PHP+SQL have a look at the standalone SQLTap service written by the guys from DaWanda.com (https://github.com/paulasmuth/sqltap).

EGreg12y ago· 2 in thread

http://platform.qbix.com/guide/patterns ;-)

ixmatus12y ago

You obviously don't understand what Haxl is for.

EGreg12y ago

Why is that obvious to you? Do you care to explain your cryptic comment?

1 more reply

nbm12y ago

The release blog post is here - https://code.facebook.com/posts/302060973291128/open-sourcin...

It contains a lot more information about the problem it was originally created to solve, and potential other use cases.

edofic12y ago

Slides from Marlow from 9 months ago about Haxl and it's workings https://github.com/meiersi/HaskellerZ/blob/master/meetups/20...

radnam12y ago

Cool ! Are there any alternatives out there, especially to deal with fault tolerance especially when if we want to establish connection with 100's of varied databases?

polskibus12y ago

mkesper12y ago

It would be cool if you could transform those tables into HTML tables. They would look prettier (not nice JPG noise) and would also be more accessible.

skyahead12y ago

are there similar things for iOS?

j / k navigate · click thread line to collapse