BlazingSQL is Now Open Source (opens in new tab)

(blog.blazingdb.com)

307 pointsroaramburu6y ago64 comments

64 comments

This seems pretty cool, and I'll probably play with this at some point, but sadly literally all of my GPUs are AMD or Intel at this point.

I'm sure you had a good reason, so I'm genuinely curious to why CUDA was chosen instead of something like OpenCL?

(I'll add my typical disclaimer that I'm not saying this as some passive-aggressive way to criticize; I'm genuinely curious to the reasoning behind the choice.)

felipe_aramburu6y ago

Thats a great question. The answer is two-fold.

Early on when we first started playing around with General Processing on GPU's we had Nvidia cards to begin with and I started looking at the apis that were available to me.

The CUDA ones were easier for me to get started, had tons of learning content that Nvidia provided, and were more performant on the cards that I had at the time compared to other options. So we built up lots of expertise in this specific way of coding for GPUS. We also found time and time again that it was faster than opencl for what we were trying to do and the hardware available to us on cloud providers was Nvidia GPUs.

The second answer to this question is that blazingsql is part of a greater ecosystem. rapids.ai and the largest contributor by far is Nvidia. We are really happy to be working with their developers to grow this eco system and that means that the technology will probably be CUDA only unless we somehow program "backends" like they did with thrust but that would be eons away from now.

Improvotter6y ago

> We also found time and time again that it was faster than opencl for what we were trying to do and the hardware available to us on cloud providers was Nvidia GPUs.

Were some benchmarks done perhaps or could you provide some more low-level reasons as to why CUDA was more performant? I'm not experienced with CUDA, just generally interested.

I also have to say that I am a bit skeptical of Nvidia as I have never received any proper support for Linux development on Nvidia GPUs for drivers and generally tracking bugs on their cards. It was so frustrating that I just switched to AMD GPUs that "just worked". How is this different for these kinds of use cases? Does Nvidia only care about their potential enterprise customers but they don't care about general usage of their GPUs on Linux? It seems to rub me the wrong way and I don't understand.

1 more reply

craftyguy6y ago

> blazingsql is part of a greater ecosystem

But now blazingsql is part of an ecosystem within a walled garden fully dependent upon the stability of a single company.

2 more replies

tombert6y ago

Thanks for answering my question so quickly!

That seems like a pretty good reason...I have been looking to learn some GPU programming to optimize some matrix math that I've been doing for a pet project, and while my first instinct was telling me OpenCL since it's portable, if people who actually know what they're talking about are saying that CUDA is simpler to start with, it might be worth it to me to pick up a cheap Nvidia GPU/Jetson Nano and do some processing that way.

3 more replies

en4bz6y ago

CUDA has two APIs:

1. The runtime api (libcudart.so)

2. The driver api (libcuda.so).

The driver api is very close to the opencl api and is very low level. Most people use the CUDA runtime api which is vastly more convenient. The main difficulty with OpenCl and the driver api is that you have to manually load GPU code onto the device which then returns a handle. You generally have to load the code onto every device which means multiple handles for the same function. This makes executing kernel quiet a lot of work. The runtime api does this all automatically which make programming with CUDA quiet easy since launching a kernel is basically a function call. The CUDA rutime also automatically handles context creation which is another time saver.

When I first learned OpenCL I was shocked at how difficult is was to simply write a simple vector add program since there was all this additional code loading, creating contexts, etc. The setup / boiler plate was greater than the actually code itself.

It basically boils down to convenience in my opinion. Couple this with the fact the NVIDIA generally has the most powerful and energy efficient cards and it's no surprise they took the market.

hobofan6y ago

> The driver api is very close to the opencl api and is very low level.

They are only realistically comparable from OpenCL 2.0 onwards. But no NVIDIA card supports anything beyond 1.2, and with that decision they basically killed OpenCL.

1 more reply

The_rationalist6y ago

The open conccurent to the runtime api is SYCL.

1 more reply

dang6y ago

Discussed twice last year:

https://news.ycombinator.com/item?id=18186392

https://news.ycombinator.com/item?id=19192625

Also 2017: https://news.ycombinator.com/item?id=15819489

2016: https://news.ycombinator.com/item?id=12484568

rburhum6y ago

This is great. The BlazingDB guys are awesome and now that the project is open source this is another good reason for my teams to experiment with different workloads and compare it against a SparkSQL approach

roaramburuOP6y ago

Also! We have a guy in Lima pushing out some GIS work into cuDF and BlazingSQL too.

huac6y ago

+1, this is very cool, but would love for the BlazingDB team to show benchmarks here

roaramburuOP6y ago

Tons of benchmarks at blog.blazingdb.com

Check it out, it's fast.

samstave6y ago

Can someone give me some use case examples?

I read that site, and the RAPIDS site - but would like to hear from some ppl using this in prod/test and what they are using it for...

lmeyerov6y ago

We worked with the team early on it. In turn, that means it's inside one of the powertools at gov, bank, etc. teams, even if most of the users don't quite know what a GPU DB is :) We do GPU visual graph analytics over event data (security, fraud, customer 360, ...). We use for a bunch: interactive sub-100ms timebars, histograms, etc. Any full-table compute stuff you'd do in pandas, sql, spark, etc. Any UI interaction like a filter can trigger tons of queries, and w/ GPUs, that means they can quickly compute all sorts of things.

The reason Graphistry picked BlazingSQL is it fit in as part of our approach of end-to-end GPU services that compose by sharing in-memory Apache Arrow format columnar data. When the Blazing team aligned on Nvidia RAPIDS more deeply than the other 2nd-wave GPU analytics engines, it made the most sense as an embedded compute dependency. Going forward, that means Blazing can focus on making a great SQL engine, and we know the rate of their GPU progress won't be pegged to their team but to RAPIDS. A surprise win over just cudf (python) was eliminating most of the constant overheads (10ms->1ms / call), and looking forward, seems like an easier path to multi/many-GPU vs. cudf (dask).

We should share a tech report at some point - bravo to the team!

felipe_aramburu6y ago

Thanks Leo! We love having you all as early adopters of our tech!

pradn6y ago

Looks like an good way to do analytics on the GPU. The Python API is clean and simple.

The premise is that GPUs will accelerate columnar data analytics. And, with "Dask" [1], you can run those worldloads on a cluster.

I wonder if careful indexing on initial write would outperform this system. This system looks like it's best when you have totally raw, unindexed data. Perhaps a future thing to do is to generate a side index during initial column scans to speed up future queries?

Also, GPU memory is pretty expensive. How does the total-cost-of-ownership compare to just running on RAM with powerful multi-core CPUs? There's like 512-bit vector operations these days.

[1]: https://rapids.ai/dask.html

felipe_aramburu6y ago

GPU memory is expensive but a big as #@$% computer is even more expensive. When we show comparisons to things like spark we are doing so use cost basis. So if we say something like we are x times faster than this technology on this workload what we did was launch clusters that have similar costs. Total cost of ownership is also reduced by the fact that the engine itself is totally ephemeral. You can turn it off and on within seconds.

orliesaurus6y ago

I had never heard of this before, anyone in HN has used this before? If yes, where and more specifically what was your use-case? Thank you!

bsamuels6y ago

What kind of benefits does CUDA bring to databases? I've never heard of running a database on a GPU before. Couldn't find anything on their homepage other than comparison with a few other db options

felipe_aramburu6y ago

This is a Distributed SQL engine not a database. We store no data. You store your data in HDFS, S3, posix, NFS etc. We allow you to query directly from these filesystems of the file formats you have already. You can look here to see the file formats cudf supports. https://github.com/rapidsai/cudf/tree/branch-0.9/cpp/src/io

You can try it out yourself here https://colab.research.google.com/drive/1r7S15Ie33yRw8cmET7_...

Or use dockerhub https://hub.docker.com/r/blazingdb/blazingsql/

The benefits are.

Greatly increased processing capacities. We can just perform orders of magnitudes more instructions per second than a cpu with the gpus we are using.

Decompression and parsing of formats like CSV and parquet happens in the GPU orders of magnitude faster than the best cpu alternatives.

You can take the output of your queries and provide it to machine learning jobs with zero copy ipc and get the results back the same way. We are all about interoperability with the rapidsai eco system.

chrisjc6y ago

Is there any reason why a SQL format isn't is that list? Wondering if there's a way to join SQL sources with file storage sources. An example of this would be filtering or enrichment operations.

// sorry if this is a stupid question.

2 more replies

ohnoesjmr6y ago

I've read the website, but I could't find a hint that the engine is distributed. Even the spark benchmarks compare a single instance with multiple nodes.

Is it distributed? How do I set it up in a distributed mode? Does it support nested parquet (something that even spark itself struggles to support inside SQL).

1 more reply

reilly30006y ago

Check out https://www.omnisci.com/learn/resources/gpu-database

In summary, you get snappy, interactive query speeds on large data sets. I've ran that locally and the results are pretty amazing compared to Postgres or even Tableau in-memory.

I'm personally more excited about GPUs in stream processing; its just quite a natural fit: https://github.com/rapidsai/cudf

kichik6y ago

If you're interested in stream processing, check out FASTDATA.io PlasmaENGINE. We do both stream and batch processing with Apache Spark on the GPU.

https://fastdata.io/plasma-engine/

* It's not open-source and I work there.

1 more reply

felipe_aramburu6y ago

Blazingsql is built on top of CUDF. We are contributors to rapidsai

throwaway0827296y ago

Isn't the speed bottlenecked by the storage speed. Is the data fully loaded into memory first?

1 more reply

arnon6y ago

CUDA by itself brings easy-to-run parallel algorithms. It's not of much value for databases unless you have a proper infrastructure set up to use it correctly. Same is true for columnar aspects, for example.

People have been building columnar databases to do analytics quickly. GPUs (with CUDA) can run analytics operations (think join, group by, math, sorting) on columnar data in a much more efficient manner. They're designed for operations on vectors, which columns are.

We've been doing this ourselves too with SQream DB: https://sqream.com. It's an enterprise data warehouse with GPU acceleration. We use CUDA exclusively too.

taf26y ago

See https://wiki.postgresql.org/images/6/65/Pgopencl.pdf

Also from 4 years ago

https://news.ycombinator.com/item?id=10151632

fanf26y ago

PG-Strom is a GPU accelerator extension for PostgreSQL which has been around for a few years now. I have not tried it myself... http://heterodb.github.io/pg-strom/

manojlds6y ago

Is this due to PartiQL?

manigandham6y ago

PartiQL is a query language, based in SQL and extended to be more natural with unstructured and nested data. It can be used with various database and querying engines.

BlazingDB/SQL is a querying engine, more similar to Presto or Apache Drill, and specializes in using GPUs for processing power.

roaramburuOP6y ago

Yeah, we were totally ignorant on PartiQL until your post. Although now looking at it, looks boss! Totally agrees with many of our theses, and there looks to be a lot to glean from this project as well.

felipe_aramburu6y ago

No. We hadn't heard of what PartiQL was before you wrote that message. At least I hadn't. I am the CTO of blazingsql

randyzwitch6y ago

This is an interesting comparison I wouldn't have thought of. But yes, PartiQL does have a similar feel to this announcement, though without GPU-acceleration, the processing speed might be several orders of magnitude slower

llampx6y ago

Does this only run on NVIDIA/CUDA systems?

bernaferrari6y ago

> BlazingSQL is built entirely on top of cuDF and cuIO.

Yes.

felipe_aramburu6y ago

Yes to be more specific it works on cuda 9.2 and 10.0 at the moment like the rest of the rapidsai eco system.

j / k navigate · click thread line to collapse

64 comments

tombert6y ago

This seems pretty cool, and I'll probably play with this at some point, but sadly literally all of my GPUs are AMD or Intel at this point.

I'm sure you had a good reason, so I'm genuinely curious to why CUDA was chosen instead of something like OpenCL?

(I'll add my typical disclaimer that I'm not saying this as some passive-aggressive way to criticize; I'm genuinely curious to the reasoning behind the choice.)

felipe_aramburu6y ago

Thats a great question. The answer is two-fold.

Early on when we first started playing around with General Processing on GPU's we had Nvidia cards to begin with and I started looking at the apis that were available to me.

Improvotter6y ago

> We also found time and time again that it was faster than opencl for what we were trying to do and the hardware available to us on cloud providers was Nvidia GPUs.

Were some benchmarks done perhaps or could you provide some more low-level reasons as to why CUDA was more performant? I'm not experienced with CUDA, just generally interested.

1 more reply

craftyguy6y ago

> blazingsql is part of a greater ecosystem

But now blazingsql is part of an ecosystem within a walled garden fully dependent upon the stability of a single company.

2 more replies

tombert6y ago

Thanks for answering my question so quickly!

3 more replies

en4bz6y ago

CUDA has two APIs:

1. The runtime api (libcudart.so)

2. The driver api (libcuda.so).

It basically boils down to convenience in my opinion. Couple this with the fact the NVIDIA generally has the most powerful and energy efficient cards and it's no surprise they took the market.

hobofan6y ago

> The driver api is very close to the opencl api and is very low level.

They are only realistically comparable from OpenCL 2.0 onwards. But no NVIDIA card supports anything beyond 1.2, and with that decision they basically killed OpenCL.

1 more reply

The_rationalist6y ago

The open conccurent to the runtime api is SYCL.

1 more reply

dang6y ago

Discussed twice last year:

https://news.ycombinator.com/item?id=18186392

https://news.ycombinator.com/item?id=19192625

Also 2017: https://news.ycombinator.com/item?id=15819489

2016: https://news.ycombinator.com/item?id=12484568

rburhum6y ago

roaramburuOP6y ago

Also! We have a guy in Lima pushing out some GIS work into cuDF and BlazingSQL too.

huac6y ago

+1, this is very cool, but would love for the BlazingDB team to show benchmarks here

roaramburuOP6y ago

Tons of benchmarks at blog.blazingdb.com

Check it out, it's fast.

samstave6y ago

Can someone give me some use case examples?

I read that site, and the RAPIDS site - but would like to hear from some ppl using this in prod/test and what they are using it for...

lmeyerov6y ago

We should share a tech report at some point - bravo to the team!

felipe_aramburu6y ago

Thanks Leo! We love having you all as early adopters of our tech!

pradn6y ago

Looks like an good way to do analytics on the GPU. The Python API is clean and simple.

The premise is that GPUs will accelerate columnar data analytics. And, with "Dask" [1], you can run those worldloads on a cluster.

Also, GPU memory is pretty expensive. How does the total-cost-of-ownership compare to just running on RAM with powerful multi-core CPUs? There's like 512-bit vector operations these days.

[1]: https://rapids.ai/dask.html

felipe_aramburu6y ago

orliesaurus6y ago

I had never heard of this before, anyone in HN has used this before? If yes, where and more specifically what was your use-case? Thank you!

bsamuels6y ago

What kind of benefits does CUDA bring to databases? I've never heard of running a database on a GPU before. Couldn't find anything on their homepage other than comparison with a few other db options

felipe_aramburu6y ago

You can try it out yourself here https://colab.research.google.com/drive/1r7S15Ie33yRw8cmET7_...

Or use dockerhub https://hub.docker.com/r/blazingdb/blazingsql/

The benefits are.

Greatly increased processing capacities. We can just perform orders of magnitudes more instructions per second than a cpu with the gpus we are using.

Decompression and parsing of formats like CSV and parquet happens in the GPU orders of magnitude faster than the best cpu alternatives.

chrisjc6y ago

Is there any reason why a SQL format isn't is that list? Wondering if there's a way to join SQL sources with file storage sources. An example of this would be filtering or enrichment operations.

// sorry if this is a stupid question.

2 more replies

ohnoesjmr6y ago

I've read the website, but I could't find a hint that the engine is distributed. Even the spark benchmarks compare a single instance with multiple nodes.

Is it distributed? How do I set it up in a distributed mode? Does it support nested parquet (something that even spark itself struggles to support inside SQL).

1 more reply

reilly30006y ago

Check out https://www.omnisci.com/learn/resources/gpu-database

In summary, you get snappy, interactive query speeds on large data sets. I've ran that locally and the results are pretty amazing compared to Postgres or even Tableau in-memory.

I'm personally more excited about GPUs in stream processing; its just quite a natural fit: https://github.com/rapidsai/cudf

kichik6y ago

If you're interested in stream processing, check out FASTDATA.io PlasmaENGINE. We do both stream and batch processing with Apache Spark on the GPU.

https://fastdata.io/plasma-engine/

* It's not open-source and I work there.

1 more reply

felipe_aramburu6y ago

Blazingsql is built on top of CUDF. We are contributors to rapidsai

throwaway0827296y ago

Isn't the speed bottlenecked by the storage speed. Is the data fully loaded into memory first?

1 more reply

arnon6y ago

We've been doing this ourselves too with SQream DB: https://sqream.com. It's an enterprise data warehouse with GPU acceleration. We use CUDA exclusively too.

taf26y ago

See https://wiki.postgresql.org/images/6/65/Pgopencl.pdf

Also from 4 years ago

https://news.ycombinator.com/item?id=10151632

fanf26y ago

PG-Strom is a GPU accelerator extension for PostgreSQL which has been around for a few years now. I have not tried it myself... http://heterodb.github.io/pg-strom/

manojlds6y ago

Is this due to PartiQL?

manigandham6y ago

PartiQL is a query language, based in SQL and extended to be more natural with unstructured and nested data. It can be used with various database and querying engines.

BlazingDB/SQL is a querying engine, more similar to Presto or Apache Drill, and specializes in using GPUs for processing power.

roaramburuOP6y ago

felipe_aramburu6y ago

No. We hadn't heard of what PartiQL was before you wrote that message. At least I hadn't. I am the CTO of blazingsql

randyzwitch6y ago

llampx6y ago

Does this only run on NVIDIA/CUDA systems?

bernaferrari6y ago

> BlazingSQL is built entirely on top of cuDF and cuIO.

Yes.

felipe_aramburu6y ago

Yes to be more specific it works on cuda 9.2 and 10.0 at the moment like the rest of the rapidsai eco system.

j / k navigate · click thread line to collapse