An SQL Solution for Jupyter (opens in new tab)

My notebook code typically involves a data prep stage with querying a SQL database, then downloading into Python for more complex analysis, ML modelling, integration with external data sources, etc. So the notebook has a Python kernel with SQL usually as embedded """-quoted strings.

Does anyone have a solution to treating selected code cells as SQL - with SQL highlighting and tooltips - exposed as string variables to the Python code?

Sparkmagic [1] does part of this for Python/SQL/Spark interoperability, but as far as I recall, doesn't support SQL syntax highlighting.

[1] https://github.com/jupyter-incubator/sparkmagic

galacticdessert5y ago

Agree this would be awesome. I reckon you could obtain something similar with pandas with pd.read_sql("<query>"), but a dedicated SQL cell akin to markdown cells would make it much more pleasant

pletnes5y ago

There’s a %%bash cell magic which turns one cell into a bash cell. What about a %%xeus cell to run a SELECT (or any sql)? Would that work?

krmk5y ago

Like this: https://pypi.org/project/ipython-sql/ ?

3l3ktr4OP5y ago

that's pretty amazing work! there have been some efforts in the past to do that: https://github.com/minrk/allthekernels I think it'd be hard to do something like this, but not impossible. Just a lot of work.

andreareina5y ago· 2 in thread

The only example of how to use it is an animation that is just impossible to read :(

Here are the examples in the live demo[1]:

    %LOAD sqlite3 db=chinook.db timeout=2 shared_cache=true
    %XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo BIN TRUE MARK square WIDTH 100 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
    %XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo BIN MAXBINS 3 MARK bar COLOR red WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
    %XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo TYPE ordinal MARK bar COLOR green WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
    %XVEGA_PLOT X_FIELD Name TYPE nominal Y_FIELD ArtistId BIN MAXBINS 1 MARK line COLOR purple WIDTH 200 HEIGHT 200 <> SELECT Name, ArtistId FROM artists LIMIT 10
    %XVEGA_PLOT X_FIELD EmployeeId GRID false Y_FIELD ReportsTo MARK area COLOR pink WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees

[1] https://hub.gke2.mybinder.org/user/jupyter-xeus-xeus-sql-x64...

3l3ktr4OP5y ago

hey @andreareina! Actually we have a binder on the repo that you can try + several examples :) link to binder: https://mybinder.org/v2/gh/jupyter-xeus/xeus-sql/stable?urlp... link to examples:https://github.com/jupyter-xeus/xeus-sql/tree/master/example...

andreareina5y ago

Thanks! I'm a fan of notebook-based work and just interactive programming in general and even if I don't use Jupyter myself it's exciting to see innovation in the space.

My complaint is really more that for this sort of thing I feel examples should be front and center; Jupyter "sells" a better experience, so show me how easy and ergonomic it is to use!

Hope that helps. Right now it looks like the sql parts are kind of the equivalent of running a subshell and displaying the results, with no communication to the "main" code. Is this accurate? Is there a way forward in the future to e.g. pull the data down with sql and then do some further munging in another language?

2 more replies

aargh_aargh5y ago· 2 in thread

I followed the instructions and the sql cells work, but the xvega ones return empty output and I'm not sure how to debug that.

  conda create -n xeus-sql
  conda activate xeus-sql
  conda install xeus-sql soci-mysql soci-postgresql soci-sqlite jupyterlab -c conda-forge

In conda list I see, among others:

  xeus                      1.0.0                h78d96c3_0    conda-forge
  xeus-sql                  0.0.8                h118ccdd_1    conda-forge
  xvega                     0.0.10               h4bd325d_0    conda-forge
  xvega-bindings            0.0.10               h4bd325d_0    conda-forge

Btw there's a typo in the instructions: soci-postresql -> soci-postgresql

aargh_aargh5y ago

Also, for mysql and postgres, are only socket connections supported? No TCP connections?

https://xeus-sql.readthedocs.io/en/latest/PostgreSQL.html#li...

aargh_aargh5y ago

Answering my own question - works as expected, only the documentation is lacking.

%LOAD postgresql host=127.0.0.1 port=5432 dbname=DATABASE user=USERNAME password=PASSWORD

ktpsns5y ago· 1 in thread

This is amazing. Jupyter is already a great tool for data science and being able to directly interact with SQL (without a host/intermediate language) is great, because it is very clean and SQL is a powerful language on its own – no need for some host language. Furthermore, vega (https://vega.github.io/vega-lite/) is an amazing kind-of-declarative visualization language which mixes great with the declarative language SQL.

3l3ktr4OP5y ago

thanks for the kind words! :)

realitysballs5y ago· 1 in thread

Does anyone know if can work with 32-bit ODBC drivers? My industry is behind the times and the only way to interact with our ERP is 32-bit ODBC

3l3ktr4OP5y ago

hey @realityballss, don't really know as I'm not too deep in the SOCI stuff. but we offer full support to SOCI meaning if these folks: https://github.com/SOCI/soci support it and the dependencies exist and work for 32bit, than yes. I'd say it looks a bit like a long shot, but may work.

Vaslo5y ago· 1 in thread

Names almost every major sql engine except SQL Server

Says will work with any DB with an ODBC connection

Hmmmm

3l3ktr4OP5y ago

it’s depending directly of SOCI. I offer a wrapper around everything they do :)

fulafel5y ago

The Github support for notebooks is so nice (was linked from the example pic caption: https://github.com/wangfenjin/xeus-tidb/blob/develop/example...)

geebee5y ago

This looks very promising! I will look into it.

When I want a tabular view, I currently either used the pandas read_sql_query method or the PandaSQL module. That does work, and I often do want to use pandas and sql together (often back and forth, depending on the operation).

That said, I can see a use for this tool in my work - I'm often really just interested in interacting with a database in a more visually friendly way than the command line, and it would be great to stay with Jupyter rather than having to go through a completely different UI.

So, yeah, nice looking tool, and thanks for posting!

seddonm15y ago

We have used https://almond.sh/ to create a Spark SQL interpreter using Jupyter Notebooks - plus a whole lot more which you can see here: https://arc.tripl.ai/tutorial

After seeing many companies writing ETL using code we decided it was too hard to manage at scale so provided this abstraction layer - which is heavily centered around expressing business logic in SQL - to standardise development (JupyterLab) and allow rapid deployments.

jasmcole5y ago

Those interested in SQL notebooks can also check out https://count.co

j / k navigate · click thread line to collapse

32 comments

27 comments · 11 top-level

plumsempy5y ago· 5 in thread

I know you pronounce it ES-KYU-EL because you used "an"

fulafel5y ago

It's more common than "sequel" in non english speaking places IME.

KerryJones5y ago

Yep, it through me off, I saw it somewhere years ago that the pronunciations were:

SQL = SEE-KWUHL MySQL = MY-ES-KYU-EL

And it's now engrained in my mind and this title through me off ;)

junon5y ago

"An SQL", "through me off"....

I just woke up and my head is already spinning. ;)

1 more reply

jessikat5y ago

"SEE-KWUHL" is actually the pronunciation for MS SQL Server. "ES-KYU-EL" is for all other uses of SQL.

1 more reply

slt20215y ago

sir, it is my-ass-que-el

stevesimmons5y ago· 4 in thread

Jupyter would be even better if it supported the seamless combination of Python and SQL code cells.

Does anyone have a solution to treating selected code cells as SQL - with SQL highlighting and tooltips - exposed as string variables to the Python code?

Sparkmagic [1] does part of this for Python/SQL/Spark interoperability, but as far as I recall, doesn't support SQL syntax highlighting.

[1] https://github.com/jupyter-incubator/sparkmagic

galacticdessert5y ago

Agree this would be awesome. I reckon you could obtain something similar with pandas with pd.read_sql("<query>"), but a dedicated SQL cell akin to markdown cells would make it much more pleasant

pletnes5y ago

There’s a %%bash cell magic which turns one cell into a bash cell. What about a %%xeus cell to run a SELECT (or any sql)? Would that work?

krmk5y ago

Like this: https://pypi.org/project/ipython-sql/ ?

3l3ktr4OP5y ago

andreareina5y ago· 2 in thread

The only example of how to use it is an animation that is just impossible to read :(

Here are the examples in the live demo[1]:

    %LOAD sqlite3 db=chinook.db timeout=2 shared_cache=true
    %XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo BIN TRUE MARK square WIDTH 100 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
    %XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo BIN MAXBINS 3 MARK bar COLOR red WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
    %XVEGA_PLOT X_FIELD EmployeeId Y_FIELD ReportsTo TYPE ordinal MARK bar COLOR green WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees
    %XVEGA_PLOT X_FIELD Name TYPE nominal Y_FIELD ArtistId BIN MAXBINS 1 MARK line COLOR purple WIDTH 200 HEIGHT 200 <> SELECT Name, ArtistId FROM artists LIMIT 10
    %XVEGA_PLOT X_FIELD EmployeeId GRID false Y_FIELD ReportsTo MARK area COLOR pink WIDTH 200 HEIGHT 200 <> SELECT EmployeeId, ReportsTo FROM employees

[1] https://hub.gke2.mybinder.org/user/jupyter-xeus-xeus-sql-x64...

3l3ktr4OP5y ago

andreareina5y ago

Thanks! I'm a fan of notebook-based work and just interactive programming in general and even if I don't use Jupyter myself it's exciting to see innovation in the space.

My complaint is really more that for this sort of thing I feel examples should be front and center; Jupyter "sells" a better experience, so show me how easy and ergonomic it is to use!

2 more replies

aargh_aargh5y ago· 2 in thread

I followed the instructions and the sql cells work, but the xvega ones return empty output and I'm not sure how to debug that.

  conda create -n xeus-sql
  conda activate xeus-sql
  conda install xeus-sql soci-mysql soci-postgresql soci-sqlite jupyterlab -c conda-forge

In conda list I see, among others:

  xeus                      1.0.0                h78d96c3_0    conda-forge
  xeus-sql                  0.0.8                h118ccdd_1    conda-forge
  xvega                     0.0.10               h4bd325d_0    conda-forge
  xvega-bindings            0.0.10               h4bd325d_0    conda-forge

Btw there's a typo in the instructions: soci-postresql -> soci-postgresql

aargh_aargh5y ago

Also, for mysql and postgres, are only socket connections supported? No TCP connections?

https://xeus-sql.readthedocs.io/en/latest/PostgreSQL.html#li...

aargh_aargh5y ago

Answering my own question - works as expected, only the documentation is lacking.

%LOAD postgresql host=127.0.0.1 port=5432 dbname=DATABASE user=USERNAME password=PASSWORD

ktpsns5y ago· 1 in thread

3l3ktr4OP5y ago

thanks for the kind words! :)

realitysballs5y ago· 1 in thread

Does anyone know if can work with 32-bit ODBC drivers? My industry is behind the times and the only way to interact with our ERP is 32-bit ODBC

3l3ktr4OP5y ago

Vaslo5y ago· 1 in thread

Names almost every major sql engine except SQL Server

Says will work with any DB with an ODBC connection

Hmmmm

3l3ktr4OP5y ago

it’s depending directly of SOCI. I offer a wrapper around everything they do :)

fulafel5y ago

The Github support for notebooks is so nice (was linked from the example pic caption: https://github.com/wangfenjin/xeus-tidb/blob/develop/example...)

geebee5y ago

This looks very promising! I will look into it.

So, yeah, nice looking tool, and thanks for posting!

seddonm15y ago

We have used https://almond.sh/ to create a Spark SQL interpreter using Jupyter Notebooks - plus a whole lot more which you can see here: https://arc.tripl.ai/tutorial

jasmcole5y ago

Those interested in SQL notebooks can also check out https://count.co

j / k navigate · click thread line to collapse