At first I was suspicious but looking at the page source the links are there, so you may want to revisit that if it's an issue for others.
If anyone is curious about other tools in the same space, our data scientists use Dash[1] and plotly to build interactive exploration and visualization apps. We set up a Git repo that deploys their apps internally with every merge to master, so they're actually building and updating tools that our operations, marketing, etc teams use every day.
I've always looked on Dash as a bit of FOMO myself. If you have experience with Jupyter, could you contrast Dash vs Jupyter?
"Streamlit assigns each variable an up-to-date value given widget states."
This line is interesting because it implies distributed state in each component (widget). Alternatively this could be framed in centralized state manager terminology.
"Each widget is provided with the current state of the application, and that state is also available to your script."
If you adopt this mindset you can separate the concerns of state and presentation. At first glance it appears that you need to extract state from widgets at the same point as they are added to the page.
(Please correct me if I'm wrong.)
I might not want to have a widget added to a page until much later in the script, but I want to have access to its state at the top of the script.
The value of the top level `props` parameter to a react component is it gives you access to all state wherever you need it, and disentangles this state from the arrangement of the page.
Ian:
Thanks for that comment. You're exactly right: Streamlit adapts a React-like model. In fact, the connection goes deeper than the post describes. For example, to make it efficient to run the same script repeatedly, Streamlit does packet-level deduplication. If you generate a lot of data and send it to the browser, only small deltas need be sent to update the UI.
We have a list of future blog posts we hope to write and one of them is (cheekily) called "Streamlit is React for Python." ;) (Not quite true, more of an imperfect analogy!)
So it made me really happy to see someone drawing that analogy already. Thank you. :)
https://gist.github.com/iandanforth/0ed987bfddf8205b8a23
I hope that could be a part of this framework in the future! (If it isn't already)
(Co-founder of Streamlit here)
https://news.ycombinator.com/item?id=21127528 https://news.ycombinator.com/item?id=21126477
It depends on how you would define "production scale".
If you're talking about hosting a publicly accessible Streamlit app on the internet, it's definitely possible but will require you to set up an appropriate infrastructure around it: sticky load balancer, replication, orchestration, etc.
If you're talking about hosting something for internal use by your company, very often just a simple machine serving your Streamlit app is more than enough.
That said, we're currently working on Streamlit For Teams, which is a paid offering that will make it trivial to deploy Streamlit apps for these use cases. If you're interested, you can sign up here: https://streamlit.io/forteams/
(Co-founder of Streamlit here)
Installing on my Mac to test this out was very straight-forward:
cd /tmp
mkdir streamlit
cd streamlit
pipenv shell
pip install streamlit
Then I could play with the built-in demos by running: streamlit hello
So that was a slick intro - next step was I followed this tutorial: https://streamlit.io/docs/tutorial/create_a_data_explorer_ap...And a few minutes later I had an interactive notebook-style interface for playing with Uber pickup data in New York.
This is a really interesting product.
I'm using WinPython 3.6 on Windows 7. I did "pip install streamlit" and then "streamlit hello", and had to allow it through the firewall, then got a 404 page.
The workaround is very simple, just use the provided http address and add "index.html":
http://localhost:8501/index.html
This link has more info: https://github.com/streamlit/streamlit/issues/244I saw there is a Streamlit for teams in the future (sounds expensive) and on the forums they recommended to make a docker container and host it anywhere, which is doable, but I'd love a way to be able to just put something up on the internet for a short period of time, sort of how now.sh[1] works.
[1]: https://zeit.co/home
Also caching is a great idea but I would expect a lot of this logic to be managed on the server side, or I am missing something and ML is different here? I would expect to pipe as little data as possible back to the application because I want the user to wait max 3-4 seconds for the app to load at start.
Jupyter -> internal tool/API is pretty much the holy grail of bridging data scientists, business teams and engineering.
I hope this project doesn't die out. A lot of people would pay for this.
Also take a look at Streetscape.gl [2] which is designed for visualizing AV data
Disclaimer: I work on Uber's data vis team but not on AVs.
[0] https://streamlit.io/docs/api.html#streamlit.deck_gl_chart [1] https://deck.gl/#/documentation/deckgl-api-reference/layers/... [2] https://avs.auto/
I haven't checked yet but a question that comes to mind is how extensible is this framework. I can easily see how I'd want to make custom widgets.
Regarding extensibility, we totally agree: over time, many people are going to want to write their own custom widgets. Which is why we're actually in the early phases of designing a plugin system for Streamlit.
So stay tuned!
You should ask for telemetry permissions _before_ the process starts up (as you do for email address), and keep the default as "No", instead of start to send the data transparently unless non user friendly steps are taken by the user.
How would people compare this to Observable [1]? 1. Javascript vs. Python 2. Client-only vs. server-required?
Does the market already give advantage to Python and server-required because the data sets are too large and live on the server, and the users (data scientists) prefer Python and the existing libraries there?
I ended up converting my Python models to TensorFlow.js and creating an ad-hoc Vue.js app [0], but Streamlit could have been very beneficial here, especially if you can just put nginx in front of it and serve it to the masses.
More information: to prevent unexpected behavior, Streamlit tries to detect mutations in cached objects so it can alert the user if needed. However, something went wrong while performing this check.
Please file a bug... "
requests_cache caches HTML requests into one SQLite database. [1] pandas-datareader can cache external data requests with requests-cache. [2]
dask.cache can do opportunistic caching (of 2GB of data). [3]
How does streamlit compare to jupyter voila dashboards (with widgets and callbacks)? They just launched a new separate github org for the project. [4] There's a gallery of voila dashboard examples. [5]
> Voila serves live Jupyter notebooks including Jupyter interactive widgets.
> Unlike the usual HTML-converted notebooks, each user connecting to the Voila tornado application gets a dedicated Jupyter kernel which can execute the callbacks to changes in Jupyter interactive widgets.
> - By default, voila disallows execute requests from the front-end, preventing execution of arbitrary code.
[1] https://github.com/reclosedev/requests-cache
[2] https://pandas-datareader.readthedocs.io/en/latest/cache.htm...
[3] https://docs.dask.org/en/latest/caching.html
[4] https://github.com/voila-dashboards/voila
[5] https://blog.jupyter.org/a-gallery-of-voil%C3%A0-examples-a2...
Acess control and resource exhaustion are challenges with building any {Flask, framework_x,} app [from Jupyter notebooks]. First it's "HTTP Digest authentication should be enough for now"; then it's "let's use SSO and LDAP" (and review every release); then it's "why is it so sloww?". JupyterHub has authentication backends, spawners, and per-user-container/vm resource limits.
> Each user on your JupyterHub gets a slice of memory and CPU to use. There are two ways to specify how much users get to use: resource guarantees and resource limits. [6]
[6] https://zero-to-jupyterhub.readthedocs.io/en/latest/user-res...
Some notes re: voila and JupyterHub:
> The reason for having a single instance running voila only is to allow non JupyterHub users to have access to the dashboards. So without going through the Hub auth flow.
> What are the requirements in your case? Voila can be installed in the single user Docker image, so that each user can also use it on their own server (as a server extension for example). [7]
Can we use asyncio to update multiple charts simultaneously / at arbitrary intervals?
Wouldn't it be better if Jupyter absorbed this API for its dashboards?