Our GitHub and S3/GCS Components (see it in action in my launch post) are actually just thin layers that execute code from other notebooks and we plan to offer this ability to create custom components like this in the future.
Nextjournal is a computational notebook platform and our goal is to make computation more accessible and automatically reproducible, so it becomes easier to collaborate and build on top of each others work.
If you'd like to know more, check out our launch blog post at https://nextjournal.com/mk/public-beta or sign up and give it a try!
just had a very brief look out of sheer curiosity, so please take this quick feedback with a grain of salt: The running times of your Python notebook get longer and longer with each print() statement and cell. While the reproducibility of your python notebook is a wonderful thing to have, I think the performance decrease is very strong downside.
Cheers -
Is that something you would be looking at adding?
We also have a proof-of-concept PR where you can implement a runtime in a notebook.
I think we'll expand the available languages as soon as we're confident the core product really solid.
So, Jupyterhub and manual tinkering to get such polish for now.
We also definitely want to open source parts of our product but we haven't figured out what parts (or everything) and under what license.
Our priority is currently on providing a useful hosted product and become sustainable. It's certainly also interesting to see how e.g. metabase is doing it the other way around, open source first without a hosted product but I guess I'm a bit scared of not being ready for developing Nextjournal in the open at this point in terms of bandwidth and keeping things backwards compatible.
From my past experiences there are a lot of enterprises that are rightfully scared to let their employees use such a service and open up data regardless of what promises a SaaS company makes. There is general assumption that if the SaaS company fucks up, all we get is a "we take our security very seriously...." blog post.
So, making your product work within a corporate network without "call home" is a great advantage and immediately expands your target audience with some potentially big pockets.
You might want to check out VizierDB (a project my group is working on). It's a self-hosted multi-language notebook with versioning, branching, and snapshots; as well as a spreadsheet-like editor and provenance-based data annotations.
How is Nextjournal different from Jupyter or Google Colaboratory?
We do support importing Jupyter notebooks and running Jupyter kernels, we also have our own runtime protocol.
In Jupyter (and hence in Colaboratory) you normally have one runtime that's running both your server code as well as the user code. In Nextjournal there's a separate application called the Runner that's orchestrating the runtimes which currently are docker images.
This allows us to use Nextjournal notebooks to do any kind of installations without the need for a full Jupyter kernel inside the image, something that gets tricky in Jupyter. Once we have a bash shell inside the image, we can do installations.
You can choose to commit the filesystem state at any time as a docker image and reuse it in other notebooks. This is actually how our default environment images are built: Our default Python environment https://nextjournal.com/nextjournal/python-environment is built on top of the minimal bash environment https://nextjournal.com/nextjournal/bash-environment which is importing just a stock ubuntu image.
Our system takes care of only referencing the image sha's everywhere, so everything is immutable and you can't accidentally overwrite anything.
You can also pull those docker images and use them locally.
Any data you upload or results you save (just write to a /results folder) is put into content-addressed storage, so same thing here, you'll never accidentally overwrite a file.
Lastly the document is stored in the database (Datomic) and you can restore any previous state.
Leveraging immutability at all layers of the stack is what enables our "remix" feature, so the ability to quickly and cheaply clone any published notebook and continue where another person left off.
Just saying "much more", or "fully" doesn't help much. Try removing all the adjectives from your marketing copy to see if it's actually communicating anything. (Then edit, then add some back :-)). Also most of the features on this page are things you get with Jupyter or collab, address what is actually different, like you do here.
In the end, the only one we found that delivered on the promises of reproducibility and managing the entire data science life cycle end to end, facilitating collaboration, and getting stuff done was Domino Datalab[1].
Can you compare and contrast Nextjournal to DD? Better yet, do you feel you're competing in the same areas or are you really more focused just on reproducibility? Even if you're not now, it feels like eventually, all these types of products seem to converge to this state eventually just by nature of the sales process and promising more and more features to customers.
Regardless, it looks really solid, so best of luck!
While data science is an obvious use case of literate programming, it's not the only one. I see the fundamental problem that needs to be addressed is one of dependency management. We address this today using Docker. In the future we plan to use a more functional approach most likely based on Nix or Guix. This more principled approach should address both reproducibility and usability (by allowing to compose images and providing much better install times thanks to binary caching).
I haven't really used Domino Datalab but I'm not sure if they allow for the installation of arbitrary system libraries and packages like we do. Check out some out our machine learning samples which run on GPUs: https://nextjournal.com/collection/machine-learning
In the future we also plan to allow in-browser JavaScript execution, this is currently hidden behind a feature flag but we still have an article that uses it in https://nextjournal.com/dubroy/ohm-parsing-made-easy
The biggest thing that I've seen that is missing from almost all of them is a robust data ingestion and transformation engine. THAT'S what I'm interested in seeing.
We market towards Data Science because our feature set (automatic versioning, reproducibility, collaboration, etc) applies very well to many pain points currently existing in the field. But Nextjournal is not limited to Data Science. We designed it as a general purpose literate programming environment that should be able to address many different use cases: generative art, cloud APIs, spreadsheet-like applications, molecular dynamics simulations, you name it.
The way we want to achieve this is by allowing people to extend Nextjournal eventually, by bringing their own languages and by implementing their own components that can be used in a notebook and shared with other users (e.g. a spreadsheet component or a task board component). We are already building some parts of Nextjournal with Nextjournal, like our component for cloning GitHub repositories into a notebook. We think this will eventually make the platform, as a whole, much more understandable and learnable and will give our users much more agency in what they want to accomplish.
The only issue I encountered was that adding comments after the final close parens in the code sections creates EOF errors.
especially pricing per resources (its not clear from the website)
If you sign up for the paid plan which is 99$ per researcher per month you can provision more powerful machines – basically anything that Google Cloud offers.
We currently don't enforce any storage limits.
This is our first iteration of pricing though so I'm pretty sure this will still change over time. We've gotten a lot of feedback from people asking for a cheaper plan.
What most people don't realise however is that you can use most of the features (including private drafts) as it stands now for free. We've also been debating weather we should allow for private drafts on the free plan or take a stance on what open science really means (working in the open from the start) but decided agains this for now.
Curious to hear what others think about this. Do you expect drafts to be private and would it be a violation of those expectations if they were not?