I use Python and Scala. I use Python for mostly small tasks. When I hit large data, I normally use Spark on EMR (PySpark or Scala).
Clojure is one of the most actively-used data analysis languages actually. It is used by many industry for that purpose. Heck even the widely-used Metabase is written in Clojure.
Using Scala as a functional language has been a fun journey for me. I still code imperatively when solving problems, but when I get a chance to refactor them into functional programming, boy, my mind goes Boom!
I do hope I get to use Clojure at some point.
Eventually, it may be a good idea to try both Clojure and Python.
Personally I find Clojure's approach towards data very refreshing. It does require an open mind and a mindset different than usual. Eventually, this can bring joy, simplicity and power.
This article by Chris Nuernberger nicely explains what it is about: https://cljdoc.org/d/cnuernber/libpython-clj/1.2/doc/so-many...
Clojure's community is certainly smaller than Python's, but some say it is very friendly.
Below are some beginner-friendly places to chat about it. If you wish, let us chat there, dive into the details, and think how you could begin exploring.
Clojurians Zulip https://clojurians.zulipchat.com and especially the data-science stream: https://clojurians.zulipchat.com/#narrow/stream/151924-data-...
Clojureverse https://clojureverse.org
We will assume some basic knowledge of Clojure, but I guess it may be interesting to join anyway.
Congratulations on your new role. Are you joining a team, or are you the team? If you're joining a team, then you'll probably use what they're using and learn their tooling before you could endeavor to improve it.
You're doing it in a professional context, so it will be Python. Many blog posts and articles on popular medium websites address shiny new things, but most of these posts address one of two scenarios: portfolio/toy projects, a project with one individual working on it, a project with data that fits on disk and RAM, and/or a Kaggle project where a good part of the heavy lifting has been done for you (data acquisition, cleaning, feature engineering, metric identification) which never happens in real life because that's what you're hired for in the first place.
A big problem in this field is the fragmented tooling and experience, which means you have to weave tools together, unless the team you're joining has it figured out and have internal tooling dialed in. Python dominates. I'm sure other languages are used at other ML shops (we have used Scala in some of our projects) but I think in your situation, there's no need to complicate things.
Then again, that is just an opinion. It is not the right answer. The goal is to deliver value.
All the best,
In that case I would pick between Python and R. R might even win out slightly over Python for your use case. Definitely not Clojure, Scala or even Julia.
If you're working in an environment where there's a lot of collaboration Clojure might be tough. But if you're actually going to be developing software that relies on data analysis (rather than just doing it as a one off) I think Clojure might be worth considering.
Clojure does have notebook solutions which are worth looking into: * https://github.com/clojupyter/clojupyter * https://github.com/jsa-aerial/saite * https://pink-gorilla.github.io (WIP, but is growing fast and going to be magnificent)
Some other projects are trying to connect the notebook idea with the REPL+editor experience. For example: * https://github.com/metasoarous/oz
Clojure taught me a lot about infinite lazy sequences (kinda like Python's generators) and how to model the program as a pipeline. A good analogy is found from shell programming. There you have stand-alone programs which handle individual tasks and you can pipe previous program's stdout into next program's stdin. On Clojure you'd wrinte stand-alone functions which you "pipe" together via "->" thread-first and "->>" thread-last macros. It also ships with several handy functions such as "frequencies", "group-by" and "partition-by". I have ported these and several others to my own Python projects thanks to their versatility and a kind of universality.
Oh and speaking of macros, if you want to get fancy you can design your own domain-specific-language and express your problem in that, hiding all of the poilerplate under the hood. But to get the highest performance sometimes you need to think whether to use Clojure's immutable datastructures or resort to Java's mutable ones, which could have better performance (or use a library I guess). Well at least on JVM you can do "real" parallel programming, unlike on CPython interpreter due to the GIL.
Clojure is fun and very educative for all kinds of projects, but on a professional data analysis setting I'd start with Python and if it seems like a bad fit then do a PoC with Clojure. :)
What a huge topic.
If you want to get things done: Python. You’ll have no problem getting up to speed based on your past experience, and the ecosystem is orders of magnitude larger than Clojure.
Once you're comfortable with it, then it might be worth exploring other languages that are less known to have a (subjective) better software design.