How is the python-ocaml interop story? To be clear, any language that does not have first-class interop with python is basically dead in the water (at least for our case).
That said, complicated Python systems can be improved a lot by adding type annotations. That's more of a solution for web servers and other "easily type-able" applications. Typing support for scientific computing isn't quite there yet. So it depends on what kinds of systems are the complicated ones.
For that reason, Julia is being closely watched, but so far we are not thinking of pulling the trigger.
Really a remarkable feat of engineering. Here's its author giving a talk: https://www.youtube.com/watch?v=vQPW16_jixs
CL could also be great language-wise (https://digikar99.github.io/py4cl2/, https://github.com/snmsts/burgled-batteries3) but I don't know how good the interop is in reality since I haven't tried it.
https://signalsandthreads.com/python-ocaml-and-machine-learn...
It's a dynamic, garbage collected language. It's easy to pick up and get going with. As a functional programming language there isn't a lot to learn in the way of language constructs, and you don't even have to do the 'wrestling with the type system' thing that you have to do in compiled functional languages like OCaml or Haskell (like you do in Rust).
Its processing 'horsepower' is probably comparable to Python, but it's much better for building low latency things if you want to run something in a bit more of a production use case. This is also improving due to the recent addition of a JIT.
The addition of NX is making Elixir an increasingly interesting place to do ML - write Elixir, have it run on GPU etc. See https://dashbit.co/blog/nx-numerical-elixir-is-now-publicly-...
Python integration is probably best done using the Erlang 'port' system - running Python as a managed process and communicating with it using messages over stdin/stdout. I use it for C interop and it works well (and fits well with the Elixir/Erlang process model). It's not difficult to roll your own in Python e.g. https://github.com/fujimisakari/erlang-port-with-python/blob... or look at something like http://erlport.org/
The main use case for a language other than python is a more robust codebase but also performance. We need to be able to efficiently ship lots of large arrays between the languages and the Rust-Python interop supports zero copy arrays for example.
Typing does help, agreed.
def calc_xxx(df:pandas.DataFrame) -> pd.DataFrame
type...Sometimes that is great. Other times, that will be very hard and error-prone.
Once you have to think of types and lifetimes, a lot of the productivity goes down the drain.
99% of the stuff you do in research ends up being consigned to the cutting floor because it doesn't work. The 1% that ends up being useful is the only part worth productionizing.
I challenge you: A lack of understanding about the data lifetimes in a program means lack of understanding about the data.
Not saying you can't have a lot of short-lived data items that you don't want to manage one-by-one. I'm saying that for the vast majority of data items, one should be able to give a reasonably well defined lifetime upper bound. So a good solution is to make a few boxes that group items by lifetime. And from time to time, throw the outdated boxes away.
And of the few items that don't have such an upper bound at creation time, many can be created in a special box that allows migrating boxes later when required.
But this argument can extend forever.
Is your program precisely dependently typed? If not is that a lack of understanding about the nature of the data as well and should you challenge yourself to fix that?
You have to trade-off how much you specify things with how valuable it is to get the result more quickly.
You're just not going to get this buy-in from people who want to use a tool to get their work done.
Imagine you are trying to establish whether there's a relationship between timeseries X and timeseries Y. You just want a tool that allows you to quickly calculate some summary statistics of these timeseries, clean them, convince yourself that they behave according to your expectations and then run some form of regression.
Nowhere in this process do you care about lifetimes. It's literally irrelevant. In fact, as long as all your work fits into memory, you don't even care about memory management. Your objective is to answer the primary question, everything else is a costly distraction.
The 1% of ideas that ends up being worthwhile is what gets productionized and needs to be robust. But obviously rewriting everything from language A to radically different language B adds it's own headaches.
It is probably one of the most underrated programming languages. The perfect marriage between state of the art functional programming and pragmatism. A great static and strong type system. Solid performance and an insanely fast compiler. Also compiles to JS if you need that.
Multicore support will make it quite perfect. Only thing that is holding it back more than that and the reason I have not done many projects with it, is it weirdly fragmented ecosystem.
Having to decide which standard library to use is a pain but you can cope with that. Tooling is getting there but stuff like automatic code formatting solutions are still pretty immature (and have really weird defaults).
Frontend there is that ReasonML/Reason/ReScript thing that Facebook it trying to do. It offers an alternative syntax but nearly nobody uses it because they changed the name and I think also the syntax three times already. So it is all a mess.
Don't let that stop you though. There are some pretty solid mature libraries in OCaml and if need be interop story with C and other languages is solid.
I wonder if that's precisely why people use it. I've been thinking about it, and I think people using OCaml value independence a lot. That's something that doesn't help building a community, since communities often thrive on consensus. As an example of that in the linked thread: Yaron Minsky's second comment about Flambda 2, which I'll copy here:
> And, I should add: Jane Street’s intent to upstream our work is not the same as upstream’s intent to accept it. None of what I’ve said is an announcement on behalf of the core OCaml team, nor am I in any position to make such an announcement!
This comment, to me, speaks volumes in terms of respect for the independence of the OCaml team. And independence seems to be something Jane Street values a lot too. They have lots of libraries that they freely share with other people. If you want to use, in a way, their "flavor of OCaml", you're free to do so. And if you don't want to, you're free to do something else.
You can see the same thing with JSOO, ReasonML/Reason/ReScript and now Melange. You're free to pick what you want. Same thing with the multicore. You want to use it? Great! You don't want to? They are working hard to make sure your code will still work and won't suffer too much performance regressions.
It may be a bit weird if you're used to other communities, I know I took a long time to understand why things are this way, and I may still be completely wrong. But I think the angle of valuing independence explains a lot, and is also a good way to know if it's a language and ecosystem for you or not.
Another thing that may not help: the book "Le langage Caml" is a great introduction to the language and programming, but sadly it's not translated.
Maybe you are referring to the Async/Lwt dichotomy? Hopefully with multicore (and basic support for "effects" that are going to be merged in OCaml 5.0) this will become less of an issue going forward. Since the runtime is becoming considerably more capable, I expect there to be less "real" fragmentation going forward as the libraries begin to use more of the primitives provided by the runtime rather than building their own from scratch.
But then again, fragmentation is a way of life in other ecosystems too. Haskell has an ever increasing number of effect systems and preludes, Rust has many async runtimes also (async-std, tokio etc.). Fragmentation can often mean a time of competition and vitality as different approaches duke it out.
Regarding syntax -- I feel too much time has been spent in the OCaml ecosystem on surface syntax. OCaml syntax has its flaws but syntax is really a small aspect of the overall art of programming. The OCaml format is here to stay -- even if it is a bit wonky. Once you commit to it you can begin to worry about more substantial things. The ReasonML community brought in the new syntax, but with the departure of Rescript (Bucklescript) from the OCaml community I expect the usage of the new javascript-y syntax to decrease.
(If I may be heretical, I actually prefer the traditional OCaml syntax! ReasonML tries to be like JavaScript with the braces and so forth. I prefer the Haskell/OCaml syntax to the JavaScript/Rust/C/Scala brace syntax. Interestingly, Scala 3 allows a braceless style in an effort to match Python perhaps. Fashion changes. Algorithms and programming patterns endure. We shouldn't worry about the syntax so much -- as long as it is not APL ;-) ! )
I wasn't aware anything like that is happening, is it? Rescript departed from Reason, that's all, right? They want to focus solely on js target, because... that's what rescript is. New stuff they're doing looks really good.
It honestly seems like a great lang, and I hope I get to try it out for a project sometime soon.
"Nobody" uses ReasonML / Reason.
Plenty of people use ReScript.
I wasn't a big fan of the ReScript split, but I think by now it's unfair to speak of these as if they're one community with a confusing story. I think it's very fair to say it's now two separate communities: ReScript and OCaml. It was very confusing for a while, but by now it actually is much easier to understand than before the ReScript split:
ReScript is really its own language now. The compiler for that language just happens to still understand OCaml syntax, for now. The ReScript language and community is focused on the JS ecosystem, with readable JS output.
OCaml has js_of_ocaml. JSOO compiles OCaml to JS, so it's focused on the OCaml ecosystem. The JS output is not readable but you can build "any" OCaml program.
Really that's the main story -- not so hard to grasp?
There is also melange, but that's a relatively new effort in the OCaml community (attracting OCaml-y refugees from ReScript) whose status I haven't formed a view on yet. The idea is to compile OCaml programs to readable JS. Reason used to do that, but Reason now only has a tiny community (and I believe it now uses JSOO?). ReScript still does that, but using it for that purpose is no longer supported.
I think another important point is that the compiler is a fork of the OCaml compiler. That means that to contribute/maintain the compiler, you need to know OCaml. This is probably going to stay this way for a very long time, since the speed of the compiler is important.
> Really that's the main story -- not so hard to grasp?
What's not helping is that the people around Reason never really said "it's dead, move on to OCaml or Rescript". The pages for things like Reason Native, Esy, ReasonML are still up.
OCaml has an academic flavor -- maybe it's not as academic as Haskell but it moves in similar ways. There is a desire to be correct and have a theoretical framework instead of amassing a ton of language features. OCaml is the foundation for Coq and other interesting compilers, type checkers and theorem provers. Over the years, the language has grown more mainstream and you can build a decent web backend on it today, for instance.
So fine, maybe multicore won't change adoption of the language in significant ways. But I foresee that the introduction of multicore will allow some amazing software to be written in OCaml in the future. Software that is truly groundbreaking and innovative. Take the example of Coq itself -- it is an important foundational software today in Computer Science. Multicore will allow Coq to potentially speed itself up and that will bring more real world applications in the ambit of Coq.
> Having said that, Ocaml compiler is one of the greatest miracles in PL when it comes to speed vs complexity of the language. Scala/Haskell/TS are not even close.
Someone will probably come correct me but what I've heard is that the compilation speed partially comes from the Pascal/Modula-3 influence, since Niklaus Wirth took compilation time into account when designing programming languages. From what I understand, OCaml doesn't allow circular dependencies outside of a single file, and that helps. Go doesn't allow them too, and is also known for its compilation speed.
You mention that you write a lot of Scala for a living -- just as a friendly (and intended to be a light hearted) riposte, some aspects of Scala strike me as "long in the tooth" too. With Scala 3 the language has done an admirable job to modernize but I find:
- The language feels heavy and (unnecessarily) "enterprise-y" -- reminiscent of the early 2000s rather than 2021
- The JVM is capable and performant, no doubt, but adds another heavy-weight and monolithic feel to the Scala platform. (Scala native likely to be essentially minuscule for years to come)
- The language veers towards a C++ style "I will have every PL feature." Sometimes less is more
- A Scala IDE (metals or JetBrains) feels clunky. sbt is over engineered and slow and given how important it is to Scala, does not give a good overall impression of the Scala platform
- Some questionable language features like implicits remind me of magic in Ruby (implicits are addressed in Scala 3 but I wonder how many years the ecosystem will have to deal with their complications -- forever??)
- The JVM seems to let down Scala in other places. Example (a) Null is rarely used in Scala but it could still pop-up in weird situations and not always because of Java interop. (Scala 3 tries to fix this via "explicit nulls" but there are compromises with that feature also). (b) A Functional style Scala (Cats and others) is popular. But true functional style has a lot of recursion. This, according to me, requires proper tail call support in the runtime which the JVM will never have. The Scala compiler tries to be smart but I wonder if it is able to deal with tail calls without blowing the stack in _all_ situations. In other words, it is difficult to do a "Haskell" on the JVM -- which we can see in a lot of places in the Scala ecosystem.
(BTW, I have pointed out some flaws of Scala but notwithstanding my criticism, Scala has got many good features that make it worthwhile. I may use it for a future project, lets see...)
> Having said that, Ocaml compiler is one of the greatest miracles in PL when it comes to speed vs complexity of the language.
I totally agree with the statement. Its a very balanced language in all important parameters: a high level of programming abstraction is possible, the LSP language server is responsive, the dune build system is great, compile times are really miniscule and run-time performance is great for a garbage collected language.
Do you still feel that way with Scala 3? From what I understood, the work on the DOT calculus helped reduce and simplify the core of the language.
But it to me seemed packaged like many languages in the days of yore, when a language shipped simply as a compiler, and nothing more. The way of the world today to me seems to be a compiler, together with a complete standard library and consistent packaging system.
My experience with OCaml was thwarted repeatedly by a byzantine exploration process of packages depending on other packages, which required other packaging systems. Once I reached that point where it felt like I was spending more time figuring out the complex ecosystem, rather than writing code, I rapidly lost interest.
And perhaps such a point comes in exploring any new language. But it came much too early for me in OCaml. I had so much more I wanted to learn, but couldn't. I am hopeful for the new release. Thank you for your efforts, OCaml team.
I didn't have the experience you described (not yet anyway!).
"Never have I took so long, to write so little code, that does so much"
OCaml can be a big learning curve, but I urge you to push through. The syntax might not be everyone's cup of team, but you get used to it quickly.
There's too little of it! OCaml seems to take a "you don't need syntax except when you need syntax" approach, which I found very destabilizing. One of the major online OCaml tutorials said something like "If it doesn't work the way you expect, try adding parentheses", and I thought "Oh hell no. In a Lisp I know exactly how many parentheses I need: all of them". I prefer not having to think about it, and letting the parentheses become invisible to me.
But otherwise I have a deep and irrational fondness for the language, and still wish I'd been able to make it stick.
That sounds like what I did with C++ with * and & when I didn't understood them. Do you think it's a lack of exprience/comprehension on your part, or that some parts of the syntax are fundamentally flawed?
Just to call out expectation-setting here in the comments: yes, the MVP of multicore will ship in OCaml 5.0, but OCaml 5.0 will ship no sooner than March 2022 (and very likely some point later, based on how challenging it appears to be to integrate the large-scale changes for multicore).
Python let's me start programming quickly. ML let's me finish quickly.