[1] https://arxiv.org/abs/1710.09756 [2] https://www.youtube.com/watch?v=t0mhvd3-60Y
The explicit purpose of Haskell is to be a basis for research into functional language design (edit: among other purposes). By "explicit purpose" I mean exactly that... people got together in 1987 to come up with a language for research. Haskell was never supposed to ossify into some kind of "finished product", it was built exactly so people could experiment with things like linear types. If you want to just write libraries and get stuff done with a more or less fixed language, you probably want to be writing OCaml.
I mean, just look at the list of GHC extensions... there are something like a hundred of them! The list is growing longer every year. https://downloads.haskell.org/~ghc/latest/docs/html/users_gu...
I think Haskell does have a good model for bringing together practical application of theoretical research.
Parent's comment is spreading the myth that Haskell is an academic language. It's not wrong but it's not Haskell's only stated purpose or utility by far.
Compare language features and Haskell's approach: Erlang and distributed-process, goroutines and channels and Control.Concurrent(.Chan), (D)STM is a library, Control.Applicative and Control.Monad for many things hardly expressible in any other language, etc, etc.
Linear types, I am afraid, would go the way implicit parameters went - their use is cumbersome and they really do not help much with day-to-day programming and when they are needed they can be perfectly implemented with a library.
Please edit swipes like that out of your comments here. The rest is fine and stands on its own.
Linear types are the perfect example of a feature that belongs in the core language, or at the very least into a core intermediate language. They are expressive, in that you can encode a lot of high-level design ideas into linear types. You can compile a number of complicated front-end language features into a linearly typed intermediate language. Linear types have clear semantics and can even be used for improved code generation. If we ignore the messy question of how best to expose linear types in a high-level language then this is just an all around win-win situation...
I do not oppose inferring linear use at core and/or intermediate representation (GRIN allowed for that and more). I just do not see their utility at the high level, in the language that is visible to the user.
Unless, of course, you're implying it's very haskellish to implement libraries with huge usability gotchas (of which ResourceT was one until the Ghosts of Departed Proofs paper reminded us we can reuse the machinery of ST), then I totally agree.
I think it is a better venue which can help many applications simultaneusly. While linear types won't.
Iirc Oleg Kiselyov implemented proper delimited continuations in ocaml as a library, without touching the runtime or compiler. Something similar has been done in Haskell.
I doubt fully dependent types can be implemented in Haskell without extra help by ghc. There has been lots of work in the area, and last time I checked you could simulate DT to some degree, but it never was as powerful as the dependant types in idris. Iirc t were some edge cases where the typing became undecidable.
You can easily simulate that using parametrized monad: http://blog.sigfpe.com/2009/02/beyond-monads.html
E.g., hClose will have type like (Has listIn h, listOut ~ Del h listIn) => h -> ParamMonad listIn listOut () and hGetLine will result in type much like this one: (Has list h) => h -> ParamMonad list list String
It is not perfect: you still may have reference to handle after it's use and you may be tempted to introduce it somehow back and get run-time error; you also would struggle juggling two handles for copying files, for example (for this you may have to use ST parametrization trick).
But anyway, you really not need linear types all that often (they are cumbersome - I tried to develop language with linear types, type checking rules and types get unwiledy very soon) and when you do, you can have a library. Just like STM, parsers, web content generation/consumption, etc. Linear types do not warrant language change.
So library approach thrives.
Therefore the RAII style wouldn't really work in Haskell. The current bracket approach is still better than RAII in Haskell.
That said, the ST-style trick of a phantom type variable is pretty well-known. Unfortunately not many people knew the same trick can be used for non-ST as well. I feel like as a community we should be encouraging this style more often.
UPDATE: I wrote the original comment with the incorrect assumption that drop functions will always be called in Rust. This is wrong. Please see child comments.
The linked post is interesting, because I didn't realise "RAII is a much better way of managing resources than destructors" was controversial. It absolutely is, RAII is fast, predictable, and flexible. It's also one of the tradeoffs some languages make to achieve more flexibility in their design by enabling performant automatic garbage collection that doesn't require perfect escape analysis.
Which .NET is finally arriving to, thanks to Midori outcomes. And Java might eventually get there as well, depending on how Project Valhalla ends up.
As for languages like Haskell, a mix of bracket and linear types might be the way to go.
The linked article was comparing RAII with the bracket approach, not the destructor approach.
This isn't useless because memory allocation can happen during destruction/exit, e.g. to write some data to the filesystem.
Suppose you have a container with a billion objects. The container's destructor iterates over each object, doing some housekeeping that requires making a copy and then deleting the original before moving on to the next object.
That requires memory equivalent to one additional object because an original is destroyed following each copy. Stop dellocating memory during destruction/exit and the total memory required doubles, because you have all the copies but still all the originals.
There are also some helpful things that happen during deallocation. For example, glibc has double free detection, which strongly implies potential UAF but it's only detected if the second free() actually gets called.
However, this is different than the bracket pattern that the article is taking about. No one in the Haskell community advocates cleaning up resources (like file descriptors, etc) using only destructors.
It is quite possible you may need to have RAII somewhere in Haskell code and that's where things like parametrized monads are good: http://blog.sigfpe.com/2009/02/beyond-monads.html
It is a library and I keep saying that what is usually programming language feature is just a library in Haskell.
thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.
Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function - on the socket in their destructor or whatever to prevent that. But I don't think the same applies for memory, once the process is destroyed the kernel should reclaim all memory in the process page tables automatically. Otherwise you'd end up with a pretty trivial way of disabling the system by exhausting all the memory...
well, the problem with non-RAII solutions is that you depend on the whims and talent of the programmer to call shutdown at some point. With a RAII solution like in C++ or Rust you know that if your socket opened successfully, a call to close will necessarily be issued.
Zeroing on exit would be more secure, but significantly slower -- you want to exit quickly, so you can potentially start a replacement program, which would be expected to, at least sometimes, take time to allocate the same amount of memory. If it does allocate the whole amount immediately, it's not necessarily any slower in total time between zeroing at exit or on mapping; but it there's enough time for the pages to get zeroed in the background, that reduces the amount of time waiting for the kernel to do things.
Constructors are orthogonal. The job of a constructor is to construct your object given that the space for the object is already allocated. This could be on the stack, where allocation means bumping the stack pointer, or in-place in preallocated storage (like std::vector), or the result of calling `operator new`. Simply using the `new` syntax does both as a shorthand.
Similarly the job of a destructor is to destruct your object without deallocating it. One can in-place destruct without deallocating, or destruct and then deallocate implicitly when the stack pointer is adjusted, or not at all. The `delete` syntax does both destruction and deallocation as a convenience.
These other resources still need an in-memory representation to track and reference resources, so you can't really separate them.
There's also no guarantee for Rust/C++ destructors to be called. It's certainly less of an issue then depending on the GC to being called, but if you need absolute correctness, then you shouldn't rely on the destructors.
If you allocate an object on the heap with `new` then its destructor isn't called automatically unless you make it so through some other mechanism, but GP comment clearly want claiming that.
There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.
The issue is when you produce an API that contains objects with destructors. Since you are handing these entities off to unknown code, you cannot ensure that they will be dropped. This was a problem in scoped threads in Rust.
Not the parent, but it is trivial to write C++ and Rust examples in which destructors of variables with block scope are not called. The std library of both languages do even come with utilities to do this:
C++ structs:
struct Foo {
Foo() { std::cout << "Foo()" << std::endl; }
~Foo() { std::cout << "~Foo()" << std::endl; }
};
{
std::aligned_storage<sizeof(Foo),alignof(Foo)> foo;
new(&foo) Foo;
/* destructor never called even though a Foo
lives in block scope and its storage is
free'd
*/
}
C++ unions: union Foo {
Foo() { std::cout << "Foo()" << std::endl; }
~Foo() { std::cout << "~Foo()" << std::endl; }
};
{
Foo foo();
/* destructor never called */
}
Rust: struct Foo;
impl Drop for Foo {
fn drop(&mut self) {
println!("drop!");
}
}
{
let _ = std::mem::ManuallyDrop::<Foo>::new(Foo);
/* destructor never called */
}
etc.> There are some situations where objects with block scope do not have their destructor called e.g. `_exit()` called, segfault, power cable pulled out. But in that sense nothing is guaranteed.
This is pretty much why it is impossible for a programming language to guarantee that destructors will be called.
Might seem trivial, but even when you have automatic storage, any of the things you mention can happen, such that destructors won't be reached.
In general, C++, Rust, etc. cannot guarantee that destructors will be called, because it is also trivial to make that impossible once you start using the heap (e.g. a `shared_ptr` cycle will never be freed).
> forget is not marked as unsafe, because Rust's safety guarantees do not include a guarantee that destructors will always run.
This was a problem in Rust when scoped threads relied on destructors being run.
I'm not sure it's possible to force any code to be run (e.g. a process can be terminated at any time) although a closure might offer slightly stronger guarantees in some situations.
This is a classical liveness vs. safety dualism. "Something good will eventually happen" and "nothing bad will ever happen" are promises whose solutions are often in conflict with one another.
The general problem — to make transactional state changes and transactional control flow (i.e. expectations about these state changes) match up precisely — is not easy to solve in the general case, especially once you move on to things that are less trivial than simple resource acquisition/release matching.
Your point about this being difficult to solve in the general case is true, it's just worth pointing out Rust intends to do that hard thing anyway.
You can still call drop on it manually to release it earlier, though.
No. Rust's ownership problem solves it for trivial cases, at the cost of making it hard to do other things (such as sharing references past the lifetime of the owner without resorting to Rc<T> or Arc<T>, at which point you don't really have lifetime guarantees anymore).
The essential limitation of Rust is that (without resorting to Rc<T> and Arc<T>, which would put you back to square one) it is conceptually limited to the equivalent of reference counting with a maximum reference count of 1. In order to make this work, Rust needs move semantics and the ability to prove that an alias has a lifetime that is a subset of the lifetime of the original object) and may even sometimes have to copy objects, because it can never actually increase the (purely fictitious) reference count after object creation.
This inherent limitation makes a lot of things hard (or at least hard to do without copying or explicit reference counting). Structural sharing in general, hash consing, persistent data structures, global and shared caches, cyclic data structures, and so forth.
In short, you have the problem with shared references less, because Rust makes it hard to share data in the first place, for better or worse. (Again, unless you resort to reference counting, and then you get the issue back in full force.)
This is a thing people say, but I think it's misleading. Reference counting can increase the lifetime of an object, but borrowing cannot. I've seen this really trip up beginners.
> This inherent limitation makes a lot of things hard
It can make them different, which can be hard, but these things are already hard. And some people think it can make them easier or even better; see Bodil Stokke's work on persistent data structures in Rust.
Your analysis of the trade offs is fine, but you claim that Rust only solves this problem for "trivial" cases. If that's true, then most of the Rust code I've written is trivial. To me, that pretty thoroughly weakens your dismissal here, at least in my case.
Keeping a debug reference at end of transaction (via a shared-reference type, since a non-shared RAII reference type could never get into that state) isn't a coding error, it's a design error -- the development intentionally requested contradictory things. That is solved by using weak references if you don't want a debug tool to force an object to stay alive.
As a simple example, you may still want to access a resource after it has been released. Closing a network connection, for example, does not mean that accessing it is invalid. The connection may still have state (such as statistics collected or whether a non-blocking close was clean) that is perfectly legal to access after release (and in fact may only be consistent/observable afterwards).
The Eiffel FILE class [1], for example, allows you to call `is_closed` at any time (as well as the various `is_open` functions). This is necessary because `not is_closed` is evaluated as a precondition for many other operations.
A more complex example is a resource that is shared by many threads. Whether that resource is valid is often not a question of whether a reference is reachable, but a function of complex distributed state. Sometimes this can be solved by atomic reference counting, but even then atomic reference counting is expensive.
[1] https://archive.eiffel.com/products/base/classes/kernel/file...
Python's "with" clause, and the way it interacts with exceptions, is the only system I've seen that gets this right for the nested case.
That is unclear. Currently, `File::drop` ignores all errors and drops them on the floor ([unix], [windows]). This is a concern both long-standing and ongoing[0].
AFAIK discussion has gone no further than https://github.com/rust-lang-nursery/api-guidelines/issues/6...
[unix]: https://github.com/rust-lang/rust/blob/master/src/libstd/sys...
[windows]: https://github.com/rust-lang/rust/blob/master/src/libstd/sys...
[0] https://www.reddit.com/r/rust/comments/5o8zk7/using_stdfsfil...
Letting this slide for this long is a very bad sign. I’ve been a big Rust fan for my hobby projects, but the whole point of Rust is effortless correctness and safety. The more I encounter bugs and issues that have no near term solution planned, the more confidence I must admit I’m losing in their bug vs feature work prioritization scheme.
For example, it seems sometimes that Rust management would rather focus on cool new language enhancements / rewrite projects, than fix major bugs (sometimes even major borrow checker bugs, or random segfaults created in correct programs).
Rust has never been about proving correctness. Yes, correctness is a goal, but it is subservient to other goals, depending on details.
Furthermore, it's not clear that this can really be implemented in a reasonable way, see https://news.ycombinator.com/item?id=18175838
> it seems sometimes that Rust management would rather focus on cool new language enhancements
In this comment, you're complaining that we haven't implemented a "cool new language enhancement." This is at odds with your desire stated here.
I'm quite comfortable stating that non-lexical lifetimes and async I/O, for instance, are far more important. The number of users who benefit from those two features are multiple orders of magnitude greater than the number of users who care about whether the official opinion of the guidelines subteam is that close() should take &self or &mut self. The Rust team would be doing a disservice to users if it focused on small issues like that—this isn't even a bug we're talking about, it's guidance around conventions!—instead of the biggest complaints that come up constantly.
The destructor would instead be in charge to perform the rollback actions on an uncommitted transaction, if any. Rollback cannot fail and indeed the system must preserve integriy even if not performed as there is no guarantee that the process will not be killed externally.
Of course if you do not care about data integrity, swallowing errors in close is perfectly acceptable.
Edit: in general destructors should only be used to maintain the internal integrity of the process itself (freeing memory, clising fds, maintaining coherency of internal datastructures), not of external data or the whole system. It is fine to do external cleanup (removing temporary files, clearing committed transaction logs, unsubscribing from remote sources, releasing system wide locks etc), but shoud always be understood to be a best effort job.
A reliable system need to be able to continue in all circumstances (replying or rolling back transactions on restart, cleaning up leftover data, heartbeating and timing out on connections and subscriptions, using lock free algos or robust locks for memoryshared between processes, etc).
Also I guess in Haskell there is more expectation that the type system should prevent you from expressing runtime errors
I think the reason for this is might be that, in Haskell, a function starting with 'with' is, by convention, using the bracket pattern and the way that you might use such a function would be very similar in structure to the Python way.
Something that is often said about C++ is that, you're only ever using 10% of the language, but that everyone uses a different 10% and it's true, but it's true of every language to differing degrees. Everyone has their own way of forming programs, just like everyone has their own slightly different style of playing chess, cooking or forming sentences.
When you have a well developed style, you will quickly spot any deviations from it. At that point, it doesn't matter if your style was forced on you by the language or whether it's just a convention that you use.
It's certainly true that Haskellers expect a lot from the type system, even compared to other static languages, let alone Python.
I don't know if there's an elegant way to solve this. If Rust had exception you could use that but then again in C++ it's often explicitly discouraged to throw in destructors because you could end up in a bad situation if you throw an exception while propagating an other one. How does Python's "with" handle that?
Much as in C++, this is not really allowed: drop runs during panic unwinding, a panic during a panic will hard-abort the entire program.
> I don't know if there's an elegant way to solve this.
I don't really think there is. Maybe opt-in linear types could be added. That would be at the cost of convenience (the compiler would require explicitly closing every file and handling the result, you could not just forget about it and expect it to be closed) but it would fix the issue and would slightly reduce the holding lifetime of resources.
Furthermore, for convenience we could imagine a wrapper pointer converting a linear structure into an affine one.
> How does Python's "with" handle that?
You'll get the exception from `__exit__` chaining to whatever exception triggered it (if any). Exceptions are the normal error-handling mechanism of Python so it's not out of place.
Right, I didn't really consider that a "drawback" because I'm in the camp that considers that panic! shouldn't unwind but actually abort the process here and there anyway. But you're right that if you rely on the default unwinding behavior panic!ing in destructors is a very bad idea.
It could take a callback. Then for any given file handle, if you don't care that the write failed you can ignore it; if you care but can't sensibly respond, you can panic; if you can sensibly respond you can do it inline or schedule work to be done somewhere with a longer lifetime than the file handle.
With RAII in C++ there's no visual difference between dumb data objects and objects like locks that are created and held on to mainly to cause implicit side effects.
In Rust this also prevents the compiler from dropping objects early - everything must be held until the end of its scope for the 0.1% of cases where you're RAII managing some externally visible resource. In those cases I would like the programmer to denote "The exact lifetime of this object is important", so the reader knows where to pay attention.
Additionally, part of Rust's core ideas is that the compiler has your back with this kind of thing, so there's less need for comments that say "CAUTION HERE BE DRAGONS." Those things can still be useful for understanding details of your code, of course, but they aren't needed to ensure that things are memory safe. That's what the compiler is for!
My preferred semantics would have been early drops by default, and a must_drop annotation similar to must_use, to say that objects like RwLockReadGuard should be explicitly dropped or moved.
It is done properly in other languages as well, specially if they allow for trailing lambdas.
fn leak() {
// Create a 1KiB heap-allocated vector
let b = Box::new(vec![0u8; 1024]);
// Turn it into a raw pointer
let p = Box::into_raw(b);
// Then leak the pointer
}
Obviously that's kind of blatant, but there are more subtle ways to leak memory. Memory leaks aren't considered unsafe, so even though they're undesirable the compiler doesn't guarantee you won't have any.Reference cycles when using Rc<T> are a big one, but generally it's pretty hard to cause leaks by accident. I've only run into one instance of leaking memory outside of unsafe code, and that was caused by a library issue.
Granted, the ownership/borrowing semantics of rust make this a lot harder, but anything that uses Rc/Arc can easily fall prey to it — you can use those to create a reference cycle.
If you mean unintentional leaks then that is a harder problem. Others have noted ARC and RC leaks but also thread locals may (or may not) leak[0].
[0]: https://doc.rust-lang.org/std/thread/struct.LocalKey.html#pl...
It has no such static checking because it was deemed to reduce expressiveness, while not impacting memory safety.
> Rust's safety guarantees do not include a guarantee that destructors will always run. For example, a program can create a reference cycle using Rc, or call process::exit to exit without running destructors. Thus, allowing mem::forget from safe code does not fundamentally change Rust's safety guarantees.
The mechanic point of this article is pretty clear:
- it's possible to be unsafe in both Haskell and Rust when dealing with resource cleanup
- Rust does a bit of a better job in the general case though it has it's own warts (see the other comments, it's hard to deal with issues during `drop`-triggered cleanup)
I want to make a muddier meta point -- Rust is the best systems language to date (does anyone know a better one I can look at?).
- The person who wrote this article Michael Snoyman[0] is mainly a haskell developer, he's the lead developer behind arguably the most popular web framework, yesod[1].
- Haskell developers generally have a higher standard for type systems, and spend a lot of time (whether they should or not) thinking about correctness due to the pro-activity of the compiler.
- These are the kind of people you want trying to use/enjoy your language, if only because they will create/disseminate patterns/insight that make programming safer and easier for everyone down the line -- research languages (Haskell is actually probably tied for the least "researchy" these days in the ML camp) are the Mercedes Benz's of the programming world -- the safety features trickle down from there.
- Rust is not a ML family language -- it's a systems language
- People who write Haskell on a daily basis are finding their way to rust, because it has a pretty great type system
When was the last time you saw a systems language with a type system so good that people who are into type systems were working with it? When was the last time you saw a systems language that scaled comfortably and gracefully from embedded systems to web services? When have you last seen a systems language with such a helpful, vibrant, excited community (TBH I don't think this can last), backed by an organization with values Mozilla's?
You owe it to yourself to check it out. As far as I see it rust has two main problems:
- Learning curve for one of it's main features (ownership/borrowing)
- Readability/Ergonomics (sigils, etc can make rust hard to read)
Admittedly, I never gave D[2] a proper shake, and I've heard it's good, but the safety and the emphasis on zero-cost abstractions Rust offers me makes it a non-starter. Rust is smart so I can be dumb. C++ had it's chance and it just has too much cruft for not enough upside -- there's so much struggle required to modernize, to make decisions that rust has had from the beginning (because it's so new). It might be the more stable choice for a x hundred people big corporate project today or next month, but I can't imagine a future where Rust isn't the premier backend/systems language for performance critical (and even those that are not critical) programs in the next ~5 years.
I'll go even one step further and say that I think that how much rust forces you to think about ownership/borrowing and how memory is shared around your application is important. Just as Haskell might force you to think about types more closely/methodically (and you're often better for it), Rust's brand of pain seems instructive.
[1]: https://www.yesodweb.com/
[2]: https://dlang.org/
Have a look at ATS[1], it supports many features that are available in Rust, and let you build proofs about your code behaviour. It's quite type annotation heavy though iirc, but it's very efficient.
[1] : http://www.ats-lang.org