Critique of Lazy Sequences in Clojure (opens in new tab)

(clojure-goes-fast.com)

87 pointsrobto2y ago49 comments

49 comments

47 comments · 13 top-level

psd12y ago· 10 in thread

I've been circling around lisp for a couple of years. I'm starting in a month, I'll spend several hours a day. I still don't know what language I want to learn.

I was drawn to Clojure because it looked like a lisp for getting stuff done. But a few things put me off. This article puts me off more. I want to get the semantics down before I have to think about what's going on under the hood.

hxegon2y ago

Clojure's lazy sequences by default are wonderful ergonomically, but it provides many ways to use strict evaluation if you want to. They aren't really a hassle either. I've been doing Clojure for the last few years and have a few grievances, but overall it's the most coherent, well thought out language I've used and I can't recommend it enough.

There is the issue of startup time with the JVM, but you can also do AOT compilation now so that really isn't a problem. Here are some other cool projects to look at if you're interested:

Malli: https://github.com/metosin/malli

Babashka: https://github.com/babashka/babashka

Clerk: https://github.com/nextjournal/clerk

BaculumMeumEst2y ago

Despite its weak points, Clojure is still an excellent lisp for getting things done. For long running programs that live on the server, and especially for multithreaded/asynchronous workloads, I find it far better to work with than other lisps.

unlogic2y ago

I'm sorry my article had such an effect on you. Despite the inconveniences, it is still a very worthy language to pick up.

roenxi2y ago

"There are only two kinds of languages: the ones people complain about and the ones nobody uses"

Clojure is fun enough that people get to know all the edge cases. And put up with the stack traces.

If you do decide to try it, don't use deps.edn, go straight to Leiningen for build tooling. Even just for playing around in the REPL.

whalesalad2y ago

I wouldn’t let this article put you off. HN is often full of really negative takes like this that bear far less significance than they might suggest.

Clojure is a fantastic language, and probably the best lisp you could start out with due to the fact that you have the entire Java ecosystem at your fingertips.

actuallyalys2y ago

I’m personally a fan of Clojure, in part due to how practical it is. Some of that practicality comes at the expense of simplicity, however.

I wonder if a Scheme dialect would be a better fit for you? They tend to be smaller and might let you focus on semantics more.

Full disclosure: I haven’t spent nearly as much time with any of the Scheme/Scheme-inspired dialects as I have with Clojure. I’m basing this off of their design philosophy and others’ observations.

ngai_aku2y ago

> I've been circling around lisp for a couple of years. I'm starting in a month, I'll spend several hours a day.

Is this just a personal goal you’re setting? I’m curious because I’m in a similar position, so I’d love to hear your plan!

invertisment2y ago

You won't start anything. I dare you. Haha. You can't learn Clojure and you know it.

hxegon2y ago

I hope you find something better to do with your day than putting people down.

invertisment2y ago

He's overanalyzing. I think it's best to choose one randomly, even with a coin toss and learn enough about it. Then later he'll know better what to choose next. I'm not trying to put down anyone. I'm trying to challenge the person to actually do something instead of doing analysis-paralysis.

jwr2y ago· 5 in thread

This article is somewhat puzzling for me. On one hand, the OP clearly knows Clojure very well. The disadvantages of laziness are real and well described.

On the other hand, though, this sounds like a theoretical/academic article to me. I've been using Clojure for 15 years now, 8 of those developing and maintaining a large complex SaaS app. I've also used Clojure for data science, working with large datasets. The disadvantages described in the article bothered me in the first 2 years or so, and never afterwards.

Laziness does not bother me, because I very rarely pass lazy sequences around. The key here is to use transducers: that lets you write composable and reusable transformations that do not care about the kind of sequence they work with. Using transducers also forces you to explicitly realize the entire resulting sequence (note that this does not imply that you will realize the entire source sequence!), thus limiting the scope of lazy sequences and avoiding a whole set of potential pitfalls (with dynamic binding, for example), and providing fantastic performance.

I do like laziness, because when I need it, it's there. And when you need it, you are really happy that it's there.

In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way. That's why I find the article puzzling.

roenxi2y ago

> Laziness does not bother me, because I very rarely pass lazy sequences around.

Sounds like that is going to the point the article is making - the best way to use lazy sequences is not to. Lazy sequence bugs make for a miserable experience. Clojure already has an onboarding problem where every new learner has to discover all the obscure do- and don't-s and go through the lessons of which parts of the language are more of a gimmick vs the parts that do real work. Attempting to do tricks with lazy sequences is part of that but it is polite to warn people before they try rather than when they get to Stack Overflow after hours of head-to-desk work.

Although I will put in a small plug for lazy sequences because they work well in high latency, high data i/o bound situations like paged HTTP calls or reading DB collections from disk. When memory gets tight it can be helpful to be processing partially realized sequences. But the (map println [1 2 3]) experience that everyone has is a big price to pay.

jwr2y ago

> the best way to use lazy sequences is not to

I disagree — I do use lazy sequences, I just rarely pass them around. Very few functions in my code return lazy sequences, and those are usually the "sources": functions that can return database data, for example.

Most of the code does not return lazy sequences, and thanks to transducers can be abstracted away from the entire notion of a sequence.

unlogic2y ago

Well, if you already use transducers, then you are not exactly the target audience:). It's meant more for the younger developers who see a core language feature and feel inclined towards using it despite the drawbacks.

> In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way.

Interesting, because it still does bother me. I mean, if I use lazy sequences and functions on them. Sure, if I consciously avoid them, then it doesn't me anymore, that's the point of the article :D.

vemv2y ago

I enjoyed the article while I have a very comparable Clojure experience.

There's always new minutiae to learn. Plus I get a handy link that I can just paste next time the topic of laziness comes up in a code review.

I'd just simply point out that there are a few sub-tribes within the Clojure world. Some are very attracted to formalism/correctness, other pride themselves in rejecting them.

muhaaa2y ago

same here. in practice its almost never an issue. Always I try to use transducers first. Where its not possible, I ask is the data very large then use lazy sequences else mapv.

I am using clojure for my side projects & hustles. If the project is quick and dirty who cares how its implemented. If project evolves to more serious product, I should re-write anyway and optimize the critical code paths.

kimi2y ago· 5 in thread

Not sure if the "transducer" approach suggested as a workaround makes your life easier or further adds to the mental overhead. See https://www.astrecipes.net/blog/2016/11/24/transducers-how-t... for some example transducers.

jwr2y ago

Using transducers is really easy and intuitive (with "comp" letting you compose a pipeline of transformations in a readable way).

Writing custom transducers, especially stateful transducers is really difficult. But that's not something you'll do often. My 10kLOC complex app has three stateful transducers that I wrote.

I think transducers are an under-appreciated aspect of Clojure. They are an extremely valuable and flexible tool, and have allowed me to write reusable and composable code and tackle significant complexity, all with great performance.

kimi2y ago

Yes I agree with you that transducers are OK. I think that they miss a bit in ergonomics, not power.

thom2y ago

Transducers aren’t just faster, they offer more functionality, with the ability to reuse the logic no matter where your inputs and outputs are coming from. So it’s no real surprise that there’s more implementation complexity. Client code isn’t much more complex, and arguably lower mental overhead because you can give stacks of transducers names without having to introduce a whole new function with a sequence as an argument.

Obviously in languages that can reliably perform stream fusion transparently, maybe you care less, but the abstraction isn’t just about the speedup.

invertisment2y ago

I saw this a couple of days ago somewhere and I think it belongs in the discussion between lazy-seqs and transducers: https://en.wikipedia.org/wiki/Waterbed_theory

You can't hide from complexity. It will lurk somewhere anyway.

thom2y ago

Yup. And to be honest there’s a bunch of additional complexity in Clojure transducers that I very much dislike (all the reducing function stuff which is almost never actually used and which is secondary to the purpose of creating a sequence transform via function composition). But it’s a tradeoff you make for additional functionality or speed. Or you move complexity to reduce cognitive overhead for certain use cases. There’s rarely a free lunch.

lenkite2y ago· 5 in thread

The main issue is that Clojure compiler doesn't really optimize lazy sequences right ? Most language compilers do this. Rust lazy iterators for example many times exhibit faster performance than for-lops.

And clojure also doesn't give an error/warning when lazy sequences aren't finalized.

memefrog2y ago

It's not really Rust's compiler that 'optimises lazy sequences'. It's LLVM, which notices that the code emitted happens to be able to be optimised down, if you run it for a really, really long time with some very strong optimisations.

GHC would be a better example, I think. It performs stream fusion. This means it can turn 'map f (map g xs)' into 'map (f . g) xs', and of course it gets more complex than that, but that's the basics. It directly optimises lists (which, this being Haskell, are lazy sequences).

vbezhenar2y ago

> GHC would be a better example, I think. It performs stream fusion. This means it can turn 'map f (map g xs)' into 'map (f . g) xs', and of course it gets more complex than that, but that's the basics. It directly optimises lists (which, this being Haskell, are lazy sequences).

Is it for built-in map or it would work in a general way for, say, `myMap f (myMap g xs)` ?

consilient2y ago

> Is it for built-in map or it would work in a general way for, say, `myMap f (myMap g xs)` ?

Just the stdlib `map`, but it's all ordinary library code. You can easily add your own rewrite rules.

dontlaugh2y ago

Rust's lazy iterators don't cache the results, to be iterated over again. It allows the optimiser to inline all calls and end up with a loop that doesn't need bounds checks, which is why they can be faster than C-style loops over arrays.

fulafel2y ago

This critique is mostly about the semantics. And I agree. For me it's mainly that things happening out of order introduces surprises for reasoning in the normal edit-debug cycle if you've forgotten to use the *v versions of functions. As for performance optimizations, there are some, such as chunking and locals clearing mentioned in the article.

geokon2y ago· 4 in thread

As I posted on Reddit:

It might also be good to mention Injest

https://github.com/johnmn3/injest

Which makes transducers more ergonomic to use if you are like me and use threading macros everywhere

Would be curious to hear how others feel about it

waffletower2y ago

Another great library that decidedly reveals transducer value: https://github.com/cgrand/xforms

BaculumMeumEst2y ago

these look great to me. would there be a downside to adding them to core?

bjoli2y ago

Macros that rely on parsing and rewriting their bodies are great way to introduce bugs. The regular threading macros work well enough because they are simple. More complex rewrites don't compose with other macros.

unlogic2y ago

Clojure authors are quite conservative about adding such opinionated instruments into the language, especially when they come from outside. But fortunately, being a Lisp allows Clojure to add such language modifications with libraries without forcing those modifications upon every user of the language.

2 more replies

BaculumMeumEst2y ago· 3 in thread

It’s too bad that transducers were created long after clojure’s inception. Can you always replace a lazy seq with a transducer? Could the language theoretically be redesigned to replace all default usages of lazy seqs with transducers, even if it were a major breaking change? And have lazy operations be very explicit?

unlogic2y ago

It could certainly be redesigned to have explicit lazy collections/operations and use transducers as the composition glue. It would be a breaking change, so it's never going to happen in Clojure. But if somebody plans to design a language inspired by Clojure, they should certainly take this hint.

BaculumMeumEst2y ago

If we ever found ourselves in a position where Clojure’s market share was decreasing YoY, do you think it would ever make sense for Clojure’s maintainers to design a new language that implements this + any other issues that come up on Clojure’s yearly survey (and other lessons learned) that might be more easily addressed by sacrificing backwards compatibility? Or do you do think the community would want them to focus on maintaining Clojure themselves?

I realize the maintainers likely would not even be interested in such a thing, of course, just daydreaming.

bjoli2y ago

Transducers were at least 20 years old when Clojure was first created. I have the book "Common Lisp: the language" from 1989 that describes transducers as found in Clojure.

Clojure is the only language where it is baked in that prominently though.

ndr2y ago· 1 in thread

> The good parts of laziness: Avoiding unnecessary work

Actually be very careful with side effects. Some functions like `map` and `for` take things in chunks, typically in steps of 32 as most underlying structures are in log-32 leaves.

```

  (let [printing-range (map (fn [i] (print "debug: " i) i) (range))
        first-10 (take 10 printing-range)]
   first-10)
  debug: 0
  debug: 1
  debug: 2
  debug: 3
  debug: 4
  debug: 5
  debug: 6
  debug: 7
  debug: 8
  debug: 9
  debug: 10
  debug: 11
  debug: 12
  debug: 13
  debug: 14
  debug: 15
  debug: 16
  debug: 17
  debug: 18
  debug: 19
  debug: 20
  debug: 21
  debug: 22
  debug: 23
  debug: 24
  debug: 25
  debug: 26
  debug: 27
  debug: 28
  debug: 29
  debug: 30
  debug: 31
  (0 1 2 3 4 5 6 7 8 9)

```

kazinator2y ago

It can be legitimate to have side effects in lazy processing and in particular to rely that a lazy sequence is not accessed beyond the visible access that is coded in the program.

Suppose we make a sequence of numbers which grows very rapidly, so that by the time we hit the 17th one, we have a bignum that is gigabytes wide.

You probably don't want this to be chunked in batches of 32.

Another situation might be if we have some side effect: the lazy sequence is connected to some external API somehow or foreign code. You might want it so that the observable behaviors happen only to the extent that the sequence is materialized.

The advice to be careful with side effects is good in general; not sure why you're downvoted.

beanjuiceII2y ago· 1 in thread

i need to figure out where my slowdown is happening...I'll be needing a phd for that

thom2y ago

Even something as simple as VisualVM makes this fairly easy for Clojure code.

grav2y ago

I've been using the trick with enforcing realization by serializing to strings a few times. Slow, but quite useful in many contexts. However, instead of using `(with-out-str (pr ...`, there's simply`pr-str`, which is easier to remember.

I'm typically using it like so:

  (defn realize [v] (doto v pr-str))

  (binding [*some* binding]
    (realize (f some-nested-lazy-seq)))

munchler2y ago

F# is similar in that it supports lazy sequences but is mostly eager otherwise, and often handles errors using exceptions. One does have to be careful, but the benefits far outweigh the risks in my experience.

waffletower2y ago

I think the main issue with lazy sequences is understanding and controlling their scope. Transducers, particularly when utilized within an `into` scope, can encapsulate laziness very neatly. Indeed, transducers utilize lazy sequences internally, and the OP shows their clear performance advantage. I think the article would be more effective if it shifted tone to "Clojure laziness best practices" rather than damning the idea wholesale. There be dragons for sure.

jgalt2122y ago

I have not read yet, but how much of this critique is relevant to Python and overuse of generators?

kazinator2y ago

TXR Lisp also fails this test:

  1> (len
       (with-stream (s (open-file "/usr/share/dict/words"))
         (get-lines s)))
  ** error reading #<file-stream /usr/share/dict/words b7ad7270>: file closed
  ** during evaluation of form (len (let ((s (open-file "/usr/share/dict/words")))
                                      (unwind-protect
                                        (get-lines s)
                                        (close-stream s))))
  ** ... an expansion of (len (with-stream
                                (s (open-file "/usr/share/dict/words"))
                                (get-lines s)))
  ** which is located at expr-1:1

The built-in solution is that when you create a lazy list which reads lines from a stream, that lazy list takes care of closing the stream when it is done.

If the lazy list isn't processed to the end, then the stream semantically leaks; it has to be cleaned up by the garbage collector when the lazy list becomes unreachable.

We can see with strace that the stream is closed:

  $ strace txr -p '(flow "/usr/share/dict/words" open-file get-lines len)'
  [...]read(3, "d\nwrapper\nwrapper's\nwrappers\nwra"..., 4096) = 4096
  read(3, "zigzags\nzilch\nzilch's\nzillion\nzi"..., 4096) = 826
  read(3, "", 4096)                       = 0
  close(3)                                = 0
  fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
  write(1, "102305\n", 7102305
  )                 = 7
  exit_group(0)                           = ?
  +++ exited with 0 +++

It is possible to address the error issue with reference counting. Suppose that we define a stream with a reference count, such that it has to be closed that many times before the underlying file descriptor is closed.

I programmed a proof of concept of this today. (I ran into a small issue in the language run-time that I fixed; the close-stream function calls the underlying method and then caches the result, preventing the solution from working.)

  (defstruct refcount-close stream-wrap
    stream
    (count 1)

    (:method close (me throw-on-error-p)
      (put-line `close called on @me`)
      (when (plusp me.count)
        (if (zerop (dec me.count))
          (close-stream me.stream throw-on-error-p)))))

  (flow
    (with-stream (s (make-struct-delegate-stream
                      (new refcount-close
                           count 2
                           stream (open-file "/usr/share/dict/words"))))
      (get-lines s))
    len
    prinl)

With my small fix in stream.c (already merged, going into Version 292), the output is:

  $ ./txr lazy2.tl
  close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 2)
  close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 1)
  102305

One close comes from the with-stream macro, the other from the lazy list hitting EOF when its length is being calculated.

Without the fix, I don't get the second call; the code works, but the descriptor isn't closed:

  $ txr lazy2.tl
  close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7b70f10> count 2)
  102305

In the former we see the call to close in strace; in the latter we don't.

j / k navigate · click thread line to collapse

49 comments

47 comments · 13 top-level

psd12y ago· 10 in thread

I've been circling around lisp for a couple of years. I'm starting in a month, I'll spend several hours a day. I still don't know what language I want to learn.

hxegon2y ago

There is the issue of startup time with the JVM, but you can also do AOT compilation now so that really isn't a problem. Here are some other cool projects to look at if you're interested:

Malli: https://github.com/metosin/malli

Babashka: https://github.com/babashka/babashka

Clerk: https://github.com/nextjournal/clerk

BaculumMeumEst2y ago

unlogic2y ago

I'm sorry my article had such an effect on you. Despite the inconveniences, it is still a very worthy language to pick up.

roenxi2y ago

"There are only two kinds of languages: the ones people complain about and the ones nobody uses"

Clojure is fun enough that people get to know all the edge cases. And put up with the stack traces.

If you do decide to try it, don't use deps.edn, go straight to Leiningen for build tooling. Even just for playing around in the REPL.

whalesalad2y ago

I wouldn’t let this article put you off. HN is often full of really negative takes like this that bear far less significance than they might suggest.

Clojure is a fantastic language, and probably the best lisp you could start out with due to the fact that you have the entire Java ecosystem at your fingertips.

actuallyalys2y ago

I’m personally a fan of Clojure, in part due to how practical it is. Some of that practicality comes at the expense of simplicity, however.

I wonder if a Scheme dialect would be a better fit for you? They tend to be smaller and might let you focus on semantics more.

ngai_aku2y ago

> I've been circling around lisp for a couple of years. I'm starting in a month, I'll spend several hours a day.

Is this just a personal goal you’re setting? I’m curious because I’m in a similar position, so I’d love to hear your plan!

invertisment2y ago

You won't start anything. I dare you. Haha. You can't learn Clojure and you know it.

hxegon2y ago

I hope you find something better to do with your day than putting people down.

invertisment2y ago

jwr2y ago· 5 in thread

This article is somewhat puzzling for me. On one hand, the OP clearly knows Clojure very well. The disadvantages of laziness are real and well described.

I do like laziness, because when I need it, it's there. And when you need it, you are really happy that it's there.

In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way. That's why I find the article puzzling.

roenxi2y ago

> Laziness does not bother me, because I very rarely pass lazy sequences around.

jwr2y ago

> the best way to use lazy sequences is not to

Most of the code does not return lazy sequences, and thanks to transducers can be abstracted away from the entire notion of a sequence.

unlogic2y ago

> In other words, it's something I don't think much about anymore, and it doesn't inconvenience me in any noticeable way.

Interesting, because it still does bother me. I mean, if I use lazy sequences and functions on them. Sure, if I consciously avoid them, then it doesn't me anymore, that's the point of the article :D.

vemv2y ago

I enjoyed the article while I have a very comparable Clojure experience.

There's always new minutiae to learn. Plus I get a handy link that I can just paste next time the topic of laziness comes up in a code review.

I'd just simply point out that there are a few sub-tribes within the Clojure world. Some are very attracted to formalism/correctness, other pride themselves in rejecting them.

muhaaa2y ago

same here. in practice its almost never an issue. Always I try to use transducers first. Where its not possible, I ask is the data very large then use lazy sequences else mapv.

kimi2y ago· 5 in thread

jwr2y ago

Using transducers is really easy and intuitive (with "comp" letting you compose a pipeline of transformations in a readable way).

Writing custom transducers, especially stateful transducers is really difficult. But that's not something you'll do often. My 10kLOC complex app has three stateful transducers that I wrote.

kimi2y ago

Yes I agree with you that transducers are OK. I think that they miss a bit in ergonomics, not power.

thom2y ago

Obviously in languages that can reliably perform stream fusion transparently, maybe you care less, but the abstraction isn’t just about the speedup.

invertisment2y ago

I saw this a couple of days ago somewhere and I think it belongs in the discussion between lazy-seqs and transducers: https://en.wikipedia.org/wiki/Waterbed_theory

You can't hide from complexity. It will lurk somewhere anyway.

thom2y ago

lenkite2y ago· 5 in thread

And clojure also doesn't give an error/warning when lazy sequences aren't finalized.

memefrog2y ago

vbezhenar2y ago

Is it for built-in map or it would work in a general way for, say, `myMap f (myMap g xs)` ?

consilient2y ago

> Is it for built-in map or it would work in a general way for, say, `myMap f (myMap g xs)` ?

Just the stdlib `map`, but it's all ordinary library code. You can easily add your own rewrite rules.

dontlaugh2y ago

fulafel2y ago

geokon2y ago· 4 in thread

As I posted on Reddit:

It might also be good to mention Injest

https://github.com/johnmn3/injest

Which makes transducers more ergonomic to use if you are like me and use threading macros everywhere

Would be curious to hear how others feel about it

waffletower2y ago

Another great library that decidedly reveals transducer value: https://github.com/cgrand/xforms

BaculumMeumEst2y ago

these look great to me. would there be a downside to adding them to core?

bjoli2y ago

unlogic2y ago

2 more replies

BaculumMeumEst2y ago· 3 in thread

unlogic2y ago

BaculumMeumEst2y ago

I realize the maintainers likely would not even be interested in such a thing, of course, just daydreaming.

bjoli2y ago

Transducers were at least 20 years old when Clojure was first created. I have the book "Common Lisp: the language" from 1989 that describes transducers as found in Clojure.

Clojure is the only language where it is baked in that prominently though.

ndr2y ago· 1 in thread

> The good parts of laziness: Avoiding unnecessary work

Actually be very careful with side effects. Some functions like `map` and `for` take things in chunks, typically in steps of 32 as most underlying structures are in log-32 leaves.

```

  (let [printing-range (map (fn [i] (print "debug: " i) i) (range))
        first-10 (take 10 printing-range)]
   first-10)
  debug: 0
  debug: 1
  debug: 2
  debug: 3
  debug: 4
  debug: 5
  debug: 6
  debug: 7
  debug: 8
  debug: 9
  debug: 10
  debug: 11
  debug: 12
  debug: 13
  debug: 14
  debug: 15
  debug: 16
  debug: 17
  debug: 18
  debug: 19
  debug: 20
  debug: 21
  debug: 22
  debug: 23
  debug: 24
  debug: 25
  debug: 26
  debug: 27
  debug: 28
  debug: 29
  debug: 30
  debug: 31
  (0 1 2 3 4 5 6 7 8 9)

```

kazinator2y ago

It can be legitimate to have side effects in lazy processing and in particular to rely that a lazy sequence is not accessed beyond the visible access that is coded in the program.

Suppose we make a sequence of numbers which grows very rapidly, so that by the time we hit the 17th one, we have a bignum that is gigabytes wide.

You probably don't want this to be chunked in batches of 32.

The advice to be careful with side effects is good in general; not sure why you're downvoted.

beanjuiceII2y ago· 1 in thread

i need to figure out where my slowdown is happening...I'll be needing a phd for that

thom2y ago

Even something as simple as VisualVM makes this fairly easy for Clojure code.

grav2y ago

I'm typically using it like so:

  (defn realize [v] (doto v pr-str))

  (binding [*some* binding]
    (realize (f some-nested-lazy-seq)))

munchler2y ago

waffletower2y ago

jgalt2122y ago

I have not read yet, but how much of this critique is relevant to Python and overuse of generators?

kazinator2y ago

TXR Lisp also fails this test:

  1> (len
       (with-stream (s (open-file "/usr/share/dict/words"))
         (get-lines s)))
  ** error reading #<file-stream /usr/share/dict/words b7ad7270>: file closed
  ** during evaluation of form (len (let ((s (open-file "/usr/share/dict/words")))
                                      (unwind-protect
                                        (get-lines s)
                                        (close-stream s))))
  ** ... an expansion of (len (with-stream
                                (s (open-file "/usr/share/dict/words"))
                                (get-lines s)))
  ** which is located at expr-1:1

The built-in solution is that when you create a lazy list which reads lines from a stream, that lazy list takes care of closing the stream when it is done.

If the lazy list isn't processed to the end, then the stream semantically leaks; it has to be cleaned up by the garbage collector when the lazy list becomes unreachable.

We can see with strace that the stream is closed:

  $ strace txr -p '(flow "/usr/share/dict/words" open-file get-lines len)'
  [...]read(3, "d\nwrapper\nwrapper's\nwrappers\nwra"..., 4096) = 4096
  read(3, "zigzags\nzilch\nzilch's\nzillion\nzi"..., 4096) = 826
  read(3, "", 4096)                       = 0
  close(3)                                = 0
  fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
  write(1, "102305\n", 7102305
  )                 = 7
  exit_group(0)                           = ?
  +++ exited with 0 +++

  (defstruct refcount-close stream-wrap
    stream
    (count 1)

    (:method close (me throw-on-error-p)
      (put-line `close called on @me`)
      (when (plusp me.count)
        (if (zerop (dec me.count))
          (close-stream me.stream throw-on-error-p)))))

  (flow
    (with-stream (s (make-struct-delegate-stream
                      (new refcount-close
                           count 2
                           stream (open-file "/usr/share/dict/words"))))
      (get-lines s))
    len
    prinl)

With my small fix in stream.c (already merged, going into Version 292), the output is:

  $ ./txr lazy2.tl
  close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 2)
  close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7aecee0> count 1)
  102305

One close comes from the with-stream macro, the other from the lazy list hitting EOF when its length is being calculated.

Without the fix, I don't get the second call; the code works, but the descriptor isn't closed:

  $ txr lazy2.tl
  close called on #S(refcount-close stream #<file-stream /usr/share/dict/words b7b70f10> count 2)
  102305

In the former we see the call to close in strace; in the latter we don't.

j / k navigate · click thread line to collapse