Destroying C with 20 lines of Haskell: wc (opens in new tab)

(0xd34df00d.me)

27 pointsdevelop76y ago71 comments

71 comments

57 comments · 19 top-level

notacoward6y ago· 15 in thread

This doesn't seem to be comparing anything like the same thing. Does the Haskell version really do the same thing as the C version? Does it handle all of the same error cases, providing the same quality of error messages if they occur? Does it handle localization? If not, that makes the comparison very skewed, as unhammer already pointed out. Sure, if you strip out all of the things that the people who wrote wc actually spent all that time on then you can go faster, but failing to note the differences is simply dishonest.

0xd34df00d6y ago

Sup, author here.

> Does the Haskell version really do the same thing as the C version?

It counts bytes, words and lines and, modulo intended Unicode space handling and unintended bugs, does the same thing.

Indeed, it does not count things like max line length or char count, but those can be plugged in without significant performance overhead (and that's what the second part is gonna be about).

> Does it handle all of the same error cases, providing the same quality of error messages if they occur?

There are no error messages at this point. Although I don't really see how this should affect performance.

> Does it handle localization?

If you mean counting multi-byte characters, then not yet. Although I'm pretty convinced it does not require doing something much more complicated — but maybe I'm wrong, we'll see in the next part.

Also, thanks for the feedback, those are important questions! Something to keep in mind when writing subsequent posts.

ColinWright6y ago

Reproducing what your code does in the most simple, naive C program possible, I can beat the existing wc utility, taking only around 40% of the time that wc takes.

So until you put in locale handling, alternate line endings, option handling, and error handling, I don't see that your post is at all convincing.

Quite the opposite.

So I look forward to a Haskell version that supports everything the wc has so we can get a fair comparison.

1 more reply

acl7776y ago

Good point. I was thinking the same thing too.

I will wait for what the next article presents, as the end of this article states:

> Stay tuned for a second part, where we will investigate shaping all this into an actual wc substitute, where different statistics can be turned on or off, all while not computing what the user hasn’t requested.

phonebucket6y ago

> This doesn't seem to be comparing anything like the same thing.

This is a fair point, and I believe this whole series of 'Beating C with foo' posts could have been better named.

But I'm of the opinion that the whole series is about showcasing various language's strengths and weaknesses, while using GNU wc as a benchmark.

From this perspective, I've learnt a bit about several languages I knew nothing about, so I rather like these articles, despite the fact that their titles might be a bit misleading.

thanatropism6y ago

It's all about tone.

Everyone remembers being a brash know-nothing teenager, so just by the headline "DESTROYING C" you get that vibe.

I remember having a blog about Haskell circa 2002 where I would smugly enumerate the ways in which Guido van Rossum was wrong. "GUIDO IS WRONG! PART 4" my post titles would say.

1 more reply

hnlmorg6y ago

He's not handling unicode and it's been well demonstrated in the past that unicode handling adds a lot to computation time in tests like these (it often comes up in grep benchmarks too).

I'd be interested to see the same tests but with a non-unicode local set, IIRC:

    LC_ALL=C

I'd wager the C version would perform much better in that test.

jstimpfle6y ago

Don't forget to export the variable, or run like "LC_ALL=C wc ...", or otherwise the locale won't apply.

jacobush6y ago

Does it handle multiple files, and stdin?

Anyway, destroying Haskell with 100 lines of C :-)

https://raw.githubusercontent.com/gdevic/minix1/master/comma...

hrgiger6y ago

I see and raise, to one line:

https://www.ioccc.org/2019/burton/prog.c

It is the Burton entry from The International Obfuscated C Code Contest, but seems like only handling ascii

https://www.ioccc.org/years-spoiler.html

Limits:

"Requires the C locale and ASCII character set. Input should be less than ten million octets to avoid this problem."

https://www.ioccc.org/2019/burton/hint.html

_v7gu6y ago

Isn't 100 lines too much for a simple utility like wc? I get that it has many edge cases to cover, but edge cases usually require some different handling when you run into them anyway. I'd rather use my one-liner (including calculation in a single pass, and even parallel processing!):

    wc:{sum (({1};ceiling 0.5*sum differ " "=;{1+count x})@\:) peach read0 x}

jstimpfle6y ago

Do you get a speed-up if you use getc_unlocked() instead of getc()? And if you write your own isspace()? As far as I know, isspace() is locale sensitive.

1 more reply

noxToken6y ago

Off-topic but tangentially related. There was Reddit post[0] about a clothes dryer's eco mode being a scam since it consumes more power than a regular drying cycle. Here[1] is a single comment thread noting that the user was measuring power output of the entire house. The user quickly changed their tune when this was highlighted.

[0]: https://old.reddit.com/r/dataisbeautiful/comments/eyevca/my_...

[1]: https://old.reddit.com/r/dataisbeautiful/comments/eyevca/my_...

dddbbb6y ago

They explicitly say in the introduction that this is a toy version, and then in the conclusion that a future post will look at a more complete substitute.

shawnz6y ago

> failing to note the differences is simply dishonest.

Like yourself, I think any reader could understand that this is meant to be a toy implementation. The author even says so in the very first paragraph and at several points in the article thereafter. How is that dishonest?

ckastner6y ago

Because that makes the programs sufficiently different that the conclusion about "smashing" seems dishonest.

Apples smash Oranges, etc.

jstimpfle6y ago· 8 in thread

> So we’ve managed to just smash a C program that was looked at by thousands of eyes of quite hardcore low-level Unix hackers over a few decades. We did this with a handful of lines of pure, mutation-less, idiomatic Haskell, achieving about 4 to 5 times of throughput of the C version and spending less than an hour on all the optimizations.

I've done many very arrogant things in my life, because I've been a strange guy with lack of self-esteem who doesn't have a clue how many things he doesn't know. I hope I've never been this arrogant, though.

0xd34df00d6y ago

Those are fairly trivial and well-known optimizations that I did (and I by no means am an expert in writing high-performant code), so all the honors go to GHC authors.

jstimpfle6y ago

Thanks for replying. I want to tell you that I feel deep regret for my impulse to publically shame you - even though I've gotten a lot of points for this comment, and did not really receive criticism for it.

Hey - if you make performance optimizations and compare implementations, it's probably best not to jump to quick conclusions. I would advise to brush up on C to get a feel for performance. Or, in times where it's popular to shit on C, and claim that it's not really a low-level language anymore, I would advise to write some assembler (which I've barely done). In practice it's unlikely that you find yourself in a situation where you can write code in a high-level language that runs considerably faster than what you could realistically write in C. The only situation where that can happen that I can see, is when the program does something that is so complicated but seems so arbitrary that you simply can't bring yourself to invest more time than what you need to hack together a quick Python script, and what you would be willing to write in C would be not even the asymptotically best approach that you can see.

For a simple program like wc, that situation does not apply, and if you find surprising results, it's best to first check for other possible reasons than "thousands of graybeards were wrong".

1 more reply

hnlmorg6y ago

The github repo description is equally distasteful too:

> wc implemented in Haskell (significantly faster than GNU coreutils version — oops I did it again

For reference, I'm referring to the "oops I did it again" part. It's really hard to take that comment as "honours go to GHC authors".

Also, I suggest you try running the GNU wc with unicode turned off because unicode is computationally expensive and you're deliberately disabling unicode support in your own code anyway. I appreciate you said you'd add in the edge cases that GNU does in your next blog post but disabling unicode in GNU for this benchmark would show good faith that you're at least trying to compare like for like. And if GHC still out performs then you can at least legitimately say:

> My code outperforms GNU for non-unicode strings

Which currently you cannot because your claim is based on incorrect benchmarks.

1 more reply

pinkfoot6y ago

Perhaps - just possibly maybe perhaps - he was suggesting its is his superior tool and technology that allowed his team to do this.

Do you think you took the most generous possible interpretation possible?

jstimpfle6y ago

Given that I was reminded of myself, I concede it might be closer to the least generous one.

armitron6y ago

He is comparing apples and oranges and his conclusion can only be characterized as widely misleading and off the mark.

The most generous possible interpretation is to say that he's not being intentionally misleading but actually believes what he writes. I fail to see how that improves matters.

AnIdiotOnTheNet6y ago

> he was suggesting its is his superior tool and technology that allowed his team to do this

He really thinks it makes sense that a "C program that was looked at by thousands of eyes of quite hardcore low-level Unix hackers over a few decades" is that much slower than something he threw together? That's like doing some back of the napkin math in a bar and declaring relativity wrong (while also not being Edward Witten). Any reasonable person would suspect that this isn't a very likely outcome and dig deeper to figure out if that's really true or not. My guess is he just implemented a very trivial and naive version of the C code and that's why it is faster.

dependenttypes6y ago

It would be great if your post added value in the discussion.

ColinWright6y ago· 3 in thread

So I just wrote the most naive version of wc I could think of in C, matching the capability of this Haskell version, and I smoked the system wc ... my code, unoptimised, was over twice as fast.

$ time wc Backups/Tera2/files.txt

  1123699   2283439 161361844 Backups/Tera2/files.txt

  real 0m2.010s
  user 0m1.964s
  sys 0m0.020s

$ time naive Backups/Tera2/files.txt

  L: 1123699
  W: 2283439
  C: 161361844

  real 0m0.864s
  user 0m0.835s
  sys 0m0.028s

Smoked !!

(See my comment elsewhere[0] for a more sensible comment)

[0] https://news.ycombinator.com/item?id=22234673

--------

Update: I just compiled with -O3 and got user time of 0.24 secs. This version is 8 times faster than the system wc.

felixge6y ago

Just put together a naive version of wc in Go and got similar perf gains over the wc that ships with macOS.

That being said, unlike the OP, I don't think it means anything as wc is likely supporting more features than my program.

    $ time wc test.txt
     16500000 49252094 2059004431 test.txt

    real 0m5.930s
    user 0m5.491s
    sys 0m0.374s

    $ go build wc.go && time ./wc test.txt
    16500000 49252094 2059004431

    real 0m2.947s
    user 0m2.620s
    sys 0m0.299s

https://gist.github.com/felixge/aa70fc97e893a7eb0bd4c801f8f5...

zzzcpan6y ago

I got 5.6x speedup on just wc alone with:

  export LANG=C

(obviously in both cases with prewarmed filesystem cache)

felixge6y ago

That doesn't seem to help on macOS. I suspect you're on Linux?

ColinWright6y ago· 3 in thread

Not being great at reading Haskell, I have some questions I was hoping people here could answer:

* Does this cope with different whitespace, such as tabs?

* Does this cope with different settings of locale?

* Does this include the option of the "longest line"?

* Does this perform the character counts?

I'm pretty sure wc does all these, and that stripping them out would make it faster. If this Haskell version doesn't do that, and yet still compares against a fully-featured version of wc, the comparison hardly seems fair.

_pvxk6y ago

* isSpace handles tabs, but looking at a single byte at a time it won't handle all the multibyte space symbols you can have in unicode. If you read further down, they rip out the remains of unicode handling for further speed improvements.

* Looking at a single byte at a time, it presumably only handles the "C" locale :) They don't say what locale GNU wc was tested with (if it's not LANG=C, that benchmark should be re-run)

* --max-line-length? no. But I'm guessing GNU wc isn't benchmarked with that option on (can't find the invocation in the blog post though)

* data State { ws, bs, ls } keeps count of words, bytes (more honest than calling it characters) and lines.

ColinWright6y ago

Thanks for the reply ...

> ... further down, they rip out the remains of unicode handling ...

Ah. Well, that makes it a little unfair, surely.

> Looking at a single byte at a time, it presumably only handles the "C" locale ...

Again.

> --max-line-length? no. But I'm guessing GNU wc isn't benchmarked with that option on

I wonder if wc does the work anyway, and only reports it if asked, or if it actually changes the code path if it's not needed.

So this entire post feels ... intellectually dishonest. personally I'm all in favour of Haskell, and I wish I had the chance to use it "in anger" rather than just doing the occasional toy thingie that I do. But this post doesn't do it or its community any favours.

Disappointing.

1 more reply

matheusmoreira6y ago

The source code of GNU wc:

https://github.com/coreutils/coreutils/blob/master/src/wc.c

The word counting algorithm does seem to be much more complex.

mkup6y ago· 3 in thread

But how would Haskell version of wc compare with C version of wc running with LC_ALL=C environment variable? UTF-8 locale is much slower than C locale in coreutils, it's a well-known fact, and their Haskell version of wc is already using fixed 8-bit characters.

0xd34df00d6y ago

wc was actually slower with LC_ALL=C as opposed to ru_RU.UTF-8 that my system normally runs with (about 10 s against 7.2 s).

Which actually raises a good question of whether I should have been comparing with that one — but that'd probably raise more questions and lead to more people accusing me of cheating in favour of Haskell.

zzzcpan6y ago

The only way I can think of this could be true is if you made a mistake in setting an env var.

classified6y ago

Well, counting characters is obviously different from counting bytes when the characters are UTF-8 encoded, or UTF-32.

tsukurimashou6y ago· 2 in thread

I don't think it is very "hard" to "destroy" most of these programs, they were written a long time ago, they evolved with backwards compatibility or portability in mind, these can run on pretty much any system. It does seem a bit unfair to compare it to a quickly hacked together program that you test against one use case.

Like the other said, the second article will probably be more interesting.

StavrosK6y ago

But multiple people have been working full time on wc for decades! At least, if you believe the article.

iainmerrick6y ago

Decades yes, but who said “full time”?

2 more replies

TickleSteve6y ago· 2 in thread

He is comparing a multi-threaded haskell version against a single-threaded C version???

"...There’s also a parallel version that relies on the monoidal structure of the problem a lot, and that one actually beats C"

coreutils wc is single-threaded, just checked.

divisonby06y ago

  But that post left me wondering: is it possible to do better without resorting to parallel processing?
  Turns out the answer is yes.

From the introduction

akdor11546y ago

Yes, your skimming was inaccurate.

bobowzki6y ago· 1 in thread

I don't like these "Destroying C" titles.

0xd34df00d6y ago

Me neither. But experience shows that these titles lead to more folks looking at the post, leading to more feedback, leading to better writing/experimenting/etc in the long run.

Also, I was trying to make a reverence to the original post that was "beating C", with the connotation of further improving on that. I'm not a native speaker so my language model might be terribly flawed.

b-3-n6y ago· 1 in thread

The post is very insightful and well written. The title is provocative which is very common nowadays. However, there should be a "serious" section that puts things into perspective. As others have pointed out it leaves a bit of a taste otherwise.

iainmerrick6y ago

Those are serious flaws. I think it’s misleading, not insightful at all.

megous6y ago

Write me a haskell program that will init DRAM on PinePhone, setup PMIC and eMMC, load 25MiB of linux kernel and initramfs from eMMC to DRAM and will fit in hard limit of 32kB, and will do all of this in 300ms max.

Then I'll consider C destroyed.

GlitchMr6y ago

Destroying Haskell with 1 line of C: https://www.ioccc.org/2019/burton/prog.c

quadrifoliate6y ago

I would really like to see fewer of these clickbait post titles. The author honestly admits here (props!) https://news.ycombinator.com/item?id=22235536 that they only used the title because it encourages discussion.

In that case, I would say that it's up to the community to not take clickbait titles like this seriously if we want to encourage reasoned, detailed content rather than borderline flaming with words like "destroy" and "smash". I personally don't enjoy this new Buzzfeed style of technical post at all.

Therefore, I have some comments about the actual code here, but going to keep them to myself so I don't encourage more people to follow OP's example.

jeen026y ago

Ah, the weekly "I beat C using X language".

I don't understand why LOC was even brought up. You can put all your code on a single line in most languages. The Github code linked even has 26 lines of Haskell which makes this even more nonsense.

mhh__6y ago

Not unimpressive, but this a bit of a hot take. If I'm reading it right, a single file benchmark (Big O who?) against a fully fledged production C program is not a complete measurement

throwawa666y ago

Related: I have an F# snippet i run from Linqpad to keep or remove lines (based on keywords) from huge log files and it takes just a few seconds to breeze through the log and it’s done

iainmerrick6y ago

I take it this is “destroying” in the sense of presidential debates or late-night satirical monologues, i.e. diverting but meaningless.

divan6y ago

Happy 33rd anniversary of Haskell devs claiming to destroy C.

deathnoto6y ago

os: CentOS Linux release 7.7.1908 (Core) cpu: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

Average time for wc on the same test file: 0.8 seconds

deathnoto6y ago

Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz CentOS Linux release 7.7.1908 (Core)

average time: 0.8 seconds

j / k navigate · click thread line to collapse

71 comments

57 comments · 19 top-level

notacoward6y ago· 15 in thread

0xd34df00d6y ago

Sup, author here.

> Does the Haskell version really do the same thing as the C version?

It counts bytes, words and lines and, modulo intended Unicode space handling and unintended bugs, does the same thing.

Indeed, it does not count things like max line length or char count, but those can be plugged in without significant performance overhead (and that's what the second part is gonna be about).

> Does it handle all of the same error cases, providing the same quality of error messages if they occur?

There are no error messages at this point. Although I don't really see how this should affect performance.

> Does it handle localization?

If you mean counting multi-byte characters, then not yet. Although I'm pretty convinced it does not require doing something much more complicated — but maybe I'm wrong, we'll see in the next part.

Also, thanks for the feedback, those are important questions! Something to keep in mind when writing subsequent posts.

ColinWright6y ago

Reproducing what your code does in the most simple, naive C program possible, I can beat the existing wc utility, taking only around 40% of the time that wc takes.

So until you put in locale handling, alternate line endings, option handling, and error handling, I don't see that your post is at all convincing.

Quite the opposite.

So I look forward to a Haskell version that supports everything the wc has so we can get a fair comparison.

1 more reply

acl7776y ago

Good point. I was thinking the same thing too.

I will wait for what the next article presents, as the end of this article states:

phonebucket6y ago

> This doesn't seem to be comparing anything like the same thing.

This is a fair point, and I believe this whole series of 'Beating C with foo' posts could have been better named.

But I'm of the opinion that the whole series is about showcasing various language's strengths and weaknesses, while using GNU wc as a benchmark.

From this perspective, I've learnt a bit about several languages I knew nothing about, so I rather like these articles, despite the fact that their titles might be a bit misleading.

thanatropism6y ago

It's all about tone.

Everyone remembers being a brash know-nothing teenager, so just by the headline "DESTROYING C" you get that vibe.

I remember having a blog about Haskell circa 2002 where I would smugly enumerate the ways in which Guido van Rossum was wrong. "GUIDO IS WRONG! PART 4" my post titles would say.

1 more reply

hnlmorg6y ago

He's not handling unicode and it's been well demonstrated in the past that unicode handling adds a lot to computation time in tests like these (it often comes up in grep benchmarks too).

I'd be interested to see the same tests but with a non-unicode local set, IIRC:

    LC_ALL=C

I'd wager the C version would perform much better in that test.

jstimpfle6y ago

Don't forget to export the variable, or run like "LC_ALL=C wc ...", or otherwise the locale won't apply.

jacobush6y ago

Does it handle multiple files, and stdin?

Anyway, destroying Haskell with 100 lines of C :-)

https://raw.githubusercontent.com/gdevic/minix1/master/comma...

hrgiger6y ago

I see and raise, to one line:

https://www.ioccc.org/2019/burton/prog.c

It is the Burton entry from The International Obfuscated C Code Contest, but seems like only handling ascii

https://www.ioccc.org/years-spoiler.html

Limits:

"Requires the C locale and ASCII character set. Input should be less than ten million octets to avoid this problem."

https://www.ioccc.org/2019/burton/hint.html

_v7gu6y ago

    wc:{sum (({1};ceiling 0.5*sum differ " "=;{1+count x})@\:) peach read0 x}

jstimpfle6y ago

Do you get a speed-up if you use getc_unlocked() instead of getc()? And if you write your own isspace()? As far as I know, isspace() is locale sensitive.

1 more reply

noxToken6y ago

[0]: https://old.reddit.com/r/dataisbeautiful/comments/eyevca/my_...

[1]: https://old.reddit.com/r/dataisbeautiful/comments/eyevca/my_...

dddbbb6y ago

They explicitly say in the introduction that this is a toy version, and then in the conclusion that a future post will look at a more complete substitute.

shawnz6y ago

> failing to note the differences is simply dishonest.

ckastner6y ago

Because that makes the programs sufficiently different that the conclusion about "smashing" seems dishonest.

Apples smash Oranges, etc.

jstimpfle6y ago· 8 in thread

0xd34df00d6y ago

Those are fairly trivial and well-known optimizations that I did (and I by no means am an expert in writing high-performant code), so all the honors go to GHC authors.

jstimpfle6y ago

For a simple program like wc, that situation does not apply, and if you find surprising results, it's best to first check for other possible reasons than "thousands of graybeards were wrong".

1 more reply

hnlmorg6y ago

The github repo description is equally distasteful too:

> wc implemented in Haskell (significantly faster than GNU coreutils version — oops I did it again

For reference, I'm referring to the "oops I did it again" part. It's really hard to take that comment as "honours go to GHC authors".

> My code outperforms GNU for non-unicode strings

Which currently you cannot because your claim is based on incorrect benchmarks.

1 more reply

pinkfoot6y ago

Perhaps - just possibly maybe perhaps - he was suggesting its is his superior tool and technology that allowed his team to do this.

Do you think you took the most generous possible interpretation possible?

jstimpfle6y ago

Given that I was reminded of myself, I concede it might be closer to the least generous one.

armitron6y ago

He is comparing apples and oranges and his conclusion can only be characterized as widely misleading and off the mark.

The most generous possible interpretation is to say that he's not being intentionally misleading but actually believes what he writes. I fail to see how that improves matters.

AnIdiotOnTheNet6y ago

> he was suggesting its is his superior tool and technology that allowed his team to do this

dependenttypes6y ago

It would be great if your post added value in the discussion.

ColinWright6y ago· 3 in thread

So I just wrote the most naive version of wc I could think of in C, matching the capability of this Haskell version, and I smoked the system wc ... my code, unoptimised, was over twice as fast.

$ time wc Backups/Tera2/files.txt

  1123699   2283439 161361844 Backups/Tera2/files.txt

  real 0m2.010s
  user 0m1.964s
  sys 0m0.020s

$ time naive Backups/Tera2/files.txt

  L: 1123699
  W: 2283439
  C: 161361844

  real 0m0.864s
  user 0m0.835s
  sys 0m0.028s

Smoked !!

(See my comment elsewhere[0] for a more sensible comment)

[0] https://news.ycombinator.com/item?id=22234673

--------

Update: I just compiled with -O3 and got user time of 0.24 secs. This version is 8 times faster than the system wc.

felixge6y ago

Just put together a naive version of wc in Go and got similar perf gains over the wc that ships with macOS.

That being said, unlike the OP, I don't think it means anything as wc is likely supporting more features than my program.

    $ time wc test.txt
     16500000 49252094 2059004431 test.txt

    real 0m5.930s
    user 0m5.491s
    sys 0m0.374s

    $ go build wc.go && time ./wc test.txt
    16500000 49252094 2059004431

    real 0m2.947s
    user 0m2.620s
    sys 0m0.299s

https://gist.github.com/felixge/aa70fc97e893a7eb0bd4c801f8f5...

zzzcpan6y ago

I got 5.6x speedup on just wc alone with:

  export LANG=C

(obviously in both cases with prewarmed filesystem cache)

felixge6y ago

That doesn't seem to help on macOS. I suspect you're on Linux?

ColinWright6y ago· 3 in thread

Not being great at reading Haskell, I have some questions I was hoping people here could answer:

* Does this cope with different whitespace, such as tabs?

* Does this cope with different settings of locale?

* Does this include the option of the "longest line"?

* Does this perform the character counts?

_pvxk6y ago

* Looking at a single byte at a time, it presumably only handles the "C" locale :) They don't say what locale GNU wc was tested with (if it's not LANG=C, that benchmark should be re-run)

* --max-line-length? no. But I'm guessing GNU wc isn't benchmarked with that option on (can't find the invocation in the blog post though)

* data State { ws, bs, ls } keeps count of words, bytes (more honest than calling it characters) and lines.

ColinWright6y ago

Thanks for the reply ...

> ... further down, they rip out the remains of unicode handling ...

Ah. Well, that makes it a little unfair, surely.

> Looking at a single byte at a time, it presumably only handles the "C" locale ...

Again.

> --max-line-length? no. But I'm guessing GNU wc isn't benchmarked with that option on

I wonder if wc does the work anyway, and only reports it if asked, or if it actually changes the code path if it's not needed.

Disappointing.

1 more reply

matheusmoreira6y ago

The source code of GNU wc:

https://github.com/coreutils/coreutils/blob/master/src/wc.c

The word counting algorithm does seem to be much more complex.

mkup6y ago· 3 in thread

0xd34df00d6y ago

wc was actually slower with LC_ALL=C as opposed to ru_RU.UTF-8 that my system normally runs with (about 10 s against 7.2 s).

zzzcpan6y ago

The only way I can think of this could be true is if you made a mistake in setting an env var.

classified6y ago

Well, counting characters is obviously different from counting bytes when the characters are UTF-8 encoded, or UTF-32.

tsukurimashou6y ago· 2 in thread

Like the other said, the second article will probably be more interesting.

StavrosK6y ago

But multiple people have been working full time on wc for decades! At least, if you believe the article.

iainmerrick6y ago

Decades yes, but who said “full time”?

2 more replies

TickleSteve6y ago· 2 in thread

He is comparing a multi-threaded haskell version against a single-threaded C version???

"...There’s also a parallel version that relies on the monoidal structure of the problem a lot, and that one actually beats C"

coreutils wc is single-threaded, just checked.

divisonby06y ago

  But that post left me wondering: is it possible to do better without resorting to parallel processing?
  Turns out the answer is yes.

From the introduction

akdor11546y ago

Yes, your skimming was inaccurate.

bobowzki6y ago· 1 in thread

I don't like these "Destroying C" titles.

0xd34df00d6y ago

Me neither. But experience shows that these titles lead to more folks looking at the post, leading to more feedback, leading to better writing/experimenting/etc in the long run.

b-3-n6y ago· 1 in thread

iainmerrick6y ago

Those are serious flaws. I think it’s misleading, not insightful at all.

megous6y ago

Then I'll consider C destroyed.

GlitchMr6y ago

Destroying Haskell with 1 line of C: https://www.ioccc.org/2019/burton/prog.c

quadrifoliate6y ago

Therefore, I have some comments about the actual code here, but going to keep them to myself so I don't encourage more people to follow OP's example.

jeen026y ago

Ah, the weekly "I beat C using X language".

I don't understand why LOC was even brought up. You can put all your code on a single line in most languages. The Github code linked even has 26 lines of Haskell which makes this even more nonsense.

mhh__6y ago

Not unimpressive, but this a bit of a hot take. If I'm reading it right, a single file benchmark (Big O who?) against a fully fledged production C program is not a complete measurement

throwawa666y ago

Related: I have an F# snippet i run from Linqpad to keep or remove lines (based on keywords) from huge log files and it takes just a few seconds to breeze through the log and it’s done

iainmerrick6y ago

I take it this is “destroying” in the sense of presidential debates or late-night satirical monologues, i.e. diverting but meaningless.

divan6y ago

Happy 33rd anniversary of Haskell devs claiming to destroy C.

deathnoto6y ago

os: CentOS Linux release 7.7.1908 (Core) cpu: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

Average time for wc on the same test file: 0.8 seconds

deathnoto6y ago

Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz CentOS Linux release 7.7.1908 (Core)

average time: 0.8 seconds

j / k navigate · click thread line to collapse