HTTP throughput regression from Go 1.7.5 to 1.8 (opens in new tab)

(github.com)

152 points01walid9y ago65 comments

65 comments

47 comments · 12 top-level

jerf9y ago· 16 in thread

As I've mentioned before [1], as the number starts getting too large, "requests per second" isn't a useful way of measuring the performance of a webserver, you're really more interested in "seconds per request overhead". The former makes this sound horrible and leads to headlines that make it sound like the entire web stack has lost 20% of its performance, which is terrible. The latter shows that the "request overhead" has gone from ~100us per request to ~120us or so, which is a lot more informative and tends to lead to better understanding what the situation is.

This is not meant as an attack or a defense of Go. The facts are what the facts are. The point here is to suggest that people use terminology that is more informative and easier to understand. There are people for whom 20us per request extra is a sufficiently nasty issue that they will not upgrade. There are also a lot of people who are literally multiple orders of magnitude away from that even remotely mattering because their requests tend to take 120ms anyhow. Using "seconds per request overhead" both makes it easier to understand both the real performance impact with real times, and makes it easier to understand that we're just talking about the base overhead per request rather than the speed of the entire request.

It might also discourage some of our, ah, more junior developers from being too focused on this metric. Why would I want to use a webserver that can only do 100,000 requests per second when I can use this one over here that can do 1,000,000 requests per second? If you look at it from the point of view that we're speaking about the difference between 10 microseconds and 1 microsecond, it becomes easier to see that if my requests are going to take 10 milliseconds on average, this is not a relevant stat to be worried about when choosing my webserver, and I should examine just the other differences instead, which may be a great deal more relevant to my use cases.

Edit: Literally while I was typing this up I see at least three comments already complaining about this regression. My question to you, my honest question to you (because some of you may well be able to answer "yes", especially with some of the tasks Go gets used for), is: Are you really going to have a problem with this? Does the rest of your request really run in microseconds? It's actually pretty challenging in the web world to run in microseconds. It can be done, but a lot of the basic things you want to do end up like "hit a database" generally end up involving milliseconds, i.e., "thousands of microseconds".

[1]: https://news.ycombinator.com/item?id=11187264

Matthias2479y ago

I have worked in the network programming domain for the last few years and I also found that especially outsiders and newbies get too obsessed on pure performance figures. Especially for all networked stuff there's also a very important other key metric, which is reliability, which is only seldom taken into consideration. However reliability can have a huge impact on performance.

E.g. not implementing read/write timeouts allows to omit lots extra code (timer management, synchronization, cancellations), which improves performance. But it might bring a whole system to stop if there a few non responsive clients. Or not implementing flowcontrol through the whole chain and simply buffering at each stage can give a huge boost on the throughput metric. But sooner or later the system might go out-of-memory.

I personally now see reliability the number 1 thing you should achieve in a protocol implementation. Performance is of course also important, but should only be compared if all other parts are also comparable.

rocqua9y ago

In response to the response that now got flagged twice:

I think you read a lot into the parent posts that wasn't there.

Let me restate what I believe to be the parents meaning:

Many junior developers care to much about the "How quick is my normal execution path" form of performance. This is a bad measure for actual performance because the rare, error-related executions can have cascading effects effectively blocking the entire network.

Allowing applications to wait indefinitely for a response. Even if asyncronous, is something like a 'thread' leak where you start accumulating dead threads eventually leading to slowdown. This would be one example.

Another would be weird broadcast storms that happen when a component fails.

Basically, consider cascading effects of errors when optimizing performance.

Projects where 'performance' is taken to be "how quick is my usual case execution path".

1 more reply

logicallee9y ago

EDIT: An earlier version of this comment has been flagged, but I stand by it and am addressing the parent poster. Feel free to disagree with me (feel free to comment), but I have communicated really clearly, and it is an important thing to communicate. See note at bottom.

The following is tough love:

>I have worked in the network programing domain for the last few years and I also found that especially outsiders and newbies get too obsessed on pure performance figures.

no need for the introduction, your attitude shows it all. It's why we all wait for 35 seconds while we watch a timer animation instead of getting a response instantly (200 milliseconds) and one time out of ten thousand having to resubmit a page and you having to deal with it. But by all means, 10000 * 35 seconds is only 97 hours. I'm happy to wait 97 hours if it means I won't have a 1/10,000 chance of having to click Submit a second time - wouldn't you? Or even a one in fifty chance? I mean wouldn't you rather wait for 35 seconds, versus either getting an instant response (98% chance) or a 98% chance of an instant response the second time you try and a 98% chance of a response the third time you try? No brainer. Who wouldn't love to wait, wait, wait, wait. It's my favorite part of using a computer! Waiting! I can anticipate how great it will be when stuff works. It reminds me of downloading over a 14.4 KBps modem (which due to the lack of web apps at the time was actually much faster in many cases, but thankfully you've fixed that.) On your end you won't have to code up what happens when I do resubmit or not get your response, which takes logic and math or a hand-coded edge case, that civilization probably will never discover and could not possibly code. I mean how can a database possibly be set right if it ever gets a transaction twice or fails to get a transaction the user really did request. It doesn't make any sense! Would you ever tell a friend the same thing twice? Or would you just tell them once, and even if it takes them 3 weeks to get your invitation for Friday, at least you won't accidentally send it twice, embarrassing yourself and your friend, or, worse, having them show up twice. The real world shows that the tradeoffs you network engineers make every day to give me 35 second web page experiences are the correct trade-offs. After all, it's my time, not yours.

You people make the worst trade-offs ever. Your decisions suck. Your work sucks. The web sucks, because of you.

Change everything radically. Figure it out. Don't boast about newbies/outsiders not understanding - you don't understand the correct trade-offs.

Plus the two general's theorem[1] shows that you can never write correct code on the theoretical level, so that other than every single thing you do being practically broken, it's theoretically broken too. Everything you guys do is broken and sucks, theoretically as well as practically. wake up already.

[1] https://en.wikipedia.org/wiki/Two_Generals'_Problem

----

Note: I took a very aggressive tone to counteract the complacency I quoted. My goal is to have parent poster rethink their whole life (in the network programming domain.) Please don't flag/downvote it if you want a better web tomorrow than we have today, because the parent and others like them is the one responsible for this. Only they can wake up and start making the correct trade-offs. It gets so bad that I manually open a new tab, slowly type in google, slowly re-authenticate, and go through the same action a second time, then close the (still loading) first tab, just because people like this person have made trade-offs that are so bad I have to work around it myself. Their decisions are wrong.

Reliability, the way network engineers have been moving toward coding for it for the past decade, is a false God. The approach is not correct. It must change if you want a better web tomorrow (or at least reply to it) or you are complacent in the thinking which the parent comment very explicitly shows. I have edited this comment considerably to be really clear, and gave multiple examples. As you can see I have 2546 karma and have been using HN for 1386 days. I stand by criticism.

kasey_junk9y ago

The issue with doing "seconds per request overhead" instead of doing "requests per second" is that you've switched what you are measuring.

The requests per second statistic is measuring throughput, and the results from such a test can be easily represented as a single value. The seconds per request statistic is a measure of latency. Latency can't be represented with a single value in a meaningful way. It is a curve of values, so you'd need to know what percentage of requests fell under a threshold.

Where those thresholds are is extremely use case specific. Some people only care about 95% of requests, others have to care about much higher levels of resolution.

So if anyone gave me a single data point about their system latency, I'd be skeptical they knew what they were talking about. Even in this case we don't know if the latencies changed across the board, only on a few outliers, or on just the middle of the latency curve.

That said, I agree that this is a bit of a tempest in a teapot . In real world usage, if this regression really matters to you, you've probably already moved off of the standard library for a variety of other reasons.

jerf9y ago

First of all, tweaking what we're measuring is sort of my point.

Second, though, if we're going to slice and dice that way, which is valid, I think you need to go even farther and point out that there are two cases. The first is when you are hammering requests through as quickly as possible, and the second is when you are not.

The latency numbers are highly specific to your load, because as load increases, things like scheduling algorithms start mattering more, especially the fundamental tradeoffs between latency and throughput. Knowing the distribution of these numbers under load is important... though I'd suggest that said distribution is still fairly likely to be dominated by the user code rather than the framework code. But the hello world benchmark is still a crucial one, because it serves as the limit of performance, so if you can show that some webserver can't even do what you need with that, you can eliminate it.

There is also the "request overhead in seconds" you get for a relatively uncontested system, where the system would have to be fairly pathologically broken to see a high variance in results. (You'll get some from GC, but in this case I wouldn't call that variance high in the patterns you'll see from a hello-world handler.) This number is important because while it is in a lot of ways more boring, it is also I suspect the relevant number for the modal web server. I suspect this is another one of those cases where some very visual image leaps to mind, the web server for Google or Facebook that is constantly getting hammered at 90% of capacity (and that carefully by design since systems get increasingly pathological as you approach 100%) serving highly optimized requests where every microsecond matters... but those are actually the rare web servers in the world. Most webservers are doing at least one of twiddling their thumbs for long stretches of time or waiting for user code to do what it's going to do in the milliseconds... or seconds... or minutes....

1 more reply

the84729y ago

The thing is that humans usually care about the latency-CDF, even if they don't know it.

What good does a 100microsecond average latency (calculated as inverse of the throughput) do for you when simply loading a website issues 200 requests and your 99tile is closer 500ms for whatever reason? Suddenly your per-load average looks a lot different than your per-request average.

Pure throughput is what you want for batch processing without those pesky, impatient humans in the loop.

1 more reply

shanemhansen9y ago

I run a web server who's number one job is to add negative overhead to most requests (CDN). As an example of how much I care about overhread: a while back one of the biggest bottlenecks preventing us from saturating a 10 gigabit NIC when serving from cache was that our cache get allocated rather than providing a view into the bytes.

I'll probably still be upgrading. go1.8 has some nice performance improvements overall. Specifically the codegen improvements help in HTML parsing and image resizing.

If I'm still upgrading, I have to wonder how many people out there are pushing Go's net/http harder than I am?

OhSoHumble9y ago

So I think you make a great point. I mean, I have no real reason to complain. In fact, I regularly make similar points about Ruby driven APIs. I have to admin that my comment was a bit knee jerk-y.

I think what I'm more offended by is how releases are being handled.

There was a regression and it's bothering people, there really is no getting around that. I think having a comment like yours in that ticket thread will really help calm the waters. However, personally, I feel that there should be a point release to revert the regression instead of waiting for a major release.

daenney9y ago

> There was a regression and it's bothering people, there really is no getting around that.

The reason people are bothered though is based on a synthetic benchmark that blows this out of proportion. Someone's already pointed that out and things have calmed down.

> However, personally, I feel that there should be a point release to revert the regression instead of waiting for a major release.

And if there was a significant regression I'm sure that would happen. However, Go has set forward the way they do releases: https://github.com/golang/go/wiki/Go-Release-Cycle

More specifically:

> A minor release is issued to address one or more critical problem for which there is no workaround (typically related to stability or security). The only code changes included in the release are the fixes for the specific critical problems. Important documentation-only changes may also be included as well, but nothing more.

If this regression can be properly quantified to fall in those categories, then a point release will be issued to fix it. But an at-worst half a micro-second overhead on a synthetic hello-world benchmark really doesn't fall into either of those categories.

From that issue thread:

> So from @OneOfOne's test, go tip made 5110713 requests in 30 seconds, that's 5.87us per request. Go 1.7.5 did 5631803 requests in 30 seconds, 5.33us per request. So when you compare those to eachother, that's like an 11% performance decrease. But if you look at it from an absolute perspective, that's a performance hit of just a half microsecond per request. I can't even imagine an HTTP service where this would be relevant.

There are many people in the Go community that do canary deployments of their services on new Go versions throughout the whole cycle. If anything major really was related to this I'm fairly certain it would've been surfaced already.

All that aside, this kind of benchmarking should have been done during the beta phase. It's even explicitly asked of the community to do so. No changes related to this were merged during the RC-cycle either.

I can't find a single compelling reason why they should break the normal release cycle over this regression.

bluejekyll9y ago

As you state, it completely depends on your use case. I'm on a team right now where we are working on a high throughput and high scale system in Java writing to Cassandra. Our ingest requests with writes are 350us.

Honestly it blew me away that we hit that number, but now that we have, 20us could, though generally will not, effect our overall numbers (there are other components in the system that are not this fast).

While I agree with you that this is generally not an issue, in some circumstances, it will be noticeable.

brightball9y ago

I'm going to assume that some of the adjustments they made to improve the scheduler are the cause of this. More checkins with the scheduler on tight loops prevents your entire server from locking up from a runaway infinite loop...but trades some slight overhead.

IMO I'd vote for stability over raw performance numbers every single time since the raw performance numbers depend on "best circumstances" and stability accounts for "worst circumstances". You'll see the latter in reality a lot while the former doesn't exist outside of benchmarks.

zeveb9y ago

> as the number starts getting too large, "requests per second" isn't a useful way of measuring the performance of a webserver, you're really more interested in "seconds per request overhead".

There's a similar issue with engine efficiency. Here in the U.S. we tend to measure engines in miles per gallon; the problem is that this isn't (typically) what we care about: we care about cost to drive a distance, not distance per dollar. I understand that in Europe engines are measured in kilometres per liter, which makes more sense. If we measured efficiency here in fluid ounces per mile, we'd see that: a 10 mpg car uses 12.8 ounces per mile; a 12 mpg car uses 10.7 ounces per mile; a 24 mpg car uses 5.33 ounces per mile; and a 36 mpg car uses 3.56 ounces per mile.

jerf9y ago

While I don't have a link, someone making that point is where I got this idea in the first place. "work units per resource" is a tempting measure because it is mathematically tractable for asking other sorts of questions ("how long will it take me to do 500 work units?", "how much gas to get to Cleveland?"), but for raw benchmarking and understanding purposes "resources per work unit" is often more intuitive as the resource tends towards 0.

hedgehog9y ago

Further, the performance improvements in 1.8 will probably make almost all apps faster anyway. Anyone counting nanoseconds needs to do their own benchmarking already to catch processor-specific regressions etc so probably exactly nobody will actually have a production regression from this.

TeMPOraL9y ago

Same situation as with FPS in games. Its usefulness as a metric is basically limited to "> 60 means you're golden; < 30 means you're in trouble". That's because $something per second scales non-linearly (1/x) with performance of your code.

jadbox9y ago

> The facts are what the facts are.

What about alternative facts?

OhSoHumble9y ago· 8 in thread

"Too late for Go 1.8, but we can look into performance during Go 1.9."

That probably shouldn't be the response for a major performance regression in a release candidate.

Looks like I'm sticking to Go 1.7 for however long it'll take before 1.9 is released.

laurent1234569y ago

According to one comment it's a performance hit of about half a microsecond per request. It's certainly something that should be looked at and fixed if possible, but my guess is that 99.99% of applications out there are not affected at all by this issue.

lmb9y ago

So all your application does is accept connections, send hello world, and close them again?

OhSoHumble9y ago

Absolutely! I provide a "hello world as a service" platform.

2 more replies

amelius9y ago

The theory is that also other programs will be affected, with a similar slowdown.

2 more replies

lossolo9y ago

It's so hard to imagine high traffic application with api serving information to clients from array that is refreshed once in 10 minutes? So basically most of the time you touch only this array on every request which is like sending hello world example. In this situation it will be noticeable if you have a lot of traffic but then you would not use stdlib http library but something like fasthttp which doesn't change that there are other real world use cases that would be affected, not a lot of them but they exist.

01walidOP9y ago

That's why I posted this, too many people would be affected by such regression. And would prefer sticking to Go 1.7.x

romanovcode9y ago

> too many people would be affected by such regression.

Yes, too many people do stupid "hello world" tests indeed.

Maybe this is a problem with running "hello world" tests and not that much of a real-world problem. Let's see.

andy_ppp9y ago

Nope, hardly anyone with real workloads will be affected.

reimertz9y ago· 5 in thread

Why would it be too late? Isn't this the whole reason for release candidates? To find final major issues before releasing the next major version?

If not, could someone please educate me?

thewhitetulip9y ago

@kmlx commented

https://github.com/golang/go/issues/18964#issuecomment-27830...

I remember reading about the release cycle here: https://github.com/golang/go/wiki/Go-Release-Cycle

Once a release candidate is issued, only documentation changes and changes to address critical bugs should be made. In general the bar for bug fixes at this point is even slightly higher than the bar for bug fixes in a minor release. We may prefer to issue a release with a known but very rare crash than to issue a release with a new but not production-tested fix. One of the criteria for issuing a release candidate is that Google be using that version of the code for new production builds by default: if we at Google are not willing to run it for production use, we shouldn't be asking others to.

barrkel9y ago

The closer you are to a release, the bigger the blocker needs to be. If there was incorrect behaviour in a mainline use case, that would be much more significant than a performance regression.

A 20% performance regression in a minimal http server (i.e. one that doesn't have any business logic) does not sound like a big problem to me; that kind of overhead would normally be dwarfed by database calls, and a 20% increase in the overhead doesn't sound like it's a large increase in what I'd expect to already be a very small number.

reimertz9y ago

Thanks for the clarification.

So a similar situation in node.js-land would be if require('http') would get a worst-case scenario of a 20% performance hit, right?

If this is the case, even I, who only run single instances of node, would think it would be a fairly big impact that i'd try to fix if I was the maintainer and still had the possibility to fix it.

1 more reply

ktta9y ago

Usually, the release candidate is modified only if significant bugs are detected. But this this branded more of an implementation pitfall so I doubt it'll be fixed.

You're absolutely right about asking if it can be fixed now rather than later (I was very surprised they wanted to wait till 1.9!), and thanks for asking that on there. 1.8 would be known for this bug in case of static site hosting since there are more req/s for that use case, if this did make the official release.

It should be noted that it was tested against a hello world benchmark and it won't matter in higher payload cases when the limiting factor isn't the extra routine but the payload itself by a long shot.

reimertz9y ago

Thanks,

I have never used GO so it might be a bit rude of me to ask. But when I looked at the commit, it seemed to be a fairly small subset of changes which I, maybe stupidly, assumed meant that it would be quick to fix. :)

bsaul9y ago· 4 in thread

bradfitz : "That was one of the biggest architectural changes in the net/http.Server in quite some time. I never did any benchmarking (or optimizations) after that change. "

Sorry, what ? It's not like the http server of the stdlib is here only for doing hello world code samples... You would imagine those benchmark to be part of some CI process along with unit tests.

lmb9y ago

Benchmarks in CI are hard, because you need them to run in the exact same environment to make any sort of conclusions. But CI's are often noisy, virtualised, dockerized, whateverized. There is not much benefit in that.

poooogles9y ago

I encountered this a while ago, I've ended up bechmarking in relation to a past change and ensuring the percentage difference it's within a margin.

Not ideal but it's better than pure X vs Y.

bsaul9y ago

That could be an idea for a service. Provide an instance type with stable execution environment for benchmarking. Just stable, and not necessarily performant.

1 more reply

jobvandervoort9y ago

Talking about this commit: https://github.com/golang/go/commit/faf882d1d427e8c8a9a1be00...

arussellsaw9y ago· 1 in thread

worth mentioning that this is only a noticeable performance regression in situations where the majority of the request is spent in http processing, eg 'hello world' handlers. Here is an example of the performance improvements i've seen in a real world application, admittedly heavily GC bound, but still the performance improvements are considerable: https://twitter.com/arussellsaw/status/819904231759085571

Cthulhu_9y ago

This is the real benchmark - compare performance with real, working software instead of microbenchmarks that show a small regression (okay a fairly big one in terms of percentage) on a very specific and unrealistic use case.

I'd take 20 us performance degradation in one specific slice of code over a 50% performance increase overall any day.

edit: english in the previous sentence is bad. You know what I mean. Small regression is fine if the overall speed is much better.

akerro9y ago· 1 in thread

Why it is too late? He doesn't want to give any justification. Isn't the point of RC and community supported development to catch such cases before stable is published? Just make another RC.

dawkins9y ago

It's answered here: https://github.com/golang/go/issues/18964#issuecomment-27830...

Matthias2479y ago

If I understand the possible culprit commit (https://github.com/golang/go/commit/faf882d1d427e8c8a9a1be00...) correctly then real world applications could still be faster than with the older versions on average. E.g. if a request handler would start a database request and forward it's CancellationToken (context.Done) to the database call both might be immediatly stopped with the new logic and the resources can be used for handling new requests. If in the old version the cancellation did not work properly the database request might have needed to run to completion before anything else could be done.

tmaly9y ago

If you look at it, the change that was most attributed to the slow down, was committed on October 2016.

Why could the people making an issue about the 0.5 us slow down per request not have tested or ran a benchmark sooner?

cameroncooper9y ago

Surprised that nobody has mentioned the true hero of this story - git bisect - awesome tool, and perfect for pinpointing these sorts of regressions.

eternalban9y ago

The std. dev. & max numbers caught my eyes:

               avg.      std dev   max
     Latency   195.30us  470.12us  16.30ms -- go tip
     Latency   192.49us  451.74us  15.14ms -- go 1.8rc3
     Latency   210.16us  528.53us  14.78ms -- go 1.7.5

That is a seriously fat distribution. Has anyone ever benched for percentiles?

sddfd9y ago

Conspiracy theory: They knew they'd take a 20 microseconds hit on every connection close, and (rightfully) did not care.

So basically this is a communication issue with a community that does not understand what to make of its own benchmarks.

siscia9y ago

As jerf mention I don't believe that this particular regression is going to be significant for the almost totally of the use cases (and the very few that are going to be touch by it probably are savy enough to test their performance before to deploy in production).

What I believe is more serious is that this wasn't catch during the development, it could definitely be a worth trade off however we should be aware of it...

j / k navigate · click thread line to collapse

65 comments

47 comments · 12 top-level

jerf9y ago· 16 in thread

[1]: https://news.ycombinator.com/item?id=11187264

Matthias2479y ago

rocqua9y ago

In response to the response that now got flagged twice:

I think you read a lot into the parent posts that wasn't there.

Let me restate what I believe to be the parents meaning:

Another would be weird broadcast storms that happen when a component fails.

Basically, consider cascading effects of errors when optimizing performance.

Projects where 'performance' is taken to be "how quick is my usual case execution path".

1 more reply

logicallee9y ago

The following is tough love:

>I have worked in the network programing domain for the last few years and I also found that especially outsiders and newbies get too obsessed on pure performance figures.

You people make the worst trade-offs ever. Your decisions suck. Your work sucks. The web sucks, because of you.

Change everything radically. Figure it out. Don't boast about newbies/outsiders not understanding - you don't understand the correct trade-offs.

[1] https://en.wikipedia.org/wiki/Two_Generals'_Problem

----

kasey_junk9y ago

The issue with doing "seconds per request overhead" instead of doing "requests per second" is that you've switched what you are measuring.

Where those thresholds are is extremely use case specific. Some people only care about 95% of requests, others have to care about much higher levels of resolution.

jerf9y ago

First of all, tweaking what we're measuring is sort of my point.

1 more reply

the84729y ago

The thing is that humans usually care about the latency-CDF, even if they don't know it.

Pure throughput is what you want for batch processing without those pesky, impatient humans in the loop.

1 more reply

shanemhansen9y ago

I'll probably still be upgrading. go1.8 has some nice performance improvements overall. Specifically the codegen improvements help in HTML parsing and image resizing.

If I'm still upgrading, I have to wonder how many people out there are pushing Go's net/http harder than I am?

OhSoHumble9y ago

So I think you make a great point. I mean, I have no real reason to complain. In fact, I regularly make similar points about Ruby driven APIs. I have to admin that my comment was a bit knee jerk-y.

I think what I'm more offended by is how releases are being handled.

daenney9y ago

> There was a regression and it's bothering people, there really is no getting around that.

The reason people are bothered though is based on a synthetic benchmark that blows this out of proportion. Someone's already pointed that out and things have calmed down.

> However, personally, I feel that there should be a point release to revert the regression instead of waiting for a major release.

And if there was a significant regression I'm sure that would happen. However, Go has set forward the way they do releases: https://github.com/golang/go/wiki/Go-Release-Cycle

More specifically:

From that issue thread:

I can't find a single compelling reason why they should break the normal release cycle over this regression.

bluejekyll9y ago

While I agree with you that this is generally not an issue, in some circumstances, it will be noticeable.

brightball9y ago

zeveb9y ago

> as the number starts getting too large, "requests per second" isn't a useful way of measuring the performance of a webserver, you're really more interested in "seconds per request overhead".

jerf9y ago

hedgehog9y ago

TeMPOraL9y ago

jadbox9y ago

> The facts are what the facts are.

What about alternative facts?

OhSoHumble9y ago· 8 in thread

"Too late for Go 1.8, but we can look into performance during Go 1.9."

That probably shouldn't be the response for a major performance regression in a release candidate.

Looks like I'm sticking to Go 1.7 for however long it'll take before 1.9 is released.

laurent1234569y ago

lmb9y ago

So all your application does is accept connections, send hello world, and close them again?

OhSoHumble9y ago

Absolutely! I provide a "hello world as a service" platform.

2 more replies

amelius9y ago

The theory is that also other programs will be affected, with a similar slowdown.

2 more replies

lossolo9y ago

01walidOP9y ago

That's why I posted this, too many people would be affected by such regression. And would prefer sticking to Go 1.7.x

romanovcode9y ago

> too many people would be affected by such regression.

Yes, too many people do stupid "hello world" tests indeed.

Maybe this is a problem with running "hello world" tests and not that much of a real-world problem. Let's see.

andy_ppp9y ago

Nope, hardly anyone with real workloads will be affected.

reimertz9y ago· 5 in thread

Why would it be too late? Isn't this the whole reason for release candidates? To find final major issues before releasing the next major version?

If not, could someone please educate me?

thewhitetulip9y ago

@kmlx commented

https://github.com/golang/go/issues/18964#issuecomment-27830...

I remember reading about the release cycle here: https://github.com/golang/go/wiki/Go-Release-Cycle

barrkel9y ago

The closer you are to a release, the bigger the blocker needs to be. If there was incorrect behaviour in a mainline use case, that would be much more significant than a performance regression.

reimertz9y ago

Thanks for the clarification.

So a similar situation in node.js-land would be if require('http') would get a worst-case scenario of a 20% performance hit, right?

If this is the case, even I, who only run single instances of node, would think it would be a fairly big impact that i'd try to fix if I was the maintainer and still had the possibility to fix it.

1 more reply

ktta9y ago

Usually, the release candidate is modified only if significant bugs are detected. But this this branded more of an implementation pitfall so I doubt it'll be fixed.

reimertz9y ago

Thanks,

bsaul9y ago· 4 in thread

bradfitz : "That was one of the biggest architectural changes in the net/http.Server in quite some time. I never did any benchmarking (or optimizations) after that change. "

Sorry, what ? It's not like the http server of the stdlib is here only for doing hello world code samples... You would imagine those benchmark to be part of some CI process along with unit tests.

lmb9y ago

poooogles9y ago

I encountered this a while ago, I've ended up bechmarking in relation to a past change and ensuring the percentage difference it's within a margin.

Not ideal but it's better than pure X vs Y.

bsaul9y ago

That could be an idea for a service. Provide an instance type with stable execution environment for benchmarking. Just stable, and not necessarily performant.

1 more reply

jobvandervoort9y ago

Talking about this commit: https://github.com/golang/go/commit/faf882d1d427e8c8a9a1be00...

arussellsaw9y ago· 1 in thread

Cthulhu_9y ago

I'd take 20 us performance degradation in one specific slice of code over a 50% performance increase overall any day.

edit: english in the previous sentence is bad. You know what I mean. Small regression is fine if the overall speed is much better.

akerro9y ago· 1 in thread

Why it is too late? He doesn't want to give any justification. Isn't the point of RC and community supported development to catch such cases before stable is published? Just make another RC.

dawkins9y ago

It's answered here: https://github.com/golang/go/issues/18964#issuecomment-27830...

Matthias2479y ago

tmaly9y ago

If you look at it, the change that was most attributed to the slow down, was committed on October 2016.

Why could the people making an issue about the 0.5 us slow down per request not have tested or ran a benchmark sooner?

cameroncooper9y ago

Surprised that nobody has mentioned the true hero of this story - git bisect - awesome tool, and perfect for pinpointing these sorts of regressions.

eternalban9y ago

The std. dev. & max numbers caught my eyes:

               avg.      std dev   max
     Latency   195.30us  470.12us  16.30ms -- go tip
     Latency   192.49us  451.74us  15.14ms -- go 1.8rc3
     Latency   210.16us  528.53us  14.78ms -- go 1.7.5

That is a seriously fat distribution. Has anyone ever benched for percentiles?

sddfd9y ago

Conspiracy theory: They knew they'd take a 20 microseconds hit on every connection close, and (rightfully) did not care.

So basically this is a communication issue with a community that does not understand what to make of its own benchmarks.

siscia9y ago

What I believe is more serious is that this wasn't catch during the development, it could definitely be a worth trade off however we should be aware of it...

j / k navigate · click thread line to collapse