story
I could be mistaken but I believe he reported the security issue through our regular support channel which is why it took three days to see (instead of our security channel). From the time I saw it, I fixed it with the patch going live within an hour or two.
When I DID see it, tried it myself with a quick shell script that that curled and backgrounded the same request a bunch of times, I just kind of chuckled. It was a good bug. Josip is top notch.
I believe the race condition is on the rise in terms of severity and importance. Developers are aware of common OWASP bugs, but this type of race condition is often overlooked and developers are going to NEED to be just as aware of. Way to go.
That's the problem with OWASP, when developer from a big company sees race condition for the first time and is surprised
I worry that a malicious attacker could finger the service for potential victims.
1. The user submits X number of requests within a second. 2. The system puts the request in a command queue that synchronizes the commands by coupon code, for example. 3. The command is popped off the queue and an event is generated and saved saying the coupon was redeemed. 4. The next command is picked up and all events are applied before processing. At this point, the command is no longer valid so you reject and send an event saying that an attempt was made to redeem a redeemed coupon. 5. Do the same for subsequent requests.
To me, this approach is safer and easier to reason about. You have a log of the fact that someone made the attempt so you can report on this. Not sure you get that benefit from a stored procedure and a transaction unless you build it in and then increase the running time of the open transaction.
It's not necessarily different than using a normal RDBMS, right - you could do a check in SQL outside a transaction and end up writing multiple times. But with an RDBMS, you can easily solve the situation by turning on a transaction and leaving no question about things.
This is why things like VoltDB ("NewSQL") are pushing to keep SQL and ACID, and figure out a way to scale, instead of throwing it all aside and making the developer deal with consistency issues.
It's not that you can't end up with the same functionality using eventual consistency, just that it's harder. Just look at Amazon's "apology based computing" (I think that was the name) and how they structure their systems to be resilient by being able to resolve multiple conflicting commands in a proper way (deciding, without communication, which command wins, figuring out rollbacks, etc.) It's fantastic, and perhaps it's the only feasible way to operate at their scale. But it's also a hell of a lot more complicated than "UseTransaction = true".
(So my predictions/guesses: If developers that'd otherwise use a traditional ACID RDBMS switch to non-ACID (BASE?) systems, they'll end up introducing bugs due to the shifted responsibility of handling data consistency. And seeing how big servers are, and even how far sharding can take you with normal RDBMS, the scale at which people "need" to drop ACID is probably far higher than the point at which people are dropping it.)
I guess if you were using an append only log that recorded the exact timestamp of the transcation, your datastore would eventually reconcile that for example promo code 1 was applied twice. But what do you do then? Rollback the 2nd application of promo code and deduct the credit from user account?
Where would the logic for that be programmed?
In most systems the reward is zero, so you can infer if a person has taken the time to submit a bug report it is because he/she is invested in seeing it fixed.
Context: I work at a decent sized company in SV on this type of problem.
Properly designed bug bounty programs are a cornerstone to any company who remotely cares about the security of their product, period.
The idea of misaligned incentives due to poor bug reports being free to submit is ignorant - and worse toxic, because it sounds so true to an executive who has no actual understanding of the issue.
A quality bug report should take no more than 1 minute for a reviewer to look at and know if it's really a bug or not. If it can't, it should be rejected saying provide more clear details. For example a dom based xss attack could be reported with just a target URL and it is quite clear what the problem is. That would take 10 seconds to analyze.
Additionally, most bugs reported to most decent sized companies are reported by someone who has previously reported a bug to the company before. If someone is constantly reporting good bugs or the opposite, its quite easy to prioritize which of those individuals gets their emails read first.
http://blogs.msdn.com/b/oldnewthing/archive/2011/12/15/10247...
(I've never been on the receiving end of a security mailbox, so I have no personal testimony as to the reasonableness of this approach.)
The bugs expressed in this post, such as the duplicate account creation one, are not going to impact a company's bottom or top lines, so it is questionable whether it warrants a small merit award which is the idea that spawned this thread.
Now that I've sufficiently named my experience, allow me to give my side:
1. You will never receive $100,000 for selling a vulnerability in PayPal. You probably couldn't even find a buyer for it on the "black market." I have explained why repeatedly on Hacker News before, so I'm just going to link this: http://breakingbits.net/2015/04/01/understanding-vulnerabili...
2. Bug bounties are not always a net positive for an organization. They are also not a cornerstone of good security posture. A foundational focus on robust software security would start with various other things until the financials are worked out and there is someone knowledgable to read incoming reports.
Only 7% of submitted reported to companies for a responsible disclosure program are valid. This is especially true for paid programs, where the validity percentage often drops to 3% or 4%. Loads of people who know nothing about software security try to find bugs, desperate for the gold rush of bounties they see headlining places like HN. They submit spurious reports and as a result the signal to noise ratio of responsible disclosure is fantastically bad. What this means practically is that the average organization spends between 50 and 300 hours a month investigating incoming security reports.
You can quickly see how the cost adds up here. I'm not an executive or manager trying to cut costs - I've managed bug bounties for plenty of startups and Fortune 500 companies. I've also reported bugs that loads of people tell me I could have sold for "millions" of dollars - and received nothing for it.
I love bug bounties. I run them, I participate in them. But they can be a frivolous waste of time for development teams without a solid enough grasp of security to review incoming reports, and a waste of money in the worst case.
3. I'm sorry, but you lose credibility by claiming most security reports can be qualified in a minute or less. You can certainly throw out many in a similar time frame, perhaps five minutes, but real vulnerabilities? No.
If I report a server-side request forgery in your API that requires a very specific set of actions to occur in an obscure, undocumented application function, you will not qualify this quickly. Unless you are literally verifying a CSRF issue, it is completely unrealistic to assume this.
A race condition will not be qualified in a minute. A buffer overflow will not be qualified in a minute. Budget an hour per report, and be happy when you come across the reports that take you a few minutes. XSS and CSRF are comparatively simple to verify with a good report, yes, but most other classes aren't.
Let's add to this the folks who can find great vulns but write bad reports. No exploit code, but he found something real? Good luck verifying. I spoke to a fellow infosec engineer the other day and he told me he spent an entire morning out of his work day verifying a report that came in. Not patching or even triaging mind you - verifying. Most security teams do not have the olympic level efficiency and skillset diversity that Google's and Facebook's do - it is unreasonable to assume a report can or even should be verified quickly.
This is all to say that I believe your outlook is not consistent with reality, with all due respect. Bug bounties are not a simple decision to make. I've seen development teams swamped, overwhelmed and jaded from the reports they receive.
1. Code sent.
2. Check if valid.
3. Redeem code.
4. Invalid code.
Now if i send 10 requests at the same time with the same code maybe 4-6 will hit the code part after 2.
And your window of opportunity is the time it takes to go from 3 to 4. Sometimes certain tasks are put inside async queue, you have a slight delay to your database server or you need to wait for db replication to kick in.
Because normally there is no code part to recheck how often this code was used.
If you introduced some combination of a user ID and promo code, then it won't prevent a race of one user firing many queries with different promo codes and stacking them up. It would, however, fix the original problem.
The problem is concurrency. Whenever you have multiple things happening at once, you have concurrency and programming concurrent system is always really hard.
Unfortunately the software industry has never really got a grip on this problem and there are lots of developers who have never really studied multi-threading at all. That's a problem, because it's something that takes a lot of practice and you have to just incorporate it into the way you think. After a while you do get a sixth sense for race conditions, but you'll still write racy code from time to time anyway. It's just tricky to get it right 100% of the time.
spdy has already outlined what is happening here, but this problem is something that is covered in literally any introductory course to database systems or multi-threaded programming. If you have two threads (or processes) in flight simultaneously that are reading and writing to a shared data store, then you need some kind of mutual exclusion. That can mean a lock:
1. Request for /reviews/add is received.
2. Database lock on the page reviews table is acquired.
3. Check if the user has already posted a review. If so, release the lock and abort (any good framework will release locks for you automatically if you throw an exception).
4. Add review to table.
5. Release lock.
At the point where the lock is acquired if another web server is in the middle of the operation, conceptually speaking the first one will stop and wait for the table to become available.
Real implementations don't actually "stop and wait" - that would be too slow. They use database transactions instead where both web server processes/threads proceed optimistically, and at the end the database will undo one of the changes if they detect that there was a conflict .... but you can imagine it as being like stop and wait.
Of course once you have concurrency, you have all the joy that comes with it like various kinds of deadlock.
It's funny, a few days ago I was letting my mind wonder and ended up thinking about web apps for some reason. Oh yes, now I remember, I was thinking about concurrency strategies in a software library I maintain and how to explain it to people. And then I was thinking how hard multi-threading is and how many people are selling snake-oil silver bullets to it, and started to wonder how many web apps had race conditions in them. And then I just discarded the thought as one of no consequence and got on with my day, haha :) Perhaps I should have tried to earn a bit of money this way instead.
for i in `seq 1 16`;do
curl.*& #copied from chrome dev tools. & to background
doneI have considered writing a program that will let me send of a bunch of HTTP requests at once, but wait to close all the connections at the exact same time. That would probably be the most effective way to trigger race conditions.
There are no guarantees in the SQL standard that queries with subqueries should be atomic.
The only truly safe way to protect yourself is to fix the schema in a way that you can make use of unique indexes. Those are guaranteed to be unique no matter what.
Or put the whole thing in a transaction, right?
You could put the transaction in SERIALIZABLE mode, but that would mean that your database has a lot of additional locking to do which you might or might not want to pay the price for:
Your two-part query now block all other transactions from writing to the table(!) and conversely also has to wait until everybody else has finished their write operation.
Doing an opportunistic attempt with READ COMMITTED and reacting to the unique index violation (official SQLSTATE 23505) is probably the better option.
Resist the temptation of READ UNCOMMITED in this case because that might lead to false-positives as competing transactions might yet be aborted in the future.
If you're trying to build a social network that's full of graphs and edges between them ...... good luck. Google developed technologies like MegaStore and Spanner to handle this. Before it had those, it used huge sharded MySQL instances.
You can then get the exact request by using Chrome developer-tools. (Find the POST-request in the network-tab, right-click and select copy as cURL)
less cynical answer: Commonly you already have some kind of means to handle races - locking, transactions, some other variety of extra check - and the fix for newly discovered races is "oh, I didn't realise that could happen. add lock"
Adding a random time to sleep might work, but some requests would run noticeably slower.
It'd be much better to make sure you're updating the same unique key and/or use the DB's conflict resolution system.
- update
thanks for the downvotes guys. keep up the good work
One time they paid me $5000 for a bug I never could have found, but they did internally based on my low severity report. (http://josipfranjkovic.blogspot.com/2013/11/facebook-bug-bou...)
(https://www.fb.com/818902394790655)
They probably got a couple people working exclusively on bug bounty reports. I also have to say they did a great job changing communication channels from emails to tickets which show in /support/, it is way easier now. The downside is that you must have a Facebook account, not sure if it was needed before the change.
stop jumping into the hate wagon everybody
HN can be frustrating if you provoke it. The problem isn't so much what you said as how you put it: the combination of dismissive tone and superficial content puts readers here on edge, because too many comments are like that and we all find them annoying. As a result it's easy to have your good intentions misread. If you had explained the thought process behind your comment, I think it would have been received better.