That's a a really brave thing to do, and deserves serious credit.
A note for people outside the academic Machine Learning field: NIPS is widely believed to be on a different level from the rest - during my PhD, my advisor used to say that a NIPS paper would be on par with a paper published on a good journal, resume-wise. The difference is especially striking if you've had the chance to attend other conferences (including, alas, IEEE-sponsored events) - that are, with very few exceptions, fairly terrible from a scientific point of view.
You have limited speaking slots. You have to guess what the conference attendees will find interesting this year. You are biased by your own particular interests.
Also, some of the submitters are your friends/colleagues, and even if they didn't tell you what they were submitting already, which is unlikely since your relationship is based on talking about this stuff, you can tell it's theirs in less than 250 words...
However a conference tries to sell the fairness and objectivity of its process, you can't anonymize or double blind these things away.
These are really great observations with deep implications. This same patterns might get applied in other aspects of life such as interviewing candidates or selection of mate or buying a shirt. In all these cases, we might have similar distribution at work.
I have often wondered why is it so hard to have less mediocrity in world? Why is not every book, t-shirt or smartphone is just great? One obvious reason is that lot of times people create something out of obligation such as demand from job instead of out of urge to create. So subsequent question is that if it was possible that no one has to have any obligation to create, can above distribution turn its head over hill? For example, in that scenario would we have, say, 70% great papers, 5% mediocre and rest coin toss?
Different people have different ideas of what "great" means. Not everyone thinks the Harry Potter series are great books, while many do. We see that in movies where a movie does poorly at the box office while the critics.
The definition of greatness changes over time, so "It's a Wonderful Life", now considered one of the most critically acclaimed films ever made, had only mediocre revenue when it came out.
Greatness is sometimes situational, so "Dan Brown ... is the undisputed king of airplane books — the not-too-heavy, not-too-long potboilers perfect for a long layover." If you don't fly, then perhaps there's no time when Brown's works might appeal.
Travel has its own category of "good enough." Visiting Germany once I bought a book from the limited English selection not because it was great, but because it was something to read on the long train ride.
A lot of people watch sports, but surely it can't be that all sports games are great, so greatness can't be the only reason for keeping someone's interest.
Since it's hard to predict greatness, people will test out ideas to see if there's a response. Sometimes this can lead to feedback and improvements. Sometimes this testing is through writing clubs. Sometimes (as with smartphone apps) this is with the market itself.
My favourite peer review story is when I submitted one on my articles to the top journal in my field at the time (Appied and Enviromental Microbiogy). It came back with the usual peer review trivial changes (cite this irrelevant paper of mine,etc) which I did (this nearly always easier than arguing with the reviewers). The editor made a mistake and instead of sending the updated manuscript out to the original reviewers, they sent it out to a new lot of reviewers. What was funny about the whole exercise was the second set of reviewers called the first set of reviewers idiots and told me to change everything back.
I'm sure more than a few people won't have any idea what "NIPS" stands for.
From the point of view of an author, if you get a paper rejected that you know is worthwhile, you just have to make whatever improvements you can and then submit it again.
SIGMOD made an interesting move this year by accepting all papers reaching its standards. However not every paper will be given a presentation slot during the conference.
Of course this is assuming an objective standard for what constitutes a "good" paper. As others have pointed out, the only really meaningful standard is "how does this paper compare to other work being done in this field"? So it's also reasonable to think of NIPS's goal as just trying to present the best papers that were written in any given year, not as bestowing a strictly-defined stamp of objective quality.
It could be that they had a first pass at a schedule, used that to set a first cut for the reviewers, then adjusted the schedule once they figured they needed to add another 42 papers.
Also, not being accepted does not mean that a paper is poor. They used a rank system, so it only means that others had papers which appeared to be better.
I've heard from lots of professors that a good conference gets a lot of "very-good-but-not-great" submissions and the job of the program committee is to pick the best among these. I wouldn't be surprised at all if minor personal preferences (which from the outside look rather random) ended up having a big say in the fate of a particular paper. Maybe some reviewers are more forgiving of poorly-written but technically strong papers, maybe some reviews consider certain fields "dead" and so are biased against them, reviewers tend to wildly different standards on how extensive an experimental analysis should be to be acceptable, ...
It doesn't take a "week off" to notice that a paper is gibberish, at the very least.