There's a reason Google won't give you a full look at a page's spamminess rating, and a reason reddit is open source except for the spam filtering bits. With any blacklist filtering routine, transparency puts you at an inherent disadvantage. Instead of a million trials to see what passes and what doesn't, the answers are already there.
...or they aren't really, and this is just a PR scheme by Yelp to claim more transparency with the illusion of fairness for review visibility. To me this is the only real possibility here.
If you make the list open then honest people can see too.
The only reason you'd keep the list of what's spam private would be to maintain (perceived) infallibility. Every filter makes mistakes, only some filters let you see them.
I can understand why they don't offer any sort of manual review -- how is a staffer who has never patronized that business capable of determining if a review is spam or legit? I wonder if it would be any better if questionable reviews for a given business would be put up to a vote by "more established" users (by whatever metric they use to determine that) who have also reviewed the same business. Let people who have actually been there determine whether it's spam from someone who never walked in the door.
Most of the filtered reviews I have seen so far are, in fact, pretty spammy or blatant advertising..
This will definitely make people appreciate the filter more
OTOH people, not machines (thanks to the captcha), could read tons of filtered reviews and come up with cleverer ways to beat it
Could be interesting to see what kind of reviews are being flagged anyways. I've always thought that review stuffing is way more common than people would think...