Go to http://www.autoaccessoriesgarage.com/Seat-Covers/
Use the picker to pick a particular make and model: http://www.autoaccessoriesgarage.com/Seat-Covers/_Acura-RDX?...
So far so good, no problem. Your browser now has a cookie that says you're interested in just this make and model. Now for the problem: use the nav links to go to "Cargo trunk liners", and where do you land?
http://www.autoaccessoriesgarage.com/Cargo-Trunk-Liners
That's cloaking -- it's not showing you all of the liners, just the ones relevant to the make and model you picked earlier. Instead, the site should add _Acura-RDX?year=2008 to the url, just like before.
Why do search engines care about this stuff? Now imagine you type in [auto accessories cargo trunk liners] into your favorite search engine, and the result is http://www.autoaccessoriesgarage.com/Cargo-Trunk-Liners ... what does the search engine think you'll see? It has no idea, really.
https://productforums.google.com/forum/#!searchin/en/cookie$... (see the response is marked best answer by Matt Cutts - head of web spam at Google).
If I have never been to the site I'd land on the unfiltered page that would be a good result, and if I had a cookie (which seems to be a session cookie from a quick look) then it is likely I was recently at the site and so the filters are likely relevant but if not they are easy to change.
'Cloaking' has negative connotations and is more of a concern when there is an attempt to mislead search engine. In this instance, there is a big problem with your suggested fix -- the Panda algorithm would see many very similar pages which might actually make things worse (which I agree is silly, as your solution would otherwise have some upsides, but there is often a trade off in these situations).
The duplicate content problem you describe is fixable (edit: and is already a problem, I'm only recommending changing links, not adding any pages to the site.)
And by the way, there are plenty of websites that force crawlers to use cookies in order to crawl the site. I don't know how GoogleBot deals with that, but I bet it involves crawling with cookies... no matter what the forum post says.
http://www.autoaccessoriesgarage.com/Seat-Covers/_Nissan-Alt... http://www.autoaccessoriesgarage.com/Seat-Covers/_Hyundai-So...
The page looks identical, and if you think you're fooling Google into thinking these pages are very different with a couple keyword-stuffed paragraphs of text, think again. Now open both in separate windows and click a product. The product page itself doesn't change except for the vehicle name inserted with a cookie. This looks like you're just mass-generating category and product pages dynamically, which is probably what you're doing.
Don't get me wrong, I feel your pain, and funny enough I've solved this EXACT SAME problem on a similar car accessory site. Maybe I can offer some advice.
Your main Panda problem is that you have a page for every type of product for every make and model. That's a LOT of nearly-identical pages. You need to consolidate them somehow. Easier said than done, right? You don't sell all products for all vehicles, and you want users to have an organic landing page when they search for something like "[make] [model] [accessory].
Instead of generating these landing pages and making up text, I'd use a filter on your car covers page that sticks the user with a URL variable that stays with them until they change their make/model. This also frees you of the need to make up pointless mass-generated paragraphs.
This truly is frustrating, because the site is actually functioning in a way that makes sense for the user, and Google is penalizing them for it.
I get that Panda can be helpful, help identify "low quality" content. But the true definition of "low quality" changes depending the industry and the category of products being sold.
A good algorithm should be able to distinguish between the various sites or topics of sites, and apply said algorithm differently, right?
It's definitely hard and algorithms will get better with time as Google understands more and more of the web, but in the meantime, give us a heads up so we can fix it :)
And I'm not sure if Google took that into consideration when they launched Panda. If they had, we wouldn't be seeing the intentional bloat.
But good write up overall, love these case studies.
Search spam detection has improved over the years, but it's fundamentally aimed at detecting sites that "look like spam". In response, search engine optimization has become more about making clickbait sites look less like spam, even to humans. It's now hard to tell a clickbait journalism site, one filled by low-paid article rewriters, from one that has actual reporters. (Business Insider is owned by the founder of DoubleClick.) Looking at the superficial properties of a site is no longer a reliable spam indicator.
The big search indicator used to be links. That's what "PageRank" was about. Links stopped working because most links to business sites now come from social media and blogs, and those are really easy to spam. Anyone who runs a blog now can watch the phony signups and posts come in. There's a whole industry selling phony Google and Facebook accounts for SEO purposes. Google has responded by disallowing many sources of links, with the result that the remaining link data is sparse for many sites.
Google isn't looking at the business behind the web site. Here, Auto Accessories Garage sells auto parts. Find the business behind their web site, and you can verify that they are in the auto parts business. Their site is full of auto parts. Therefore, not spam. Google doesn't do that. That's why they failed Auto Accessories Group.
At SiteTruth, we look at the business behind the web site. Here's what we're able to find out for Auto Accessories Garage.[1] This is the internal details page; users rarely look at this. We give them a good rating. We didn't, unfortunately, get a proper match to corporate records because their corporate name is Overstock Garage, Inc. (We don't have a full D/B/A business name database for dealing with such problems yet.) SiteTruth picked up the Better Business Bureau seal of approval on the site, cross-checked it with the BBB for validity, and noted the "A+" rating there. Not a spam site.
The process is completely transparent. The link below lets you see all the data SiteTruth looked at for Auto Accessories Garage. Because it's checking against hard data from external sources the site can't control, there's no need to be mysterious about how it works. There's a vast amount of data available on businesses. If you tap into Dun and Bradstreet (we can do this, but can't turn it on for public viewing by free users) you get in-depth financial data on companies. That allows real supplier evaluation, far beyond what Google can do.
The SiteTruth approach does a good job on real businesses that sell real stuff. There are objective measures for such businesses - revenue, years in business, BBB ratings, even credit data. Google doesn't use those, and Google fails real-world businesses because they don't.
If you want to try looking at SiteTruth ratings, try our browser add-on from "sitetruth.com". We put those ratings on search results from Google, Bing, Yahoo, DuckDuckGo, etc. Now on Firefox for Android, too. End self-promotion.
[1] http://www.sitetruth.com/fcgi/ratingdetails.fcgi?url=www.aut...