Bam, no more wareseeker or efreedom.
This would solve a lot of people's complaints in one fell swoop.
There are greasemonkey etc. scripts to do this, but they're tied to a single browser on a single machine. A global filter (like in gmail) would be so much more useful.
Would this be particularly hard to do?
Plus, we are talking about a company whose core business demands that it can identify groups of bad-faith voters. Given time, they may find a way to incorporate this data safely into the ranking data (if anyone could, it would be Google).
And I know there are extensions to do this (mine mysteriously stopped working recently), but doing this on the client-side in a way that's bound to a single browser install just seems wrong to me, especially for Google.
As mentioned above, then introducing the shared-ranking via the social graph would be the next logical step. It could be something opt-in'd to ease adoption.
Then, ideally (and this is my personal 'white whale' problem) it would be great to imagine something where the user could whitelist through no action of their own rather than have to do any work to block, i.e. use the result set 'hit' of what's clicked in the results to act as a personal ranking upvote.
There's some interesting engineering issues of per-user indexing though, but hey, you wanted to work at Google right?
efreedom is monetised by Google ads. Might seem like a problem to Google.
Let's say it starts with personal blacklists. Then trusted lists that you can subscribe to (AdBlock-style). Then word spreads and enough people are using it such that AdSense revenue drops 20-30% or more?
(IME, CTR on ads is much higher on these content-light sites than it is on more reputable sites.)
It's to Google's benefit that people end up on these pages, see a ton of ads, and then click on one out of confusion or desperation.
Facebook, on the other hand, has developed a system where nearly every user activity creates a new easily processed and meaningful connection between users or out to the web itself. And those connections are probably closer to representations of some kind of trust than "I email that person a lot".
Anyway, I'm not saying the sky is falling for Google, just that search appears to be changing for the first time in awhile.
Traditionally google seem against human powered editing (as this would be), but I think as the black hat SEOs run rings around them, its needed badly.
What I'm trying to get at is, with all things equal, let's say Stack Overflow and efreedom's SEO is on par with each other, shouldn't SO's reputation/inbound link ranks automatically trump things?
SO is not editing the material for SEO, they just have whatever content the users generated.
1) - It doesn't have to be extremely painful, just painful enough, such that true loathing is needed as motivation. This way, we filter out frivolous decisions. A few seconds pause would be enough.
2) - We need to let the reduced ad revenue do the job for us through the market. Anything else will be gamed much to everyone's detriment. Just empower people to remove the annoyance, and let the money do its thing.
Re 1, painful? WTF? The whole point is to make it quick and usable. I can already blacklist sites the painful way, by adding them to a Google Custom Search page. The whole point is I'd like a quick add-to-killfile button, like email clients have had for decades.
Why?
99% of users are non-tech oriented.
Those users will not really be aware of the specific problems with the search results, they won't understand the concept of a good vs bad result and they certainly won't bother to tweak/ban/filter their results.
The 1% that do care and are currently being vocal about it will start filtering their results and they will perceive that the problem is solved. They will stop making a fuss.
So now, the complaints have gone away, but 99% of users are still using the broken system, so the good sites that create good original content are still ranking below the scrapers and spam results for 99% of the users.
The problem must be solved for all (or at least the majority) of users.
(And you can't take the 1%s filtering and apply it to all users in some kind of social search because the spammers will just join the 1% and game the system)
Perfection being the enemy of good enough, and a common and valued and traditional mechanism to delay product shipment.
And Google might well be able to utilize information from that 1% of users that have sorted that out - 1% of a Really Big Number of searches, factoring for the folks looking to game the search results (downward, in this case) - to provide feedback back into their search results.
I disagree. Let's call it 95%.
>Those users will not really be aware of the specific problems with the search results, they won't understand the concept of a good vs bad result and they certainly won't bother to tweak/ban/filter their results.
So have only people that have enabled the advanced features of Google search ban sites. All of a sudden only people that "get it" are the ones that can ban.
>So now, the complaints have gone away, but 99% of users are still using the broken system, so the good sites that create good original content are still ranking below the scrapers and spam results for 99% of the users.
So we need to use the votes to stop the spammers.
>(And you can't take the 1%s filtering and apply it to all users in some kind of social search because the spammers will just join the 1% and game the system)
Sure you can. If you couldn't then Reddit would be a wasteland of adds, but it isn't. They only have 4 or 5 engineers there and they can write code that will stop vote rings, let alone Google.
It's actually a pretty simple exercise to stop vote rings, unless the anti-vote ring code is open sourced, but even there it should be possible.
I think 99% of email users have not been adequately trained in why or how they should report spam, and even if they were I think most of them would still not care enough to actually do it with any regularity.
When pushed many may acknowledge that they know it exists, they will probably even be able to find the button when asked if given a chance. But they won't remember to do it when they see spam, they'll just ignore it and move on to the messages from people they know.
With search results, the spam is more often than not Made for AdSense sites that the average user doesn't realize are pure garbage. Then there are the mass-produced content sites like eHow that most technical people realize are worthless, but the average user loves. It isn't often you see Viagra sites popping up in searches for woodworking. It does happen occasionally though.
So no, I am pretty confident a majority of users would not utilize effectively a feature like that.
And when it's about search result. People are browsing and clicking through adsense filled "landing page sites". Most of them think that it's their fault that they couldn't find the thing they were searching for.
What I think really needs to be exploited is a ring of trust type aspect, I'd like to have the Hackernews ring where all us on here work together to remove the spam from our results and let's Google see what are taking out, maybe that will help them improve their algorithms.
Why not apply that reading level algorithm to users gmail data and public social network profiles, estimate the users IQ, and then those at the top of the pile are given "result burrying" moderator privileges.
Confirmed user accounts (cell phone verification) combined with other algorithms, such as profile age and activity, could make spamming sufficiently complex to de-incenvize all but the most illicit spammers.
Users at the bottom of the IQ pile (non-logged in users based on past search data and geo-location socio-economic status) don't even get the option to bury results. Which, by the way I think is more like 20% of US internet users than 99%.
CSE wants you to list sites that you want to search from. Of course, you can't default to '' or '.'. They even stated that '.com' and '*.org' etc. won't return any results. That's unacceptable. Secondly, given you could configure it meaningfully it seems it's pretty hard to configure your browser's search bar to use this CSE instead.
And that's what I think most people use for searching. At least I do.
Facebook got it right this time: with each post, there's an option to hide that post, that person, that application, or that site which posted the post. One click that means "don't show stuff from them anymore": that's what Google needs, too.
By this I mean I added it to my browsers, but I still use regular Google search daily. If the results is laden with bogus sites, then I switch over and start again, weeding if necessary.
Initially I thought I'd use GCS all the time, but it lacks the Google menu (Images, Maps, etc) which comes in handy more often than I expected. I use GCS most for code/development related searches.
http://news.ycombinator.com/item?id=2075437
It's not so much that it's a knowledge vacuum, just that someone didn't read the whole thread before replying.
If you go to the CSE website and select 'Advanced' and then download annotations, you can export the list of sites you've excluded.
Further you can make the exclusion list ("annotation list") into a feed - so it is entirely possible to implement the kind of user-generated blacklist of sites which has been discussed here.
GCS is really easy to set up - takes only a few minutes. I spent the most time hunting down rouge sites - which was actually kinda fun and cathartic.
Big tip: keep an easy-to-get-to link to the GCS Sites Control panel, so it's easy to add new sites. I've added ~40 more in the past two months.
Everyone is ripping off someone's content.
And just to be accurate here, SO content is creative commons (created by the community). Are those just cheap words?
These add significant value to the original content IMO
In my experience though the sites that are taking the content are ad ridden messes which remove value rather than add anything.
Maybe it's like that in your field, but in Mac dev questions you're fairly likely to get answers from established OS X developers, and even Apple employees.
Do people realise that if Google is your referrer, you can scroll all the way down and see the solutions to the question?
Actually, I sort of sympathize with the predicament EE faces. They want to show up high on Google search results, because that's how they get new customers ... but they don't want to give away their content for free.
Here's an opportunity for a search engine start-up: Allow users to search (by default) for only free content -- but also allow them to search, if they so choose, for content that's behind a paywall. Pay-for-access sites would love something like this.
Also there is this form for reporting spam sites: https://www.google.com/webmasters/tools/spamreport
Integrating the above into standard search results would be difficult unless it was restricted to users with a good "karma". That might be possible in our increasingly socially networked world
Perhaps we need to frame the discussion differently, considering what the searcher wants, rather than "spam-free hits".
If Google use that information to gradually adjust their ranking overall, then fair enough -- won't affect me, I can't see them anyway.
EDIT: Even if they don't let that affect everyone else's results (because of gaming), then I still don't care, I still don't see the crap in my results ever again.
As a workaround, try searching for "[any widget] sucks" and "[any widget] good".
EDIT: tying this to other discussions on the topic, it's a symptom of Patio11's observation that natural language search doesn't work very well. If you want to find something, you need to paint a picture of what it looks like, rather than asking a question about it.
People are also a lot less tolerable of spam in their inbox than they are of irrelevant search results.
I can even do that already with Google's Custom Search, all that's missing is a little 'block this site' button. Instead I have to go and configure Custom Search manually for each URL mask.
Presumably it would do the same exact thing as '-site:foobar.com'.
Usability-wise, though, it's not nearly as much use as a 'ban' button next to each result would be. But it shows Google already have the infrastructure and code that would allow this -- they just need to make it instant to use.
EDIT: The other downside of this is you lose a load of bells & whistles, e.g. previews, "pages from the UK" (without typing), icons for images/news/etc. Time will tell if I miss those.
It includes only sites which are known to use (mostly) good English. It's designed for English learners/teachers who want to find correct usage examples without risking exposure to Yahoo-Answers-style English.
And doing this would spawn a lot of people's complaints in one fell swoop.
If you owned a site, and created enemies, they could band together and flag your site as spam.
You are totally correct. I completely missed that. Maybe I should drink a bit of coffee and wake up ;)
Although... I have a suspicion that at some point it would effect non-logged in users. Many logged in users banning a site is a signal that may effect the global results. Maybe in the same way marking spam in Gmail...
If they're not looking into integrating that nicely into the existing search results page (not a separate form that the average user will never find or use), especially after all the internet chatter about it recently, then they definitely should make that a top priority in 2011. I definitely don't want them to do a rush job on it though. I don't want competitors to start reporting each other as spam in search results to try and game the system even further. I'm assuming they have anti-gaming measures in place for Gmail, so they won't be completely starting that from scratch...
At best G could use the information as a list of potential spammers and filter domains manually, but I really can't see this being automated without giving the SEOs another weapon.
If the best and brightest (arguably) on the planet can't figure out how to filter out search with algorithms, what makes us think we can mimic true human intelligence any time soon. (I think it will happen, just not as soon as some claim)
Basically, it's the difference between PvP and PvE.
Why not allow individual users to hide sites from their own search results and save the info in their google account? For example, provide a "hide this site from my results" link next to each result. Each person decides which site they don't want to see and SEO and global results remain unaffected.
If you clicked it, that result wouldn't appear for you again. I used it all the time.
Then, lately it's gone. Maybe I was part of a small, randomly-selected test group?
That experiment was replaced by Google "Stars" in March 2010 because, according to Google:
> In our testing, we learned that people really liked the idea of marking a website for future reference, but they didn't like changing the order of Google's organic search results.
http://googleblog.blogspot.com/2010/03/stars-make-search-mor...
I personally think there is much more going on here than Google admits.
Is there something I'm missing here?
It's not in Google's financial interest to provide this feature, but it already exists rather trivially.
Do you really think this feature doesn't exist for Firefox?
Further, it'll even sync with your google account making it global if you give it access.
http://googleblog.blogspot.com/2008/11/searchwiki-make-searc...
Would you want to type out a string of 20 or 30 excluded sites every time you search Google?
Ranking clearly isn't going to be good enough, because algorithms can be worked around and gamed.
Who decides if a site is spam?
So is free speech dead under your proposal? What is I build a site that criticizes the Governor of your state. Or a federal agency. What would prevent my site from being blacklisted in your proposal? Even if I had great content (your argument is about poor quality content) my could be voted into a black hole in a few hours. Lets think about this carefully. Is that the price we are willing to pay to get rid of EE?
I never said anything about it affecting other people's results...
What I found interesting: I was doing a search on something I normally have no interest in (a sewing machine manual for my wife) and I was amazed by the level of spam I was encountering.
We have no idea how bad the problem is for others whose topics we do not usually see. The web is far more full of spam than we even realize.
a) Google could warn you if it thinks the sites you have blacklisted seemed to have regained credibility.
b) Google could suggest additional sites you may wish to blacklist, based on other user blacklists.
c) Google could allow outside parties to curate blacklists.
d) Google could list the most commonly black-listed sites publicly. For the webmasters that find themselves listed who want to run an actual honest business, this is a good sign they should change their tactics. For the folks that aim to spam and profit... well screw those guys.
So for example, I could define -site:efreedom.com as an operator to be applied silently for every search I make.
My theory is that these complaints are coming from specific interest groups, not the general public. For example, spammy-content is created and targeted at a developer/programmer audience, and that is the source of some of these complaints.
So my suggestion is Google should platformize their search; and give out dedicated search instances to specific communities. The community should have enough levers to govern/influence what is spam or not. In addition, the community can promote certain high-value resources, which are otherwise unfairly listed in search results. Invite some high-profile communities for a test-run, and let the communities make their own choices.
The public Google can still handle the general public. This can also bring in some transparency in the way spam is determined.
1. How does Google make money? Search Ads.
2. How do people click on search ads? Bad real search results.
I haven't used it because I don't want Google to remember my search history. But if you are willing to stay logged into Google (which would be required for your proposal), it would not be an issue.
If, though, we could whitelist sites, it seems that results would get cleaner faster. I don't care about how many bad sites are out there, as long as helpful sites make it to the top. Plus, I typically use just a few sites to access reliable information anyway (the number's about 7, right?), so if I can whitelist results from those sites, I'll probably find my desired content more quickly.
What about the case when there are 30 spam sites listed before 1 good site? That hasn't happened too often for me. Instead, the results I'm looking for are usually just 4 or 5 spots down the front page, and very occasionally on the second page.
White listing seems like it would still be faster and easier for now.
That's the only negative I can think of - other than that, I say bring it!
Let it learn what I think is a good result for my needs.
If you make it a little bit social, make sure you weight other people's opinions by how much they agree with my own in other areas (making it harder for sockpuppets to muddy the waters)
Well sortof, you could block individual responses from coming up under a specific search term.
There was a little x by each result if you were signed into google and it said "never show this result again"
Not enough people used the feature for it to stick around...
I would love this ability but google please, good UI and consumer education. I love your features but don't love when they get taken away because users don't know they exist.
It lets me sync my config accross multiple machines.
Has nice hacker-ish config. Basically a text file you can share with others. This is my current config:
# Make these domains stand out in results
+en.wikipedia.org
+stackoverflow.com
+github.com
+api.rubyonrails.org
+apple.com
+ruby-doc.org
+codex.wordpress.org
+imdb.com
+alternativeto.net
# SPAM - never show these results
experts-exchange.com
ezinearticles
http://userscripts.org/scripts/show/33156
You can also sync them for Firefox across multiple machines using Dropbox, as the preferences are stored in your profile (IIRC, in a javascript file).
I do hope those working on the algorithm are taking note.
on your local machine and/or remote server... and it's free software.
blekko ? try this query, http://blekko.com/ws/?q=debian duh ?
Oh, yeah – they pulled it.
This would be an awesome feature.
In fact, let there be a sea of hands all gesticulating wildly to present it.
And no, this doesn't solve the problem.