When I was at Altavista, we were also blocked from doing dynamic abstracts by cost.
Google's main advantages were:
- managed by the founders with a total focus on search and measurable results
- google's hiring process produced a very strong team early on
- strong focus on controlling costs from the beginning (Altavista's use of the DEC Alpha was a huge handicap.
At google these three groups worked hand in hand and complemented each other's work. The eggheads came up with page rank, the coders figured out how to make pagerank scale through massive paralellism via sharding and mapreduce, and the data center folks figured out how to make sharding cheap and fast through commodity pc based servers and massive amounts of automation for management. In the end everyone was working at the top of their game to help everyone else. The result was that google was able to deliver better results (pagerank) faster (mapreduce) and cheaper (automated commodity hardware datacenters) than the competition.
There were lots of other fine details that led to google's success, but in the end those core factors are what allowed them to deliver a better search experience to users (better/faster) and to be more competitive in the marketplace (lower cost per search means more profit even with lower per search ad revenue).
No one else in search was pushing on all the right pressure points the way google was, and the rest is history.
From the article: "In short, Google had realized that a search engine wasn't about finding ten links for you to click on. It was about satisfying a need for information. For us engineers who spent our day thinking about search, this was obvious. Unfortunately, we were unable to sell this to our executives. Doug built a clutter-free UI for internal use, but our execs didn't want to build a destination search engine to compete with our customers. I still have an email in which I outlined a proposal to build a snippets and caching cluster, which was nixed because of costs."
The engineers here had more than inkling what needed to be done. The problem was this didn't go through the entire company.
Infighting and begrudging compromises only happen when the leadership is blind to the details.
I was VP of Engineering at Altavista in 2000, and I started the project to move to Linux. It wasn't easy because search engineering was populated by Alpha fans who were unswayed by the 10x cost advantage.
As late as 2001, I sat in multiple focus groups where all the enterprise customers said Linux was not yet ready for the datacenter. IBM's penguin campaigns were just beginning at that time.
Google's large scale use of Linux was groundbreaking when they launched in 1998.
Altavista was started by Paul Flaherty. One of his jobs was to find some way to showcase Alphas.
If I just want to know when the next episode of Big Bang Theory is out or what the weather is today I rarely need to even click on a result. For more obscure technical searches at work, Google still finds more answers.
But remember - the barrier to change for a search engine's customers is very very low
What the article doesn't say is Inktomi had a dual sided business. One side was in Caching Proxies the other was licensing a search API.
Inktomi decided to focus on the caching proxy business and de-emphasized their search product, only to watch the proxy business evaporate as internet bandwidth became cheaper/better.
The focus on a shrinking market (proxies) and the lack of focus on growing market (search) killed them. Had search been a priority from the beginning things may have ended very differently with Inktomi creating their own front end.
As someone who worked on search quality at Google for some time, this bit jumped out at me as a terrible mistake. The correct way to judge results for the query [yahoo] is:
(a) Where is yahoo.com? At the top?
(b) There is no (b).
It seems like a slight difference, but it leads to the wrong priorities. For the query [yahoo], it does not matter if spam or non-spam is in spot #5. The only thing that matters is where you put yahoo.com.
It's a recording of a very very good talk by Inktomi co-founder Eric Brewer called "Inktomi's Wild Ride - A Personal View of the Internet Bubble"
- search is a commodity for licensing (making them resistant to launching a "cleaner" engine that would alienate their clients)
- what worked for a smaller internet (100 million pages) could scale appropriately with the growing internet (100 billion pages) without rethinking everything
- "Page rank" only helped relevance (it was also about spam)
I think Google is stuck in a rut of their own right now. Here's some faulty assumptions I think Google is making:
- users always want faster, more direct answers (rather than controlling the filtering/categorization of their searches)
- users want Google to predict what they mean rather than clarify what they mean
- algorithms > human decisions
That's a very power-user centric attitude, don't you think? As a power user I preferred to type long, complicated Sabre queries to find exactly which airplane flight I wanted. It was much faster, and I had memorized all of the complicated mnemonics. But that's not what a casual user would want to use.
Asking users to specify categories for what they want means requiring a certain orientation in their thinking which is shared by computer scientists and trained librarians. But to an average user, that's extra work. And think about how this might work if you're talking to an actual human librarian: if you start asking about TV shows, and then mention "The Big Bang Theory", do you think the librarian will ask you, "Did you mean the scientific theory, or the TV show?" That's only something a stupid computer would do. A smart librarian would take the context of the previous queries that you've made of him or her, and provide the right answer quickly and efficiently. Wouldn't you want the same thing from a search engine?
AltaVista still exists. It's awful, and it's powered by the Yahoo search engine. Which is pretty much the same thing, I suppose.
If I had to pick one reason why Google triumphed (and you can only pick one), I think it would be their Page Rank algorithm. It added that extra bit of awesome-sauce to and already tasty stew.
The Inktomi managers had the point of view of a generic MBA.
Everything followed from that difference of view.
>The Google CEO's have the point of view of an engineer.
And, more generally:
>The Google CEO's have the point of view of the people doing the actual work.
The lesson to take away from this is that one shouldn't try to manage what one can't do themself. The disconnect between the manager and the problem domain becomes too great and they end up making ridiculous decisions since they are acting on the wrong information.
Inktomi management would probably have had to raise capital on a risky pivot whilst at the same time dropping all of their existing revenue streams in order to compete head to head with Google who at that point didn't even have a way to monetise their technology.
That's a hard thing for any company to do: In this case it would have been the right choice, but it's far easier to say that with the benefit of perfect hindsight.
Edit: fixed typo
Agreed. If I had a nickel for every time my Blackberry crashed since I upgraded the OS a year ago, I'd buy some Apple stock.
In the age of napster people were using our service-- and falling in love with it-- finding music on the internet.
Nope, no money there. Better to sell to ISPs in canada and hope they integrate our results with Inktomis.
"Are there any lessons to be learned from this? For one, if you work at a company where everyone wants to use a competitor's product instead of its own, be very worried."
This is because companies sometimes (maybe often?) ban the use of competitor products to their detriment.
Most engineers that I knew in YST did not use Google at all. We preferred to eat our own dogfood, and filed query triages against bad results (and only used Google to compare).
Is that crazy or what?
A) it was fast, it loaded fast. B) it was not filled with ads and pop ups.
Only one of those is still true.
Easy test: are you using Bing now? (with their new clean results pages)
This is unrelated to the main point, but does anyone know if Yahoo re-used a significant amount of Inktomi tech acquired for that $250M? Or was it spoiled?
x's so it won't become a link!
hxxps://wxw.google.com/search?ix=aca&sourceid=chrome&ie=UTF-8&q=domino+pizza+phone+number#hl=en&gs_nf=1&tok=DBTJEp2_3oW1F2ietDMecQ&pq=ipad%20screen%20resolution&cp=1&gs_id=5w&xhr=t&q=new+ipad+screen+resolution&pf=p&safe=off&sclient=psy-ab&oq=nipad+screen+resolution&aq=0&aqi=g2g-b2&aql=&gs_l=&pbx=1&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.,cf.osb&fp=8316e992ae23057e&ix=aca&biw=1363&bih=647
Thanks for capturing the train of thoughts that seem to run through my head almost daily. Lots of lessons to learn from that experience. Although it is easier to connect the dots in the rearview mirror that it was looking forward at the time, There were some clear lessons about not forgetting the actual end user (which is not always your customer), using a single metric as a proxy for user experience, obsessing about a competitor and trying to get big instead of great.
I'm not sure why I switched to Google. Not to discredit the Google UX, but I think I switched because the name 'Google' was so catchy and eased it's way into my university's vernacular. "Just Google it" rolls of the tongue nicely.
Back in the day you could be on the second page of Google search results before Excite or Altavista had even loaded.
Yahoo wasn't much faster.
The google spam that actually works (or at least, worked a few years ago, before panda) requires setting up lots of sites, lots of independent IP blocks. That was much harder when PageRank appeared - hosting was damn expensive, VPSes were nowhere to be found.
PageRank was a huge thing. I used "and" queries in altavista at the time. It was no match.
If your users are that passionate, you're doing something right.
After I gave it a go for a while, the other reasons you list made me stay.
From what I recall, it was also the most programmer-friendly search engine at the time.
http://www.bing.com/search?q=DGGEVX
http://duckduckgo.com/?q=DGGEVX
http://www.google.com/#q=DGGEVX
This is purely subjective, but Google is the only one that links to the Netlib 'real' LAPACK, straight to the file/documentation in question. The others have mixtures of Java packages and other examples...
Now, FWIW, I'm building another search engine. Instead of 20 engineers we have just me. Instead of 4 years, we're going to do it in one. While I have no interest in going up against google (different plans entirely) the radical change in leverage you get with open source and PAAS or IAAS, combined with Google's having taken their eye off the ball and run off to chase Facebook down a blind alley, means that something like DuckDuckGo actually could take real share away... maybe. (%1 of google's volume would be "Real share" right?)
[1] Oracle did have full text search but did not have the performance or per-machine-efficiency we needed, so it cost us a lot, and it was a constant fight to get it to do the kind of queries our relevancy algos required. we had a constant stream of consultants in from oracle HQ, and in the end dumped it and wrote our own db from scratch in about 4 months.