BSD's hash table code has been around since probably longer than the author has been alive.
Here is the FreeBSD version, it's very compact and works quite well: http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/db/hash/h...
"A few decades ago [...] if you'd wanted to use a hash table, if you even knew what a hash table was, you'd have to write your own."
I agree with you on the quality of the BSD code, and I'm glad such great code is readily available. But a) I definitely had been programming before 1990 (the copyright on that BSD code), b) back then hash tables were far less tightly integrated with programming languages than they are today and fewer people knew about them, and c) if you want to be pedantic Hash Tables have been around since 1953, so way before most programming languages that are still in use today. http://en.wikipedia.org/wiki/Hash_table#History - they're however much more commonly understood, and in ubiquitous use today!
And just for the record, Common Lisp had hash tables since 1984 (and I guess Maclisp had them before that), but earlier lisp dialects had things like plists and alists.
As far as I can tell, this article says 1) Shucks, hardware sure is cheap these days! and 2) There sure is a lot of software out there that you can mash together! Those things make it easier to start a company, but they don't provide the essential insights that make that company truly revolutionary.
"Without thinking" is an exaggeration for some of the items in the post, but consider the problem of storing 200GB of data. "Um... on a hard drive?" "And how will you finance that?" "Gee, maybe with the money in my wallet right now? When do these questions get hard?" Shucks, hardware sure is cheap these days! Problems simply disappear from being challenges to not requiring any thought at all. The exponential increase in the power of affordable hardware may not be surprising, but to me it seems worth thinking about even though it's been normal and predictable my whole life.
Google's innovation was 3-fold: better search algorithms (pagerank), which did use the implicit data from the interconnectedness of the web to judge the relevancy and rank of search results; revolutionary data center ops (using commodity hardware with heavy reliance on automation); and state of the art software engineering (sharding, map reduce, etc.) The last 2 enabled the first to run efficiently on a rather small set of hardware and to scale up speed just by adding more hardware. The end result was better results, delivered faster, and at lower cost to google.
This led to a much better product for the end users (better/faster) and allowed them to acquire a huge portion of search marketshare quickly. But the low cost of operations meant that they could better take advantage of advertising (lower cost per search means that even lower revenue per search can be profitable).
Nutch[1].
Nutch doesn't deal with modern web spam particularly well, but I'd say it matched early Google pretty well. Specifically, it implements Page Rank, has a reliable web crawler and a web-scale data store.
1) getting the data, 2) computing the eigenvector of a large matrix, 3) and serving that data to users, wasn't cheap in 1998. it's comparatively dirt cheap today.
not to diss larry and sergey's impressive achievement - they were brilliant and they pulled it off - but i think back then game was so costly that a lot of brilliant people never made it to the starting line. it's cool to see that it's become a much more level playing field now. i'm curious what cool stuff we missed out on because of people who didn't make it to the starting line!
The real message is that servers are cheap, albeit brought forward in a long vague buildup, and hardly novel information.
Sounds like a marvelous challenge. Anyone have other similar "technological frontier then, high-school science fair project now" type challenges? OPer notes BioCurious as one. A major factor in education is walking kids thru a subject from basic principles to state-of-the-art, recreating historical milestones along the way.
Content publishing: Weekend project. Rails, memcached and CloudFront and you're done.
IM and Buddy Lists: 1.5 million simultaneous users doing n^2 pub/sub-type distributed transactions.
Mail: 4,000 emails per second with live unsend and recipient read/unread status. I think PostgreSQL tops out in the millions of rows per second nowadays.
Web caching/acceleration: pick your favorite proxy solution and configure it.
Single sign-on: Form strategic partn-- Hey, you said technical challenge, not political.
opening a web shop.
Building robots(at today's kid's levels).
Designing really complex and fast digital circuits(using FPGA, and IP blocks).
Building a global, scalable and complex database application(using something like MS lightswitch).
I was writing this more in the sense that kids at BioCurious (and the DIY Bio Movement in general) are doing electrophoresis to transfer DNA from glowing jellyfish to bacteria. This is just a few (two?) years after someone got a Nobel prize for that.
That's progress. If stuff that used to be hard falls into kids hands, you're gonna see impressive stuff happening.
However I fully agree that it takes more than that to build a company (Also I wouldn't try to compete with 2012 Google using 1998 technology)
The Nobel prize you're referring to was probably the one for GFP. Interestingly, a huge challenge in using GFP now is patent issues and thus money issues, rather than technical issues.
Google's value doesn't come so much from search any more (it's good at it, though there are now grumblings from the Googluminati), but from its advertising network (and the concomitant connections and contracts associated with it), and the value-added services built on top of Google's underlying search technology, to the extent that those leverage Google's base tools and/or expertise.
The chinks in Google's armor are starting to show though:
- Cheap and/or federated search is now available. - OpenStreetMap is providing mapping data (and APIs) to rival Google Maps. - There's a lot of grumbling going on over privacy especially in the social and mobile spaces. Neither has quite fully coalesced, but if you look at the volatility in both spaces (consider what the largest social network and most popular smartphones were 5 years ago vs. today), things could again change quickly. - Most tellingly, trust in Google to "not be evil" is eroding, rapidly in some quarters.
Google is valuable -- because it dominates advertising, and has the users to monetize that. Chip away at the user base and it could find its hegemony starting to fail.
The fact that it's very, very cheap to replicate Google's underlying tech helps with this. DuckDuckGo is essentially a one-man shop. Yes, it has a very small fraction of Google's traffic, but it compares favorably with everyone else who's tackling Google, including Micorosft's Bing, with ... more than one man equivalent last I checked.
Then again, according to the wikipedia page the original BackRub was conceived when the web was only 10 million pages large, $2000 is considerably more acceptable for a Ph.D. project.
This included a $4,516,573 NSF grant (that didn't go to Larry & Sergey in full, but probably helped their project's infrastructure quite a bit).
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=9411... http://en.wikipedia.org/wiki/Stanford_Digital_Library_Projec...
On the expense side I've probably actually underestimated the expenses by orders of magnitude. Bandwidth wasn't cheap back then and the storage requirements probably were significantly higher.
Basically the article boils down to this, what counted as a 'cluster' in 1998 is a single system in 2008, what used to take hundreds of disk drives to store, you can store on 1 today.
Not particularly deep, but useful to think about from time to time. There is a quote, perhaps apocryphal, which says
"There are two ways to solve a problem that would take 1000 computers 10 years to solve. One is to buy a 1000 computers and start crunching the numbers, the other is party for 9 years, use as much of the money as you need to buy the best computer you can at the end of the 9th year, and compute the answer in one day."
The idea that computers get more powerful every year, and that in 10 years they will be more than 1000x more powerful than the ones you would have started with so one can solve the same problem.
Of course they haven't been getting as powerful as quickly as they once were, but the amount of data you can store per disk has continued to outperform.
The point is that if you are designing for the long haul (say 10 yrs from now) you can probably assume a much more powerful compute base and a lot more data storage.
What he's saying is that the existence of the cloud and library advances such as MapReduce and APIs mean that the bar is lowered, when writing new software, to an extent it's hard even to comprehend.
Every time I get a module from CPAN I still get a shiver down my spine, remembering trying to do new and interesting things in the 80's and early 90's and every single time ending up trying to build a lathe to build a grinder to grind a chisel to hack out my reinvented wheel.
Though, given that hard drives very much do not obey Moore's Law, a well-designed 1998 solution with hundreds of disks may well have far faster IO than the 2012 one-disk solution.
PS: A traditional HDD is hard pressed to break 200 IOPS / second cheap SSD's easily 100x you can break 100,000 for well under a grand. http://en.wikipedia.org/wiki/IOPS
I agree with the gist of the blog posting though.
Unfortunately he had to use the 14 year old girl analogy and exaggerate the ease with we could build Google circa '98 today. Now his whole point is lost to click clacking of a thousand pedants' keyboards. Guys, this isn't about 14 year old girls nor is it about Google per se as much as it is about the fast pace of tech innovation, the ease and costs associated with acquiring infrastructure, and to a lesser extent there's a tiny but about how we're totally spoiled compared to what we had to work with 14 years ago.
The stuff about Google and 14 year old girls is just a literary tool (along with some mild hyperbole) to help illustrate his point which so far is getting completely missed. Come on guys, is this Hacker News or Pedantic Literary Scholar News? Focus on the point, not little Google girls. PLSN does have a nice ring to it but no, we're not on PSLN. At least not yet.
hehehe
Look, I don't care whether your product cures cancer, dispenses oral sexual favors, and mints pure gold dubloons-- I will not give you my email address without a damned good reason.
Every single goddamn link on your page brings me to a "Enter your email here" prompt, except for the company tab, which brings me instead to a pile of vapid marketing bullshit.
Flotype Inc. is a venture-backed company building a suite of enterprise technology for real-time messaging. Flotype takes a unique approach by building developer-friendly technologies focused on ease-of-use and simplicity, while still exceeding enterprise-grade performance expectations.
Flotype licenses enterprise-grade middleware, Bridge, to customers ranging from social web and software enterprises to financial and fleet management groups.
What does that even mean? You using carrier pidgins? Dwarves? Cyborgs? UDP? ZeroMQ? Smoke signals?
You don't even tell me how my email is going to be used.
Fix your shit.
There's a time and place for profanity/verbal hostility. Feedback to a stranger on website UX isn't it, the perceived intensity level and level of anger is just dialed wrong. I wish pg would implement a filter for this kind of comment.
Honestly, I think it's a reaction to "Minimum Viable Product" overkill on HN.
The first 10 times it's OK. The next 50 times it gets less interesting. Once you're into three figures it really starts to grate. So you start to the MVP style posts. Which means those making MVP posts have to turn to a different strategy: the "interesting headline" blog post, to drive traffic to their site.
Oh, and I think that the older people here (and at 31 I'm probably one of them) are turned off by really blatant marketing..
More seriously,this is not a mere UX problem. This isn't a problem with colors not matching, with poor navigation, or with anything else.
Absent any other information, this site appears to be a way of fishing for email addresses. That's the long and the short of it.
I am not just a string to send messages to. I am not just a networking opportunity. I am not just an entry in your preferred database.
I am a developer, and I don't like it when sites treat me otherwise.
I thanked the author for his (very fast) response.
I'm sorry about the tone of the post, but frankly we can't let this dehumanization and arrogance towards users (and worse in this case, fellow developers) slide.
EDIT: Note also that, had he simply posted a good article (which it was!) without the shameless plug, I would've said nothing. If the plug had linked to a page that had anything other than email scraping, I wouldn't have complained. But the linked page was so offensive that it deserved calling out. Let this be a lesson for you startupy folks: don't cheapen a good thing with a bad plug.
:)
If possible, at least have a dev writeup some use case or sample code or something we can see to get an idea of what Bridge does.
Thanks!
(My team's website is rather bad right now, but at least it has direct download links without asking for emails. I feel your pain on the web stuff, though, when you've got code to hack.)