Karma is a function of time since joining, participation, and quality of contributions. And it starts at 1. 'Participation' can be determined by looking at contributions per length of time. Quality is average score of each submission -- separating it from participation is a useful way to extend this to a more complicated model taking into account the fact that people stop using HN. So the line of best fit should be something closer to 1 + t * q * p, or the sum of 1 + (t0 * q * p0) + .... (tn * q * pn) to describe folks who are off-and-on contributors.
My instinct is that once you filter these people, you'll see a much stronger linear relationship between time and karma, since karma isn't normalized by the number of contributions, and number of contributions is probably a poisson process.
That's just a constant offset, it matters more whether the next karma level is 2 or 10. When fitting trend lines a "linear fit" would normally satisfy y=mx+c, without limiting yourself to c=0. Note that the posted linear fit has c = -1... apparently everyone starts with -1 karma. Lies, damned lies and statistics I say!
Which is a great way to conduct research! Nice work.
This reminded me of my senior project in number theory, when I manipulated a large data set, wondering what I'd find. Eventually, I found quite a bit.
Also reminded me of this quote by Wernher von Braun:
"Basic research is what I am doing when I don't know what I'm doing."
I think that's the crux of it. Somebody could monitor all the various news sites and spend an hour a day here posting comments and stories and so forth, but I suspect most folks would rather spend that time doing something else.
That said, edw519 is a pretty cool guy.
Which program did you use to produce the plots?