Also is search considered a solved problem?
In any case, PageRank is a method for estimating quality of a page based on the amount of inbound links, not a solution to all of search.
But it's a property of the web at the time, not something universal to the search problem, e.g. it's not a statistic that exists if you want to search books.
I think the work being done on question answering (given a question and a document that answers the question, provide a concise answer) is a place where a lot of interesting work is being done, both in academia and at Google with the snippets of web pages it provides.
The next–very unsolved–problem is being able to "understand" natural language queries and "understand" source materials such that a user can ask for something and get it.
"Understand" is in quotes because because it means something rather specific.
Ha, not only is search not a solved problem, I would posit that search is getting WORSE.
Computer knowledge is a particularly good example for how search is degrading with time.
Try to figure out how to do X on the Beaglebone Black (I presume the Raspberry Pi has a similar problem, but it's not something I'm that familiar with).
The problem is that the Linux implementation for the Beaglebone went from weird distribution (Angstrom) to mainline Debian Linux kernel 3.8 -> 4.4 -> 4.14 in a VERY short time so the number of links to new stuff stayed flat.
Consequently, the old Angstrom stuff almost always fills the initial search positions for quite a ways even though it's completely useless.
This is occurring in other things, as well. Stack Overflow, for example, has no way to mark an answer as "This was correct 5 years ago but is now wrong."
Effectively, the web is becoming sclerotic and search engines are following it.
I REALLY miss old AltaVista's feature where it would give you a graphical representation of the clusters in your search so you could drill down into a less popular grouping. The fact that nobody has recreated this makes me wonder ...
Not counting comments? What more could you ask for?
a ton has happened! since pagerank, theres been a ton of advances around nlp that has changed the way queries are processed prior to information retrieval. for example, google's rankbrain seems to do a lot of the heavylifting around word similarity.
I certain wouldn't, since I still encounter things that I know are on the internet but Google can't find. It's possible that the next advance won't be actually indexing the web but rather figuring out what the user wants rather than what they requested.
Amusing anecdote regarding this issue.
- I teach an introductory online chemistry class.
- If the students are determined enough, they can/do cheat on their quizzes.
- In one of my quizzes, I give the students a formula for a pretend material and ask them to compute its molar mass.
- If you perform the calculation, the molar mass works out to something like 108 grams / mole.
- If you try to Google the answer, Google is smart enough to know that my compound is unstable.
- Instead, Google provides the molar mass for a _related_ material (86 grams / mole)
- Each semester, I find a handful of students who dutifully tell me the answer is 86 g / mole.Do you mean in school or in university?
Because if somebody is taking linear algebra in university and still didn't understand that this is an extremely important topic, then he should maybe go back to school or take a gap year or something to improve his general education.
From a university lecture in linear algebra, I expect that it carries me through as many important topics as possible while giving me the tools to form a good formal and intuitive understanding.
The motivation part should be solved by that time.
From a university lecture in linear algebra, I expect that it carries me through as many important topics as possible while giving me the tools to form a good formal and intuitive understanding.
Sure and unless your goal is to do pure mathematics research, part of that is providing some motivation in terms of understanding applications. There's nothing wrong with a class on LA (or any other topic as far as that goes) spending some time motivating study of the topic by showing how it can be useful... even explaining how it might used to make millions or billions of dollars.
I really liked it when we went over PageRank in one my university lectures. It made me think "Wow, I can actually have an impact using the things I've learned here".
raises hand
I was that kid.
I had to take differential equations, linear algebra, and discrete systems for my CS undergrad. I loved math, and I did my best but I eventually got bogged down and lost interest.
Since then I’ve come to appreciate more why you would want that advanced math as a programmer, but my experience at the time was that I could brute force almost anything with enough for-loops.
It just didn’t occur to me that there were interesting problems that were too complex to brute force, or that brute force would lead to such tangled code in some cases that it wouldn’t be debuggable.
Frankly, I doubt more liberal arts would’ve convinced me. I needed to get knee deep in big bad code and domain specific problems before I’d realize what I missed.
But in all seriousness, Google is notoriously tight-lipped about how PageRank works, so I doubt we're going to get any more information than this. I'd love to proven wrong, though!
https://searchengineland.com/faq-all-about-the-new-google-ra...
Although there haven't been any major announcements, from daily anecdotal evidence I can confirm it's still a major factor to get you into the front pages.
I'd say the major changes since deprecating PageRank in practice are 1. Much higher dependence on CTR and bounce rate once you start showing up in the SERPs 2. Much higher influence of a notion of "trust" on links (not just the quantity, but mainly the quality counts now, too much quantity without quality can actually hurt) 3. PageRank much more disconnected from domain level. A few years ago, you could rank with pretty much anything if your DomainPop was high enough. By now, Google got much smarter about different folders that don't have anything to do with what it's ranking your main site for and it's harder to get them ranking. On the bright side, they also got smarter about the negative SEO influence of subdomains or domain changeovers, which won't cost you as dearly.
So TL;DR: PageRank still exists, but it's been reduced to an input vector.
Yes indeed. In particular how multiplication distributes over addition. It is powerful technology.