Instead of showing the shortest path, which in my opinion is "boring" and ends up connecting super-important central articles, I came up with my own method: WikiBinge selects the smaller, less represented articles on Wikipedia. In a WikiBinge path, the underdogs are the kings!
How does it work? It's pretty straightforward! Compute PageRank on the Wiki-graph and assign as weight of each edge the PageRank value of the destination node. A WikiBinge path is then simply a shortest path using these weights: the algorithm will then favor paths passing through articles with lower PageRank values.
More on the motives to build this here: https://www.jamez.it/project/wikibinge/
This is an older project of mine, but it never got much exposure, so I'm humbly submitting it now.
I really love the circuitous path though. Fantastic route to discovery and I can see those even being a neat thing for schools
https://www.wikibinge.com/#George_Bush_Intercontinental_Airp...
I imagine you could double this.
I built wikiscroll.blankenship.io for myself to scratch my neophile itch. You might be displacing it in my daily routine, a nice pre-built rabbit hole between two topics of interest has proven to be a lot of fun over the past 30 minutes.
Amazing work.
As a short aside, at first I didn't get it. I was surprised the paths between articles were so long. It wasn't until I tried "Adolf Hitler" -> Something (Hitler has notoriously short paths to everything) that I realized these weren't the shortest paths. Your loading text does a really great job of explaining that, but the "random" button appears to be pulling from a cache (clever!) so I didn't get to see that loading message about the "boring shortest path" until I went off the beaten path.
Since it seems like you are computing both the shortest and the "most interesting" path between the two articles, it would be cool to give me a way to see both on the final loaded page. The shortest path is interesting too, even if it is less interesting than the one you ultimately generate.
It'd also be cool to be able to "pin" one of the boxes so the random button only impacts the other. For example, if I started at the Great Molasses Flood, what path could I take to random other articles? Though I guess this can be accomplished by spinning and then retyping the "Great Molasses Flood"
Edit: I deeply appreciate your narrative at https://www.jamez.it/project/wikibinge/ - this is one of my favorite projects I've come across on HN in a long while.
I agree. Sometimes it loads fairly quickly and you only get a second to look at the shortest path.
Fun project, thanks for sharing!
If I had a bit of feedback to share, it's that the shortest path (which shows while loading the binge) continues to be visible after it finishes loading -- maybe at the bottom of the page?
It seems that, if you pick an uncached path, the loading screen shows you the shortest path while it computes the longer one. More info in their linked article.
So now I want to know, is there a similar tool that does shortest path? Because that would be fun too.
Yes very many of them, like OP said.
Here’s a popular one: https://www.sixdegreesofwikipedia.com/
I think this would avoid the super common article problem, but also lead to more relation between each link.
hackNY won't come up and if you try try to add a place with a comma (Lowell, Massachusetts) you can't type it you have to scroll it.
I take your point about the limits of knowledge graphs written manually vs LLMs. IMHO it's not either/or. We need both curation and statistical approaches, and when they are merged they give the best results. Just ask Wolfram: https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its... Edit: fixed link to Stephen Wolfram's blog.
chicken nugget <-> constitution of canada
Now I'm wondering if its a bug or if there is actually no connection
Fun fun, thank you for sharing! In the interactive web interface*, I hope non-canonical names can be used, that shortest names can be completed and exact matches can be use, and at least accept what it's in page names.
*It looks like writing the URL fragment yourself allows more leniency.
Mocca is a port on the Red Sea in Yemen. The Ottoman Turks used to require all shipping entering the Red Sea to put in at Mocca so their coffee could be taxed. Mocca is a port town; they don't actually grow coffee in the town, it comes from a mountainous region to the North.
Jimma is a coffee-producing region in Ethiopia, south of the entrance to the Red Sea.
I just bought 250g of beans labelled "Mocca Djimmah"; the vendor couldn't tell me whether it came from Yemen or Ethopia. My guess is that exports from Yemen are "challenged" just now, but I'd like to taste some coffee from the original home of coffee.
One of those town manufacture car parts. Then I got 10ish car model from various brand.
An actor that did a commercial for one of those brand 20 years ago.
He grew up in the second town.
Cool. But loose
Apparently Humptulips, Washington was Terry Pratchett's favourite place on earth. :)
I managed ~50 intermediates on this tool from [my home town] to [Kevin Bacon].
Well done coding it up. The average pathlength will be fascinating.