Analyzing a larger dataset would be neat indeed, though somewhat more challenging. Especially for the layout algorithm to produce something nice in a reasonable amount of time.
There are ~11k repos with >= 100 stars, compared to the 825 I had here (fewer after filtering for the giant component).