We are friendly with Semantic Scholar, and have used their "open corpus" dumps as one of several URL seed lists for crawling in the past. Their search and discovery tech is more sophisticated than ours is likely to be any time soon (https://medium.com/ai2-blog/building-a-better-search-engine-...). We would love to get to the place where groups like AI2, which are primarily research-oriented, could build on an existing open catalog and corpus, and not need to duplicate time crawling, merging catalogs, cleaning metadata, etc. As of today Microsoft Academic (used by Semantic Scholar) might be a better option.
Want to be thoughtful about ranking signals, and are deeply skeptical of journal impact factor, h-index, and most bibliometrics. "Has this been cited more than a handful of times" seems like a reasonable coarse boost. Hope to include more curated signals, like "won a paper prize", "journal in DOAJ and other reviewed indices", etc.
Have been working on a citation graph, keep an eye out for something about that in coming months. One cool thing we hope to do with the citation graph is find "missing works" not yet in the catalog (eg, don't have a DOI, especially for pre-1990 era).