For starters, you have things like Wikidata[1] and dbPedia[2] which give you access to Wikipedia data in a structured form. Then you have datasets like OpenCyc[3] and Freebase[4] that you could include.
Then, there are tools like Stanbol[5] which can, to a certain extent, extract structured data from free text. Of course this isn't perfect, since you'd basically need to have solved AGI to do this completely. But you can get some "knowledge" from free text. Combine that with a crawling system like ManifoldCF[6] or Nutch[7] or something, and you could imagine building a pipeline to crawl websites and add to your knowledge-base.
If you decide to use RDF as the representation for the knowledgebase, there are things like Jena[8] that let you store and query your KB and do inference against it. Do all that, and probably add in a little more AI / NLP and you can build your own knowledge graph.
OK, yes, the "add a little more AI" bit is kinda hand-wavy, but that's an area of open research. Still, there are practical things that can be done today... and if you're looking for a thesis topic, well, here ya go. :-)
[1]: https://www.wikidata.org/wiki/Wikidata:Main_Page
[4]: https://developers.google.com/freebase/
[5]: http://stanbol.apache.org
Thanks for the response. Very informative.
The things is, you are probably looking for a solution that does the reconciliation of data arriving from multiple sources and formats for you and preferably exposes it over an easy to use API.
You can try http://unigraph.io, the API Sandbox (GraphQL) is available at: http://u01.unigraph.rocks and an extensive documentation covers it at: https://github.com/unigraph/docs/wiki
Currently Unigraph combines data from:
- wikidata
- geonames
- freebase
- crunchbase
- SEC EDGAR
- Companies House (UK).
A datadump is on its way and more sources will be added soon.
Disclaimer: I am building Unigraph, precisely for the reason of the question: "An open alternative to GKG".