It occurs to me that if we want LLMs to be able to reason better, we may need to write texts that explicitly embody the reasoning we desire, and not just use a bunch of books or whatever scraped from the Internet.
Populating these Ontologies is very manual and time-consuming right now. LLMs without any additional training (I'm currently using a mix of the various size Lllama2 models, along with GPT 3.5 and 4 for this) are capable of few-shot generation of ontological classification. Extending this classification using fine-tuning is doing REALLY well.
I'm also seeing a lot of value in using LLMs to query and interpret proofs from deductive reasoners against these knowledge graphs. I have been limiting the scope of my research around this to two domains that have a kind of eccentric mix of formal practices, explicit "correct" knowledge, and common sense rules of thumb to be successful. Queries can be quite onerous to build which a fine-tuned model can help with, and LLMs can both assist in interpreting those logic chains and doing knowledge maintenance to add in the missing common sense rules or remove bad or outdated rules. Even selecting among possible solutions produced by the reasoners is really solid when you include the task, desires, and constraints of what you're trying to accomplish in the prompt performing the selection.
The formal process and knowledge is handled very well by knowledge graphs along with a deductive reasoning engine, but they produce very long winded logic chains to reach a positive or negative conclusion where a simpler chain might have sufficed (usually due to missing rules, or a lack of common sense rules) and are generally incapable of "leaps" in deduction.
LLMs on their own are capable of (currently largely low-level) common sense reasoning, and some formal reasoning but are still very prone to hallucinations. A 20% failure rate when building rules that human lives may depend on is a non-starter. This will improve but I don't think a probabilistic approach will ever fully remove them. We can use all of our tools together in various blends to augment and verify knowledge, fully automatically, to make more capable systems.
* What are some deductive reasoning engines out there?
* How are you storing the rules in a graph? Using triples?
* What query languages are you using for this?
How I'm storing them is maybe interesting... Graph databases that support tagged or typed relations get you most of the way to a basic reasoning system. I'm currently storing the graph content of the knowledge base in a database called SurrealDB with the RocksDB backend (this is new to me). Learning SurrealDB has been giving me a bit of Deja Vu around building simple reasoners, there is a lot of overlap between graph theory and formal reasoning.
Rules are encoded in the program. I find it beneficial to think about this as building up lots of small tools that are aware of what they're capable of and can indicate whether they think they can make single step progress towards the goal. You might have a small tool that if you have a location and you're looking for a different location (its prerequisites for "knowing" it may be useful) you could have that tool walk your graph finding the smallest shared domain that both are a part of that can become part of the context. Building up context and learning to disambiguate things using that context is kind of the "hard part" if you want to optimize your solvers or ask complex questions.
For the Rules to be meaningful you need to have an Ontology for your database. Your Rules will be operating on top of fixed concepts, and known relation types between them. Ontologies are a BIG subject that has also faded into the mist a bit, there are some samples on http://ontologydesignpatterns.org but there are a lot of dead links on there as well. The more complex your Ontology the harder to write and/or less useful the individual tools become as they will probably have fewer opportunities to run.
Hope this helps!