1
Ask HN: How to prompt LLM to infer XSD from many XML documents?
I've got thousands of XML documents in a format whose XSD is not published. I'd like to produce an XSD for it, and I am wondering if a LLM could help. I've tried a few online LLMs like Claude and Copilot and the best they (or I) could do is to use a handful of XML files to generate an XSD. While the XSD was more or less valid, it was far from capturing all cases of the underlying format, and failed on the very next XML document I tried.
I am ready to run a local LLM for this task, but can someone with more LLM experience than me (I have none) describe a good process to do so? And which LLM might be suited?
Thanks!