Has anyone tried it? I mean, I would, but as far as I can tell understand I need 4 boxes with 4 GPU's. Plus an interconnect. I mean, I could put in an order for my homelab but at around $80k per box and maybe $20k for the right switches and some other gear, my wife will probably frown at me ordering a $340,000 rig to try this code that I don't know what to do with it if it works.
Maybe some other heavy hitter out there can explain what all this whatchamacallit newfangled synergy producing matrix algebra does after you have it running?