Historically their performance is underwhelming. Sometimes competitive on the first iteration, sometimes just mid. But generally they can't iterate quickly (insufficient resources, insufficient product demand) so they are quickly eclipsed by pure software implementations atop COTS hardware.
This particular Valley of Disappointment is so routine as to make "let's implement this in hardware!" an evergreen tarpit idea. There are a few stunning exceptions like GPU offload—but they are unicorns.