What I'm trying to say is: The line could be blurry. The PR code quality could be crap and the project really needs to invest a lot of time to make this fit for merge, in which case they rightfully refuse the bounty. But it could be that the code quality is great and they are just trying to misuse this to make people do more than they originally wanted. And the difference between those scenarios could be hard to see for somebody external. Or even for the parties involved: The project could legitimately think that the code is not of sufficient quality while the submitter could legitimately think that they satisfied the request.
Who is the arbitter? Will people tend to accept the PR anyway (silently clean up and spend time afterwords), not wanting to risk their reputation? Or will submitters tend to accept major changes, possibly beyond the original ask, not wanting to risk their reputation? Seems a bit to me like a problem also faced by Airbnb and similar services.