An AI only doing a task correctly 50% of the time may in-fact be better than your N% chance of hiring a highly capable human for that task, and especially for contracting a human to a 1-2 hour task.
But your successful use of AI is still predicated on a human who can judge output and break the work into smaller tasks that fit the skill ceiling of the AI, which is currently no more than tasks that take a skilled human 2 hours.