Basically solutions that were doing well in arc just threw thousands of ideas at the wall and picked the ones that stuck. They were literally generating thousands of python programs, running them and checking if any produced the correct output when fed with data from examples.
This o3 doesn't need to run python. It itself executes programs written in tokens inside it's own context window which is wildly inefficient but gives better results and is potentially more general.