This is one thing that frustrates me about AI. I can data pre-process all day every day, I've been writing ETLs and data warehouses for years. But what am I supposed to preprocess it to? What is the ideal shape of the data?
A lot of courses gloss over this. The dedicate a whole section to cleaning data and then skip straight to ML with datasets already made. Or slightly better, they make you pre-process the data but tell you exactly what columns you need not why. So when you have a new project unless it is near identical to the example in the course you may not know what to do.