The first (and imo the main) hurdle is not reproduction, but just learning to hear the correct sounds. If you don't speak Hindi and are a native English speaker, this [1] is a good example. You can only work on nailing those consonants when they become as distinct to your ear as cUp and cAp are in English.
We can get by by falling back to context (it's unlikely someone would ask for a "shit of paper"!), but it's impossible to confidently reproduce the sounds unless they are already completely distinct in our heads/ears.
That's because we think we hear things as they are, but it's an illusion. Cup/cap distinction is as subtle to an Eastern European as Hindi consonants or Mandarin tones are to English speakers, because the set of meaningful sounds distinctions differs between languages. Relearning the phonetic system requires dedicated work (minimal pairs is one option) and learning enough phonetics to have the vocabulary to discuss sounds as they are. It's not enough to just give feedback.