It's interesting cause I have a recording of human voices plus a background TV show that was too loud; I've looked around for something that would be able to separate the two but I haven't found a straightforward solution.
For example if you Google then FASST is one of the ones that come up, but it's a whole framework and in order to use it you'd have to learn the research yourself; much of these software is not geared for end users.
Learn how to do waveform inversions - if you have a stereo signal, anything not fully-centered will come through better while the rest is cut out. You can then take that, invert it, and play it back with the original, cutting out that noise and keeping the fully-centered things like vocals present.
This is how I play guitar to my favorite songs on my computer.
this sounds interesting. could you elaborate a bit? it is unclear if you are inverting once, or twice. "if you have a stereo signal, anything not fully-centered will come through better while the rest is cut out" -- is this before or after an inversion?
Of stereo tracks L and R, you invert R and add it to L, effectively canceling anything centered. This usually removes voices. If you subtract (invert then add) the result from the original L and R tracks you get centered sounds only. Results range from perfect to not effective at all depending on the songs you apply it to.