That's what I was thinking, but now I'm pretty sure it doesn't even need crazy algorithms like that.
1. align the image stack. not trivial, but a common task.
2. take several cross-sections, in both dimensions, and have a human draw a line along a specific layer line
3. linearly interpolate these lines into a surface.
4. for each pixel in each output layer, set the value to layers[l + offset][x][y], where the offset was calculated in step 3.