undefined | Better HN

0 pointsDoxin2y ago0 comments

I'd assume there's no real state the network can "remember" between iterations, so shuffling will at best just waste time.

0 comments

1 comments · 1 top-level

Two_hands2y ago

My thoughts had been related to the ordering, but it makes sense that it doesn’t matter. I have read that it is actually better to train the model in separate batches with generated and real images in their own batches before the gradient step.

j / k navigate · click thread line to collapse