def get_labels(rightside):
met = {}
met['brain'] = (
1. * (rightside != 0).sum() / (rightside == 0).sum())
met['tumor'] = (
1. * (rightside > 2).sum() / ((rightside != 0).sum() + 1e-10))
met['has_enough_brain'] = met['brain'] > 0.30
met['has_tumor'] = met['tumor'] > 0.01
return met
I will say that it is very handy to know exactly how the labels were computed.What I really meant is a way to search and select data based on metadata. For example has_tumor.
Also note how everything is still one single blob, to get one line of any of the files, one would need to download everything.