undefined | Better HN

0 pointsUkv2y ago0 comments

I think people 100% have the right to use this on their images, but:

> simply acquiring only training data you have permission to use

Currently it's generally infeasible to obtain licenses at the required scale.

When attempting to develop a model that can describe photos for visually impaired users, I had even tried to reach out to obtain a license from Getty. They repeatedly told me that they don't license images for machine learning[0].

I think it's easy to say "well too bad, it doesn't deserve to exist" if you're just thinking about DALL-E 3, but there's a huge number of positive and far less-controversial applications of machine learning that benefit from web-scale pretraining and foundation models - spam filtering, tumour segmentation, voice transcription, language translation, defect detection, etc.

[0]: https://i.imgur.com/iER0BE2.png

0 comments

1 comments · 1 top-level

devmor2y ago

I don't believe it's a "doesn't deserve to exist" situation, because these things genuinely can be used for the public good.

However - and this is a big however - I don't believe it deserves the legal protection to be used for profit.

I am of the opinion that if you train your model on data that you do not hold the rights for, your usage should be handled similarly to most fair use laws. It's fine to use it for your personal projects, for research and education, etc. but it is not OK to use it for commercial endeavors.

1 more reply

j / k navigate · click thread line to collapse