The completed index takes up 337gb.
The incoming data to the crawler... not so sure... something on the order of 10-20tb. I didn't really measure this. And I don't keep all the data. There is no "cache" function.
On a 1gbit/s connection it takes about a week to crawl and generate the index.