It is possible to build a crawler and store its data on DynamoDB, but you have to consider multiple issues when dealing with scale such as Google. E.g How much of the data is going to stay in DynamoDB and what is going to be offloaded to say S3 for example? What are you using DynamoDB for? Storing crawled info? Storing meta data of crawled sites? Scheduling your crawling? I would suggest you to look up crawling with Python / Node.js & DynamoDB, it might better answer your questions.