I'm not sure what it's like at super large companies, but I agree with another comment that assumes they make decisions based on micro-testing interactions with their millions of users.
At a mid-size company I worked at, we looked at metrics using analytics software (like Pendo and Full Story, not GA) and then AB tested variations to see if they moved users to our desired behavior.