This looks cool for v1! The only problem I see is most devices don't have much RAM, so local models are small and most requests will go to the servers.
Apple could use it to sell more devices - every new generation can have more RAM = more privacy. People will have real reason to buy a new phone more often.
Apple is starting to anticipate a higher RAM need in their M4+ silicon chips: There are rumors they are including more ram than specified in their entry level computers.