I get what you're saying, and it's a database if you're talking about facts, but breaking down mathematics and being able to perform logic isn't a database in the true sense, it's still able to generate novel responses based on a set of rules.
My point here is even if it only "gets" the semantics, it has the ability to perform logic. It's just not very efficient. And, I'd say this isn't that far off from what is happening in our brains.
Do we really "get" logic, or do we rely on heuristics? Why do we know that 12 * 2 is equal to 24? Either because we remember our multiplication tables (look up from a table) or because we break it into smaller steps until we're left with pieces of the problem that we inherently know (including 12 * 2 means 12 + 12 which means take 12 and increment the number +1 for 12x times, or 10 + 10 = 20 and 2 + 2 = 4 so 12 + 12 = 24).
I don't see why that couldn't reasonably scaled up to advanced calculus.
I think the point that you're hitting on is that LLMs aren't "full brains" that have all of the components that human brains have, and that's true. But LLMs appear to essentially have the ability to simulate (or replicate depending on how far you want to go) executive function. As others have hit on, from that, you can either command other specialized components to be able to do specific tasks (like recall facts, perform logic, etc) or you can have it break down the problem into steps. Then take the output from that processing and form a coherent response.
The structure here is flexible enough that I'm struggling to find the upper limit, and if all research in the development of LLMs stopped tomorrow, and everyone focused on building with LLMs, I still don't think that limit will be found.