However, that split isn’t automatic. You can’t expect to run a 40GB model on that, unless perhaps if it’s been designed for that—the way llama.cpp can split a model between the GPU and CPU, for instance.
What you can do without trouble is keep more models loaded, do more things at the same time, and occasionally run the same model at double speed if it batches well.