I saw that, that function is probably completely inlined by the optimiser as well so likely the move doesn't even happen there.
Just wanted to make sure since I know a lot of devs not super proficient in C++ just sticks a shared pointer on things to get around worries about ownership and don't concider the tradeoffs.
For me I concider using a unique_ptr a form of compile time check for how I'm thinking about the code. I find shared_ptr to be a smell. It's not necessarily wrong but probably needs some reasoning about.
Another note, I saw some comments somewhere about SIMD vectorisation that needs to be implemented. I would check whether the compiler isn't already doing that and if it isn't I would see about changing the code to make if possible for the optimiser to generate vectorised code.
I still haven't been at a PC so haven't been able to properly look through the code but it is nice to at least see modern C++ being used in a codebase