Back in the distant past I wrote some really big ARM 32 assembly projects. 64 bit ARM is really very similar!
I had a look through the code. Some ENTRY/EXIT macros to help with the drudgery of save restore registers & stack frame would probably help. Also some register renaming would help readability (eg if a register points to incoming data throughout a subroutine rename it pdata).
I salute your effort and please enjoy the core dumps :-)