By the standards of this article assembler on most architectures suffers from many of the same things that make C “not low level”: in particular it offers little control over cache hierarchy or coherency (modulo hint instructions like x86’s PREFETCH), nor instruction level parallelism. Of course, these aspects are entirely due to the fact that the dominant lingua-franca (C) has no ability to support these semantics.
In large part the article argues that in most cases the abstract machines that ISAs describe differ so fundamentally from the reality of how code is executed on the underlying machine to make a truly low level language impossible to achieve.