First think about row and column vectors. A row vector and a column vector can be combined via standard matrix multiplication to produce a real number. From that perspective, a row vector is a function that takes a column vector and returns a real number. Similarly, column vectors take row vectors as arguments and produce real numbers.
It turns out that row (column) vectors are the only linear functions on column (row) vectors. This result is known as the Reisz representation theorem. If I give you a linear function on a row vector, you can find a column vector so that computing my function is equivalent to calculating a matrix multiply with your column vector.
Now on to matrices. Matrices take one row vector and one column vector and produce a real number. I can feed a matrix a single argument - the row vector, say - so that it becomes a function that takes one more argument (the column vector) before it returns a real number. Sort of like currying in functional programming. But as we said, the only linear functions that map column vectors into real numbers are row vectors. So by feeding our matrix one row vector, we've produced another row vector. This is the "matrices transform vectors" perspective in the OP's article. But I think the "Matrices are linear functions" perspective is more general and more powerful.
This perspective of vectors, matrices, etc... as functions might seem needlessly convoluted. But I think it's the right way to think about these objects. Tricky concepts like the tensor product and vector space duality become relatively trivial once you come to see all these objects as functions.