Bloom filters allow you to prune the number of files you even have to look at, which matters a lot when there is a coat associated with scanning the files.
Partitioning the data can be advantageous both for pruning (on its sort keys) or for parallel query fan-out (you independently scan and apply the predicate to the high cardinality column in each partition concurrently).
In the use case that underpins the article, they want to minimize unnecessary access to parquet file data because it lives on a high latency storage system and the compute infrastructure is not meant to be scaled up to match the number of partitions. So they just want an index to help find the data in the high cardinality column
For example when you have a predicate like, `where id = 'fdhah-4311-ddsdd-222aa'` sorting on the `id` column will help
However, if you have predicates on multiple different sets of columns, such as another query on `state = 'MA'`, you can't pick an ideal sort order for all of them.
People often partition (sort) on the low cardinality columns first as that tends to improve compression signficantly