Parquet files store metadata about row groups in the file footer. Delta Lake adds file-level metadata in the transaction log. So Delta Lake can perform file-level skipping before even opening any of the Parquet files to get the row-group metadata.
Delta Lake allows you to rearrange your data to improve file-skipping. You can Z Order by timestamp for time-series analyses.
Delta Lake also allows for schema evolution, so you can evolve the schema of your table over time.
This company may have a cool file format, but is it closed source? It seems like enterprises don't want to be locked into closed formats anymore.
The schema evolution is something that popped out in a water cooler conversation the other day in my team.
Thanks but I will stay with Parquet for now.
On reading the reverse happens.
This becomes the compute/space conundrum: space is reduced with column based regularity, but time is increased due to the extra overhead of columnar compression.