this is a major hassle when converting from avro (from kafka which uses a schema registry, so schemas are not shipped with the avro data) and storing in parquet which requires a schema in the file but you can 'upgrade' it with another schema when reading it. It would be great to have a binary protocol-like format (schema-less avro), and a schema-less columnar storage format.. which is I guess is what these guys are doing.