-
Task
-
Resolution: Fixed
-
Normal
-
None
-
None
-
None
Right now the data in hdfs is grouped by listened_at timestamps, this means that adding new data involves updating multiple parquet files. If we grouped this by inserted_at, then for each incremental dump, we'd just create a new parquet file and be done with the updation.