-
Task
-
Resolution: Fixed
-
Normal
-
None
-
None
We should update the way that we import listens into spark. Some places for improvement are -
- Use a context manager for creating temporary directories locally, this way we don't have to worry about deleting the directories if an error occurs or after the import is done
- We don't create the '/temp' directory in HDFS explicitly, which might be confusing for readers who are new to the code.
- We should also use different variable names for directories inside HDFS and locally in order to reduce confusion.