-
Task
-
Resolution: Won't Fix
-
Normal
-
None
-
None
-
None
So far we have been able to process user listening history and get useful output from it using collaborative filtering. We call this output candidate recordings. This task aims at releasing these candidate recordings so that people can experiment with this dataset and give us feedback on what they like and what they don't like about it. We want to focus on the latter for a better second version.
Ideally, there should be two dumps of candidate recordings:
- Top-artists: recordings of top-artists listened to by the user.
- Similar-artist: recordings of similar-artists listened to by the user.
A general roadmap:
- create dataframes (preprocess data ) of listening history from 2004-2019 ( from the inception of LB so that we include all the users)
- Train model on this data.
- create candidate sets for all the users. Note that candidate sets should also be built on data from 2004-2019 since this dump will provide an insight into listening habits and recommendations drawn from them spread over the entire decade or so.
- generate candidate recordings. Right now, the recordings are written to an HTML file. Instead of writing to HTML files, prepare data dumps of these recordings, this is the crux of this task.
A concrete roadmap and ideas on how the data should look like (schema) can be inferred from here .