Uploaded image for project: 'ListenBrainz'
  1. ListenBrainz
  2. LB-603

All time release stats overload the spark cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Normal Normal
    • None
    • None
    • None

      Getting this error when calculating all time user stats

      20/05/26 17:20:14 ERROR TaskSchedulerImpl: Lost executor 0 on 10.0.1.100: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
      20/05/26 17:20:15 ERROR TaskSchedulerImpl: Lost executor 1 on 10.0.1.101: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
      20/05/26 17:20:16 ERROR TaskSchedulerImpl: Lost executor 2 on 10.0.1.99: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
      
      

      I have a feeling we could fix this by tuning the spark cluster but it'll eventually be a problem anyways.

       

      We should do the following:

      • Make all time release stats an independent command so that we can test it without needing to calculate artist stats at the same time.
      • Only get around a 100-200 releases per user. The query will get complicated, but that seems a reasonable compromise.

            ishaanshah Ishaan Shah
            iliekcomputers Param Singh
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:

                Version Package