Uploaded image for project: 'MusicBrainz Search Server'
  1. MusicBrainz Search Server
  2. SEARCH-580

Indexer doesn't always return in musicbrainz-docker

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • Indexer
    • None

      Running python -m sir reindex doesn't always return in musicbrainz-docker.

      It never happened in production using nohup.

      Steps to reproduce

      Install musicbrainz-docker@rmq-sir-solr (related PR) and run as follows:

      git clone -b rmq-sir-solr https://github.com/yvanzo/musicbrainz-docker.git
      cd musicbrainz-docker
      export COMPOSE_FILE=docker-compose.yml:docker-compose.public.yml:docker-compose.musicbrainz-development.yml
      docker-compose up -d
      docker-compose exec musicbrainz /createdb.sh -fetch -sample
      

      Log files

      SIR has a debug option but no logging option, its stdout must be captured:

      docker-compose exec indexer python -m sir -d reindex \
      > reindex-`date -u -Iseconds`.debug.log
      

      Solr has logs that can be more useful to check for activity:

      docker-compose logs search
      

      Indexes status

      Search indexes can be rebuilt for a specific entity type ~ core as follows:

      entity_type=annotation
      docker-compose exec indexer python -m sir -d reindex --entity-type ${entity_type} \
      > reindex-${entity_type}-`date -u -Iseconds`.debug.log
      

      Indexes completion can be checked by comparing for each entity-type ~ core:

      • numDocs in core index at http://localhost:8983/solr/#/~cores
      • number of entries in the corresponding table
        entity_type=annotation
        docker-compose exec db psql -U musicbrainz -d musicbrainz_db \
        -c "COPY(SELECT COUNT(*) FROM ${entity_type}) TO STDOUT"
        

          [SEARCH-580] Indexer doesn't always return in musicbrainz-docker

          yvanzo added a comment -

          It seems to be some kind of thrashing that may occur when SIR is not configured appropriately with respect to system resources.

          yvanzo added a comment - It seems to be some kind of thrashing that may occur when SIR is not configured appropriately with respect to system resources.

          yvanzo added a comment -

          Note: Mentioning nohup was not relevant here, using it doesn’t solve this issue. Another setup difference is: production has 12 CPU cores, whereas test environment we used to reproduce this issue has 4.

          yvanzo added a comment - Note: Mentioning nohup was not relevant here, using it doesn’t solve this issue. Another setup difference is: production has 12 CPU cores, whereas test environment we used to reproduce this issue has 4.

          yvanzo added a comment -

          Updated description with steps to reproduce, commands to check logs and indexes status.

          yvanzo added a comment - Updated description with steps to reproduce, commands to check logs and indexes status.

          yvanzo added a comment -

          It never happened when running sir reindex in production, using neither `docker` nor `docker-compose` but `nohup`, so I still wonder if this issue comes from SIR, its Docker image, MB Solr Docker image, or Docker Compose.

           

          yvanzo added a comment - It never happened when running sir reindex in production, using neither `docker` nor `docker-compose` but `nohup`, so I still wonder if this issue comes from SIR, its Docker image, MB Solr Docker image, or Docker Compose.  

            Unassigned Unassigned
            yvanzo yvanzo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:

                Version Package