Skip to content

Harvest items duplicate key issue #788

@maudetes

Description

@maudetes

We have an issue at the moment on the harvest item v-for loop due to duplicate keys.

          <tr
            v-for="item in paginatedItems"
            :key="item.remote_id"
          >

Indeed, item.remote_id is not unique due to "skipped" items that have remote_id at null and duplicates remote id.

I checked for unique property but we don't have an id and it seems that started, created and ended are not 100% unique either.

Pipeline to ckech for uniqueness
from udata.harvest.models import HarvestJob, HarvestSource
pipeline = [
    # Étape 1 : On décompose chaque HarvestJob en un document par HarvestItem
    {
        "$unwind": "$items"
    },
    # Étape 2 : On groupe par HarvestJob et par date created des items
    {
        "$group": {
            "_id": {
                "job_id": "$_id",
                "created_date": "$items.started"
            },
            "count": {"$sum": 1},
            "job": {"$first": "$$ROOT"}
        }
    },
    # Étape 3 : On filtre pour ne garder que les groupes où count > 1 (plusieurs items avec la même date)
    {
        "$match": {
            "count": {"$gt": 1}
        }
    },
    # Étape 4 : On regroupe par job_id pour éviter les doublons
    {
        "$group": {
            "_id": "$_id.job_id",
            "job": {"$first": "$job"},
            "duplicate_dates": {"$push": "$_id.created_date"}
        }
    },
    # Étape 5 : On reforme le document pour un affichage clair
    {
        "$project": {
            "job": 1,
            "duplicate_dates": 1
        }
    }
]
results = list(HarvestJob.objects(created__gte="2025-11-25").aggregate(*pipeline))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions