Skip to content

Implement server-side search endpoint for scalable file searching #2

@bryanchriswhite

Description

@bryanchriswhite

Summary

Currently, file search is implemented entirely client-side using Fuse.js in the React UI. While this works well for small-to-medium datasets, it has scalability limitations as the dataset grows larger.

This issue proposes implementing a server-side search endpoint to handle search queries on the backend, which will:

  • Reduce client-side data transfer for large datasets
  • Enable pagination of search results
  • Support more advanced search features
  • Improve performance for users with slower devices

Current Implementation

Client-side search in pinshare-ui/src/pages/Browse.jsx:

  • Uses Fuse.js for fuzzy matching
  • Searches across: fileName, ipfsCID, fileType, fileSHA256
  • Field weights: fileName (0.4), ipfsCID (0.3), fileType (0.2), fileSHA256 (0.1)
  • Threshold: 0.3 (fuzzy matching tolerance)
  • All files are fetched via GET /files then filtered in browser

Proposed Changes

Backend API

Add new endpoint: GET /files/search

Query Parameters:

  • q (required): Search query string
  • limit (optional): Max results to return (default: 50)
  • offset (optional): Pagination offset (default: 0)
  • fields (optional): Comma-separated fields to search (default: all)
  • threshold (optional): Fuzzy match threshold (default: 0.3)

Response:

{
  "results": [...],
  "total": 123,
  "limit": 50,
  "offset": 0,
  "query": "example"
}

Implementation Options

  1. Go native implementation with string matching libraries
  2. SQLite FTS (Full-Text Search) if/when migrating to SQLite
  3. PostgreSQL full-text search if/when migrating to PostgreSQL
  4. Dedicated search engine (Elasticsearch, Meilisearch, etc.) for large-scale deployments

Migration Strategy

The client-side search should remain as a fallback:

  1. Try server-side search first
  2. If endpoint not available (older backend), fall back to client-side Fuse.js
  3. This ensures backward compatibility during rollout

Benefits

  • Scalability: Handles thousands of files without performance degradation
  • Reduced bandwidth: Only matching results sent to client
  • Pagination: Large result sets can be paginated
  • Advanced features: Can add filters, facets, aggregations, etc.
  • Consistent results: Same search algorithm for all clients

Related Issues

  • #TBD: PostgreSQL migration (will enable PostgreSQL full-text search)

Implementation Priority

Medium Priority - Current client-side search works well for typical use cases. This becomes important as the dataset grows beyond ~1000 files.


Migrated from bryanchriswhite/PinShare#2

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions