Summary
Implement a comprehensive Google Drive import tool that allows users to authenticate with their Google account, select files/folders from their Drive, and import them into PinShare with full monitoring capabilities. The tool should support both manual one-time imports and optional continuous synchronization.
User Story
As a PinShare user, I want to import my files from Google Drive into the decentralized PinShare network, so that I can:
- Liberate my data from centralized cloud storage
- Share my files via P2P/IPFS without relying on Google's infrastructure
- Maintain a decentralized backup of my important files
- Optionally keep my PinShare instance in sync with my Drive
Current State
PinShare currently supports file uploads via:
- File system watcher monitoring an
./upload folder
- Manual file placement by users
Architecture Gaps for Google Drive Import:
- ❌ No user authentication system (OAuth or otherwise)
- ❌ No Google Drive API integration
- ❌ No background job queue for long-running operations
- ❌ No per-user file import tracking
- ❌ Limited upload status tracking (designed for quick local file processing)
- ❌ No continuous sync/monitoring capability
Existing Assets to Leverage:
- ✅ Robust upload pipeline (validation → hashing → security scanning → IPFS → metadata storage)
- ✅ Upload status tracking system (
UploadStatusManager)
- ✅ Real-time UI updates via React Query
- ✅ Security scanning infrastructure (VirusTotal, ClamAV, P2P-Sec)
- ✅ P2P metadata distribution via PubSub
Proposed Solution
Architecture Components
1. User Authentication System
- OAuth 2.0 with PKCE for Google Drive API access
- Per-user token storage (encrypted)
- Token refresh mechanism
- Support for multiple users on the same PinShare instance
Tech Stack:
google.golang.org/api/drive/v3 for Google Drive API
golang.org/x/oauth2 for OAuth flow
- Secure token storage (encrypted database or keyring)
2. Google Drive Integration
- List user's Drive folders and files
- Stream file downloads directly to PinShare
- Folder hierarchy preservation (optional)
- Metadata mapping (Drive metadata → PinShare metadata)
Features:
- Folder tree browser UI
- Multi-select file/folder selection
- Filter by file type, size, date
- Preview file list before import
3. Background Job System
- Job queue for import operations
- Worker pool for concurrent processing
- Job persistence (survive restarts)
- Progress tracking per job
- Retry mechanism with exponential backoff
Job States:
pending → downloading → hashing → scanning → uploading → completed
└→ failed (with retry)
4. Enhanced Monitoring Dashboard
Real-time metrics:
- Overall import progress (X of Y files, % complete)
- Per-file status with detailed stages
- Error tracking with specific failure reasons
- Bandwidth metrics (current speed, average speed, ETA)
- Success/failure statistics
Historical tracking:
- Import job history
- Per-file import logs
- Retry attempts
- Total data imported
5. Continuous Sync Engine (Phase 3)
- Watch Google Drive for changes (polling or webhooks)
- Auto-import new/modified files
- Configurable sync interval
- Conflict resolution strategy
- Sync pause/resume capability
Technical Implementation
Backend API Endpoints
Authentication
POST /api/google-drive/authorize
→ Initiates OAuth flow, returns authorization URL
POST /api/google-drive/callback?code={authCode}
→ Exchanges auth code for tokens, stores encrypted
GET /api/google-drive/auth-status
→ Returns whether user is authenticated
DELETE /api/google-drive/revoke
→ Revokes access and deletes tokens
File Selection
GET /api/google-drive/folders?path={folderId}
→ Lists files/folders in specified folder (defaults to root)
Response: { id, name, mimeType, size, modifiedTime, parents[] }
POST /api/google-drive/preview-import
Request: { fileIds: [], folderIds: [], recursive: bool }
Response: { files: [], totalSize, totalCount }
Import Operations
POST /api/google-drive/import
Request: {
fileIds: [],
folderIds: [],
recursive: bool,
options: { preserveHierarchy, skipDuplicates }
}
Response: { jobId, status, filesQueued }
GET /api/google-drive/import/{jobId}/status
Response: {
jobId,
status: "pending|running|completed|failed|cancelled",
progress: {
totalFiles,
completedFiles,
failedFiles,
currentFile,
percentComplete,
bytesTransferred,
totalBytes,
transferRate,
estimatedTimeRemaining
},
files: [
{
driveId,
fileName,
status: "pending|downloading|hashing|scanning|uploading|completed|failed",
progress: 0-100,
error: ""
}
]
}
POST /api/google-drive/import/{jobId}/cancel
→ Cancels running import job
POST /api/google-drive/import/{jobId}/retry-failed
→ Retries all failed files in the job
GET /api/google-drive/import/history
→ Returns list of past import jobs with summary stats
Continuous Sync (Phase 3)
POST /api/google-drive/sync/configure
Request: { enabled, folderId, interval, options }
GET /api/google-drive/sync/status
Response: { enabled, lastSync, nextSync, syncedFiles, errors }
POST /api/google-drive/sync/trigger
→ Manually triggers sync cycle
Database Schema
Users Table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
google_id VARCHAR(255) UNIQUE NOT NULL,
email VARCHAR(255) NOT NULL,
encrypted_access_token TEXT,
encrypted_refresh_token TEXT,
token_expiry TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
Import Jobs Table
CREATE TABLE import_jobs (
id UUID PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
status VARCHAR(50), -- pending, running, completed, failed, cancelled
total_files INTEGER,
completed_files INTEGER,
failed_files INTEGER,
total_bytes BIGINT,
transferred_bytes BIGINT,
started_at TIMESTAMP,
completed_at TIMESTAMP,
options JSONB,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
Import Files Table
CREATE TABLE import_files (
id SERIAL PRIMARY KEY,
job_id UUID REFERENCES import_jobs(id),
drive_file_id VARCHAR(255),
file_name VARCHAR(500),
file_size BIGINT,
status VARCHAR(50), -- pending, downloading, hashing, scanning, uploading, completed, failed
progress INTEGER, -- 0-100
sha256_hash VARCHAR(64),
ipfs_cid VARCHAR(255),
error_message TEXT,
retry_count INTEGER DEFAULT 0,
started_at TIMESTAMP,
completed_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
Sync Configurations Table (Phase 3)
CREATE TABLE sync_configs (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
drive_folder_id VARCHAR(255),
enabled BOOLEAN DEFAULT TRUE,
sync_interval INTEGER, -- minutes
last_sync_at TIMESTAMP,
next_sync_at TIMESTAMP,
options JSONB,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
Internal Architecture
New Go Packages
internal/gdrive/
client.go - Google Drive API client wrapper
oauth.go - OAuth flow management
downloader.go - File download from Drive
mapper.go - Drive metadata → PinShare metadata conversion
internal/jobs/
queue.go - Job queue interface
worker.go - Worker pool implementation
import_job.go - Import job definition and state machine
persistence.go - Job state persistence
internal/users/
auth.go - User authentication
store.go - User data storage
tokens.go - Encrypted token management
internal/sync/ (Phase 3)
engine.go - Continuous sync orchestration
watcher.go - Drive change detection
scheduler.go - Sync scheduling
Integration with Existing Systems
Upload Pipeline Integration:
// In internal/jobs/import_job.go
func (j *ImportJob) processFile(driveFile *drive.File) error {
// 1. Download from Google Drive
j.updateFileStatus(driveFile.Id, "downloading", 0)
localPath, err := j.driveClient.Download(driveFile)
// 2. Plug into existing upload pipeline
j.updateFileStatus(driveFile.Id, "hashing", 30)
sha256 := psfs.ComputeSHA256(localPath)
j.updateFileStatus(driveFile.Id, "scanning", 50)
secResult := psfs.SecurityCheck(localPath, sha256)
if !secResult.Safe {
return j.failFile(driveFile.Id, "Security scan failed")
}
j.updateFileStatus(driveFile.Id, "uploading", 70)
cid, err := psfs.AddFileIPFS(localPath)
j.updateFileStatus(driveFile.Id, "storing", 90)
metadata := store.BaseMetadata{
FileSHA256: sha256,
IPFSCID: cid,
FileName: driveFile.Name,
// ... map other Drive metadata
}
store.GlobalStore.AddFile(metadata)
j.updateFileStatus(driveFile.Id, "completed", 100)
return nil
}
UI Requirements
1. Google Drive Authorization Page
Location: /import/google-drive/authorize
Components:
- "Connect to Google Drive" button
- OAuth consent explanation
- Permissions required list
- Privacy policy link
2. Folder/File Selection Interface
Location: /import/google-drive/select
Features:
- Tree view of Drive folders (collapsible)
- File list view with checkboxes
- Multi-select capability
- File type icons
- Size/date metadata display
- Search/filter bar
- "Select All" / "Deselect All" buttons
- Preview import summary (X files, Y GB)
- Import options:
- "Start Import" button
3. Import Status Dashboard
Location: /import/google-drive/status/{jobId}
Real-time Metrics:
╔════════════════════════════════════════════════════════╗
║ Import Progress [Cancel] ║
╠════════════════════════════════════════════════════════╣
║ ████████████████░░░░░░░░░░ 45% (45/100 files) ║
║ ⬇ Downloading: document.pdf (2.5 MB/s) ║
║ ⏱ ETA: 5 minutes ║
║ 📊 Status: 40 completed, 5 failed, 55 pending ║
╚════════════════════════════════════════════════════════╝
Files:
┌─────────────────────────────────────────────────────┐
│ ✅ report.pdf │ Completed │ 2.3 MB │ 12:30 │
│ ⏳ presentation.pptx │ Scanning │ ████░░ │ │
│ ❌ large-video.mp4 │ Failed │ Error: Too large│
│ ⏸ document.docx │ Pending │ 45 KB │ │
└─────────────────────────────────────────────────────┘
[Retry Failed Files] [View Details]
Detailed Per-File View:
- File name with Drive icon
- Progress bar for current file
- Current stage (downloading/hashing/scanning/uploading)
- Transfer speed
- Success/error indicator
- Retry button for failed files
4. Import History Page
Location: /import/google-drive/history
Display:
- List of past import jobs
- Job ID, start time, duration
- Success/failure counts
- Total data imported
- "View Details" link to status page
5. Sync Configuration Page (Phase 3)
Location: /import/google-drive/sync
Settings:
- Enable/disable continuous sync
- Select Drive folder to sync
- Sync interval (hourly, daily, etc.)
- Conflict resolution strategy
- Last sync timestamp
- Manual "Sync Now" button
Monitoring & Observability
Metrics to Track
Job-Level Metrics:
gdrive_import_jobs_total{status="completed|failed|cancelled"}
gdrive_import_duration_seconds
gdrive_import_files_total{status="completed|failed"}
gdrive_import_bytes_total
File-Level Metrics:
gdrive_file_download_duration_seconds
gdrive_file_size_bytes{stage="downloaded|uploaded"}
gdrive_transfer_rate_bytes_per_second
API Metrics:
gdrive_api_requests_total{endpoint,status}
gdrive_api_errors_total{error_type}
gdrive_api_rate_limit_hits_total
Sync Metrics (Phase 3):
gdrive_sync_cycles_total{status}
gdrive_sync_new_files_detected
gdrive_sync_lag_seconds (time since last successful sync)
Logging Strategy
- Structured JSON logs
- Log levels: DEBUG, INFO, WARN, ERROR
- Include job ID, user ID, file ID in all log entries
- Detailed error logging with stack traces
Security Considerations
OAuth Security
- PKCE Flow - Use Proof Key for Code Exchange for additional security
- Token Encryption - Encrypt tokens at rest using AES-256
- Secure Storage - Store encrypted tokens in database or OS keyring
- Token Rotation - Implement automatic refresh token rotation
- Scope Minimization - Request only
drive.readonly scope
API Security
- Rate Limiting - Respect Google Drive API quotas (per-user limits)
- Authentication Required - All endpoints require valid user session
- Input Validation - Validate all Drive file IDs, folder paths
- CORS - Restrict to localhost and configured domains
File Security
- Leverage Existing Scanning - All imported files go through security checks
- Size Limits - Enforce max file size limits
- Type Validation - Respect PinShare's allowed file types
- Malware Scanning - VirusTotal/ClamAV on all imports
Privacy
- User Data Isolation - Each user only sees their own imports
- Token Revocation - Support complete data deletion
- Audit Logging - Log all import operations
Testing Strategy
Unit Tests
- Google Drive client mocking
- OAuth flow state machine
- Job queue operations
- Metadata mapping accuracy
Integration Tests
- End-to-end import flow with test files
- OAuth callback handling
- Database persistence
- Worker pool concurrency
Load Tests
- Import 1,000 files concurrently
- Test with 10+ concurrent import jobs
- Measure memory usage and performance
- Test rate limit handling
Security Tests
- OAuth PKCE flow validation
- Token encryption/decryption
- Unauthorized access attempts
- Input validation edge cases
Implementation Phases
Phase 1: OAuth + Basic Import (MVP)
Goal: Import files manually from Google Drive
Deliverables:
Estimated Effort: 2-3 weeks
Phase 2: Enhanced Monitoring
Goal: Comprehensive import monitoring
Deliverables:
Estimated Effort: 1-2 weeks
Phase 3: Continuous Sync
Goal: Auto-sync Drive changes
Deliverables:
Estimated Effort: 2-3 weeks
Phase 4: Performance & Polish
Goal: Production-ready reliability
Deliverables:
Estimated Effort: 1-2 weeks
Dependencies
External Services
- Google Cloud Project - OAuth credentials, API enablement
- Google Drive API v3 - File access
- Database - PostgreSQL or SQLite for job/user persistence
Go Packages
require (
google.golang.org/api v0.XXX
golang.org/x/oauth2 v0.XXX
github.com/lib/pq v1.XXX // PostgreSQL driver
github.com/google/uuid v1.XXX // Job IDs
)
Configuration
# config.yaml additions
google_drive:
oauth:
client_id: "${GOOGLE_OAUTH_CLIENT_ID}"
client_secret: "${GOOGLE_OAUTH_CLIENT_SECRET}"
redirect_url: "http://localhost:9090/api/google-drive/callback"
scopes:
- "https://www.googleapis.com/auth/drive.readonly"
import:
max_concurrent_downloads: 5
max_file_size_mb: 1024
temp_download_dir: "./tmp/gdrive"
rate_limiting:
requests_per_second: 10
burst: 20
Success Metrics
User Adoption
- Number of users connecting Google Drive
- Total files imported
- Active sync configurations
Performance
- Average import speed (files/minute, MB/s)
- P95 latency for import operations
- Error rate < 1%
Reliability
- Job success rate > 99%
- Retry success rate
- Sync lag < 5 minutes (Phase 3)
Future Enhancements
Beyond Initial Implementation
- Dropbox Integration - Apply same pattern to Dropbox
- OneDrive Support - Microsoft OneDrive import
- S3 Import - AWS S3 bucket import
- Selective Export - PinShare → Google Drive
- Smart Deduplication - Cross-user file deduplication
- Bandwidth Scheduling - Import during off-peak hours
- Multi-folder Sync - Sync multiple Drive folders simultaneously
Related Issues
References
Priority: High
Complexity: High
Impact: High - Unlocks PinShare for users with existing cloud storage
Summary
Implement a comprehensive Google Drive import tool that allows users to authenticate with their Google account, select files/folders from their Drive, and import them into PinShare with full monitoring capabilities. The tool should support both manual one-time imports and optional continuous synchronization.
User Story
As a PinShare user, I want to import my files from Google Drive into the decentralized PinShare network, so that I can:
Current State
PinShare currently supports file uploads via:
./uploadfolderArchitecture Gaps for Google Drive Import:
Existing Assets to Leverage:
UploadStatusManager)Proposed Solution
Architecture Components
1. User Authentication System
Tech Stack:
google.golang.org/api/drive/v3for Google Drive APIgolang.org/x/oauth2for OAuth flow2. Google Drive Integration
Features:
3. Background Job System
Job States:
4. Enhanced Monitoring Dashboard
Real-time metrics:
Historical tracking:
5. Continuous Sync Engine (Phase 3)
Technical Implementation
Backend API Endpoints
Authentication
File Selection
Import Operations
Continuous Sync (Phase 3)
Database Schema
Users Table
Import Jobs Table
Import Files Table
Sync Configurations Table (Phase 3)
Internal Architecture
New Go Packages
internal/gdrive/client.go- Google Drive API client wrapperoauth.go- OAuth flow managementdownloader.go- File download from Drivemapper.go- Drive metadata → PinShare metadata conversioninternal/jobs/queue.go- Job queue interfaceworker.go- Worker pool implementationimport_job.go- Import job definition and state machinepersistence.go- Job state persistenceinternal/users/auth.go- User authenticationstore.go- User data storagetokens.go- Encrypted token managementinternal/sync/(Phase 3)engine.go- Continuous sync orchestrationwatcher.go- Drive change detectionscheduler.go- Sync schedulingIntegration with Existing Systems
Upload Pipeline Integration:
UI Requirements
1. Google Drive Authorization Page
Location:
/import/google-drive/authorizeComponents:
2. Folder/File Selection Interface
Location:
/import/google-drive/selectFeatures:
3. Import Status Dashboard
Location:
/import/google-drive/status/{jobId}Real-time Metrics:
Detailed Per-File View:
4. Import History Page
Location:
/import/google-drive/historyDisplay:
5. Sync Configuration Page (Phase 3)
Location:
/import/google-drive/syncSettings:
Monitoring & Observability
Metrics to Track
Job-Level Metrics:
gdrive_import_jobs_total{status="completed|failed|cancelled"}gdrive_import_duration_secondsgdrive_import_files_total{status="completed|failed"}gdrive_import_bytes_totalFile-Level Metrics:
gdrive_file_download_duration_secondsgdrive_file_size_bytes{stage="downloaded|uploaded"}gdrive_transfer_rate_bytes_per_secondAPI Metrics:
gdrive_api_requests_total{endpoint,status}gdrive_api_errors_total{error_type}gdrive_api_rate_limit_hits_totalSync Metrics (Phase 3):
gdrive_sync_cycles_total{status}gdrive_sync_new_files_detectedgdrive_sync_lag_seconds(time since last successful sync)Logging Strategy
Security Considerations
OAuth Security
drive.readonlyscopeAPI Security
File Security
Privacy
Testing Strategy
Unit Tests
Integration Tests
Load Tests
Security Tests
Implementation Phases
Phase 1: OAuth + Basic Import (MVP)
Goal: Import files manually from Google Drive
Deliverables:
Estimated Effort: 2-3 weeks
Phase 2: Enhanced Monitoring
Goal: Comprehensive import monitoring
Deliverables:
Estimated Effort: 1-2 weeks
Phase 3: Continuous Sync
Goal: Auto-sync Drive changes
Deliverables:
Estimated Effort: 2-3 weeks
Phase 4: Performance & Polish
Goal: Production-ready reliability
Deliverables:
Estimated Effort: 1-2 weeks
Dependencies
External Services
Go Packages
Configuration
Success Metrics
User Adoption
Performance
Reliability
Future Enhancements
Beyond Initial Implementation
Related Issues
References
Priority: High
Complexity: High
Impact: High - Unlocks PinShare for users with existing cloud storage