A two-service web app:
- Frontend: Next.js (TypeScript) dashboard to submit URLs and view scraped records.
- Backend: ASP.NET Core minimal API that scrapes HTML (title/description/headings) and writes to Supabase Postgres.
- User submits a URL from the Next.js UI.
- Frontend sends
POST /api/scrapeto the C# backend. - Backend downloads HTML, parses relevant content, stores the result in Supabase (
scrape_results). - Frontend lists latest rows from
GET /api/scrape/latest.
- Node.js 20+
- .NET SDK 8+
- A Supabase project with the Postgres connection string
Run SQL in Supabase SQL editor:
-- backend/ScrapeApi/schema.sql
create table if not exists scrape_results (
id uuid primary key,
url text not null,
title text not null,
description text not null,
headings text[] not null default '{}',
scraped_at timestamptz not null default timezone('utc', now())
);
create index if not exists idx_scrape_results_scraped_at on scrape_results(scraped_at desc);Set environment variable:
export Supabase__ConnectionString='Host=...;Port=5432;Database=postgres;Username=postgres;Password=...;SSL Mode=Require;Trust Server Certificate=true'
export Frontend__Origin='http://localhost:3000'Run:
cd backend/ScrapeApi
dotnet restore
dotnet runBackend starts on http://localhost:5168 (or another dev port, check terminal).
Set environment variable:
export NEXT_PUBLIC_API_URL='http://localhost:5168'Run:
cd frontend
npm install
npm run devOpen http://localhost:3000.
export SUPABASE_CONNECTION_STRING='Host=...'
docker compose up --build- Respect target website terms/robots policies before scraping.
- This app is intentionally simple and does not include job queueing, authentication, or anti-bot handling.