|
1 | 1 | --- |
2 | 2 | title: "Tutorials" |
| 3 | +subtitle: "Learn to explore 6.7 million physical samples from scientific collections worldwide using modern browser-based tools." |
| 4 | +number-sections: false |
3 | 5 | --- |
4 | 6 |
|
5 | | -Learn to explore **6.7 million physical samples** from scientific collections worldwide using modern browser-based tools. |
6 | | - |
7 | | -## Start Here |
| 7 | +## Start Here {.unnumbered} |
8 | 8 |
|
9 | 9 | | Tutorial | What You'll Learn | |
10 | 10 | |----------|-------------------| |
11 | | -| [**Interactive Explorer**](isamples_explorer.qmd) | Search and filter samples with faceted search, view on 3D globe | |
12 | | -| [**Deep-Dive Analysis**](zenodo_isamples_analysis.qmd) | Comprehensive DuckDB-WASM analysis with Observable JS | |
13 | | -| [**3D Globe Visualization**](parquet_cesium_isamples_wide.qmd) | Cesium-based visualization of all iSamples data | |
14 | | -| [**Technical: Narrow vs Wide**](narrow_vs_wide_performance.qmd) | Schema comparison and performance benchmarks | |
| 11 | +| [**Interactive Explorer**](isamples_explorer.qmd) | Search and filter samples with faceted search, view results on a 3D globe | |
| 12 | +| [**Deep-Dive Analysis**](zenodo_isamples_analysis.qmd) | Comprehensive DuckDB-WASM analysis with Observable JS — charts, maps, statistics | |
| 13 | +| [**3D Globe Visualization**](parquet_cesium_isamples_wide.qmd) | Cesium-based progressive visualization with H3 spatial clustering | |
| 14 | +| [**Technical: Narrow vs Wide**](narrow_vs_wide_performance.qmd) | Schema comparison and performance benchmarks for the PQG data formats | |
| 15 | + |
| 16 | +## What's in the Data? {.unnumbered} |
| 17 | + |
| 18 | +| Source | Samples | Focus | |
| 19 | +|--------|---------|-------| |
| 20 | +| **SESAR** | 4.6M | Earth science — rocks, minerals, sediments, soils | |
| 21 | +| **OpenContext** | 1M | Archaeology — artifacts, excavation materials | |
| 22 | +| **GEOME** | 605K | Biology — genomic and tissue specimens | |
| 23 | +| **Smithsonian** | 322K | Natural history — museum collections | |
15 | 24 |
|
16 | | -## Data Sources |
| 25 | +## Data Files {.unnumbered} |
17 | 26 |
|
18 | | -All tutorials use **geoparquet files** - no server required: |
| 27 | +All data is hosted on [`data.isamples.org`](https://data.isamples.org) with HTTP range request support — DuckDB-WASM only downloads the bytes it needs. |
19 | 28 |
|
20 | | -- **iSamples Full Dataset**: ~280 MB wide format, 6.7M samples from SESAR, OpenContext, GEOME, Smithsonian |
21 | | -- **Available via**: Cloudflare R2 with HTTP range requests |
| 29 | +| File | Size | Description | |
| 30 | +|------|------|-------------| |
| 31 | +| [Wide format](https://data.isamples.org/isamples_202601_wide.parquet) | 278 MB | One row per entity, all sources — primary file for tutorials | |
| 32 | +| [Wide + H3](https://data.isamples.org/isamples_202601_wide_h3.parquet) | 292 MB | Wide format with H3 spatial indices for globe visualizations | |
| 33 | +| [Facet summaries](https://data.isamples.org/isamples_202601_facet_summaries.parquet) | 2 KB | Pre-computed filter counts — loads instantly | |
| 34 | +| [H3 clusters (res4)](https://data.isamples.org/isamples_202601_h3_summary_res4.parquet) | 0.6 MB | Zoomed-out globe view | |
22 | 35 |
|
23 | | -## Why Browser-Based? |
| 36 | +## Why Browser-Based? {.unnumbered} |
24 | 37 |
|
25 | 38 | Our approach using **geoparquet + DuckDB-WASM** provides: |
26 | 39 |
|
27 | | -- ✅ **Universal access** - No installation, works in any browser |
28 | | -- ✅ **Fast analysis** - 5-10x faster than downloading full datasets |
29 | | -- ✅ **Memory efficient** - Analyze 300MB using <100MB browser memory |
30 | | -- ✅ **Minimal transfer** - Only download the columns/rows you need |
| 40 | +- **Universal access** — No installation, works in Chrome, Firefox, Edge, Safari, and Brave |
| 41 | +- **Fast analysis** — 5-10x faster than downloading full datasets |
| 42 | +- **Memory efficient** — Analyze 300MB datasets using <100MB browser memory |
| 43 | +- **Minimal transfer** — HTTP range requests download only the columns and rows you need (typically <1 MB to start) |
| 44 | +- **Reproducible** — All code is visible and foldable on tutorial pages |
| 45 | + |
| 46 | +## For Developers {.unnumbered} |
| 47 | + |
| 48 | +All tutorial source code is on [GitHub](https://github.com/isamplesorg/isamplesorg.github.io/tree/main/tutorials). Want to build your own analysis? Fork the repo, modify a `.qmd` file, and run `quarto preview`. |
| 49 | + |
| 50 | +- [GitHub repositories](https://github.com/isamplesorg/) — all source code and data pipelines |
| 51 | +- [Zenodo community](https://zenodo.org/communities/isamples) — archived datasets for reproducible research |
| 52 | +- [Query architecture](https://github.com/isamplesorg/isamplesorg.github.io/issues/82) — how the Explorer queries work under the hood |
0 commit comments