Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions book/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ parts:
- file: content/forcing/cmip_historic.ipynb
- file: content/forcing/cmip_future.ipynb
- file: content/forcing/manual_forcing.ipynb
- file: content/forcing/discharge.ipynb

- file: content/different_models.md
sections:
Expand Down
167 changes: 167 additions & 0 deletions book/content/forcing/discharge.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a1b2c3d4e5f6a7b8",
"metadata": {},
"source": [
"# GRDC Discharge Observations\n",
"\n",
"The [Global Runoff Data Centre (GRDC)](https://portal.grdc.bafg.de/) is the primary source for historical daily and monthly river discharge data worldwide, covering thousands of gauging stations.\n",
"\n",
"Unlike the Caravan or ERA5 data, GRDC data requires a manual download step — you register on the GRDC portal, select the stations you need, and download the data files to your own machine.\n",
"Once you have the files, eWaterCycle makes loading them straightforward.\n",
"\n",
"What we need:\n",
"1. A GRDC station ID (found via the [GRDC station catalogue](https://portal.grdc.bafg.de/))\n",
"2. A time window"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b2c3d4e5f6a7b8c9",
"metadata": {},
"outputs": [],
"source": [
"# General python\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\", category=UserWarning)\n",
"\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"from pathlib import Path\n",
"\n",
"# Niceties\n",
"from rich import print\n",
"\n",
"# eWaterCycle observation module\n",
"from ewatercycle.observation.grdc import get_grdc_data"
]
},
{
"cell_type": "markdown",
"id": "c3d4e5f6a7b8c9d0",
"metadata": {},
"source": [
"## Downloading GRDC data\n",
"\n",
"**Normally** one would download the GRDC data.\n",
"We have some GRDC stations downloaded ourselves, if it is not in our database, please ask the admin to add your station.\n",
"Or you could do it yourself:\n",
"\n",
"1. Register (free) at [portal.grdc.bafg.de](https://portal.grdc.bafg.de/)\n",
"2. Search for your station by name, river, or country\n",
"3. Select the station and download the **daily** data as a `.txt` file\n",
"4. Place the downloaded file(s) in a directory on your machine — that path goes into `grdc_data_home` below\n",
"\n",
"The GRDC station ID is a 7-digit number visible in the portal and in the filename of the downloaded file (e.g. `GRDC_6335020_Q_Day.txt`)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d4e5f6a7b8c9d0e1",
"metadata": {},
"outputs": [],
"source": [
"# Rhine at Lobith — one of the most monitored river cross-sections in Europe\n",
"station_id = \"6335020\"\n",
"\n",
"experiment_start_date = \"2000-01-01T00:00:00Z\"\n",
"experiment_end_date = \"2005-12-31T00:00:00Z\""
]
},
{
"cell_type": "markdown",
"id": "e5f6a7b8c9d0e1f2",
"metadata": {},
"source": [
"## Loading the observations\n",
"\n",
"`get_grdc_data` reads the downloaded GRDC file and returns an xarray dataset of daily discharge values also containing station information (name, river, coordinates, drainage area, etc.)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f6a7b8c9d0e1f2a3",
"metadata": {},
"outputs": [],
"source": [
"observations_ds = get_grdc_data(\n",
" station_id=station_id,\n",
" start_time=experiment_start_date,\n",
" end_time=experiment_end_date,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c9d0e1f2a3b4c5d6",
"metadata": {},
"source": "The station metadata (name, river, country, coordinates) is embedded as scalar variables in the dataset.\nThe discharge timeseries is in `streamflow`, with `-999` used as a missing value — we mask those before plotting."
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0e1f2a3b4c5d6e7",
"metadata": {},
"outputs": [],
"source": "print(observations_ds)"
},
{
"cell_type": "markdown",
"id": "e1f2a3b4c5d6e7f8",
"metadata": {},
"source": [
"## Plotting the discharge\n",
"\n",
"The observations are a pandas Series with a DatetimeIndex, so they can be plotted directly."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f2a3b4c5d6e7f8a9",
"metadata": {},
"outputs": [],
"source": [
"station_name = str(observations_ds[\"station_name\"].values)\n",
"river_name = str(observations_ds[\"river_name\"].values)\n",
"\n",
"# Mask the -999 missing value sentinel before plotting\n",
"streamflow = observations_ds[\"streamflow\"].where(observations_ds\n",
" [\"streamflow\"] != -999)\n",
"\n",
"plt.figure(figsize=(12, 4))\n",
"streamflow.plot(color=\"steelblue\")\n",
"plt.title(f\"GRDC station {station_id} — {station_name} ({river_name})\")\n",
"plt.xlabel(\"Date\")\n",
"plt.ylabel(\"Discharge (m³/s)\")\n",
"plt.tight_layout()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbformat_minor": 5,
"pygments_lexer": "ipython3",
"version": "3.12.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
6 changes: 6 additions & 0 deletions book/content/generate_forcing.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Every model needs forcing data, there are several possible ways to get this forc
- CMIP6 historical data
- CMIP6 future data
- Manual data input
- GRDC discharge observations

eWaterCycle supports different types of forcings, currently it supports:

Expand Down Expand Up @@ -46,6 +47,11 @@ The forcing object in eWaterCycle has some properties:
- filenames, a dictionary containing the paths to the netCDF files where the data is stored for that variable
- dataset, which I will cover in more detail below.

## Observations

It is also possible to use GRDC discharge observations to use as ground truth in your research.
We support various GRDC stations already, if your data is not on the server but on the GRDC data storage, please ask an admin to help you.

### Technical Details

This is for advanced users that will need to use different datasets.
Expand Down