River network processing
The river network processing module provides a toolkit for building and analyzing river network topology, including Strahler ordering, reach joining, and routing parameter calculation.
Overview
River network processing is a required preprocessing step that:
- Builds network topology from a shapefile geometry
- Enforces binary tree structure (maximum 2 upstream tributaries per reach)
- Computes Strahler stream ordering
- Optionally joins single-tributary reaches to simplify the network
- Calculates hydraulic and routing parameters (width, lag time, storage coefficient)
- Determines optimal calculation order for routing algorithms
- Exports processed networks for use in simulations
Functions
River network processing
Process river network shapefile to create a complete network structure.
This function reads a river network shapefile, builds the network topology, enforces binary tree structure (max 2 upstream tributaries per reach), computes Strahler stream ordering, optionally joins single-tributary reaches, and calculates routing parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
shapefile_path
|
str | Path
|
Path to river network shapefile |
required |
join_single_tributaries
|
bool
|
If True, joins reaches with single tributaries (default: True) |
True
|
routing_params
|
dict | None
|
Dictionary with routing parameters: - wcel: Wave celerity [m/s] (default: 5.0) - Br0: Width of 1st order streams [m] (default: 1.0) - NBr: Width-order exponent (default: 1.5) - n_Man: Manning coefficient [s/m^(1/3)] (default: 0.03) |
None
|
Returns:
| Type | Description |
|---|---|
GeoDataFrame
|
GeoDataFrame with processed network including topology and routing parameters |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If shapefile does not exist |
ValueError
|
If network is invalid |
Examples:
>>> network = process_river_network(
... "river_network.shp",
... routing_params={"wcel": 3.0, "Br0": 1.0, "NBr": 1.5, "n_Man": 0.03}
... )
Source code in mobidic/preprocessing/river_network.py
River network I/O
Save processed river network to file.
Saves the river network to either GeoParquet (default, recommended) or Shapefile format. GeoParquet is more efficient, has no field name limitations, and preserves data types better.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
network
|
GeoDataFrame
|
Processed river network GeoDataFrame |
required |
output_path
|
str | Path
|
Path to output file |
required |
format
|
str
|
Output format, either ‘parquet’ (default) or ‘shapefile’ |
'parquet'
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If format is not supported |
Examples:
>>> from mobidic import process_river_network
>>> network = process_river_network("river_network.shp")
>>> from mobidic.preprocessing.io import save_network
>>> save_network(network, "river_network.parquet") # Default: parquet
>>> save_network(network, "river_network.shp", format="shapefile") # Shapefile
Source code in mobidic/preprocessing/io.py
Load processed river network from GeoParquet file.
Loads river network data from GeoParquet format only (the recommended format). For loading shapefiles, use geopandas.read_file() directly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
network_path
|
str | Path
|
Path to network GeoParquet file (.parquet) |
required |
Returns:
| Type | Description |
|---|---|
GeoDataFrame
|
GeoDataFrame with river network |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If network file does not exist |
ValueError
|
If file is not a .parquet file |
Examples:
>>> from mobidic.preprocessing.io import load_network
>>> network = load_network("river_network.parquet")
Source code in mobidic/preprocessing/io.py
Examples
Complete river network processing
from mobidic import process_river_network, save_network
# Process river network from shapefile
network = process_river_network(
shapefile_path="data/river_network.shp",
join_single_tributaries=True,
routing_params={
"wcel": 2.5, # Celerity: 2.5 m/s
"Br0": 5.0, # Base width: 5 m
"NBr": 0.5, # Width exponent
"n_Man": 0.03 # Manning coefficient
}
)
# Inspect the processed network
print(f"Total reaches: {len(network)}")
print(f"Max Strahler order: {network['strahler_order'].max()}")
print(f"Total length: {network['length_m'].sum() / 1000:.1f} km")
# Export to GeoParquet (recommended)
save_network(network, "output/network.parquet", format="parquet")
# Or export to Shapefile (for backward compatibility)
save_network(network, "output/network.shp", format="shapefile")
Loading a processed Network
from mobidic import load_network
# Load previously processed network
network = load_network("output/network.parquet")
# Access network properties
terminal_reaches = network[network['downstream'].isna()]
print(f"Number of outlets: {len(terminal_reaches)}")
headwater_reaches = network[network['strahler_order'] == 1]
print(f"Number of headwater streams: {len(headwater_reaches)}")
Analyzing river network topology
from mobidic import process_river_network
import pandas as pd
network = process_river_network("data/river_network.shp")
# Get summary statistics by Strahler order
summary = network.groupby('strahler_order').agg({
'length_m': ['count', 'sum', 'mean'],
'width_m': 'mean',
'lag_time_s': 'mean'
})
print(summary)
# Find reaches with multiple upstream tributaries
junctions = network[network['upstream_2'].notna()]
print(f"Number of junctions: {len(junctions)}")
# Trace upstream from a specific reach
def get_upstream_network(network, mobidic_id):
"""Recursively find all upstream reaches."""
upstream = []
reach = network[network['mobidic_id'] == mobidic_id].iloc[0]
for us_field in ['upstream_1', 'upstream_2']:
us_id = reach[us_field]
if pd.notna(us_id):
upstream.append(us_id)
upstream.extend(get_upstream_network(network, us_id))
return upstream
# Get all reaches upstream of reach 100
upstream_reaches = get_upstream_network(network, 100)
print(f"Reaches upstream of reach 100: {len(upstream_reaches)}")
River network schema
Processed networks contain the following fields:
| Field | Type | Description |
|---|---|---|
mobidic_id |
int | Internal reach identifier (0-indexed for topology) |
geometry |
LineString | Reach geometry |
upstream_1 |
int/NaN | First upstream reach ID (references mobidic_id) |
upstream_2 |
int/NaN | Second upstream reach ID if junction (references mobidic_id) |
downstream |
int/NaN | Downstream reach ID (references mobidic_id, NaN for outlets) |
strahler_order |
int | Strahler stream order (1 = headwater) |
calc_order |
int | Calculation order for routing (lower first) |
length_m |
float | Reach length in meters |
width_m |
float | Channel width in meters |
lag_time_s |
float | Lag time in seconds (used as K in routing) |
n_manning |
float | Manning roughness coefficient |
Note: All original shapefile fields are preserved in the output. The mobidic_id field is added for internal topology management and is separate from any original ID fields in the shapefile.
Strahler ordering
The Strahler stream order is a hierarchical classification system:
- First-order streams (order 1): Headwater streams with no upstream tributaries
- Higher-order streams: When two streams of order n join, the result is order n+1
- Unequal joining: When streams of different orders join, the downstream reach inherits the higher order
Example network:
Routing parameters
Calculated parameters for channel routing:
- Channel width:
B = Br0 × order^NBr - Uses power-law relationship with Strahler order
Br0: Base width for first-order streams-
NBr: Width scaling exponent -
Lag time:
τ = L / wcel - Time for a flow disturbance to travel the reach length
L: Reach length (m)-
wcel: Wave celerity (m/s) -
Storage coefficient:
K = 0.5 × exp(-L/10000) - Dimensionless attenuation factor
-
Decreases exponentially with reach length
-
Manning coefficient:
n_manning - Roughness parameter for friction calculations
- Can be specified or estimated from channel properties
Processing pipeline
The river network processing follows these steps in order:
- Read shapefile: Load river network geometry and attributes
- Build topology: Identify upstream/downstream connections by matching reach endpoints
- Enforce binary tree: Add fictitious short reaches for nodes with >2 tributaries
- Compute Strahler order: Assign hierarchical stream orders
- Join single tributaries (optional): Merge linear reach sequences
- Calculate routing parameters: Compute width, lag time, storage coefficient
- Determine calculation order: Set optimal processing sequence for routing
Binary tree enforcement
Networks may have junctions where more than 2 reaches join. MOBIDIC requires binary trees (maximum 2 upstream tributaries per reach) for its routing algorithms. The _enforce_binary_tree() function:
- Identifies nodes with >2 upstream reaches
- Adds fictitious short reaches (0.1m length) to split non-binary junctions
- Updates topology to maintain network connectivity
Example: A 3-way junction (reaches A, B, C joining at node N flowing to reach D):
This step typically adds a small number of fictitious reaches.
River network topology
Supported Structures
- Binary trees: Maximum 2 upstream tributaries per reach (enforced automatically)
- Multiple outlets: Supports networks with more than one terminal reach
- Disconnected networks: Warns if multiple unconnected sub-networks exist
Topology Validation
The processing pipeline validates:
- Reach connectivity (no dangling reaches)
- Binary tree structure (automatically enforced)
- Circular references (detects loops)
- Duplicate reach IDs
Reach Joining
When join_single_tributaries=True:
- Identifies reaches with exactly one upstream and one downstream reach
- Merges these into the downstream reach
- Recalculates length, routing parameters, and geometry
- Simplifies the network while preserving topology
This is useful for: - Reducing computational load - Focusing on hydrologically significant junctions - Matching observational data at specific locations
Note: Joining is applied after binary tree enforcement, so some fictitious reaches may be merged.
File Formats
GeoParquet (Recommended)
- Pros: Fast I/O, efficient compression, preserves all data types, widely supported
- Cons: Requires
pyarrow(default) orfastparquet - Use: Default format for processed networks
Shapefile (Legacy)
- Pros: Universal compatibility, GIS software support
- Cons: Slower I/O, field name limitations (10 chars), limited data types
- Use: When compatibility with legacy tools is required
Notes
- All lengths are in meters, times in seconds
- Coordinate reference system (CRS) is preserved from input shapefile
- Original shapefile fields are preserved;
mobidic_idis added for topology - Geometry must be LineString (not MultiLineString)
- Network must be topologically connected (warns if not)
- Fictitious reaches (added during binary tree enforcement) contain only MOBIDIC-specific fields