Skip to content

Hillslope-reach mapping

The hillslope-reach mapping module connects hillslope grid cells to river reaches, enabling lateral flow routing from the distributed grid to the channel network.

Overview

This module performs two key operations:

  1. Rasterization: Identifies which grid cells are occupied by river reaches
  2. Flow routing: Traces flow paths from hillslope cells to their corresponding reaches

These functions are used for distributing lateral inflow from the hillslope water balance to the appropriate river reaches in the channel routing module.

Functions

Compute hillslope cells

Compute hillslope cells for each reach in the river network.

This function rasterizes each reach geometry onto the grid to identify which grid cells are directly occupied by the channel. These cells are stored as linear indices in the ‘hillslope_cells’ column.

The densification uses iterative midpoint subdivision (matching MATLAB’s densify.m) to ensure exact compatibility with the original implementation.

Parameters:

Name Type Description Default
network GeoDataFrame

GeoDataFrame with processed river network (must contain geometry column)

required
grid_path str | Path | None

Path to reference grid raster (used to get grid parameters). Required if grid_array is not provided.

None
densify_step float

Maximum distance [m] between points when densifying reach geometries (default: 10.0)

10.0
grid_array ndarray | None

Optional numpy array with grid data. If provided, xllcorner, yllcorner, and cellsize must also be provided.

None
xllcorner float | None

X coordinate of lower-left corner [m]. Required if grid_array is provided.

None
yllcorner float | None

Y coordinate of lower-left corner [m]. Required if grid_array is provided.

None
cellsize float | None

Grid cell size [m]. Required if grid_array is provided.

None

Returns:

Type Description
GeoDataFrame

GeoDataFrame with added ‘hillslope_cells’ column containing list of linear indices

Examples:

>>> from mobidic import process_river_network
>>> network = process_river_network("river_network.shp")
>>>
>>> # Option 1: Use file path
>>> flowdir_path = "flowdir.tif"
>>> network = compute_hillslope_cells(network, grid_path=flowdir_path)
>>>
>>> # Option 2: Use array with metadata
>>> network = compute_hillslope_cells(
...     network,
...     grid_array=flowdir_array,
...     xllcorner=xll,
...     yllcorner=yll,
...     cellsize=cs
... )
Source code in mobidic/preprocessing/hillslope_reach_mapping.py
def compute_hillslope_cells(
    network: gpd.GeoDataFrame,
    grid_path: str | Path | None = None,
    densify_step: float = 10.0,
    grid_array: np.ndarray | None = None,
    xllcorner: float | None = None,
    yllcorner: float | None = None,
    cellsize: float | None = None,
) -> gpd.GeoDataFrame:
    """Compute hillslope cells for each reach in the river network.

    This function rasterizes each reach geometry onto the grid to identify
    which grid cells are directly occupied by the channel. These cells are
    stored as linear indices in the 'hillslope_cells' column.

    The densification uses iterative midpoint subdivision (matching MATLAB's
    densify.m) to ensure exact compatibility with the original implementation.

    Args:
        network: GeoDataFrame with processed river network (must contain geometry column)
        grid_path: Path to reference grid raster (used to get grid parameters). Required if grid_array is not provided.
        densify_step: Maximum distance [m] between points when densifying reach geometries (default: 10.0)
        grid_array: Optional numpy array with grid data. If provided, xllcorner, yllcorner,
                    and cellsize must also be provided.
        xllcorner: X coordinate of lower-left corner [m]. Required if grid_array is provided.
        yllcorner: Y coordinate of lower-left corner [m]. Required if grid_array is provided.
        cellsize: Grid cell size [m]. Required if grid_array is provided.

    Returns:
        GeoDataFrame with added 'hillslope_cells' column containing list of linear indices

    Examples:
        >>> from mobidic import process_river_network
        >>> network = process_river_network("river_network.shp")
        >>>
        >>> # Option 1: Use file path
        >>> flowdir_path = "flowdir.tif"
        >>> network = compute_hillslope_cells(network, grid_path=flowdir_path)
        >>>
        >>> # Option 2: Use array with metadata
        >>> network = compute_hillslope_cells(
        ...     network,
        ...     grid_array=flowdir_array,
        ...     xllcorner=xll,
        ...     yllcorner=yll,
        ...     cellsize=cs
        ... )
    """
    logger.info(f"Computing hillslope cells for {len(network)} reaches")

    # Determine input mode and get grid parameters
    if grid_array is not None:
        # Array mode: use provided array and metadata
        if xllcorner is None or yllcorner is None or cellsize is None:
            raise ValueError("When grid_array is provided, xllcorner, yllcorner, and cellsize must also be provided")
        grid = grid_array
        nrows, ncols = grid.shape
        logger.debug("Using provided grid array")
    elif grid_path is not None:
        # Path mode: read from file
        grid_result = grid_to_matrix(grid_path)
        grid = grid_result["data"]
        xllcorner = grid_result["xllcorner"]
        yllcorner = grid_result["yllcorner"]
        cellsize = grid_result["cellsize"]
        nrows, ncols = grid.shape
        logger.debug(f"Reading grid from file: {grid_path}")
    else:
        raise ValueError("Either grid_path or grid_array (with metadata) must be provided")

    logger.debug(f"Grid parameters: xllcorner={xllcorner}, yllcorner={yllcorner}, cellsize={cellsize}")

    # Initialize column for storing hillslope cells
    hillslope_cells = []

    for idx in network.index:
        geom = network.loc[idx, "geometry"]

        if geom.is_empty:
            hillslope_cells.append([])
            continue

        # Densify the LineString geometry
        coords = _densify_linestring(geom, densify_step)

        if len(coords) == 0:
            hillslope_cells.append([])
            continue

        # Convert geographic coordinates to grid indices
        xx0 = coords[:, 0]  # x coordinates
        yy0 = coords[:, 1]  # y coordinates

        # Convert to grid indices
        col_indices = np.round((xx0 - xllcorner) / cellsize).astype(int)
        row_indices = np.round((yy0 - yllcorner) / cellsize).astype(int)

        # Filter out-of-bounds indices
        valid_mask = (row_indices >= 0) & (row_indices < nrows) & (col_indices >= 0) & (col_indices < ncols)
        row_indices = row_indices[valid_mask]
        col_indices = col_indices[valid_mask]

        if len(row_indices) == 0:
            hillslope_cells.append([])
            continue

        # Convert to linear indices
        linear_indices = col_indices * nrows + row_indices

        # Get unique indices
        unique_indices = np.unique(linear_indices)

        hillslope_cells.append(unique_indices.tolist())

    # Add column to network
    network = network.copy()
    network["hillslope_cells"] = hillslope_cells

    total_cells = sum(len(cells) for cells in hillslope_cells)
    logger.success(f"Computed hillslope cells: {total_cells} total cells across {len(network)} reaches")

    return network

Map hillslope to reach

Map each hillslope cell to its downstream river reach.

For each grid cell, this function follows the flow direction path until it reaches a cell that belongs to a river reach (hillslope cell), then assigns the reach’s mobidic_id to the original cell.

This function translates MATLAB’s hill2chan.m algorithm: - Builds a lookup table of reach cells from network.hillslope_cells - For each valid cell, follows flow direction until reaching a reach cell - Assigns the reach’s mobidic_id to all cells along the path - Handles edge cases: out-of-bounds, invalid directions, loops

Parameters:

Name Type Description Default
network GeoDataFrame

GeoDataFrame with river network (must have ‘hillslope_cells’ and ‘mobidic_id’ columns)

required
flowdir_path str | Path | None

Path to flow direction raster file. Required if flowdir_array is not provided.

None
flow_dir_type str

Flow direction notation (‘Grass’ for 1-8 or ‘Arc’ for power-of-2). Only used if flowdir_path is provided.

'Grass'
flowdir_array ndarray | None

Optional numpy array with flow direction data in MOBIDIC notation (1-8). If provided, no conversion is applied.

None

Returns:

Type Description
ndarray

2D array with reach assignment for each cell (mobidic_id or -9999 for unassigned)

Raises:

Type Description
ValueError

If network doesn’t have required columns or flow_dir_type is invalid

Examples:

>>> from mobidic import process_river_network, compute_hillslope_cells
>>> network = process_river_network("river_network.shp")
>>> network = compute_hillslope_cells(network, "flow_dir.tif")
>>>
>>> # Option 1: Use file path
>>> reach_map = map_hillslope_to_reach(network, flowdir_path="flow_dir.tif")
>>>
>>> # Option 2: Use array (already in MOBIDIC notation)
>>> reach_map = map_hillslope_to_reach(network, flowdir_array=flowdir_mobidic)
Source code in mobidic/preprocessing/hillslope_reach_mapping.py
def map_hillslope_to_reach(
    network: gpd.GeoDataFrame,
    flowdir_path: str | Path | None = None,
    flow_dir_type: str = "Grass",
    flowdir_array: np.ndarray | None = None,
) -> np.ndarray:
    """Map each hillslope cell to its downstream river reach.

    For each grid cell, this function follows the flow direction path until
    it reaches a cell that belongs to a river reach (hillslope cell), then
    assigns the reach's mobidic_id to the original cell.

    This function translates MATLAB's hill2chan.m algorithm:
    - Builds a lookup table of reach cells from network.hillslope_cells
    - For each valid cell, follows flow direction until reaching a reach cell
    - Assigns the reach's mobidic_id to all cells along the path
    - Handles edge cases: out-of-bounds, invalid directions, loops

    Args:
        network: GeoDataFrame with river network (must have 'hillslope_cells' and 'mobidic_id' columns)
        flowdir_path: Path to flow direction raster file. Required if flowdir_array is not provided.
        flow_dir_type: Flow direction notation ('Grass' for 1-8 or 'Arc' for power-of-2).
                        Only used if flowdir_path is provided.
        flowdir_array: Optional numpy array with flow direction data in MOBIDIC notation (1-8).
                        If provided, no conversion is applied.

    Returns:
        2D array with reach assignment for each cell (mobidic_id or -9999 for unassigned)

    Raises:
        ValueError: If network doesn't have required columns or flow_dir_type is invalid

    Examples:
        >>> from mobidic import process_river_network, compute_hillslope_cells
        >>> network = process_river_network("river_network.shp")
        >>> network = compute_hillslope_cells(network, "flow_dir.tif")
        >>>
        >>> # Option 1: Use file path
        >>> reach_map = map_hillslope_to_reach(network, flowdir_path="flow_dir.tif")
        >>>
        >>> # Option 2: Use array (already in MOBIDIC notation)
        >>> reach_map = map_hillslope_to_reach(network, flowdir_array=flowdir_mobidic)
    """
    logger.info("Mapping hillslope cells to river reaches")

    # Validate network has required columns
    if "hillslope_cells" not in network.columns:
        logger.error("Network must have 'hillslope_cells' column. Run compute_hillslope_cells() first.")
        raise ValueError("Network must have 'hillslope_cells' column")
    if "mobidic_id" not in network.columns:
        logger.error("Network must have 'mobidic_id' column")
        raise ValueError("Network must have 'mobidic_id' column")

    # Determine input mode and get flow direction
    if flowdir_array is not None:
        # Array mode: use provided array (assumed to be in MOBIDIC notation)
        flowdir = flowdir_array
        logger.debug("Using provided flow direction array (MOBIDIC notation)")
    elif flowdir_path is not None:
        # Path mode: read from file and convert
        flowdir_result = grid_to_matrix(flowdir_path)
        flowdir = flowdir_result["data"]
        # Transform flow direction from GRASS/Arc to MOBIDIC notation
        flowdir = convert_to_mobidic_notation(flowdir, from_notation=flow_dir_type)
        logger.debug(f"Reading flow direction from file and converting from {flow_dir_type} to MOBIDIC notation")
    else:
        raise ValueError("Either flowdir_path or flowdir_array must be provided")

    nrows, ncols = flowdir.shape

    # Initialize output array with NaN
    reach_map = np.full((nrows, ncols), np.nan, dtype=float)

    # Build lookup table: linear_index -> mobidic_id
    # This replaces MATLAB's ri/rv arrays with a dictionary for O(1) lookup
    logger.debug("Building reach cell lookup table")
    cell_to_reach = {}
    for idx in network.index:
        mobidic_id = network.loc[idx, "mobidic_id"]
        hillslope_cells = network.loc[idx, "hillslope_cells"]

        if hillslope_cells is None or len(hillslope_cells) == 0:
            continue

        for cell_idx in hillslope_cells:
            # Store first occurrence (matches MATLAB h(1) behavior)
            if cell_idx not in cell_to_reach:
                cell_to_reach[cell_idx] = mobidic_id

    logger.debug(f"Lookup table built: {len(cell_to_reach)} reach cells")

    # Direction offsets for MOBIDIC notation (1-8)
    di = np.array([-1, -1, -1, 0, 1, 1, 1, 0])  # row offset (matches i8)
    dj = np.array([-1, 0, 1, 1, 1, 0, -1, -1])  # column offset (matches j8)

    # Get valid cells (non-NaN flow direction)
    valid_cells = np.where(np.isfinite(flowdir))
    n_valid = len(valid_cells[0])

    logger.info(f"Processing {n_valid} valid cells")

    # Process each valid cell
    processed_count = 0
    unassigned_count = 0

    for k in range(n_valid):
        i0, j0 = valid_cells[0][k], valid_cells[1][k]

        # Skip if already assigned
        if not np.isnan(reach_map[i0, j0]):
            continue

        # Track path for this cell
        path_i = [i0]
        path_j = [j0]
        ic, jc = i0, j0

        # Follow flow direction until reaching a reach cell or error
        while True:
            # Convert current position to linear index
            kc = jc * nrows + ic

            # Check if current cell is a reach cell
            if kc in cell_to_reach:
                # Assign reach to original cell
                reach_map[i0, j0] = cell_to_reach[kc]
                processed_count += 1
                break

            # Check for invalid flow direction
            direction = flowdir[ic, jc]
            if not (1 <= direction <= 8) or np.isnan(direction):
                reach_map[i0, j0] = -9999
                unassigned_count += 1
                break

            # Move to next cell
            direction_idx = int(direction) - 1
            ic_next = ic + di[direction_idx]
            jc_next = jc + dj[direction_idx]

            # Check bounds
            if ic_next < 0 or ic_next >= nrows or jc_next < 0 or jc_next >= ncols:
                reach_map[i0, j0] = -9999
                unassigned_count += 1
                break

            # MATLAB loop detection: if find(sub2ind([n,m],i0,j0) == sub2ind([n,m],ic,jc))
            # Check if next position (ic_next, jc_next) is already in the path
            # This detects if we're revisiting any cell in the current path
            if any(ic_next == pi and jc_next == pj for pi, pj in zip(path_i, path_j)):
                reach_map[i0, j0] = -9999
                unassigned_count += 1
                break

            # Append to path and continue (MATLAB: i0=[i0 ic]; j0=[j0 jc];)
            path_i.append(ic_next)
            path_j.append(jc_next)
            ic, jc = ic_next, jc_next

    logger.success(
        f"Hillslope-to-reach mapping complete: {processed_count} cells assigned, "
        f"{unassigned_count} cells unassigned (out-of-bounds, loops, or invalid flow)"
    )

    return reach_map

Workflow

The typical workflow for hillslope-reach mapping is:

from mobidic import (
    process_river_network,
    compute_hillslope_cells,
    map_hillslope_to_reach,
)

# Step 1: Process river network
network = process_river_network(
    shapefile_path="data/river_network.shp",
    join_single_tributaries=True,
)

# Step 2: Rasterize reaches onto grid
network = compute_hillslope_cells(
    network=network,
    grid_path="data/flow_direction.tif",
)

# Step 3: Map hillslope cells to reaches
reach_map = map_hillslope_to_reach(
    network=network,
    flowdir_path="data/flow_direction.tif",
    flow_dir_type="Grass",  # or "Arc"
)

# Result: 2D array with reach ID for each cell
print(f"Grid shape: {reach_map.shape}")
print(f"Reaches identified: {len(set(reach_map[reach_map >= 0]))}")

Algorithm details

Rasterization algorithm

compute_hillslope_cells() performs the following steps:

  1. Reads the reference grid to obtain spatial reference (transform, CRS, dimensions)
  2. For each reach in the network:
  3. Converts the reach geometry to a list of (x, y) coordinates
  4. Transforms geographic coordinates to pixel indices using the grid’s affine transform
  5. Records all cells intersected by the reach geometry
  6. Stores the list of cell indices (as linear indices) in a hillslope_cells column

Flow path

map_hillslope_to_reach() implements a flow path following algorithm:

  1. For each grid cell:
  2. If the cell belongs to a reach (has hillslope_cells), assign that reach ID
  3. Otherwise, follow the flow direction downstream until reaching a reach
  4. Assign the reached reach ID to the original cell
  5. Cells that don’t reach any reach (e.g., flowing out of basin) are assigned -9999

Flow direction support: Both Grass (1-8 notation) and Arc (power-of-2 notation) formats are supported.

Return values

compute_hillslope_cells()

Returns a modified GeoDataFrame with an additional column:

  • hillslope_cells (list of int): Linear indices of cells occupied by each reach

map_hillslope_to_reach()

Returns a 2D numpy array with:

  • Values ≥ 0: mobidic_id of the reach draining this cell
  • Value = -9999: Cell does not drain to any reach (e.g., basin outlet)
  • NaN: Nodata cells outside the valid domain

Example output

After processing, the network GeoDataFrame contains:

network[['mobidic_id', 'length_m', 'hillslope_cells']].head()
#    mobidic_id  length_m  hillslope_cells
# 0           0   5432.1   [12045, 12046, 12147, ...]
# 1           1   3210.8   [15234, 15235, ...]
# 2           2   8765.4   [18902, 18903, 19004, ...]

And the reach_map can be visualized:

import matplotlib.pyplot as plt

plt.figure(figsize=(10, 8))
plt.imshow(reach_map, cmap='tab20', interpolation='nearest')
plt.colorbar(label='Reach ID')
plt.title('Hillslope-Reach Mapping')
plt.show()