Note
This page was generated from a jupyter notebook.
Using USGS NHDPlus HR Datasets With the Landlab NetworkSedimentTransporter Component¶
This tutorial illustrates how to use USGS NHDPlus HR datasets with the NetworkSedimentTransporter. Before they can be imported into Landlab, NHDPlus HR shapefiles must be preprocessed in GIS to import hydraulic attributes, fix topological issues, and generate nodes. The “Soque_Links.shp” and “Soque_Nodes.shp” files used in this example were prepared following QGIS procedures archived here.
Part 1: Here, we demonstrate how to import the files into Landlab, convert the shapefile attributes, and remove flat areas from the topograpghy.
Parts 2 - 5: These cells are copied nearly verbatim from the Landlab notebook “Using the Landlab NetworkSedimentTransporter component starting with a shapefile river network”. The only differences are the values of timesteps (cell 16), dt (cell 16), and originating_link (cell 20). This is to demonstrate how the code in Part 1 allows the files to behave exactly the same as the MethowSubBasin shapefiles used in other notebooks.
[ ]:
import warnings
warnings.filterwarnings(
"ignore", category=UserWarning, module=".*network_sediment_transporter"
)
[ ]:
import functools
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from tqdm import tqdm
from landlab import ExampleData
from landlab.components import FlowDirectorSteepest, NetworkSedimentTransporter
from landlab.data_record import DataRecord
from landlab.io import read_shapefile
from landlab.plot import graph, plot_network_and_parcels
1. Load the Preprocessed NHDPlus HR Files¶
First, we load the preprocessed shapefiles.
[ ]:
datadir = ExampleData("io/shapefile", case="soque").base
link_shp = datadir / "Soque_Links.shp"
node_shp = datadir / "Soque_Nodes.shp"
Next, we initiate the grid, identify the pour point node, convert the attribute formats so Landlab can read them, and convert some units.
[ ]:
# Initiate grid
grid = read_shapefile(
file=link_shp,
points_shapefile=node_shp,
link_fields=["TotDASqKm", "Slope", "LengthKM", "MaxElevSmo"],
node_fields=["TotDASqKm", "Slope", "LengthKM", "MaxElevSmo"],
link_field_conversion={
"TotDASqKm": "drainage_area",
"Slope": "channel_slope",
"LengthKM": "reach_length",
"MaxElevSmo": "topographic__elevation",
},
node_field_conversion={
"TotDASqKm": "drainage_area",
"Slope": "channel_slope",
"LengthKM": "reach_length",
"MaxElevSmo": "topographic__elevation",
},
threshold=0.01,
store_polyline_vertices=True,
)
[ ]:
# Identify pour point node ID
watershed_pour_point = np.where(np.isnan(grid.at_node["drainage_area"]))
# Give pour point node the same attributes (plus / minus a small amount) as the node immediately upstream on the main stem
largest_neighbor = np.max(
grid.at_node["drainage_area"][grid.adjacent_nodes_at_node[watershed_pour_point]]
)
# POUR POINT NODE DRAINAGE AREA IS 0.1% LARGER THAN ITS NEAREST MAINSTEM NEIGHBOR NODE
grid.at_node["drainage_area"][watershed_pour_point] = largest_neighbor * 1.001
smallest_neighbor = np.min(
grid.at_node["drainage_area"][grid.adjacent_nodes_at_node[watershed_pour_point]]
)
# POUR POINT NODE ELEVATION IS ([NEAREST MAINSTEM NEIGHBOR NODE ELEVATION] - [MINIMUM SLOPE IN NHDPLUS DATASET])
grid.at_node["topographic__elevation"][watershed_pour_point] = (
smallest_neighbor - np.min(grid.at_link["channel_slope"])
)
# Convert units
grid.at_link["drainage_area"] *= 1000000.0 # Convert km^2 TO m^2
grid.at_node["drainage_area"] *= 1000000.0 # Convert km^2 TO m^2
grid.at_link["reach_length"] *= 1000.0 # Convert km TO m
grid.at_node["reach_length"] *= 1000.0 # Convert km TO m
grid.at_link["topographic__elevation"] /= 100.0 # Convert cm to m
grid.at_node["topographic__elevation"] /= 100.0 # Convert cm to m
Finally, we have to deal with cetain areas of the grid called “flats”. These occur where two or more adjacent nodes have the same topographic elevation, creating an effective slope of zero between them. The NHDPlus HR datasets assign these areas a slope value of 1E-5 so that there are not any zeros:
[ ]:
np.nanmin(grid.at_node["channel_slope"])
As you can see, the smallest slope value is 1E-5 (we use the numpy.nanmin function because the value of the pour point node is “nan”). However, the Landlab flow directors independantly use topographic elevation to calculate slope. This can result in zeros that confuse the flow directors.
We can solve this by performing a “carve”. We will loop through each node on the grid and calculate the downstream path from that location. Wherever we encounter a flat, we will adjust the topographic elevation by subtracting a height equal to the length of that reach multiplyed by the minimum slope assigned by the NHDPlus HR datatables (1E-5).
[ ]:
# Create array that will store terminal flat areas for each node
downstream_nodes = np.zeros(grid.number_of_nodes, dtype=int)
# For each node, identify terminal flat node
for i in grid.nodes:
adjacent_nodes = grid.adjacent_nodes_at_node[i]
adjacent_nodes = np.append(adjacent_nodes, i)
adjacent_nodes = np.delete(adjacent_nodes, np.where(adjacent_nodes == -1))
adjacent_nodes_DA = grid.at_node["drainage_area"][adjacent_nodes]
downstream_id = np.where(adjacent_nodes_DA == np.max(adjacent_nodes_DA))
downstream_id = functools.reduce(lambda sub, elem: sub * 10 + elem, downstream_id)
if np.size(downstream_id) > 1:
downstream_nodes[i] = -1 # np.nan
else:
downstream_nodes[i] = adjacent_nodes[downstream_id]
downstream_nodes[watershed_pour_point] = -1 # np.nan
# downstream_nodes = downstream_nodes.astype(np.int)
# Create array to store new elevations
carve = np.ones(grid.number_of_nodes) * grid.at_node["topographic__elevation"]
# Calculate carve
for i in grid.nodes:
current_node = i
q = 1
while q == 1:
downstream_node = downstream_nodes[current_node]
if downstream_node >= 0:
current_node_elev = carve[current_node]
downstream_node_elev = carve[downstream_node]
if downstream_node_elev < current_node_elev:
current_node = downstream_node
if downstream_node_elev >= current_node_elev:
carve[downstream_node] = carve[current_node] - (
grid.at_node["channel_slope"][current_node]
* grid.at_node["reach_length"][current_node]
)
if current_node == watershed_pour_point:
q = 0
if downstream_node < 0:
q = 0
# Record amount carved
carve_delta = grid.at_node["topographic__elevation"] - carve
# Perform carve
grid.at_node["topographic__elevation"] = carve
The variable “carve_delta” records how much height was subtrated from each node. We can use it to view some statistics on how much we altered the topography:
[ ]:
plt.bar(grid.nodes, carve_delta)
plt.xlabel("Node ID")
plt.ylabel("Vertical distance carved (m)")
print("Total number of nodes carved = ", np.count_nonzero(carve_delta))
print(
"Percent of nodes carved = ",
(np.count_nonzero(carve_delta) / grid.number_of_nodes) * 100,
"%",
)
print("Maximum distance carved = ", np.max(carve_delta), "m")
print(
"Mean distance (+/- 1-sigma) carved = ",
np.mean(carve_delta[np.nonzero(carve_delta)]),
"+/-",
np.std(carve_delta[np.nonzero(carve_delta)]),
"m",
)
As you can see, the adjustments we made to the topography were very small (no more than a few millimeters at most). This means it is very unlikely that these changes will significantly affect the model behavior.
2. Review the Grid and Add Parameters¶
Alright, let’s see what fields we read in with this shapefile:
[ ]:
grid.at_link.keys()
[ ]:
grid.at_node.keys()
Great! Looks like we have length (reach length), upstream drainage area (drainage area), x and y verticies of each link/reach (x and y of polyline), and bed elevation (topographic elevation).
Note that “reach_length” is defined by the user, rather than calculated as the minimum distance between nodes. This accounts for channel sinuosity. In this case, “reach_length” could be equivalently calculated as the cumulative distance between verticies defined by x and y of polyline.
[ ]:
graph.plot_graph(grid, at="node,link")
[ ]:
grid.number_of_links
[ ]:
grid.number_of_nodes
Our network consists of 317 links between 318 nodes. In the plot above, X and Y represent the plan-view coordinates of the node locations.
Next, we need to populate the grid with the relevant topographic and hydrologic information:
[ ]:
grid.at_node["bedrock__elevation"] = grid.at_node["topographic__elevation"].copy()
grid.at_link["channel_width"] = 1 * np.ones(grid.number_of_links) # m
grid.at_link["flow_depth"] = 0.5 * np.ones(grid.number_of_links) # m
We must distinguish between topographic elevation (the top surface of the bed sediment) and bedrock elevation (the surface of the river in the absence of modeled sediment).
3. Create sediment ‘parcels’ in a DataRecord¶
We represent sediment in the network as discrete parcels (or packages) of grains of uniform size and characteristics. Each parcel is tracked through the network grid according to sediment transport and stratigraphic constraints.
Parcels are tracked using the Landlab DataRecord.
First, let’s create arrays with all of the essential sediment parcel variables:
[ ]:
# element_id is the link on which the parcel begins.
element_id = np.repeat(np.arange(grid.number_of_links), 50)
element_id = np.expand_dims(element_id, axis=1)
volume = 1 * np.ones(np.shape(element_id)) # (m3)
active_layer = np.ones(np.shape(element_id)) # 1= active, 0 = inactive
density = 2650 * np.ones(np.size(element_id)) # (kg/m3)
abrasion_rate = 0 * np.ones(np.size(element_id)) # (mass loss /m)
# Lognormal GSD
medianD = 0.15 # m
mu = np.log(medianD)
sigma = np.log(2) # assume that D84 = sigma*D50
np.random.seed(0)
D = np.random.lognormal(
mu, sigma, np.shape(element_id)
) # (m) the diameter of grains in each parcel
In order to track sediment motion, we classify parcels as either active (representing mobile surface sediment) or inactive (immobile subsurface) during each timestep. The active parcels are the most recent parcels to arrive in the link. During a timestep, active parcels are transported downstream (increasing their location_in_link
, which is a normalized value ranging from 0 to 1) according to a sediment transport formula.
We begin by assigning each parcel an arbitrary (and small) arrival time and location in the link.
[ ]:
time_arrival_in_link = np.random.rand(np.size(element_id), 1)
location_in_link = np.random.rand(np.size(element_id), 1)
In addition to the required parcel attributes listed above, you can designate optional parcel characteristics, depending on your needs. For example:
[ ]:
lithology = ["quartzite"] * np.size(element_id)
We now collect the arrays into a dictionary of variables, some of which will be tracked through time (["item_id", "time"]
), and others of which will remain constant through time :
[ ]:
variables = {
"abrasion_rate": (["item_id"], abrasion_rate),
"density": (["item_id"], density),
"lithology": (["item_id"], lithology),
"time_arrival_in_link": (["item_id", "time"], time_arrival_in_link),
"active_layer": (["item_id", "time"], active_layer),
"location_in_link": (["item_id", "time"], location_in_link),
"D": (["item_id", "time"], D),
"volume": (["item_id", "time"], volume),
}
With all of the required attributes collected, we can create the parcels DataRecord. Often, parcels will eventually transport off of the downstream-most link. To track these parcels, we have designated a “dummy_element
” here, which has index value -2
.
[ ]:
items = {"grid_element": "link", "element_id": element_id}
parcels = DataRecord(
grid,
items=items,
time=[0.0],
data_vars=variables,
dummy_elements={"link": [NetworkSedimentTransporter.OUT_OF_NETWORK]},
)
4. Run the NetworkSedimentTransporter¶
With the parcels and grid set up, we can move on to setting up the model.
[ ]:
timesteps = 20 # total number of timesteps
dt = 60 * 60 * 24 * 10 # length of timestep (seconds)
Before running the NST, we need to determine flow direction on the grid (upstream and downstream for each link). To do so, we initalize and run a Landlab flow director component:
[ ]:
fd = FlowDirectorSteepest(grid, "topographic__elevation")
fd.run_one_step()
Then, we initialize the network sediment transporter:
[ ]:
nst = NetworkSedimentTransporter(
grid,
parcels,
fd,
bed_porosity=0.3,
g=9.81,
fluid_density=1000,
transport_method="WilcockCrowe",
)
Now we are ready to run the model forward in time:
[ ]:
for t in tqdm(range(0, (timesteps * dt), dt)):
nst.run_one_step(dt)
5. Plot the model results¶
There are landlab plotting tools specific to the NetworkSedimentTransporter
. In particular, plot_network_and_parcels
creates a plan-view map of the network and parcels (represented as dots along the network). We can color both the parcels and the links by attributes.
Here, we demonstrate one example use of plot_network_and_parcels
. For a thorough tutorial on the plotting tools, see this notebook.
We can color links by values that we calculate. For example, if we are curious about the fate of sediment that started out on link 300 (in the far northwest corner of the basin), we might want to plot the total volume of sediment that originated on link 300 during a later timestep:
[ ]:
timestep_of_interest = 6
originating_link = 300
# filter the parcels to calculate total volumes of only the parcels that originated in the chosen link
parcelfilter = np.zeros_like(parcels.dataset.element_id, dtype=bool)
parcelfilter[:, timestep_of_interest] = (
parcels.dataset.element_id[:, 0] == originating_link
)
vol_orig_link = parcels.calc_aggregate_value(
xr.Dataset.sum, "volume", at="link", filter_array=parcelfilter, fill_value=0.0
)
fig = plot_network_and_parcels(
grid,
parcels,
link_attribute=vol_orig_link,
link_attribute_title="Vol of sed originating on link x",
network_linewidth=5,
parcel_alpha=0,
)
Non-network plotting¶
The results of the NST can be visualized by directly accessing information about the grid, the parcels, and by accessing variables stored after the run of NST.
As a simple example, we can plot the total volume of parcels on the grid through time. As parcels exit the grid, the total volume decreases.
[ ]:
parcel_vol_on_grid = parcels.dataset["volume"].values
parcel_vol_on_grid[parcels.dataset["element_id"].values == -2] = 0
# plt.figure(figsize=(8,6))
plt.plot(
np.asarray(parcels.time_coordinates) / (60 * 60 * 24),
np.sum(parcel_vol_on_grid, axis=0),
"-",
linewidth=3,
alpha=0.5,
)
plt.ylabel("Total volume of parcels on grid $[m^3]$")
plt.xlabel("Time [days]")
plt.show()
We can also plot individual parcel characteristics. The plot below shows the total transport distance of each parcel through the whole model run as a function of the parcel’s grain size (during the final timestep).
[ ]:
plt.loglog(parcels.dataset.D[:, -1], nst._distance_traveled_cumulative, ".")
plt.xlabel("Parcel grain size (m)")
plt.ylabel("Cumulative parcel travel distance (m)")
# Note: some of the smallest grain travel distances can exceed the length of the
# grid by "overshooting" during a single timestep of high transport rate
The plot below is an example of accessing variables associated with the grid (grid.at_link.X
, or grid.at_node.X
), as well as a variable associated with this instance of NetworkModelGrid (nmg.X
):
[ ]:
plt.plot(grid.at_link["channel_slope"], nst.d_mean_active, ".")
plt.xlabel("Channel slope (m/m)")
plt.ylabel("Mean grain size of active layer (m)")