How to autocalibrate a SWMM model with swmmr

Introduction

Model calibration or optimization is an essential part within the modelling chain to improve the model quality. During calibration, model parameter values are systematically modified to optimize an objective function, which numerically expresses the difference between observation and simulation data. In this tutorial, we walk through the required steps to autocalibrate a SWMM model with swmmr. We calibrate the Example1.inp model which is shipped with the offical SWMM executable from US EPA. For the optimization, we use the DEoptim package, which provides a differential evolution optimization algorithm.

library(swmmr)
library(DEoptim)

Setup

First, model paths are defined and a simulation run is initiated with run_swmm to check for errors.

# set path to inp
# If your operating system is Windows, the Example1.inp model is usually 
# located at "C:\Users\your user name\Documents\EPA SWMM Projects\Examples".
# For convenience the Example1.inp model is also included in the swmmr package.
# Feel free to change this to your path of choice.
inp_file <- system.file("extdata", "Example1.inp", package = "swmmr", mustWork = TRUE)

# both rpt and out files are temporary files
tmp_rpt_file <- tempfile()
tmp_out_file <- tempfile()

# initiate the simulation
swmm_files <- run_swmm(
  inp = inp_file,
  rpt = tmp_rpt_file,
  out = tmp_out_file
)

The original output of the first simulation run is assumed to be the observation data. The parameter of interest is total_runoff at node 18. We load the observation data as an xts object with read_out.

obs <- read_out(
  file = swmm_files$out,
  iType = 1,
  object_name = "18",
  vIndex = 4
)[["18"]]$total_inflow

To keep it simple, we only calibrate the model parameter Perc_Imperv of subcatchments which are larger than 10 ha. The original parameter values are “50” for subcatchment “5” and “10” for subcatchment “6”. The model structure is loaded with read_inp.

# read model structure
inp <- read_inp(swmm_files$inp) 

# show the original parameter values
inp$subcatchments[inp$subcatchments$Area > 10, ]

Goodness-of-fit criteria

To numerically evaluate the difference between observation and simulation data, we use the Nash-Sutcliffe efficiency (nse).

# function calculates the goodness of fit value
# input x is a two column xts object, col1: obs, col2: sim
nse <- function(x) {
  1 - sum((x[, 1] - x[, 2]) ^ 2) / sum((x[, 1] - mean(x[, 1])) ^ 2)
}

Objective function

The optimization algorithm exactly needs one function to be minimized. Therefore, we define a function that

first argument takes a vector of real-valued parameters,
second argument is a loaded inp file, containing a list of SWWM section as tibbles and
takes observation data as a third argument.

The function mainly consists of three sections. First, a new parameter set generated from the optimization algorithm overrides the original values of the inp object. The updated inp object is then written to disk with write_inp. Second a simulation run is performed with the new inp file and results are read with read_out. Finally, the goodness-of-fit value is calculated.

obj_fun <- function(x, inp, obs) {

  # set new parameters and update inp object
  inp$subcatchments <- transform(
    inp$subcatchments,
    Perc_Imperv = ifelse(Area > 10, x, Perc_Imperv)
  )

  # write new inp file to disk
  tmp_inp <- tempfile()
  write_inp(inp, tmp_inp)

  # run swmm with new parameter set
  swmm_files <- suppressMessages(run_swmm(tmp_inp, stdout = NULL))

  # remove files when function exits to avoid heavy disk usage
  on.exit(file.remove(unlist(swmm_files)))

  # read sim result
  sim <- read_out(
    file = swmm_files$out, # path to out file
    iType = 1, # type: node
    object_name = "18", # name of node
    vIndex = 4 # parameter at node: total inflow
  )[["18"]]$total_inflow # directly access to xts object

  # calculate goodness-of-fit
  # note: multiply by minus one to have a real min problem (nse: +1 to -Inf)
  nse(merge(obs, sim)) * -1
}

Optimization

Finally, we need to config the optimization algorithm. It is required to provide

the function to be optimized
parameter bounds (lower, upper)
a list of control parameters (useful for parallel computing or fine tuning)
further argument passed to the function to be minimized (here: the ìnp object and the observation data)

  set.seed(84) # to get reproducible results

  calibration_res <- DEoptim(
    fn = obj_fun, 
    lower = c(0, 0), 
    upper = c(100, 100),
    control = list(
      itermax = 50, # maximum iterations
      trace = 10, # print progress every 10th iteration
      packages = c("swmmr"), # export packages to optimization environment
      parVar = c("nse"), # export function to optimization environment
      parallelType = 0 # set to 1 to use all available cores
    ),
    inp = inp, # 'inp' object
    obs = obs # xts object containing observation data
  )

  summary(calibration_res)

The calibration yields as optimized parameter values which is close to the original values.