Blog

Scaling up VIC for HPC

Oct 17, 2016. | By: Joe Hamman

Last month, we released VIC 5.0.0, a major rewrite of the user interface to and the infrastructure of the VIC model. We introduced this release in a recent blog post. This rewrite included a number of upgrades that will allow VIC to perform much better in high-performance computing (HPC) environments. In this blog post, I will discuss some of these improvements and illustrate how they may facilitate VIC applications at scales that were previously computationally infeasible.

Background

VIC was originally intended to be used as a land surface scheme (LSS) in general circulation models (GCMs) (Liang et al., 1994). However, the original source code was written as a stand-alone column model. In this configuration, distributed simulations were run one grid cell at a time, where the model would complete a simulation for a single grid cell for all timesteps prior to moving on to the next grid cell. We call this configuration “time-before-space”. From an infrastructure perspective, this meant VIC did not need built-in tools for parallelization or memory management for distributed simulations. Large scale (e.g. regional or global), distributed simulations were made possible through what we refer to as “poor man’s parallelization” where each grid cell was simulated as a separate task (even on a separate computer). For the past 15 years, this parallelization strategy has been sufficient for many VIC applications.

The development of VIC 5 was largely motivated by a number of limitations that were related to the legacy configuration of the VIC source code. First, despite this being one of VIC’s original goals, the “time-before-space” loop order precluded VIC’s direct coupling within GCMs, which typically require a “space-before-time” evaluation order. Second, VIC’s input/output (I/O) was designed to work with its “time-before-space” configuration. This meant there were individual forcing and output files for each grid cells. For large model domains, such as the Livneh et al. (2015) domain which included more than 330,000 grid cells, the sheer number of files that VIC users were having to deal with began to be the most challenging part of working with VIC. It also precluded the use of standard tools (such as ncview, cdo, xarray and others) that are designed to visualize and analyze large data sets. Not being able to visualize model output easily hampers model application and development. Third, a number of important hydrologic problems require a “space-before-time” configuration. For example, if upstream flow needs to be taken into account for local water and energy balance calculations, then it is necessary to perform routing after each model timestep. Routing requires information about the entire domain. While there are ways to accommodate this with prior VIC versions, these implementations tend to be awkward and aim to circumvent the “time-before-space” configuration. Implementation of a true “space-before-time” mode simplifies the integration of routing with the rest of the VIC model.

VIC 5 Developments for HPC

The largest change introduced in VIC 5 is the notion of individual drivers that all call the same physics routines. Because VIC has such as large user community and there are many ongoing projects that will use the legacy “time-before-space” configuration, we provided a classic driver, which essentially functions the same way as previous VIC implementations. For the reasons described above, most of our work on VIC 5 was focused on the development of a “space-before-time” configuration. We’re calling this the image driver.

The image driver has two main infrastructure improvements relative to the classic driver. First, we’ve incorporated a formal HPC parallelization strategy in the form of a Message Passing Interface (MPI). MPI is a standardized communication protocol for large parallel applications (out-of-core and off-node). In the context of VIC, it allows for the simultaneous simulation of thousands of grid-cells, distributed across a cluster or super-computer. Second, we’ve overhauled and standardized VIC’s I/O. Whereas previous versions of VIC used custom ASCII or binary file types, the image driver uses netCDF for both input and output files. NetCDF has been widely adopted throughout the geoscience community and offers three main advantages: first, it stores for N-dimensional binary datasets; second, it provides a standardized method for storing metadata along with the data; and third, there are many tools for visualizing and analyzing NetCDF files.

Scaling VIC on HPC

In this section, we will show an example of how VIC can be applied in parallel on a large super-computer. We’re showing results from two 1-year test simulations run using the image driver on the RASM model domain. The RASM model domain has about 26,000 land grid cells and includes the the entire pan-Arctic drainage. Both of these simulations were run using 3-hourly forcing inputs, a 3-hour model timestep, frozen soils, and minimal output (only 8 variables written once monthly). We ran these tests on the Topaz super-computer at the U.S. Department of Defense Engineer Research and Development Center Supercomputing Resource Center (ERDC DSRC). Topaz is a 124,416 core SGI ICE X machine, made up of 3,456 nodes (36 cores/node) capable 4.62 PFLOPS.

The first simulation we’re going to look at was run in “Water Balance” mode (FULL_ENERGY = FALSE). When VIC is run in this mode it doesn’t iterate to find the surface temperature, resulting in much faster run times at the expense of model complexity. The figure below shows the model throughput (left axis) for 15 identical VIC simulations run using between 1 and 432 MPI processes. Using just 1 processor (no MPI), this VIC configuration has a model throughput of 27 model-years/wall-day (where one wall-day short for “wall-clock day” which is simply a calendar day in real time). Using 72 processors, we see the model throughput increase 334 years/wall-day. Beyond 36 processors, the throughput begins to plateau and the scaling efficiency (right axis) is too low to push the scaling any further.

image-title-here

Next we’ll look at the scaling performance of VIC run in “Energy Balance” mode (FULL_ENERGY = TRUE). When run in this configuration, we expect the model to be much more expensive. In fact, we find that at 36 cores (one node), the throughput in the “Energy Balance” simulation is about 36 times less than the “Water Balance” simulation. Because this configuration is so much more expensive, the scaling efficiency is substantially better. In this case, the model throughput doesn’t really plateau after 864 processors where we’re able to get a throughput of 97 model-years/wall-day.

image-title-here

Conclusions

Our initial analysis of the VIC scaling shows us that for complex model configurations, we get reasonable scaling. With the scaling we’re seeing here, we can conceivably run 1000 year spinup simulations of VIC in the Arctic in 2-3 weeks. Another way to think about the potential here is in terms of applying VIC in large ensembles where 100s of ensemble members are run and readily analyzed using tools optimized for use with netCDF output. For simple VIC configurations (e.g. “Water Balance” only), it seems that VIC is not going to scale much beyond a single node, but it still benefits from multiple cores on the same node. The reason for this difference in scaling behavior lies in the interplay between the digital volume of input forcings (read on the master MPI processor and scattered to the slave processors) and computation performed on slave processors. As the input volume goes up or the computation time spent by individual slave processors goes down, the scaling performance decreases.

We are developing tools that automate these scaling tests. We also have a number of issues slated for future development that we believe will improve the scaling of VIC. The issues we think hold the most potential would add parallel netCDF I/O and on-node shared memory threading. Both of these features would reduce the overhead introduced by MPI.

Acknowledgments

Supercomputing resources were provided through the Department of Defense (DOD) High Performance Computing Modernization Program at the Army Engineer Research. Tony Craig, a member of the RASM team, also contributed to development of this blog.

References

  • Liang, X., D. P. Lettenmaier, E. F. Wood, and S. J. Burges (1994), A simple hydrologically based model of land surface water and energy fluxes for general circulation models, J. Geophys. Res., 99(D7), 14415–14428, doi:10.1029/94JD00483.
  • Livneh, B., Bohn, T.J., Pierce, D.W., Munoz-Arriola, F., Nijssen, B., Vose, R., Cayan, D.R. and Brekke, L., 2015. A spatially comprehensive, hydrometeorological data set for Mexico, the US, and Southern Canada 1950–2013. Scientific data, 2, doi:10.1038/sdata.2015.42.

[Read More]

VIC 5.0.0 released

Sep 6, 2016. | By: Bart Nijssen

Today we released VIC 5.0.0, a major upgrade to the VIC model infrastructure and the basis for all future releases of the VIC model. The new release is available from the VIC GitHub repository. Documentation is provided at on the VIC documentation web site.

Following this release, no further releases will be made as part of the VIC 4 development track except for occasional bug fixes to the support/VIC.4.2.d branch.

The VIC 5.0.0 release is the result of three years of concerted effort in overhauling the VIC source code to allow for future expansion and for better integration with other models. More details on individual model features will be provided on this web site in a series of posts over the next few weeks. Full details can be found on the VIC documentation web site and in the various issues that have been tracked on the VIC GitHub repository.

Note that the VIC 5.0.0 release includes many infrastructure upgrades, but that the model calculations provide the same results as can be obtained with VIC 4.2.d. This was a conscious decision to limit the number of simultaneous changes that were implemented in the source code.

Major changes in VIC 5.0.0 include:

  • a clean separation of model physics from the model driver: All model physics are now contained in a vic_run module and different model drivers, that manage input/output, initialization, memory allocation, all interact with the same vic_run module.

  • multiple model drivers: For historic reasons, VIC always ran in a time before space mode in which every model element (or grid cell) was run to completion before advancing to the next model element. While this had advantages, this mode made it much more difficult to interact with other models which use a space before time mode in which the entire model domain is completed for one time step. We have retained the original behavior as part of a classic driver and have implemented the space before time behavior as part of an image driver. We are also creating a CESM driver (which is currently still being tested) and have a python driver that can be used to test individual model functions.

  • netCDF file format: While the classic driver uses (nearly) the same ASCII format for input and output as earlier versions of VIC, all input and output in the image driver uses netCDF. This includes spatial model parameters, meteorological forcing data, model output and model state files.

  • exact restarts: When operated in image mode, the VIC model now has byte-exact restarts. That means that if you run the model in a single simulation, the results are exactly the same as when you run the model in shorter increments and restart each time from a model state file that was generated at the end of the previous simulation. Because the classic mode uses ASCII state files, restarts are not byte-exact, but the behavior has been much improved over previous versions of VIC.

  • parallelization: In classic mode (time before space), VIC could easily be parallelized by breaking the domain into separate pieces and running each piece on a separate processor and/or node. In image mode this is a less desirable solution. Although parallelization could be implemented that way, it reduces the advantages of having a space before time mode in which you may want to treat the entire domain as a single entity. The image mode therefore uses MPI to allow parallelization of the code over a large number of nodes or processors. We have run tests with VIC 5.0.0 in which we used more than three thousand processors.

  • separation of the generation of the meteorological forcings: We have removed the MTCLIM code that was used to estimate meteorological forcings at sub-daily time steps given limited inputs from the VIC source code. This means that the user will need to generate all the meteorological forcing data outside of VIC. This will likely be the single largest impact on users who are only interested in the classic mode and who would like to upgrade to VIC 5.0.0.

  • continuous integration and model testing: As part of the VIC GitHub repository all code changes are automatically subjected to a large number of tests. We currently use travis to automatically build and run the model on a number of different architectures and with a number of different compilers. As part of this, we also test model functionality in an automated manner. In addition, we have designed a science test suite which is run before model releases in which model output is compared with observations and previous model simulations to assess the effect of model changes in a large range of environments.

  • improved documentation: All documentation has been updated and is available on the VIC documentation page. Note that documentation for previous versions of VIC is available in the same location.

  • extensive code cleanup and reformatting.

We encourage everyone to upgrade to VIC 5.0.0 and to contribute to further model development.


You can report bugs and issues on the VIC GitHub repository.

If you have questions, please refer them to the VIC user mailing list.

[Read More]

We are hiring

Oct 23, 2015. | By: Bart Nijssen

I am looking for two motivated and qualified postdoctoral research associates (postdocs) to join the UW Hydro | Computational Hydrology group. One is on an NSF-funded project to study hydrological and stream temperature changes under climate change, while the other is on a NASA-funded project to develop infrastructure that supports hyper-resolution, large ensemble, continental-scale hydrologic modeling. You can find details about the position on our join page.

Update (January 15, 2016): Both positions have been filled. Future openings will be posted on our join page.

[Read More]

SUMMA 1.0.0 released

Jul 17, 2015. | By: Bart Nijssen

In collaboration with Martyn Clark’s group at NCAR/RAL, we released the source code for SUMMA, a new hydrologic modeling code that advances a unified approach to process-based hydrologic modeling and enables controlled and systematic evaluation of multiple model representations. This has been some time in the making and will potentially change a lot of the model development, application and evaluation activities in the group.

SUMMA stands for the the Structure for Unifying Multiple Modeling Alternatives and is described in detail in two papers that recently appeared in Water Resources Research and in an NCAR Technical Note. The source code (and details about the papers) can be found on the SUMMA github page and sample data sets can be obtained from the SUMMA page at NCAR.

In future posts, I will provide some samples of SUMMA’s capabilities.

[Read More]

Integrated Scenarios

May 22, 2015. | By: Bart Nijssen

One of the projects that the group has been involved with over the past few years is the Integrated Scenarios of the Future Northwest Environment project, together with groups at Oregon State University and the University of Idaho. The goal of the project was to evaluate changes in climate, hydrology and vegetation in the Pacific Northwest during the rest of this century. In our group, Matt Stumbaugh, Diana Gergel, Dennis Lettenmaier, and myself have been involved with model simulations and analysis. While the analysis is ongoing, the model runs to support this project have been completed and the model output is available for other groups to use in their research. There is an interesting article about the project in a new annual magazine from the Northwest Climate Science Center: Northwest Climate Magazine. The Integrated Scenarios article can be found here.

[Read More]

Recently published

Apr 29, 2015. | By: Bart Nijssen

There has been a spate of new publications (co-)authored by the UW Hydro | Computational Hydrology group. You can always get the latest on our publications page, but here is a brief overview of the papers published so far this year:

  • Mishra et al., Changes in observed climate extremes in global urban areas, Environmental Research Letters, doi:10.1088/1748-9326/10/2/024005:

    A look at changes in daily, historic extremes in temperature, precipitation and wind in urban areas around the globe over th period 1973-2012. Main takeaways: “[…] urban areas have experienced significant increases […] in the number of heat waves during the period 1973–2012, while the frequency of cold waves has declined”. At the same time “[e]xtreme windy days declined”.

  • Roberts et al., Simulating transient ice–ocean Ekman transport in the Regional Arctic System Model and Community Earth System Model, Annals of Glaciology, doi:10.3189/2015AoG69A760:

    This is one of the first publications that stems from the Regional Arctic System Model (RASM). The study demonstrates that high-frequency coupling is necessary to represent certain transport processes in the Arctic Ocean. In particular, “[t]he result suggests that processes associated with the passage of storms over sea ice (e.g. oceanic mixing, sea-ice deformation and surface energy exchange) are underestimated in Earth System Models that do not resolve inertial frequencies in their marine coupling cycle.”

  • Vano et al., Seasonal hydrologic responses to climate change in the Pacific Northwest, Water Resources Research, doi:10.1002/2014WR015909:

    The development of a methodology that allows rapid evaluation long-term changes in seasonal hydrographs based on global climate model output and an application in the Pacific Northwest (PNW). Withing the PNW, “[…] transitional (intermediate elevation) watersheds experience the greatest seasonal shifts in runoff in response to cool season warming.”

  • Clark et al., A unified approach for process-based hydrologic modeling: 1. Modeling concept, Water Resources Research, doi:10.1002/2015WR017198:

    The first paper describing the concepts behind SUMMA (Structure for Unifying Multiple Modeling Alternatives), which potentially has a big impact on the work done in our group. While we will continue to support VIC and DHSVM for the foreseeable future, much of our model development effort over the next few years will shift towards SUMMA. SUMMA “[…] formulates a general set of conservation equations, providing the flexibility to experiment with different spatial representations, different flux parameterizations, different model parameter values, and different time stepping schemes.” The goal of this modeling approach is for SUMMA to “help tackle major hydrologic modeling challenges, including defining the appropriate complexity of a model, selecting among competing flux parameterizations, representing spatial variability across a hierarchy of scales, identifying potential improvements in computational efficiency and numerical accuracy as part of the numerical solver, and improving understanding of the various sources of model uncertainty.”

  • Clark et al., A unified approach for process-based hydrologic modeling: 2. Model implementation and case studies, Water Resources Research, doi:10.1002/2015WR017200:

    The second paper in this two-part series discusses the SUMMA implementation and provides some initial applications to demonstrate SUMMA’s potential.

  • Mao et al., Is climate change implicated in the 2013–2014 California drought? A hydrologic perspective, Geophysical Research Letters, doi:10.1002/2015GL063456:

    California has been subject to a major drought over the last few years (with no relief in sight as of this writing). In this paper, we examined whether the 2012-2014 drought was within the range of historic droughts and to what extent warming during the past century has contributed to the below-average mountain snow pack in 2013-2014. “We find that the warming may have slightly exacerbated some extreme events (including the 2013–2014 drought and the 1976–1977 drought of record), but the effect is modest; instead, these drought events are mainly the result of variability in precipitation.”

  • Clark et al., Continental Runoff into the Oceans (1950-2008), Journal of Hydrometeorology, doi:10.1175/JHM-D-14-0183.1:

    This paper (by a different Clark than the above two papers – Liz rather than Martyn) provides a new set of estimates of freshwater discharge to the oceans, based on a merger of observations and model simulations. “We estimate that flows to the world’s oceans globally are 44,200 (± 2660) km3 yr-1 (9% from Africa, 37% from Eurasia, 30% from South America, 16% from North America, and 8% from Australia-Oceania). These estimates are generally higher than previous estimates, with the largest differences in South America, and Australia-Oceania.”

[Read More]

A new name and website: UW Hydro | Computational Hydrology

Mar 11, 2015. | By: Bart Nijssen

With the move of Dennis Lettenmaier and part of his group to UCLA last fall, it is time for a new name and web site for what used to be the Land Surface Hydrology Group at the University of Washington. We’ll maintain the old link for a little while, but any updates will happen on this new web site. To more accurately reflect what the group does and where we are headed, we are now the UW Hydro | Computational Hydrology group. Existing content such as information about our models, data sets, and projects has been maintained, but you may need to look around to find it. Other parts of the web site have seen more drastic change to reflect the current make-up and activities within the group. We’ll use the web site to provide periodic updates on projects, publications and new activities.

[Read More]

Recent Posts

Popular Tags

web site (1) publications (2) integrated scenarios (1) code (3) model (3) summa (1) Hiring (1) vic (2) hpc (1)

Subscribe

Subscribe to this blog via RSS.

About

UW Hydro builds tools to simulate and investigate the terrestrial hydrological cycle.

Our Home

Wilson Ceramics Laboratory
Department of Civil and Environmental Engineering
Box 352700
University of Washington
Seattle, WA 98195-2700