Weather Research and Forecast (WRF) Scaling, Performance Assessment and Optimization
NCAR SIParCS Program 2018

Table of Contents

This is still very much a work in progress!

This repo contains the benchmark cases and scripts used for assessing the scaling performance of WRF on NCAR's Cheyenne Supercomputer.

The results of this study can be found here.

git directories

  • cases/ contains the namelists for WPS and WRF for the various benchmark cases along with scripts for generating the wrfbdy and wrfinput files for each case.
  • scripts/ contains useful scripts for automating the compilation, running, and analysis of the WRF test cases. Pass the --help flag to see how to use each.
  • WRFs contains job scripts for compiling the various combinations of WRF versions, compiler versions and options, and MPI versions used in these performance assessments.
  • WPSs contains job scripts for compiling the various WPS versions used to generate WRF input data.
  • docs/ contains this documentation along with the timing and scaling results and associated analysis and plotting code used to generate the results.

Example of how to run the new_conus12km benchmark case

As always, change the example paths to your directory structure. All the scripts in the srcipts/ directory do not hardcode paths (except wrf_run_pbs_jobs has a default run directory) but the scripts in the cases/ directory do hardcode paths. All the job submission scripts in WRFs and WPSs also hardcode paths.

The following will give you a table with timing results using the new_conus12km benchmark with WRFV4.0 compiled with intel18.0.1 and mpt2.18 scaled across 1, 2, 4, 8 nodes on cheyenne. (cheyenne's 36 cores/node is the default in wrf_run_pbs_jobs)

export JASPERLIB=/glade/u/home/wrfhelp/UNGRIB_LIBRARIES/lib
export JASPERINC=/glade/u/home/wrfhelp/UNGRIB_LIBRARIES/include
alias qsub='/opt/pbs/bin/qsub -A MY_ACCOUNT_CODE -m abe -M my_email_address'

git clone https://github.com/akirakyle/WRF_benchmarks.git

qsub WRF_benchmarks/WRFs/WRFV4.0-intel18.0.1-mpt2.18.pbs
qstat # wait for job to finish...
qsub WRF_benchmarks/WPSs/WPSV4.0-intel18.0.1.pbs
qstat # wait for job to finish...
./WRF_benchmarks/cases/make_cases_new.sh
qstat # wait for job to finish...
./WRF_benchmarks/wrf_run_pbs_jobs -c ~/WRF_benchmarks/cases/new_conus12km \
                                  -w ~/work/WRFs/WRFV4.0-intel18.0.1-mpt2.18 \
                                  -n 1 2 4 8 -r ./run \
                                  -p '-A MY_ACCOUNT_CODE -m abe -M my_email_address'
qstat # wait for jobs to finish...
cd run
./WRF_benchmarks/wrf_cpy_rsl -a ../archive -o ../results -r *
cd ../results
./WRF_benchmarks/wrf_stats *

I made heavy use of emacs org mode for automating the above workflow of creating runs, submitting them, and plotting their output. In particular I made use of the babel feature in org mode and its integration with TRAMP mode. I eventually needed a more flexible solution than provided by the wrf_run_pbs_job script to manage all the pbs job submission scripts so I created the bash source block in in docs/runs.org file to run, manage, and document all the jobs I was submitting to Cheyenne. The wrf_stats script outputs an org mode compatible table, so I can use the docs/data.org file to save all the wrf timing results. I can then reference this table of data in the analysis code blocks in docs/results.org.

Of course this document itself is also written in org mode and exported to html. Be wary of the fact that github does render org mode documents as if it was markdown, however this often leaves out valuable information present in the raw org mode text.

Data Sources for the various benchmark cases

katrina (1km, 3km, 30km)

Data for running the Hurricane Katrina cases can be downloaded from the WRF tutorial page

curl http://www2.mmm.ucar.edu/wrf/TUTORIAL_DATA/Katrina.tar.gz -o Katrina.tar.gz
tar -xf Katrina.tar

The 30km case is the same namelist as shown in the WRF-ARW tutorial at http://www2.mmm.ucar.edu/wrf/OnLineTutorial/CASES/SingleDomain/index.html

WPS and real.exe must be run to generate wrfbdy_d01 and wrfrst_d01 for these cases

conus12km and conus2.5km

The wrfbdy_d01 and wrfrst_d01 files for the official CONUS benchmarks at 12km and 2.5km resolution can be found at: http://www2.mmm.ucar.edu/wrf/WG2/benchv3/

maria1km, maria3km, and new_conus12km and new_conus2.5km

The data for these Hurricane Maria cases and CONUS cases comes from the NCEP GFS Model Runs which can be downloaded form https://rda.ucar.edu/datasets/ds084.1/. However this data already resides on glade, so the WPS scripts just link in the appropriate glade directories.

WPS and real.exe must be run to generate wrfbdy_d01 and wrfrst_d01 for these cases

Author: Akira Kyle

Email: akyle@cmu.edu

Created: 2018-07-31 Tue 18:05

Validate