Custom Diagnostics Module Setup

Part of the power of WRF is the ability to tweak and tune such that you can answer very specific questions that you are curious about. Of course, modifying WRF is not always the most intuitive of processes. Even with some of the existing documentation, it took me a fair amount of time sifting through the WRF source code to really begin to see what was possible and how to achieve it. This example attempts to explain the steps to insert a custom diagnostic calculation via the addition of a module.

First, why use a module instead of just inserting your code into existing WRF modules? The short answer is that modular code is a good thing in most instances. A module allows you to compartmentalize your additions to WRF and makes debugging a whole lot easier. Yes, you will still have to add small bits of code in some of the WRF routines, but keeping your subroutines separate will save you a great deal of headache in the long run. With that being said, adding your code as a module is more involved/difficult than simply dumping it in another WRF module file.

Second, why go through the painstaking process of coding your own diagnostics? Aside from the obvious ability to get exactly what you want, the biggest gain is the ability to cut down on the amount of post-processing time. Having run on several supercomputers on research group allocations, it is highly desirable to save SUs wherever possible. WRF is highly parallelized and adding your module—correctly!—can take advantage of that. Many post-processing tools are not parallelized and doing your specific calculations on a large scale will not be very efficient.

Now, to explain how this works. There are several steps that are involved. I will go through them in some detail below:

  1. Create your module.
  2. This is the logical first step. I went through several iterations of testing of my module before I began to try an incorporate it into WRF. What you can do to help yourself is to familiarize yourself with the other diagnostic modules. One of the easiest diagnostics modules to model your own on is the AFWA diagnostics module. You’ll find this module in the ./phys/module_afwa_diagnostics.F file. The .F extension is important for your module as well as the make recipes work on .F files first.

    Take a look through the AFWA code and see if there is anything similar to what you want to do. For my case, there was some code to handle resetting variables after a history write so that you could calculate a sum/maximum over a given time period. Overall, I cut out most of the AFWA code and left the USE statements as well as the timing CALL statements. Below you will find a pseudocode example of my WRF diagnostic module:

    MODULE my_custom_diagnostic
    USE ! Generally, keep things the same here
    SUBROUTINE my_diagnostic_driver()
    ! Global variable declarations
       ! Variables here come from the grid and are dimensioned
       ! using ims, ime, jms, jme, kms, kme
    ! Local variable declarations
       ! Any local arrays you create, that will NOT be passed
       ! back to the grid/solver can be dimensioned using
       ! ite, jts, jte, kts, kte
    ! Timing subroutine calls...
    ! Set array values to zero after history write
    ! This can be used for sums/maxima calculations
    ! Call this before calculations so that it does
    ! not blow over your calculations prematurely
       IF ( is_after_history_dump ) THEN
          DO j = jms, jme
            DO i = ims, ime
              ! This variable is one you set up in the registry
              grid%myVar(i,j) = 0.
    ! Your diagnostic calculations may have more than one routine to call
    CALL my_diagnostic_calc()
    END SUBROUTINE my_diagnostic_driver
    SUBROUTINE my_diagnostic_calc()
    ! Variables will be passed in using the ims, ime, jms, jme, kms, kme dimensions
    ! Do the calculations
    ! Loops in your subroutines should use the its, ite, jts, jte, kts, kte indices
    ! only if you plan on using OMP directives, otherwise for MPI only I used
    ! the ims, ime, jms, jme, kms, kme
    ! Make sure to test if you are at domains edge, i.e., DO i=its,MIN(ite,ide-1)
    END SUBROUTINE my_diagnostic_calc()
    END MODULE my_custom_diagnostic

    I want to highlight a couple of important things from the code above. You will see that I make reference to various WRF dimensions. There are, by my count, four different types of dimensions that WRF uses to define arrays (you can read more about them in the WRF Tutorial presentations that I reference at the end of these notes). There are the domain, memory, tile, and patch dimensions. All of these dimensions are stored in the grid derived type variable within WRF. Domain dimensions are defined via the namelist and are used to make sure one does not loop off the edge of the domain. The memory dimensions include ghost zones for periodic boundaries and differencing schemes along the tile edges. The memory dimensions are also used to pass along the grid arrays to various subroutines. Per a wrfhelp email conversation, the memory dimensions are not supposed to be used in loops in things like my diagnostic routines. However, I ran into a lot of issues with the data produced by WRF (e.g., oddly striped data, missing data long boundaries), when it did not SEGFAULT without useful error messages, trying to loop only through tile dimensions in my subroutines. I had no trouble using the memory dimensions for everything. The wrfhelp folks gave me the example of the advance_uv subroutine in the ./dyn_em/module_small_step_em.F file. From the looks of that subroutine, among others, the tile dimensions are often used in conjunction with a subroutine call within an OMP parallel do loop. Based on that, I think the tile dimensions only apply when OMP directives are being used. The performance hit brought about by my added calculations was not that much even when using the memory dimensions. If I get more information from WRF Help or become more confident in the answer myself, I will update this with more clarification.

    Secondly, the AFWA diagnostics module contains some code that “empties the bucket”, so to speak, of a few variables. I used that framework to set up a variable that could calculate an hourly max value and output it with the history write. Just pay attention to when you reset your values as you do not want to overwrite before you output to history. I use this sort of code to create hourly max fields of an assortment of variables.

    The last important thing I want to mention is in regard to how to pass arrays around your subroutines. I had significant trouble trying to use mostly local arrays within my diagnostic subroutine that I would use to calculate and update the grid variables I defined in the WRF registry (I’ll cover this later). Stranger still, I only ran into trouble (i.e., WRF SEGFAULTS) with certain variables and not others being defined locally. I am not sure if there were issues with dimensioning that I could not diagnose or if trying to mix the grid derived type with normal arrays caused issues. The bottom line is that I ended up defining my necessary arrays in the registry and using those to do calculations with. Until I figure out why that might be, that would be my recommendation.

  3. Modify WRF source.
  4. Since I am adding a new set of diagnostics to WRF, I will need to make sure that my new output variables are properly initialized. To add variables to WRF, modifying the registry is necessary. In the ./Registrydirectory, there is a file called Registry.EM. In this file you will see several existing include statements. During the compile step, these statements bring in other registry files to be used in pre-processing the Fortran code. What I chose to do was to make my own registry file called ./Registry/Registry.EM_mymodule. From there I added and include statement to Registry.EM for my newly created registry definitions. Again, this allowed me to leave as much as the original WRF code unmodified. What you will need to is add a state registry entry so that your variable gets included. The easiest thing to do is look through the registry for another variable that is similar to the one you want to add and copy/modify that entry to your liking. For a description of the registry entries and some examples of how to add your own entries, look at the WRF tutorials and registry description that I link at the end of the post.

    As I mentioned earlier, I added several variables to the WRF registry that I used for my calculations. What I did not do, though, is have them all output. You can define any number of arrays and simply define the output as “-” to indicate that it will not be included in any output stream. This ultimately let me easily and correctly do my calculations without dealing with the local array issue or taking any more of an I/O hit during integration. Below you will find an example state registry entry. I have defined a couple entries here. The first entry is what I plan on outputting during model integration. The second entry is an array defined to store some intermediate calculations and is not output at all (see IO column).

    #ENTRY  TYPE    SYM     DIMS        USE         TLEV        STAG    IO  DNAME           DESCRIP                         UNITS
    state   real    myVar   ij          misc        1           -       h   "MYVAR"         "Maximum estimated hail size"   "mm"
    state   integer calc1   ikj         misc        1           -       -   "CALC1"         "My temp calc array"            "J/kg"

    Another registry entry that I found useful was the rconfig entry. The rconfig entry allows you to add namelist options that will do things like turn on your diagnostic calculations or tune some parameters in your equations. As an example, I added an on/off switch in the namelist for my module:

    #ENTRY          TYPE    SYM             HOW SET                 NENTRIES        DEFAULT 
    rconfig         integer myVar_opt       namelist,myModule       1               0

    What I can then do is add my own section to the default WRF namelist:

    myVar_opt    =    1,

    Now that the variables have been created, how do you use them? Access to the namelist variables are given through the grid and config_flags variables in WRF. The grid and config_flags variables are derived types that are defined in the WRF code. They are passed to a great deal of the subroutines in WRF in the base code. Modeling your diagnostic module after AFWA, as I did, will already place the appropriate derived type declarations in your module. Just recall that accessing array elements in Fortran variables requires use of the “%” operator. Here is an example:

    IF (config_flags%myVar_opt == 1) THEN
      ! Do some stuff
    ! Same result as above
    IF (grid%myVar_opt == 1) THEN
      ! Do the same stuff

    Once the module and registry entries have been created, you will also need to place calls to the appropriate subroutines. In my case, I used a driver function to call and pass all the necessary variables to. The most likely place you will want to call your own diagnostics is in the ./dyn_em/solve_em.F file. The file contains all the calls to subroutines that advance the model by one time step. Much like the AFWA diagnostics, I placed my call near the end of the solve subroutine to be sure that the diagnostics will not be called until the model has been advanced forward in time and necessary variables are available. While the CALL statement is important, it is equally important to include your module via a USE statement at the top of solve_em.F. The main thing to remember is to be sure that the calculations you do get placed into the grid variable that will be output at the history interval.

  5. Modify makefile recipes.
  6. Since you are adding a module, it is not enough to just add the file and include the proper USE statements. The makefiles that WRF uses to build the executables need to know where these new files exist to compile and link them properly. For my example, this required modifying the Makefile in the ./phys and ./dyn_em directories. These directories correspond to the locations of the new diagnostic module and the updated solve_em.F file, respectively.

    First, change the Makefile in the ./phys directory. Within that file you wll see a make array variable called MODULES defined. As you have already guessed, this tells the makefile what modules will be built in this directory. Whatever you module name is, as specified in the MODULE statement in your .F file, this will be what you place in this array similar to all the others that exist there. Make sure that you place the .o (object) extension at the end of the name and make sure all but the last lines have the “\” character after the name. The next thing to do in this file is to add a recipe for your module file. I used the AFWA recipe as a guide. There are recipes for each module array that you just modified near the top of the file. The important thing in this step to make sure the recipe includes all the modules (using their relative locations) that appear in the USE section of your newly created module. My recipe entry looked something like this:

    module_mesh.o: \
                    ../frame/module_comm_dm.o \
                    ../frame/module_state_description.o \
                    ../frame/module_configure.o \

    After modifying the Makefile in ./phys, you need to modify the Makefile in ./dyn_em. You need to do this because you added your module to the USE section of solve_em.F. This time it will be a little easier since the recipe for solve_em.o is already there. You just need to add the relative location of your new module to the recipe. After doing this, you should be set to compile. GNU make is pretty good about telling you when and why something has gone wrong. Pay attention to error messages and make any necessary corrections.

Other considerations. I want to touch on a few other things I learned along the way, but did not have to implement in my own module. An important thing that I learned along the way was when a halo registry entry is necessary. Halo entries essentially define what the stencil is for a particular equation; that is, the necessary points in space or time. WRF’s domain gets decomposed into smaller pieces in the horizontal but not the vertical. What this means is that halo entries are only necessary when you have calculations that need points other than the one that is being solved for. Specifically, the halo entry ensures that processors have access to points that exist in other processors memory during parallel processing. Not having had to define a halo entry, I cannot say much about it. Just know when and when not to implement one.

Do you want your diagnostic calculations to be output with the initial history write? That is something I asked myself. I ultimately decided not to output it at the initial time. While I was trying to decide, I did implement versions of my code that included my calculations in the initial history file. In order to get your calculation into the initial history file, you need to call your calculation from within the initialization routines. For my case this was real_em.F. You would again have to add your MODULE via a USE statement in real_em.F and the modify the Makefile in ./main to include your module object in the real_em.o recipe. The other thing you have to modify is the registry entry for your variable so that it also gets included in the initialization files (i.e., wrfinput_*). That just requires adding the “i” flag where you already added the “h” flag for your variable. If you only include your calculation in solve_em.F, the initial history write will contain your variable, but it will most likely be initialized with all zeros.

Finally, here are some useful resources. Some of them are older, but they still apply to how WRF operates today and give you a better idea of how WRF works.

Document Description
WRF Tutorials A page with several years worth of WRF tutorial presentations. Of particular interest are those dealing with the model architecture, parallelism, and the registry.
WRF Registry Description A more detailed look at the WRF registry entries.
WRF Call Tree An older (for WRF v2), but useful, description of the flow of WRF.
WRF Module Descriptions Older (v2). Describes the modules of WRF. Useful for understanding call tree.
WRF Main Program Description Older (v2). Describes in more detail the workings and flow of the WRF main program.
Adding Microphysics Variable Another how-to on adding a variable. From Utah AtmoSci graduate student John McMillen.