Understanding & creating IRMA modules

Modules provide IRMA with all the data and configuration necessary to assemble a particular virus genome. Reference libraries are never passed to IRMA on the command-line; rather, they are implicitly invoked. So long as the module is well constructed, reference libraries need not change very often if at all. We explain the directory structure of IRMA, IRMA modules, and how to create a new IRMA module.


Return to the IRMA homepage

IRMA directory structure

install-path/
├── IRMA_RES/
│   ├── modules/
│   │   ├── EBOLA/
│   │   │   ├── config/
│   │   │   │   ├── EBOLA-fast.sh
│   │   │   │   └── EBOLA.sh
│   │   │   ├── profiles/
│   │   │   │   ├── CIEBOV_hmm.mod
│   │   │   │   ├── EBOV_BDBV_hmm.mod
│   │   │   │   ├── LLOV_hmm.mod
│   │   │   │   ├── MARV_hmm.mod
│   │   │   │   ├── REBOV_hmm.mod
│   │   │   │   ├── SEBOV_hmm.mod
│   │   │   │   └── ZEBOV_hmm.mod
│   │   │   ├── reference/
│   │   │   │   └── consensus.fasta
│   │   │   └── init.sh
│   │   └── FLU/
│   │       ├── config/
│   │       │   ├── FLU-alt.sh
│   │       │   ├── FLU-avian.sh
│   │       │   ├── FLU-debug.sh
│   │       │   ├── FLU-fast.sh
│   │       │   ├── FLU-lowQC.sh
│   │       │   ├── FLU-pacbio.sh
│   │       │   ├── FLU-pgm.sh
│   │       │   ├── FLU-ref.sh
│   │       │   ├── FLU-roche.sh
│   │       │   ├── FLU.sh
│   │       │   └── FLU-utr.sh
│   │       ├── profiles/
│   │       │   ├── A_PB2_hmm.mod
│   │       │   ├── A_PB1_hmm.mod
│   │       │   └── ...
│   │       ├── reference/
│   │       │   ├── consensus.fasta
│   │       │   └── FLU-debug.fasta
│   │       └── init.sh
│   ├── ppath/
│   ├── scripts/
│   │   ├── packaged-citations-licenses/
│   │   └── (R, Perl, pre-packaged binaries, etc.)
│   └── defaults.sh
├── LABEL_RES/scripts/(scripts used by IRMA)
├── IRMA
└── LABEL



Modules are stored in the IRMA_RES/modules/ folder, which contains our default modules FLU and EBOLA at the time of this writing. To create a new module:

  1. Download our module template
  2. Unzip the template, and put the ORG folder in the IRMA_RES/modules/ folder. The full path would then be: /install-path/IRMA_RES/modules/ORG
  3. Rename ORG to the name of your organism. We recommend something short and easy to remember. Do not use whitespace.
  4. You must provide a reference library to act as a seed for the iterative assembly (IRMA currently does not invoke de novo assembly). We recommend taking a plurality consensus sequence for each gene segment, genome, and/or lineage/subtype. Place these sequences in the "consensus.fasta" file within the reference/ sub-folder.
  5. Open your init.sh and edit your module-specific parameters. If you are using a different FASTA file than above, you must either use an absolute path (REF_SET="/path/to/file") that is IRMA accessible or must place the file in the reference folder (REF_SET="myfile").
  6. *If you have competing lineages or subtypes, you may specify a group list (comma-delimited, no spaces, e.g., SORT_GROUPS="PB1,PB2,PA") to enable sorting into primary and secondary data for competing lineages of the same gene. Your reference library headers must also contain unique substrings associated with each group. For example, for a group called "PB2", sequence headers "A_PB2" and "B_PB2" would be grouped. If no groupings are specified, only data below the minimum read count will be sorted into secondary data. Finally, if you set SORT_GROUPS="__ALL__", then only the maximum hit for all reference sequences will be grouped into primary data.
  7. *The config/ subfolder contains run-specific configuration files. You can create new ones following the pattern "ORG-configname.sh" or by naming the file configname.sh We have already provided sample configuration files in the template.
  8. *If you are using SAM to perform rough alignment, you must create profile HMMs corresponding to each gene or genome in your reference library in your profiles/ folder. The modelfromalign program for SAM is packaged with LABEL in its LABEL_RES/scripts/ folder. It converts a multiple sequence alignment files into an HMM profile. You must name your files using the pattern NAME_hmm.mod, so "A_PB2" would have a profile in the profiles/ folder called A_PB2_hmm.mod.
  9. *If you would like to use LABEL to sort reads, please visit the LABEL homepage and contact us for advice.
  10. You are now ready to test your module on data!
*Optional

Return to the Influenza Division Bioinformatics Team homepage
This page last reviewed: Tuesday, November 19, 2019