❓Frequently Asked Questions (FAQ)
General
Q: Which Illumina Infectious Disease and Microbiology target-capture enrichment panel kits are compatible with the DRAGEN Microbial Enrichment Plus app?
A: RPIP, UPIP, RVOP/RVEK, VSP, VSP V2, and Custom infectious disease and microbiology enrichment panels. To analyze the Pan-Coronavirus (Pan-CoV) panel, a custom coronavirus reference sequence database may be specified. The DME+ app is not intended for use with non-infectious disease enrichment panels (such as human exome).
Q: Can I analyze the Pan-Coronavirus (Pan-CoV) panel here?
A: The only infectious disease and microbiology enrichment panel without a pre-set DME+ database is the Pan-CoV panel. To analyze Pan-CoV enriched data with the DME+ app, select "Custom Panel" under the "Enrichment Panel" drop-down list and specify a custom coronavirus reference sequence database. Alternatively, we recommend using the DRAGEN Targeted Microbial app.
Q: What does it cost to analyze samples using the DRAGEN Microbial Enrichment Plus app?
A: A Basic Basespace Sequence Hub (BSSH) user account is required to access the DME+ app. However, there is no subscription cost for a Basic BSSH account and no compute cost to run the DME+ app. A Basic BSSH account provides 1 TB of free storage. Additional storage may require iCredits.
Q: Where do I upload my custom reference FASTA and/or BED file?
A: Upload these files to a BSSH project before launching the DME+ app. It will then be possible to select these files in the "Select Dataset File(s)" browser in the app. Please see general guidelines for how to upload data to BaseSpace and reach out to techsupport@illumina.com with any unresolved upload issues.
Panel Content & Design
Q: Is my viral subtype of interest captured by the VSP V2 panel?
A: See the "Virus Types Captured" column of the "Microorganisms" table in the VSP V2 Panel Summary.
Q: Was VSP V2 designed using contemporary viral genomes or against traditional reference strains only?
A: The VSP V2 viral genome sourcing approach aimed at being as inclusive and comprehensive as possible for the 200 targeted human viruses. All viral genomes passing quality filters available as of June 2023 were included in the design, including recombinant and vaccine strains.
Q: How much of the genome is targeted by the RPIP, UPIP, RVOP/RVEK, VSP, and VSP V2 panels?
A: The full viral genome is targeted for all RVOP/RVEK, VSP, and VSP V2 viruses. For RPIP viruses, see the "Percent Genome Targeted" column of the "Microorganisms" table in the RPIP Panel Summary. No more than ~1% of bacterial, fungal, and parasitic genomes are targeted by RPIP or UPIP.
Analysis Options & Settings
Q: I am using the "Custom panel specification" option and my custom analysis aborted or shows an error, why?
A: While there are many possible reasons, one of the most common causes is that the custom database was not formatted correctly. Below are requirements for the custom reference FASTA and (optional) BED file:
Do not exceed the file size limitation: 10 million bases
Do not include duplicate entries
Do not use spaces in the file name; instead use an underscore "_"
File extension must be .fasta or .fa for custom reference FASTA file and .bed for custom reference BED file
If providing a custom reference BED file, the names in the first column of the BED file (chrom) must match the names that appear in the FASTA file (text after > and before the first whitespace character).
See Custom reference FASTA and BED files for further details.
Q: I am using the "User-defined specification" option. I am not seeing the microorganisms I expect to be there AND/OR I am seeing microorganisms that I do not want to see.
A: Ensure that the correct microorganism reporting file was uploaded and used. We recommend saving the updated microorganism reporting file with a new name. Rows with microorganism names that are not of interest can be deleted, but do not add any new columns or delete any columns from the provided template. Similarly, do not change or remove any text from the header row. Also, please note that the "kmer_read_count" metric is only valid with the UPIP panel. See Microorganism Reporting File format for further details.
Q: What read QC (Quality Control) is performed by the DRAGEN Microbial Enrichment Plus app?
A: If enabled, low-quality bases are trimmed from the ends of each read. After trimming, the read is discarded if fewer than 50% of its bases have a quality score greater or equal to q20, the read is shorter than 32 bp, or the read has 5 or more ambiguous bases. It is assumed that appropriate adapter trimming has already been performed.
Q: What does "Read classification sensitivity" mean in the settings for RVOP/RVEK, VSP, and VSP V2?
A: This setting is used as a pre-alignment filtering step for all viral whole-genome sequencing (WGS) panels. The default setting of 5 means that if less than 5 reads classify to the set of reference sequences belonging to a given virus, that virus will not be reported. On the other hand, if 5 or more reads classify to the set of reference sequences belonging to a given virus, read alignment will proceed and alignment-based thresholds will be used to determine whether that virus is reported. The read classification sensitivity can be set as low as 1 or as high as 1000. Lowering the read classification sensitivity threshold below 5 may significantly increase computational run time and is not recommended for most use cases.
Q: When is a Pangolin analysis run?
A: Pangolin is currently enabled for all enrichment panels except UPIP. For Custom Panel analyses, Pangolin is enabled and will run on custom reference sequences with at least 3% coverage that meet these naming conventions:
If only a FASTA file is provided, Pangolin will run on sequences that have a header containing either SARS-CoV-2 or NC_045512
If both a FASTA and BED file are provided, Pangolin will run on sequences where the first column (chrom) contains NC_045512 or the fourth column (genomeName) contains SARS-CoV-2
Q: When is a Nextclade analysis run?
A: When enabled, a Nextclade analysis using the specified dataset(s) is run for the following microorganisms, as applicable:
Q. What Internal Control (IC) options are supported and what additional information does using an IC provide?
A: The RPIP, UPIP, and VSP V2 enrichment panels contain probes targeting commercially available Internal Controls. See the table below for Internal Control options compatible with RPIP, UPIP, and VSP V2. It is recommended to spike each sample prior to extraction with Enterobacteria phage T7 at 1.21 x 10^7 copies/mL of sample.
*Quantitative Internal Control concentration must be provided
Q. What are the DRAGEN Microbial Enrichment Plus app settings related to consensus sequence generation and variant calling?
A: See the table below. Consensus sequence bases without aligned read support are indicated by "N" bases.
Reporting
Q: I don't see the microbe I'm interested in listed in the reported microorganism summary. Does that mean my microbe of interest is not present?
A: Not necessarily. The microbe of interest may be present in the sample, but the DME+ app may not have reported it because the detection metrics fell below the default reporting thresholds. If it is suspected that this may be the case, select the "Report microorganisms and/or AMR markers that are below threshold" option. A user-defined microorganism reporting file can also be specified on a microorganism-by-microorganism basis using multiple parameters should more sensitive reporting be required for a given use case. See Microorganism Reporting File format for further details.
Q: What is the default reporting threshold for a microorganism to be "predicted present" and make it into reports?
A: Multiple parameters are used to determine whether the sequencing data for a given microorganism is sufficient for a positive call. These may include the horizontal coverage, median read depth, normalized read count, average nucleotide identity, etc of the microorganism and/or other genetically related microorganisms. The default reporting thresholds are different for different microorganisms, as microorganisms with close genetic neighbors generally require more stringent reporting thresholds than genetically distinct microorganisms. As with most tests and prediction algorithms, the default reporting thresholds are intended to balance the trade-off between analytical sensitivity and specificity. Should a given use case require more sensitive or specific reporting, a user-defined microorganism reporting file can be specified on a microorganism-by-microorganism basis using multiple parameters. See Microorganism Reporting File format for further details. Additionally, the "Report microorganisms and/or AMR markers that are below threshold" option can be enabled.
Q. Are low coverage, median depth 0 microorganisms actually in the sample or are they artifacts?
A: Mathematically, any result with a horizontal coverage of <50% will have a median depth of 0 (50% or more of the nucleotide positions have a depth of 0). Low coverage results could represent true low positives (the most likely reason) or non-specific results, contamination, etc. If maximum confidence is required for a given use case, stricter microorganism reporting thresholds can be specified on a microorganism-by-microorganism basis using multiple parameters. See Microorganism Reporting File format for further details.
Q. What is tiered reporting logic, which viruses are reported as part of a tiered reporting group, and why should I care?
A: See the "Has Tiered Reporting" and "Reporting Tier" columns of the "Microorganisms" table in the Panel Summary for RPIP, RVOP/RVEK, VSP, and VSP V2 to select and see which viruses are reported as part of a tiered reporting group. Membership in a tiered reporting group means that a hierarchical relationship is pre-built into the database and the most granular tier level passing reporting thresholds is reported. For example, if Influenza B virus (B/Victoria/2/87-like)
or Influenza B virus (B/Yamagata/16/88-like)
are reported in a sample then the less granular Influenza B virus
reporting name will NOT be reported. Tiered reporting group membership is especially relevant when specifying a user-defined microorganism reporting file as including the entire tiered reporting group is necessary to preserve tiered reporting logic.
Q. How can I evaluate DRAGEN Microbial Enrichment Plus microorganism absolute quantification results?
A: To evaluate microorganism absolute quantification results, it is recommended to perform experiments using the relevant sample type and full sequencing workflow (including extraction) and to compare results obtained from the DME+ app with those from digital droplet PCR (ddPCR) and/or quantitative PCR (qPCR) assays. A per-microorganism absolute quantification correction factor can be applied to DME+ results as needed.
Q. I noticed some antimicrobials listed that do not usually get used in clinical environments - is this expected?
A: Yes. Not all antimicrobials and drug classes that are listed may be relevant. Detected AMR markers may also confer resistance to antimicrobials and drug classes that are not listed. Linkage between bacterial AMR marker, antimicrobial, and drug class is based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8) from McMaster University, ResFinder (version 2.2.1), NCBI Reference Gene Catalog (version 2023-09-26.1), EUCAST expert rules on indicator agents (2019-2023), and CLSI Performance Standards for Antimicrobial Susceptibility Testing (M100 34th Edition). Linkage between viral AMR marker, antimicrobial, and drug class is based on the publications provided in the JSON report - see the PubMed IDs (pmids) field.
Results & Output Files
Q: Most of my reads are untargeted reads. Is enrichment working?
A: For complex samples or samples with the majority of nucleic acid being host/untargeted, while 100-1000X more targeted reads and sensitivity over a shotgun/pre-enriched library is expected, typically targeted reads will still only represent a minority of the overall sequencing reads. Notably, RPIP, UPIP, and VSP V2 support various Internal Control options that can be spiked into samples prior to extraction to enable automated calculation of an enrichment factor sample QC metric.
Q: Is any typing information included for my virus of interest?
A: See the "Has Tiered Reporting" and "Lineage/Clade Prediction" columns of the "Microorganisms" table in the Panel Summary for RPIP, RVOP/RVEK, VSP, and VSP V2. Consensus sequence and best match reference accession are also provided for RPIP, RVOP/RVEK, VSP, and VSP V2 viruses. Subtype information may be possible to infer from the consensus sequence (e.g. by Blast) or from the best match reference accession (if annotated in NCBI). Consensus sequence can also be used as input to downstream viral typing tools.
Q. The % Targeted Microbial Reads is not exactly equal to the sum of microorganism Aligned Read Count values, why?
A: The % Targeted Microbial Reads is calculated using a kmer-based classification approach that is intended to give a quick, high-level overview of sample composition. The Aligned Read Count values for microorganisms are calculated in a separate pipeline step using microorganism-specific reference sequence alignment as opposed to broad, categorical, kmer-based classification. Reads that were unclassified or that were classified as low-complexity or ambiguous may actually align to reference sequences. It is also possible for a read to align to a reference sequence of more than one microorganism, for example in a conserved region.
Q: How can I verify or compare results of the DRAGEN Microbial Enrichment Plus app to previously used apps (such as DRAGEN Targeted Microbial)?
A: FASTQ files previously run through other apps can be re-analyzed using the DME+ app. Results from other apps may not be identical to results from the DME+ app, most notably because of the expanded databases used in DME+.
Q: The Reference Coverage section of the HTML report only shows coverage plots for viral genomes. Why doesn't it show the plots for bacterial genomes and/or for viral targeted regions?
A: Viral genomes are orders of magnitude smaller and thus computationally much "cheaper" to align to than bacterial, fungal, and parasitic genomes. In the case of RVOP/RVEK, VSP, and VSP V2, the full viral genome is targeted for all viruses. For RPIP viruses, see the "Percent Genome Targeted" column of the "Microorganisms" table in the RPIP Panel Summary. While not visualized in the HTML report at this time, the DME+ Report JSON does contain coverage depth vector information for all microorganism targeted regions (viruses, bacteria, fungi, and parasites). See: .targetReport.microorganisms[].condensedDepthVector[]
, which is the read depth across the targeted microorganism reference sequences, condensed (if needed) into 256 bins.
Last updated