Skip to contents

This function generates a set of visualizations to analyze AMR gene combinations, MIC values, and gene prevalence from a given binary matrix. It creates several plots, including MIC distributions, a bar plot for the percentage of strains per combination, a dot plot for gene combinations in strains, and a plot for gene prevalence. It also outputs a table summarising the MIC distribution (median, lower, upper) and number resistant, for each marker combination.

Usage

amr_upset(
  binary_matrix,
  min_set_size = 2,
  order = "",
  plot_set_size = FALSE,
  plot_category = TRUE,
  print_category_counts = FALSE,
  print_set_size = FALSE,
  boxplot_colour = "grey",
  assay = "mic"
)

Arguments

binary_matrix

A data frame containing the original binary matrix output from the get_binary_matrix function. Expected columns are and identifier (column 1, any name); 'pheno' (class sir, with S/I/R categories to colour points), 'mic' (class mic, with MIC values to plot), and other columns representing gene presence/absence (binary coded, ie 1=present, 0=absent).

min_set_size

An integer specifying the minimum size for a gene set to be included in the analysis and plots. Default is 2. Only genes with at least this number of occurrences are included in the plots.

order

A character string indicating the order of the combinations on the x-axis. Options are: - "": Default (decreasing frequency of combinations), - "genes": Order by the number of genes in each combination. - "value": Order by the median assay value (MIC or disk zone) for each combination.

plot_set_size

Logical indicating whether to include a bar plot showing the set size (i.e. number of times each combination of markers is observed). Default is FALSE.

plot_category

Logical indicating whether to include a stacked bar plot showing, for each marker combination, the proportion of samples with each phenotype classification (specified by the 'pheno' column in the input file). Default is TRUE.

print_category_counts

Logical indicating whether, if plot_category is set to true, to print the number of strains in each resistance category, for each marker combination in the plot (default FALSE).

print_set_size

Logical indicating whether, if plot_set_size is set to true, to print the number of strains with marker combination on the plot (default FALSE).

boxplot_colour

Colour for lines of the box plots summarising the MIC distribution for each marker combination, (default "grey").

assay

A character string indicating whether to plot MIC or disk diffusion data. - "mic": Plot MIC data, stored in column 'mic' of class 'mic'. - "disk": Plot disk diffusion data, stored in column 'disk' of class 'disk'.

Value

A list containing the following elements:

plot

A grid of plots displaying: (i) grid showing the marker combinations observed, MIC distribution per marker combination, frequency per marker and (optionally) phenotype classification and/or number of samples for each marker combination.

summary

Summary of each marker combination observed, including median MIC (and interquartile range) and positive predictive value for resistance (R).

Details

This function processes the provided binary matrix (binary_matrix), which is expected to contain data on gene presence/absence, MIC values, and phenotype calls (S/I/R) (can be generated using get_binary_matrix). The function also includes an analysis of gene prevalence and an ordering option for visualizing combinations in different ways.

Examples

if (FALSE) { # \dontrun{
# Example usage

ecoli_geno <- import_amrfp(ecoli_geno_raw, "Name")

binary_matrix<- get_binary_matrix(geno_table=ecoli_geno, 
              pheno_table=ecoli_ast, 
              antibiotic="Ciprofloxacin", 
              drug_class_list=c("Quinolones"), 
              sir_col="pheno", 
              keep_assay_values=TRUE, 
              keep_assay_values_from = "mic"
           )

amr_upset(binary_matrix, min_set_size = 3, order = "value", assay="mic")
} # }