This function performs a Positive Predictive Value (PPV) analysis for AMR markers associated with a given antibiotic and drug class. The function calculates the PPV for solo markers (those with only one genetic marker relevant to the drug class) and visualizes the results using various plots. It returns a list containing summary statistics for each solo marker, and associated plots showing the breakdown of resistance phenotypes, and PPV (with 95% confidence interval) for each marker.
Usage
solo_ppv_analysis(
geno_table,
pheno_table,
antibiotic,
drug_class_list,
geno_sample_col = NULL,
pheno_sample_col = NULL,
sir_col = NULL,
keep_assay_values = TRUE,
min = 1,
axis_label_size = 9,
pd = position_dodge(width = 0.8),
plot_cols = c(R = "IndianRed", NWT = "navy")
)
Arguments
- geno_table
A data frame containing genotype data, including at least one column labeled
drug_class
for drug class information and one column for sample identifiers (specified viageno_sample_col
otherwise it is assumed the first column contains identifiers).- pheno_table
A data frame containing phenotype data, which must include a column
drug_agent
(with the antibiotic information) and a column with the resistance interpretation (S/I/R, colname specified viasir_col
).- antibiotic
A character string specifying the antibiotic of interest to filter phenotype data. The value must match one of the entries in the
drug_agent
column ofpheno_table
.- drug_class_list
A character vector of drug classes to filter genotype data for markers related to the specified antibiotic. Markers in
geno_table
will be filtered based on whether theirdrug_class
matches any value in this list.- geno_sample_col
A character string (optional) specifying the column name in
geno_table
containing sample identifiers. Defaults toNULL
, in which case it is assumed the first column contains identifiers.- pheno_sample_col
A character string (optional) specifying the column name in
pheno_table
containing sample identifiers. Defaults toNULL
, in which case it is assumed the first column contains identifiers.- sir_col
A character string specifying the column name in
pheno_table
that contains the resistance interpretation (SIR) data. The values should be interpretable as "R" (resistant), "I" (intermediate), or "S" (susceptible).- keep_assay_values
A logical indicating whether to include columns with the raw phenotype assay data in the binary matrix. Assumes there are columns labelled "mic" and/or "disk"; these will be added to the output table if present. Defaults to
TRUE
.- min
Minimum number of genomes with the solo marker, to include the marker in the plot (default 1).
- pd
Position dodge, i.e. spacing for the R/NWT values to be positioned above/below the line in the PPV plot. Default 'position_dodge(width = 0.8)'
param axis_label_size Font size for axis labels in the PPV plot (default 9).
- plot_cols
A named vector of colors for the plot. The names should be the phenotype categories (e.g., "R", "I", "S", "NWT"), and the values should be valid color names or hexadecimal color codes. Default colors are provided for resistant ("R"), intermediate ("I"), susceptible ("S"), and non-wild-type ("NWT").
Value
A list containing the following elements:
- solo_stats
A dataframe summarizing the PPV for resistance (R vs S/I) and NWT (R/I vs S), including the number of positive hits, sample size, PPV, and 95% confidence intervals for each marker.
- combined_plot
A combined ggplot object showing the PPV plot for the solo markers, and a bar plot for the phenotype distribution.
- solo_binary
A dataframe with binary values indicating the presence or absence of the solo markers.
- amr_binary
A dataframe with binary values for the AMR markers, based on the input genotype and phenotype data.
Examples
if (FALSE) { # \dontrun{
# import genotype data
geno_table <- import_amrfp(ecoli_geno_raw, "Name")
# example phenotype data
head(ecoli_ast)
soloPPV_cipro <- solo_ppv_analysis(
geno_table = geno_table,
pheno_table = ecoli_ast,
antibiotic = "Ciprofloxacin",
drug_class_list = c("Quinolones"),
sir_col = "Resistance phenotype"
)
# View the results
soloPPV_cipro$solo_stats
soloPPV_cipro$combined_plot
} # }