This function imports an antibiotic susceptibility testing (AST) dataset, processes the data, and optionally interprets the results based on MIC or disk diffusion data. It assumes that the input file is a tab-delimited text file (e.g., TSV) and parses relevant columns (antibiotic names, species names, MIC or disk data) into suitable classes using the AMR package. It optionally can use the AMR package to determine susceptibility phenotype (SIR) based on EUCAST or CLSI guidelines (human breakpoints and/or ECOFF). If expected columns are not found warnings will be given, and interpretation may not be possible.
Usage
import_ncbi_ast(
input,
sample_col = "#BioSample",
interpret = F,
ecoff = F,
default_guideline = "EUCAST"
)
Arguments
- input
A string representing a dataframe, or a path to a tab-delimited file, containing the AST data in NCBI antibiogram format. These files can be downloaded fromNCBI AST browser, e.g. https://www.ncbi.nlm.nih.gov/pathogens/ast#Pseudomonas%20aeruginosa
- sample_col
A string indicating the name of the column with sample identifiers. If
NULL
, assume this is '#BioSample'.- interpret
A logical value (default is FALSE). If
TRUE
, the function will interpret the susceptibility phenotype (SIR) for each row based on the MIC or disk diffusion values, against human breakpoints from either EUCAST or CLSI testing standard (as indicated in theTesting standard
column of the input file, if blank the value of the default_guideline parameter will be used by default). IfFALSE
, no interpretation is performed.- ecoff
A logical value (default is FALSE). If
TRUE
, the function will interpret the wildtype vs nonwildtype status for each row based on the MIC or disk diffusion values, against epidemiological cut-off (ECOFF) values. These will be reported in a new column 'ecoff', coded as 'NWT' (nonwildtype) or 'WT' (wildtype). IfFALSE
, no ECOFF interpretation is performed.- default_guideline
A string (default is "EUCAST"). Default guideline to use for interpretation via as.sir. Allowed values are 'EUCAST' or 'CLSI'. If the input file contains a column
Testing standard
, or if interpret or ecoff are set toTRUE
, a new columnguideline
will be created to use in the interpretation step. Values are populated from those inTesting standard
, however rows with missing/NA values or non-allowed values will be coerced to the value specified by 'default_guideline'. If there is noTesting standard
column, all rows will be interpreted using the default_guideline.
Value
A data frame with the processed AST data, including additional columns:
id
: The biological sample identifier (renamed from#BioSample
or specified column).spp_pheno
: The species phenotype, formatted using theas.mo
function.drug_agent
: The antibiotic used in the test, formatted using theas.ab
function.mic
: The minimum inhibitory concentration (MIC) value, formatted using theas.mic
function.disk
: The disk diffusion measurement (in mm), formatted using theas.disk
function.guideline
: The guideline used for interpretation (either EUCAST or CLSI; taken from input column otherwise forced to parameter default_guideline).pheno
: The phenotype interpreted against the specified breakpoint standard (as S/I/R), based on the MIC or disk diffusion data.ecoff
: The wildtype/nonwildtype status interpreted against the ECOFF (as WT/NWT), based on the MIC or disk diffusion data.
Examples
# Example usage
if (FALSE) { # \dontrun{
# small example E. coli AST data from NCBI
ecoli_ast_raw
# import without re-interpreting resistance
pheno <- import_ncbi_ast(ecoli_ast_raw)
head(pheno)
# import and re-interpret resistance (S/I/R) and ECOFF (WT/NWT) using AMR package
pheno <- import_ncbi_ast(ecoli_ast_raw, interpret = T, ecoff=T)
head(pheno)
} # }