In a cell culture lab various cellular assays are performed. The package “bioassays” will help to analyse the results of these experiments performed in multiwell plates. The usage of various functions in the “bioassays” package is provided in this article.
The functions in this package can be used to summarise data from any multiwell plate, and by incorporating them in a loop several plates can be analyzed automatically. Two examples are also provided in the article.
The output reading from the instrument (eg.spectrophotometer) should be in a matrix format. An example data (csv format) is shown below. If the data is in .xls/.xlsx format read_excel
function in ‘readxl’ package can be used.
data(rawdata96)
head(rawdata96)
#> X X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
#> 1 A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> 2 B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> 3 C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> 4 D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> 5 E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> 6 F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300
A metadata is needed for the whole experiment. “row” and “col” columns are must in the metafile to indicate the location of well. An example is given below.
extract_filename
help to extract information from the file name. syntax is extract_filename(filename,split = " ",end = ".csv", remove = " ", sep="-")
. filename is the file name. split is the portions at which the name has to be split (default is space " “). end is the extension of file name that need to be removed (default is”.csv“). remove is the portion from the file name that need to be omitted after splitting (default is space” “). sep add a symbol between separate sections, default is”-".
This function is useful for extracting specific information from file names, like compound name, plate number etc, to provide appropriate analysis.
For e.g.
rmodd_summary
help to remove the outliers and summarise the values from a given set of function. Syntax is rmodd_summary(x, rm = "FALSE", strict= "FALSE", cutoff=80,n=3)
. x is a numeric vector. rm = TRUE if want to remove outliers. If strict = FALSE those values above/below 1.5 IQR is omitted (outliers omitted). If strict = TRUE more aggresive outlier removal is used to bring %cv below cutoff. n is the minimum number of samples you need per group if more aggresive outlier removal is used.
For e.g.
x<- c(1.01,0.98,0.6,0.54,0.6,0.6,0.4,3)
rmodd_summary(x, rm = "FALSE", strict= "FALSE", cutoff=80,n=3)
#> mean median n sd cv
#> 0.9662500 0.6000000 8.0000000 0.8487796 87.8426480
data2plateformat
convert the data (eg: readings from a 96 well plate) to appropriate matrix format. Syntax is data2plateformat(data, platetype = 96)
. data is the data to be formatted. platetype is the plate from which the data is coming. It can take 6, 12, 24, 96, 384 values to represent the corresponding multiwell.
For e.g. To rename columns and rows of ‘rawdata96’ to right format.
rawdata<-data2plateformat(rawdata96,platetype = 96)
head(rawdata)
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300
plate2df
format matrix type 2D data of multi well plates as a dataframe. The function uses column names and row names of ‘datamatrix’ (2D data of a mutli well plate) and generate a dataframe with row, col (column) and position indices. The ‘value’ column represent corresponding value in the ‘datamarix’..
Syntax is plate2df(datamatrix)
. datamatrix is the data in matrix format.
For eg.
matrix96
help to convert a dataframe in to a matrix format. Syntax is matrix96(dataframe,column,rm="FALSE")
. dataframe is the dataframe to be formatted. The dataframe should have a “row” and “col” columns to function smoothly. column is the name of column that need be converted as a matrix.. If rm= “TRUE” then -ve and NA are assigned as 0.
For e.g.
matrix96(OD_df,"value")
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300
#> G 0.122 0.118 0.300 0.288 0.293 0.251 0.245 0.270 0.261 0.259 0.271 0.271
#> H 0.107 0.102 0.320 0.340 0.319 0.270 0.262 0.277 0.294 0.278 0.307 0.316
matrix96(OD_df,"position")
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A "A01" "A02" "A03" "A04" "A05" "A06" "A07" "A08" "A09" "A10" "A11" "A12"
#> B "B01" "B02" "B03" "B04" "B05" "B06" "B07" "B08" "B09" "B10" "B11" "B12"
#> C "C01" "C02" "C03" "C04" "C05" "C06" "C07" "C08" "C09" "C10" "C11" "C12"
#> D "D01" "D02" "D03" "D04" "D05" "D06" "D07" "D08" "D09" "D10" "D11" "D12"
#> E "E01" "E02" "E03" "E04" "E05" "E06" "E07" "E08" "E09" "E10" "E11" "E12"
#> F "F01" "F02" "F03" "F04" "F05" "F06" "F07" "F08" "F09" "F10" "F11" "F12"
#> G "G01" "G02" "G03" "G04" "G05" "G06" "G07" "G08" "G09" "G10" "G11" "G12"
#> H "H01" "H02" "H03" "H04" "H05" "H06" "H07" "H08" "H09" "H10" "H11" "H12"
plate_metadata
combine the plate specific information (like compound used, standard concentration, dilution of samples, etc) and metadata, to produce unique plate metadata. Syntax is plate_metadata(plate_details, metadata,mergeby="type")
. plate details is the plate specific information that need to be added to metadata. metadata is the metadata for whole experiment. mergeby is the column that is common to both metadata and plate_meta (this column will be used for merging the information).
For eg. An incomplete meta data
head(metafile96)
#> row col position type id concentration dilution
#> 1 A 1 A01 STD1 STD 25 NA
#> 2 A 2 A02 STD1 STD 25 NA
#> 3 A 3 A03 S1 Sample NA NA
#> 4 A 4 A04 S1 Sample NA NA
#> 5 A 5 A05 S1 Sample NA NA
#> 6 A 6 A06 S1 Sample NA NA
Plate specific details are.
plate_details <- list("compound" = "Taxol",
"concentration" = c(0.00,0.01,0.02,0.05,0.10,1.00,5.00,10.00),
"type" = c("S1","S2","S3","S4","S5","S6","S7","S8"),
"dilution" = 1)
Using plate specific info, the metadata can be filled by calling plate_metadata
function.
plate_meta<-plate_metadata(plate_details,metafile96,mergeby="type")
head(plate_meta)
#> row col type position id dilution concentration compound
#> 1 A 1 STD1 A01 STD NA 25 <NA>
#> 2 A 2 STD1 A02 STD NA 25 <NA>
#> 3 A 3 S1 A03 Sample 1 0 Taxol
#> 4 A 4 S1 A04 Sample 1 0 Taxol
#> 5 A 5 S1 A05 Sample 1 0 Taxol
#> 6 A 6 S1 A06 Sample 1 0 Taxol
To join both plate_meta and OD_df, innerjoin
(is a dplyr function) can be used.
data_DF<- dplyr::inner_join(OD_df,plate_meta,by=c("row","col","position"))
head(data_DF)
#> row col position value type id dilution concentration compound
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA>
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA>
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol
heatplate
help to create a heatmap of multiwell plate. The syntax is heatplate(datamatrix,name,size=7.5)
. datamatrix is the data in matrix format. An easy way to create this is by calling ‘matrix96’ as explained before. name is the name to be given for heatmap, size is the size of each well in the heatmap (default is 7.5).
This function will give a heatmap of normalized values if the ‘variable’ is numeric. If it is a factorial variable, it will simple provide a coloured categorical plot.
eg 1. Categorical plot
datamatrix<-matrix96(metafile96,"id")
datamatrix
#> 1 2 3 4 5 6 7 8
#> A "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> B "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> C "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> D "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> E "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> F "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> G "STD" "STD" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> H "Blank" "Blank" "Sample" "Sample" "Sample" "Sample" "Sample" "Sample"
#> 9 10 11 12
#> A "Sample" "Sample" "Sample" "Sample"
#> B "Sample" "Sample" "Sample" "Sample"
#> C "Sample" "Sample" "Sample" "Sample"
#> D "Sample" "Sample" "Sample" "Sample"
#> E "Sample" "Sample" "Sample" "Sample"
#> F "Sample" "Sample" "Sample" "Sample"
#> G "Sample" "Sample" "Sample" "Sample"
#> H "Sample" "Sample" "Sample" "Sample"
eg 2. Heatmap
rawdata<-data2plateformat(rawdata96,platetype = 96)
OD_df<- plate2df(rawdata)
data<-matrix96(OD_df,"value")
data
#> 1 2 3 4 5 6 7 8 9 10 11 12
#> A 0.659 0.649 0.598 0.601 0.541 0.553 0.568 0.519 0.576 0.575 0.583 0.504
#> B 0.442 0.455 0.586 0.563 0.525 0.548 0.511 0.503 0.533 0.559 0.529 0.535
#> C 0.278 0.266 0.491 0.562 0.510 0.473 0.467 0.433 0.382 0.457 0.475 0.510
#> D 0.197 0.199 0.452 0.456 0.421 0.431 0.409 0.401 0.458 0.412 0.408 0.403
#> E 0.177 0.174 0.447 0.437 0.392 0.412 0.368 0.396 0.397 0.358 0.360 0.393
#> F 0.141 0.137 0.277 0.337 0.294 0.279 0.257 0.263 0.262 0.292 0.280 0.300
#> G 0.122 0.118 0.300 0.288 0.293 0.251 0.245 0.270 0.261 0.259 0.271 0.271
#> H 0.107 0.102 0.320 0.340 0.319 0.270 0.262 0.277 0.294 0.278 0.307 0.316
reduceblank
help to reduce blank values from the readings.
The syntax is reduceblank (dataframe,x_vector,blank_vector,y)
. dataframe is the data. x_vector is the entries for which the blank has to be reduced. If all entries has to reduced use “All”. x_vector should be in a vector format eg: c(“drug1”,“drug2”,drug3" etc). blank_vector is the vector of blank names whose value has to be reduced (should be in a vector format eg: c(“blank1”,“blank2”,“blank3”,“blank4”)). This function will reduce the first blank vector element from first x_vector element and so on. y is the column name where the action will take place. y should be numeric in nature. The results will appear as a new column named ‘blankminus’.
For eg.
data_DF<-reduceblank(data_DF, x_vector =c("All"),blank_vector = c("Blank"), "value")
head(data_DF)
#> row col position value type id dilution concentration compound blankminus
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA> 0.5545
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA> 0.5445
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol 0.4935
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol 0.4965
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol 0.4365
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol 0.4485
estimate
help to estimate the unknown variable (eg: concentration) based on the standard curve. Syntax is estimate(data=dataframe,colname="blankminus",fitformula=fit, methord="linear/nplr")
. data is the dataframe which need to be evaluated. colname is the column name for which the values has to be estimated. fitformula is the filling formula used. methord is to specify if linear or nonparametric logistic curve was used for the fitformula.
For eg: data_DF is a dataframe for which the concentration has to be estimated based on the value of blankminus.
For filtering the ‘standards’
std<- dplyr::filter(data_DF, data_DF$id=="STD")
std<- aggregate(std$blankminus ~ std$concentration, FUN = mean )
colnames (std) <-c("con", "OD")
head(std)
#> con OD
#> 1 0.39 0.0155
#> 2 0.78 0.0345
#> 3 1.56 0.0710
#> 4 3.13 0.0935
#> 5 6.25 0.1675
#> 6 12.50 0.3440
To fit a standard curve.
fit1 is the 3 parameter logistic curve model and fit2 is the linear regression model. The appropriate one for your experiment can be used.
fit2<-stats::lm(formula = con ~ OD,data = std)# linear model
fit1<-nplr::nplr(std$con,std$OD,npars=3,useLog = FALSE)# nplr, 3 parameter model
For estimating the concentration using linear model
estimated<-estimate(data_DF,colname="blankminus",fitformula=fit2,method="linear")
head(estimated)
#> row col position value type id dilution concentration compound blankminus
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA> 0.5545
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA> 0.5445
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol 0.4935
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol 0.4965
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol 0.4365
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol 0.4485
#> estimated
#> 1 23.96838
#> 2 23.51493
#> 3 21.20234
#> 4 21.33838
#> 5 18.61769
#> 6 19.16183
For estimating the concentration using nplr methord
estimated2<-estimate(data_DF,colname="blankminus",fitformula=fit1,method="nplr")
head(estimated2)
#> row col position value type id dilution concentration compound blankminus
#> 1 A 1 A01 0.659 STD1 STD NA 25 <NA> 0.5545
#> 2 A 2 A02 0.649 STD1 STD NA 25 <NA> 0.5445
#> 3 A 3 A03 0.598 S1 Sample 1 0 Taxol 0.4935
#> 4 A 4 A04 0.601 S1 Sample 1 0 Taxol 0.4965
#> 5 A 5 A05 0.541 S1 Sample 1 0 Taxol 0.4365
#> 6 A 6 A06 0.553 S1 Sample 1 0 Taxol 0.4485
#> estimated
#> 1 26.39687
#> 2 24.01751
#> 3 18.68524
#> 4 18.88785
#> 5 15.70869
#> 6 16.23867
dfsummary()
help to summarize the dataframe (based on a column). It has additional controls to group samples and to omit variables not needed. syntax is dfsummary(dataframe,y,grp_vector,rm_vector,nickname,rm="FALSE",param)
. dataframe is the data. y is the numeric variable (column name) that has to be summarized. grp_vector is a vector of column names, based on which samples are grouped. The order of elements in grp_vector determines the order of grouping. rm_vector is the vector of items need to be omitted before summarizing. nickname is the name that has to be given to the output dataframe. rm=“FALSE” if outliers has not to be removed. If outliers has to be removed then rm =“TRUE”. For more stringent methord for removing outlier the parameters are provided in a vector param. param has to be entered in the format c(strict=“TRUE”,cutoff=40,n=12). For details please refer rmodd_summary function.
For eg. data has to be summarized based on the “type” column. “estimated” values are summarized. samples are grouped as per “id”. “STD” and “Blank” values need to be omitted. outliers are not omitted (rm=“FALSE”). nickname for the plate is “plate1”.
result<-dfsummary(estimated,"estimated",c("id","type"),
c("STD","Blank"),"plate1", rm="FALSE",
param=c(strict="FALSE",cutoff=40,n=12))
#> F1
#> F2
result
#> id type label N Mean SD CV
#> 1 Sample S1 plate1 10 19.561 1.465 7.49
#> 2 Sample S2 plate1 10 18.536 1.141 6.15
#> 3 Sample S3 plate1 10 15.670 2.194 14.00
#> 4 Sample S4 plate1 10 13.362 1.026 7.68
#> 5 Sample S5 plate1 10 12.043 1.359 11.29
#> 6 Sample S6 plate1 10 6.969 1.066 15.30
#> 7 Sample S7 plate1 10 6.370 0.819 12.85
#> 8 Sample S8 plate1 10 7.612 1.174 15.42
pvalue()
help to calculate the significance by t-test on the result dataframe. Syntax is pvalue(dataframe,control,sigval)
. dataframe is the result of dfsummary. control is the group that is considered as control, sigval is the pvalue cutoff (a value below this is considered as significant). For eg.
pval<-pvalue(result, control="S8", sigval=0.05)
head(pval)
#> id type label N Mean SD CV pvalue significance
#> 1 Sample S8 plate1 10 7.612 1.174 15.42 control
#> 2 Sample S1 plate1 10 19.561 1.465 7.49 < 0.001 Yes
#> 3 Sample S2 plate1 10 18.536 1.141 6.15 < 0.001 Yes
#> 4 Sample S3 plate1 10 15.670 2.194 14.00 < 0.001 Yes
#> 5 Sample S4 plate1 10 13.362 1.026 7.68 < 0.001 Yes
#> 6 Sample S5 plate1 10 12.043 1.359 11.29 < 0.001 Yes