The fundamental calculations underlying this package is based on work published in Casado et al. (2013) Sci Signal. 6(268):rs6. Please refer to this paper for details on the formula.
This package has the following functions:
This package includes a few sample datasets to use for exercises:
Additional notes on the PX format:
The following is a detailed description of each column in PX:
The listed columns must be presented in that exact order. There can be no NA values, or else the entire row will be discarded from analysis. Although Protein, Peptide, and p entries are optional, the column headers are mandatory.
The goal of the KSEAapp is to generate relative kinase activity inferences from quantitative phosphoproteomics data.
Given an experimental dataset input, you will generate 3 different forms of outputs:
You can achieve this result using 2 different routes:
Route A: use the KSEA.complete() function to do everything in one go. This directly saves the 3 separate outputs into your working directory as .tiff (for plot), .csv (for KSEA kinase scores table), and .csv (for K-S relationships table) files.
Route B: use the KSEA.KS_table(), KSEA.Scores(), and KSEA.Barplot() functions. This sequence suppresses the file exports and allows everything to be created as objects within the R environment. This gives additional flexibility for the user to do downstream data manipulation. Alternatively, the user can employ KSEA.Heatmap() rather than KSEA.Barplot() if wanting to compile a multi-condition experiment into a single heatmap instead of separate bar plots.
The following are detailed walk-throughs on how to navigate through each route.
KSEA.Complete only compares two groups at a time; therefore, if a dataset contains multiple conditions, the user must submit separate input files, each with a single fold change column, for every pairwise comparison.
This exercise requires the following datasets included in the package: KSData and PX
This is the overview of all the required parameters for KSEA.Complete()
Here is an example type-up for the R Console:
KSEA.Complete(KSData, PX, NetworKIN=TRUE, NetworKIN.cutoff=5, m.cutoff=5, p.cutoff=0.01)
The function will result in 3 different outputs saved directly into the working directory:
“KSEA Bar Plot.tiff”: This is the bar plot that summarizes the KSEA results. Note that not all kinases are included. The kinase substrate count cutoff, set by m.cutoff, decides which kinases to include in this plot. The p-value cutoff, set by p.cutoff, decides which kinases to color blue/red for visual annotation of kinases that reach statistical significance. Kinases with non-significant scores will be black.
“Kinase-Substrate Links.csv”: This is a complete table listing ALL the K-S relationships identified from the experimental dataset. This includes relationships for kinases that are not featured in the bar plot. For each kinase, every substrate identified from the dataset was used for the KSEA calculations (in other words, there was no filtering of the substrates). Kinase.Gene represents the gene name for each kinase. Substrate.Gene indicates the gene name for each substrate linked to that kinase. Substrate.Mod is the substrate's specific amino acid residue that was modified. Source shows the database where the K-S annotation was derived from. log2FC is the log2(fold change) value of that particular substrate phosphosite from the experiment. If that same site was detected across multiple peptides that map to the same protein, the average log2FC is reported.
“KSEA Kinase Scores.csv”: This is a complete table listing ALL the kinases, including those that are not featured in the bar plot, that have at least one identified substrate in the input dataset. Please refer to the original Casado et al. publication for detailed description of these columns and what they represent. Kinase.Gene indicates the gene name for each kinase. mS represents the mean log2(fold change) of all the kinase's substrates. Enrichment is the background-adjusted value of the kinase's mS. m is the total amount of detected substrates from the experimental dataset for each kinase. z.score is the normalized score for each kinase, weighted by the number of identified substrates. p.value represents the statistical assessment for the z.score. FDR is the p-value adjusted for multiple hypothesis testing using the Benjamini & Hochberg method.
This exercise requires the following datasets included in the package:
The output of this function is identical in format and content as the Kinase-Substrate Links.csv output from KSEA.Complete().
Here is an example type-up for the R Console:
KSData.dataset <- KSEA.KS_table(KSData, PX, NetworKIN=TRUE, NetworKIN.cutoff=5)
This should result in an R object KSData.dataset, which is identical to the output “Kinase-Substrate Links.csv”“ generated in the earlier exercise with KSEA.Complete()
Here is an example type-up for the R Console:
Scores <- KSEA.Scores(KSData, PX, NetworKIN=TRUE, NetworKIN.cutoff=5)
This should result in an R object Scores, which is identical to the output "KSEA Kinase Scores.csv” generated in the earlier exercise with KSEA.Complete()
Here is an example type-up for the R Console:
KSEA.Barplot(KSData, PX, NetworKIN=TRUE, NetworKIN.cutoff=5, m.cutoff=5, p.cutoff=0.01, export=FALSE)
This should result in a bar plot, which is identical to the output “KSEA Bar Plot.tiff” generated in the earlier exercise with KSEA.Complete(). Setting export=TRUE would result in the same output “KSEA Bar Plot.tiff” as generated with KSEA.Complete().
Important notes:
This is the overview of all the required parameters for KSEA.Heatmap():
Here is an example type-up for the R Console:
KSEA.Heatmap(score.list=list(KSEA.Scores.1, KSEA.Scores.2, KSEA.Scores.3),
sample.labels=c("Tumor.A", "Tumor.B", "Tumor.C"),
stats="p.value", m.cutoff=3, p.cutoff=0.05, sample.cluster=TRUE)
This should result in a .png heatmap saved as “KSEA.Merged.Heatmap.png” within the working directory. Blue = negative kinase scores; White = zero-valued kinase scores; Red = positve kinase scores; Asterisks = scores that met the statistical cutoff, as indicated by the p.cutoff parameter.