Integrated differential expression (DE) and differential co-expression (DC) analysis on gene expression data based on DECODE (DifferEntial CO-expression and Differential Expression) algorithm. Given a set of gene expression data and functional gene set data, the program will return a table summary of the selected gene sets with high differential co-expression and high differential expression (HDC-HDE).
Lui, TWH, Tsui, NBY, Chan, LWC, SP Siu, PM, Wong, C, Yung, BYM. (2015) DECODE: an integrated differential co-expression and differential expression analysis of gene expression data. BMC Bioinformatics, May 31;16:182. http://www.biomedcentral.com/1471-2105/16/182?fmt_math=yes&fmt_math_check=on
Data format:
Columns are tab separated
Column 1: Official gene symbol
Column 2: Probe ID
Starting from column 3: Expression for different samples
Row 1 (starting from column 3): Sample class (“1” indicates control group; “2” indicates case group)
Row 2: Sample id
Starting from row 3: Expression for different genes
geneName | probeID | 2 | 2 | 2 | 1 | 1 | 1 |
---|---|---|---|---|---|---|---|
- | - | Case Sample 1 | Case Sample 2 | Case Sample 3 | Control Sample 1 | Control Sample 2 | Control Sample 3 |
7A5 | ILMN_1762337 | 5.12621 | 5.19419 | 5.06645 | 5.40649 | 5.51259 | 5.38700 |
A1BG | ILMN_2055271 | 5.63504 | 5.68533 | 5.66251 | 5.37466 | 5.43955 | 5.50973 |
A1CF | ILMN_2383229 | 5.41543 | 5.58543 | 5.43239 | 5.49634 | 5.62685 | 5.36962 |
A26C3 | ILMN_1653355 | 5.56713 | 5.55470 | 5.59547 | 5.46895 | 5.49622 | 5.50094 |
A2BP1 | ILMN_1814316 | 5.23016 | 5.33808 | 5.31413 | 5.30586 | 5.40108 | 5.31855 |
A2M | ILMN_1745607 | 7.65332 | 6.56431 | 8.20163 | 9.19837 | 9.04295 | 10.1448 |
A2ML1 | ILMN_2136495 | 5.53532 | 5.93801 | 5.33728 | 5.36676 | 5.79942 | 5.13974 |
A3GALT2 | ILMN_1668111 | 5.18578 | 5.35207 | 5.30554 | 5.26107 | 5.26536 | 5.28932 |
A4GALT | ILMN_1735045 | 6.34751 | 5.56750 | 6.92335 | 7.49523 | 7.12119 | 6.54748 |
A4GNT | ILMN_1680754 | 5.26417 | 5.28596 | 5.27560 | 5.28830 | 5.08440 | 5.44869 |
Data format:
Columns are tab separated
Column 1: Name of gene set
Column 2: Gene set ID (e.g. GO ID)
Starting from column 3: Genes (using official gene symbols) in the gene set
Column 1 | Column 2 | Column 3, 4, … |
---|---|---|
positive regulation of epidermal growth factor-activated receptor activity | GO 0045741 | EREG FBXW7 EPGN ADAM17 ADRA2C ADRA2A TGFA EGF |
pyrimidine-containing compound salvage | GO 0008655 | UPP1 TYMP TK1 UPP2 UCKL1 CDA TK2 UCK1 DCK |
library(decode)
Running a larger set of gene expression data with 1400 genes. It will take ~16 minutes to complete. (Computer used: An Intel Core i7-4600 processor, 2.69 GHz, 8 GB RAM)
path = system.file('extdata', package='decode')
geneSetInputFile = file.path(path, "geneSet.txt")
geneExpressionFile = file.path(path, "Expression_data_1400genes.txt")
runDecode(geneSetInputFile, geneExpressionFile)
A smaller set of gene expression data with 50 genes to satisfy CRAN’s submission requirement. (No results will be generated)
path = system.file('extdata', package='decode')
geneSetInputFile = file.path(path, "geneSet.txt")
geneExpressionFile = file.path(path, "Expression_data_50genes.txt")
runDecode(geneSetInputFile, geneExpressionFile)
## [1] "Reading gene expression data..."
## [1] "Calculating t-statistics..."
## [1] "Calculating pairwise correlation for normal states..."
## [1] "Calculating pairwise correlation for disease states..."
## [1] "Calculating differential co-expression measures ..."
## [1] "Reading functional gene set data"
## [1] "Identifying optimal thresholds for genes"
## [1] "Gene id: 1"
## [1] "Gene id: 2"
## [1] "Gene id: 3"
## [1] "Gene id: 4"
## [1] "Gene id: 5"
## [1] "Gene id: 6"
## [1] "Gene id: 7"
## [1] "Gene id: 8"
## [1] "Gene id: 9"
## [1] "Gene id: 10"
## [1] "Gene id: 11"
## [1] "Gene id: 12"
## [1] "Gene id: 13"
## [1] "Gene id: 14"
## [1] "Gene id: 15"
## [1] "Gene id: 16"
## [1] "Gene id: 17"
## [1] "Gene id: 18"
## [1] "Gene id: 19"
## [1] "Gene id: 20"
## [1] "Gene id: 21"
## [1] "Gene id: 22"
## [1] "Gene id: 23"
## [1] "Gene id: 24"
## [1] "Gene id: 25"
## [1] "Gene id: 26"
## [1] "Gene id: 27"
## [1] "Gene id: 28"
## [1] "Gene id: 29"
## [1] "Gene id: 30"
## [1] "Gene id: 31"
## [1] "Gene id: 32"
## [1] "Gene id: 33"
## [1] "Gene id: 34"
## [1] "Gene id: 35"
## [1] "Gene id: 36"
## [1] "Gene id: 37"
## [1] "Gene id: 38"
## [1] "Gene id: 39"
## [1] "Gene id: 40"
## [1] "Gene id: 41"
## [1] "Gene id: 42"
## [1] "Gene id: 43"
## [1] "Gene id: 44"
## [1] "Gene id: 45"
## [1] "Gene id: 46"
## [1] "Gene id: 47"
## [1] "Gene id: 48"
## [1] "Gene id: 49"
## [1] "Gene id: 50"
## [1] "Identifying best associated functional gene set for each gene..."
## [1] "Gene id: 1"
## [1] "Gene id: 2"
## [1] "Gene id: 3"
## [1] "Gene id: 4"
## [1] "Gene id: 5"
## [1] "Gene id: 6"
## [1] "Gene id: 7"
## [1] "Gene id: 8"
## [1] "Gene id: 9"
## [1] "Gene id: 10"
## [1] "Gene id: 11"
## [1] "Gene id: 12"
## [1] "Gene id: 13"
## [1] "Gene id: 14"
## [1] "Gene id: 15"
## [1] "Gene id: 16"
## [1] "Gene id: 17"
## [1] "Gene id: 18"
## [1] "Gene id: 19"
## [1] "Gene id: 20"
## [1] "Gene id: 21"
## [1] "Gene id: 22"
## [1] "Gene id: 23"
## [1] "Gene id: 24"
## [1] "Gene id: 25"
## [1] "Gene id: 26"
## [1] "Gene id: 27"
## [1] "Gene id: 28"
## [1] "Gene id: 29"
## [1] "Gene id: 30"
## [1] "Gene id: 31"
## [1] "Gene id: 32"
## [1] "Gene id: 33"
## [1] "Gene id: 34"
## [1] "Gene id: 35"
## [1] "Gene id: 36"
## [1] "Gene id: 37"
## [1] "Gene id: 38"
## [1] "Gene id: 39"
## [1] "Gene id: 40"
## [1] "Gene id: 41"
## [1] "Gene id: 42"
## [1] "Gene id: 43"
## [1] "Gene id: 44"
## [1] "Gene id: 45"
## [1] "Gene id: 46"
## [1] "Gene id: 47"
## [1] "Gene id: 48"
## [1] "Gene id: 49"
## [1] "Gene id: 50"
## [1] "Processing raw results..."
## [1] "Summarizing functional gene set results..."
## [1] "Done. Result is saved in out_summary.txt"