|
This
website provides data and software used in the paper “A Neural Network Model for Constructing Endophenotypes of
Common Complex Diseases - An Application to Male Young-onset Hypertension Microarray
Data” submitted to Bioinformatics in Jan, 2009. |
|
|
l
Data files: Original Chip Data (159 male, 77 cases, 82 controls, each with
multiple replicate (1~4) chips, and each chip with 22184 genes; Some chips
with bad quality have been removed.) Phalanx_microarray_data.zip
(106.39MB)– Original microarray chip data (392 chip data files that
passed QC tests, each includes probe ID, probe intensity, probe SD, and gene
symbol) probe_data_from_Phalanx.zip (1.03MB)
– Probe Information (gene information of the 22184 probes) subject_chip_list.xls – A list
provides links between subject IDs and their associated microarray chip IDs. ---------------------------------------------------------------------------------------------------------------------------------------- Data after
preprocessing (download
the zipped file: data_preprocess.zip
(18.9MB)) The data was preprocessed by the following steps. Step1. Multiple replicates were weighted (1/SD)
merged. Step2. Gene values less than 1 were set to 1. Step3. A logarithm were then applied to all the
resultant values male_hyp.zip (9.17MB) – male hypertension data (22184 genes, 77 cases) male_nor.zip (9.76MB) – male normotensive data (22184 genes, 82 controls) ----------------------------------------------------------------------------------------------------------------------------------------- Supplimentary data (download the zipped file: supl_data.zip
(124KB)) ** The following
data are required for running performance_evaluation.m,
figure_generation.m, and significant_gene.m** top_103_gene_info.csv –
probe
information of the select
103 genes hyp_BP_info.csv – blood pressure information for the 77 hypertensive cases nor_BP_info.csv – blood pressure information for the 82 normotensive cases matched_case_control.csv – the 61 matched IDs match_id.csv – the order of the 61 matched subjects in the dataset ranked_by_pvalue.csv – genes ranked by their p-values random_order_id.csv – a random vector to rearrange the order of the dataset to prevent high correlation among the neighboring records model1.csv – the network parameters (weights) of model 1 model2.csv – the network parameters (weights) of model 2 model3.csv – the network parameters (weights) of model 3 model4.csv – the network parameters (weights) of model 4 model5.csv – the network parameters (weights) of model 5 |
|
|
l
Softwares (Matlab programs;
download the zipped file: test_software.zip
(12KB)) performance_evaluation.m – evaluate the performance of the computed model (Its output is used for Table 1!!). figure_generation.m – generate the figures in the paper (including in the supplemental data). significant_genes.m
– compute significant genes mentioned in the paper comp_5obj.m – called by “performance_evaluation.m” to compute the five objectives mycorrel.m – called by “mycorrel” to compute correlations mycov.m – called by “comp_5obj” to compute covariance Please
contact Dr. Ke-Shiuan Lynn if you
have any questions about the data. |
|