This website provides data and software used in the paper “A Neural Network Model for Constructing Endophenotypes of Common Complex Diseases - An Application to Male Young-onset Hypertension Microarray Data” submitted to Bioinformatics in Jan, 2009.

l        Data files:

Original Chip Data (159 male, 77 cases, 82 controls, each with multiple replicate (1~4) chips, and each chip with 22184 genes; Some chips with bad quality have been removed.)

Phalanx_microarray_data.zip (106.39MB)– Original microarray chip data (392 chip data files that passed QC tests, each includes probe ID, probe intensity, probe SD, and gene symbol)

probe_data_from_Phalanx.zip (1.03MB) – Probe Information (gene information of the 22184 probes)

subject_chip_list.xls – A list provides links between subject IDs and their associated microarray chip IDs.

----------------------------------------------------------------------------------------------------------------------------------------

Data after preprocessing (download the zipped file: data_preprocess.zip (18.9MB))

The data was preprocessed by the following steps.

Step1. Multiple replicates were weighted (1/SD) merged.

Step2. Gene values less than 1 were set to 1.

Step3. A logarithm were then applied to all the resultant values

male_hyp.zip (9.17MB) – male hypertension data (22184 genes, 77 cases)

male_nor.zip  (9.76MB) – male normotensive data (22184 genes, 82 controls)

-----------------------------------------------------------------------------------------------------------------------------------------

Supplimentary data (download the zipped file: supl_data.zip (124KB))

** The following data are required for running performance_evaluation.m, figure_generation.m, and significant_gene.m**

top_103_gene_info.csv probe information of the select 103 genes

hyp_BP_info.csv – blood pressure information for the 77 hypertensive cases

nor_BP_info.csv – blood pressure information for the 82 normotensive cases

matched_case_control.csv – the 61 matched IDs

match_id.csv – the order of the 61 matched subjects in the dataset

ranked_by_pvalue.csv – genes ranked by their p-values

random_order_id.csv – a random vector to rearrange the order of the dataset to prevent high correlation among the neighboring records

model1.csv – the network parameters (weights) of model 1

model2.csv – the network parameters (weights) of model 2

model3.csv – the network parameters (weights) of model 3

model4.csv – the network parameters (weights) of model 4

model5.csv – the network parameters (weights) of model 5

 

l         Softwares (Matlab programs; download the zipped file: test_software.zip (12KB))

performance_evaluation.m – evaluate the performance of the computed model (Its output is used for Table 1!!).

figure_generation.m – generate the figures in the paper (including in the supplemental data).

significant_genes.m – compute significant genes mentioned in the paper

comp_5obj.m – called by “performance_evaluation.m” to compute the five objectives

mycorrel.m  called by “mycorrel” to compute correlations

mycov.m – called by “comp_5obj” to compute covariance

 

Please contact Dr. Ke-Shiuan Lynn if you have any questions about the data.