| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Editorial |
From the Departments of Biostatistics and Epidemiology (J.S.R.) and Physiology and Biophysics (M.B.), School of Medicine, Case Western Reserve University and the Department of Molecular Cardiology (M.B.), Lerner Research Institute, The Cleveland Clinic Foundation, Cleveland, Ohio.
Correspondence to Meredith Bond, PhD, Department of Molecular Cardiology/NB50, Lerner Research Institute, The Cleveland Clinic Foundation, 9500 Euclid Ave, Cleveland, OH 44195. E-mail bondm{at}ccf.org
Key Words: microarrays statistical analysis cDNA cardiac remodeling
Over the last 10 to 20 years, the search for mechanisms responsible for cardiac remodeling during cardiac hypertrophy and failure has been hampered by the experimental tools available (primarily Western blot analysis and polymerase chain reaction). This is because these approaches only permit measurement of the expression levels of a few preselected genes at one time. However, there is increasing evidence that at the molecular level the changes that occur during development of heart failure represent a complex series of interrelated events.1 2 3 Thus, to identify the full scope and complexity of the subcellular changes that take place and thus make more rapid progress in identifying causes and cures of heart disease, we must depend on emerging high-throughput gene-profiling technologies. These newer approaches permit expression screening of very large numbers of genes simultaneously and then clustering of the results into functional gene families.4 5 As stated by Weinstein et al,6 "We will have to understand our favorite biological molecule in the context of many thousands of others ... a wide net must be cast to be sure that we have, in fact, found the important ones ..." (p. 627).
Both cDNA7 and oligonucleotide arrays8 permit an unbiased assessment (ie, no preselection required) of expression levels of thousands of full-length genes, cDNAs, or expressed sequence tags. In a relatively short period of time, high-density cDNA and oligonucleotide arrays have become almost household words in gene expression studies. Both off-the-shelf and customized arrays are increasingly finding their way into the tool chest of heart researchers. Just 1 year ago in Circulation Research, an article9 and accompanying editorial10 heralded the emergence of gene expression profiling using cDNA arrays as a powerful approach to perform broad-based gene expression studies in heart disease. Several reports have already appeared identifying changes in classes, or clusters, of genes whose expression changes during cardiac remodeling hypertrophy or failure.10 11 12 13 Collectively, these studies represent a major leap forward in our ability to sort out the different pathways in the heart or isolated cardiac myocytes, where changes in gene expression and, very likely, changes in protein expression have occurred.
With the draft sequence of the human genome now complete,
the good news is that the number of genes or gene fragments whose
expression can be assessed in a single pass by high-throughput
analysis will steadily increase, and analysis and
display programs that can handle and display the enormous amount of
data should become more readily
available.7 As the choices of
microarrays increase and (hopefully) become cheaper, more and more
investigators will be accessing this technology. However, the study by
Liu et al14 in this issue of
Circulation Research reminds us
that in our rush to embrace these new technologies, we need to take
pause and consider several important issues pertinent to data
analysis and interpretation
(Figure
).
Stage 1 of this gene exploration is relatively straightforward. It
depends on the size of an investigators supply budget for purchase of
cDNA arrays or GeneChip® and, secondly, on his or her
technical skill to consistently produce high-quality labeled
RNA or cDNA. Stage 2, data analysis, is more
problematic. There are several issues to consider. First is
simply the size of the data sets; gigabytes of computer storage and
very fast computers are now routinely required for storage and
manipulation of gene expression data. In theory, this problem can be
overcome by use of faster computers, large disk storage arrays, fast
network interconnects, and modern data backup and archiving systems
(albeit at great expense). The second issue is a thornier one, and this
is addressed by Liu et al.14
How can we determine which changes in gene expression are statistically
significant? How do we set the sensitivity and specificity of the
analysis, and, in view of the very large number of genes
analyzed, how do we avoid false positives?
|
Gene screening involves statistical hypothesis testing and as such has built in type I and type II errors. There are two issues at play here, one of which is addressed by the study by Liu et al.14 The other is indirectly addressed. The first issue deals with how replicates increase the accuracy of database estimates and hence statistical hypothesis testing. To investigate this question, Liu et al have chosen as their test system the changes in gene expression in isolated cardiac myocytes stimulated by insulin-derived growth factor-1 (IGF-1). IGF-1 is one of several factors known to trigger changes leading to cardiac hypertrophy, resulting in increased cell size, assembly of sarcomeres, and reexpression of fetal genes. Liu et al cap a rigorous statistical analysis of their cDNA expression data with a report of identification of several novel genes. Recently, this same issue was formally addressed for microarray data.15 However, the study by Liu et al14 uses a more heuristic approach to demonstrate how increasing the number of replicates reduces false detection rates (FDRs) of gene expression changes during cardiac remodeling.
The second issue deals with the multiplicity of statistical tests conducted. In this case, the usual error rates (P<0.05) applied to each test are no longer valid. Instead, family-wise error rates (cumulative error rates over the total number of hypotheses tested) need to be considered and procedures need to be developed to ensure that the overall error rate over all tests conducted is below some threshold. However, when thousands of tests are conducted, as in the case of gene screening, this becomes impractical. Therefore, the notion of FDRs16 17 has been developed to answer the following question: out of all of the hypothesis tests rejected (ie, significant gene differences found between IGF-1 and control), what proportion are rejected incorrectly? FDRs can be estimated from data using permutation or bootstrap methodologies (simulation techniques used when traditional assumptions, such as normality, do not hold) and have been successfully applied to gene screening for microarrays.18 A minor point is that the authors equate low FDRs with high specificity, whereas low FDRs actually indicate high sensitivity. To achieve high specificity, one would have to have some knowledge of false negatives. Whereas some theoretical work along these lines has been done, nothing has yet been extended to the microarray problem.
Acknowledgments
The authors wish to acknowledge Carley Gwin, Director, Gene Expression Core, and Eldon Walker, Director, Research Computing Services, at the Lerner Research Institute, The Cleveland Clinic Foundation.
Footnotes
The opinions expressed in this editorial are not necessarily those of the editors or of the American Heart Association.
References
1. Chien KR. Genomic circuits and the integrative biology of cardiac diseases. Nature. 2000;407:227232.[Medline] [Order article via Infotrieve]
2.
Houser SR, Lakatta
EG. Function of the cardiac myocyte in the conundrum of end-stage,
dilated human heart failure.
Circulation. 1999;99:600604.
3.
Mann DL. Mechanisms
and models in heart failure: a combinatorial approach.
Circulation. 1999;100:9991008.
4.
Eisen MB, Spellman
PT, Brown PO, Botstein D. Cluster analysis and display of
genome-wide expression patterns. Proc Natl
Acad Sci
U S A. 1998;95:1486314868.
5.
Overbeek R,
Fonstein M, DSouza M, Pusch GD, Maltsev N. The use of gene clusters
to infer functional coupling. Proc Natl
Acad Sci
U S A. 1999;96:28962901.
6. Weinstein JN. Fishing expeditions [letter]. Science. 1998;282:627.
7. Eisen MB, Brown PO. DNA arrays for analysis of gene expression. Methods Enzymol. 1999;303:179205.[Medline] [Order article via Infotrieve]
8. Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ. High density synthetic oligonucleotide arrays. Nat Genet. 1999;21(suppl 1):2024.
9.
Abdellatif M.
Leading the way using microarray: a more comprehensive approach for
discovery of gene expression patterns.
Circ Res. 2000;86:919920.
10.
Stanton LW,
Garrard LJ, Damm D, Garrick BL, Lam A, Kapoun AM, Zheng Q, Protter AA,
Schreiner GF, White RT. Altered patterns of gene expression in response
to myocardial infarction. Circ
Res. 2000;86:939945.
11.
Friddle CJ, Koga
T, Rubin EM, Bristow J. Expression profiling reveals distinct sets of
genes altered during induction and regression of cardiac
hypertrophy. Proc Natl Acad
Sci
U S A. 2000;97:67456750.
12.
Taylor LA,
Carthy CM, Yang D, Saad K, Wong D, Schreiner G, Stanton LW, McManus BM.
Host gene regulation during coxsackievirus B3 infection in mice:
assessment by microarrays. Circ
Res. 2000;87:328334.
13.
Yang J, Moravec
CS, Sussman MS, DiPaola NR, Fu D, Hawthorn L, Young JB, Francis GS,
McCarthy PM, Bond M. Decreased expression of striated muscle LIM
protein-1 (SLIM1) and increased expression of gelsolin in failing human
hearts by high density oligonucleotide arrays.
Circulation. 2000;102:30463052.
14.
Liu T-j, Lai H-c,
Wu W, Chinn S, Wang PH. Developing a strategy to define the effects of
insulin-like growth factor-1 on gene expression profile in
cardiomyocytes. Circ
Res. 2001;88:1231-1238.
15.
Lee M, Kuo F,
Whitemore G, Sklar J. Importance of replication in microarray gene
expression studies: statistical methods and evidence from a single cDNA
array experiment. Proc Natl Acad Sci
U S A. 2000;97:98349839.
16. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc Br. 1995;57:289300.
17. Benjamini Y, Yekutieli D. Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J Stat Plan Infer. 1999;82:171196.
18.
Tusher V,
Tibshirani R, Chu G. Significance analysis of microarrays
applied to the ionizing radiation response.
Proc Natl Acad Sci
U S A. 2001;98:51165121.
This article has been cited by other articles:
![]() |
G. E. Haddad, L. J. Saunders, S. D. Crosby, M. Carles, F. del Monte, K. King, M. R. Bristow, F. G. Spinale, T. E. Macgillivray, M. J. Semigran, et al. Human cardiac-specific cDNA array for idiopathic dilated cardiomyopathy: sex-related differences Physiol Genomics, April 1, 2008; 33(2): 267 - 277. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. P. Cappola, L. Cope, A. Cernetich, L. A. Barouch, K. Minhas, R. A. Irizarry, G. Parmigiani, S. Durrani, T. Lavoie, E. P. Hoffman, et al. Deficiency of different nitric oxide synthase isoforms activates divergent transcriptional programs in cardiac hypertrophy Physiol Genomics, June 24, 2003; 14(1): 25 - 34. [Abstract] [Full Text] [PDF] |
||||
![]() |
C Napoli, L O Lerman, V Sica, A Lerman, G Tajana, and F de Nigris Microarray analysis: a novel research tool for cardiovascular scientists and physicians Heart, June 1, 2003; 89(6): 597 - 604. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Lankford, A. M. Byford, K. J. Ashton, B. A. French, J. K. Lee, J. P. Headrick, and G. P. Matherne Gene expression profile of mouse myocardium with transgenic overexpression of A1 adenosine receptors Physiol Genomics, October 29, 2002; 11(2): 81 - 89. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-L. Tan, C. S. Moravec, J. Li, C. Apperson-Hansen, P. M. McCarthy, J. B. Young, and M. Bond The gene expression fingerprint of human heart failure PNAS, August 20, 2002; 99(17): 11387 - 11392. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Research Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2001 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |