Percentile Ranking and Citation Impact of a Large Cohort of National Heart, Lung, and Blood Institute–Funded Cardiovascular R01 Grants
Rationale: Funding decisions for cardiovascular R01 grant applications at the National Heart, Lung, and Blood Institute (NHLBI) largely hinge on percentile rankings. It is not known whether this approach enables the highest impact science.
Objective: Our aim was to conduct an observational analysis of percentile rankings and bibliometric outcomes for a contemporary set of funded NHLBI cardiovascular R01 grants.
Methods and Results: We identified 1492 investigator-initiated de novo R01 grant applications that were funded between 2001 and 2008 and followed their progress for linked publications and citations to those publications. Our coprimary end points were citations received per million dollars of funding, citations obtained <2 years of publication, and 2-year citations for each grant’s maximally cited paper. In 7654 grant-years of funding that generated $3004 million of total National Institutes of Health awards, the portfolio yielded 16 793 publications that appeared between 2001 and 2012 (median per grant, 8; 25th and 75th percentiles, 4 and 14; range, 0–123), which received 2 224 255 citations (median per grant, 1048; 25th and 75th percentiles, 492 and 1932; range, 0–16 295). We found no association between percentile rankings and citation metrics; the absence of association persisted even after accounting for calendar time, grant duration, number of grants acknowledged per paper, number of authors per paper, early investigator status, human versus nonhuman focus, and institutional funding. An exploratory machine learning analysis suggested that grants with the best percentile rankings did yield more maximally cited papers.
Conclusions: In a large cohort of NHLBI-funded cardiovascular R01 grants, we were unable to find a monotonic association between better percentile ranking and higher scientific impact as assessed by citation metrics.
The National Heart, Lung, and Blood Institute (NHLBI) looks to peer review to guide its funding decisions for investigator-initiated R01 grants, which make up the largest single component of our extramural portfolio.1 For the most part, successful applications are those that fall below a percentile ranking value of peer review priority scores; the cut-off percentile ranking value, or payline, is determined by budgetary considerations. Despite longstanding tradition and affirmations, many question the ability of peer review, as it is currently practiced, to identify those research proposals most likely to have high impact, and whether it has impact on scientific thought, clinical practice, or public policy.2–5 Although previous reports have questioned the internal consistency and validity of peer review,5 there is little data regarding the association, if any, with postaward scientific achievement.4 We, therefore, conducted an observational analysis of percentile rankings and bibliometric outcomes for a contemporary set of NHLBI-funded cardiovascular R01 grants.
We considered all de novo investigator-initiated R01 grants that met the following inclusion criteria: (1) award on or after January 1, 2001, and before September 1, 2008; (2) duration of funding of ≥2 years; (3) assignment to a cardiovascular unit within NHLBI; and (4) receipt of a percentile ranking based on a priority score given by a National Institutes of Health (NIH) peer review study section.
We obtained grant-specific award and funding data from an internal NHLBI Tracking and Budget System, which include information on investigator status (early stage or established), grantee institution, identity of peer review study section, percentile ranking, project start and end dates, involvement of human subjects, and total funding (including direct and indirect costs).
We used the NIH’s electronic scientific portfolio assistant (eSPA) to generate lists of grant-associated publications, along with publication- specific data on publication type (research or other), total citations, and citations received <2 years of publication. Our common censor date was September 23, 2012. Our coprimary outcome measures were number of total citations received per million dollars of NIH funding, number of citations received <2 years, and number of citations received <2 years for each grant’s most highly cited paper (ie, 1 paper per grant that received the most number of 2-year citations). We also calculated each grant’s h-index (and h-index for 2-year citations), where a grant is given an index of h if it includes h papers that have been cited at least h times and none of the grant’s other papers have received more than h citations.6 The eSPA system maps publications to specific grants with the Scientific Publication Information Retrieval and Evaluation System (http://era.nih.gov/nih_and_grantor_agencies/other/spires.cfm) and citation data from ISI Web of Science.
Because many publications were supported by >1 grant, we adjusted the counts for publications and citations by dividing by the number of cited grants. Thus, if a paper has cited 3 grants and garnered 30 citations, each individual grant would be credited with 1/3 of a publication (0.3333…) and with 10 citations. We also performed a supplementary analysis focusing on papers that acknowledged only 1 grant.
For descriptive purposes only, we obtained aggregate data on journals in which publications appeared and on their medical subject heading (MeSH) terms using PubMed PubReminer (http://hgserver2.amc.nl/cgi-bin/miner/miner2.cgi).
For descriptive purposes, we present baseline characteristics of grants with numbers and percentages for categorical variables and mean±SE for continuous variables, stratified by 3 percentile ranking categories: 0% to 10%, 10% to 20%, and 20% to 30%. We also described unadjusted citation statistics by generating a Pareto plot, which shows the sum of total citations received within citation deciles; a classic Pareto plot enables one to demonstrate, for example, that 20% of inputs (eg, employees) generate 80% of outputs (eg, productivity). To describe the association of bibliometric outcomes allocated with percentile rankings, we computed and plotted nonparametric locally weighted scatterplot smoothing estimates. We performed multivariable linear regression analyses to account for associations with study type (human subjects or not), grant duration, new investigator status, calendar year of first award, average number of grants acknowledged per paper, average number of authors per paper, average annual funding (in million dollars per year), and total institutional funding within the portfolio of all grants included in the study sample. Because both publications and citations per million dollars allocated have right- skewed distributions, we performed natural logarithmic transformations of (Publications/$Million+1) and (Citations/$Million+1) and performed goodness-of-fit tests for linear models using the nonparametric analysis of deviance F-tests.7
To further evaluate the independent association of percentile rankings with bibliometric outcomes, we constructed Breiman random forests, which are machine learning–based constructs that allow for robust, unbiased assessment of complex associations. As described previously, we assessed the relative variable importance based on a variable importance value that reflected gain of discrimination by adding a variable as well as by average minimal depth (where 1 is best, and higher values suggest lesser importance).8 Statistical analyses were conducted using SAS 9.2 (SAS Institute Inc), the Spotfire S+, and the R statistical software packages. We used the qcc package to present the distribution of citations graphically, and the randomforestSRC package to construct random forests. We will make available copies of the analysis data sets to interested investigators on request.
There were 1492 cardiovascular R01 grants that met our inclusion criteria. Table 1 summarizes the baseline characteristics of these grants stratified by the percentile ranking categories of ≤10.0%, 10.1% to 20.0%, and >20.0%. Grants with lower percentile rankings had higher funding levels and longer durations.
In 7654 grant-years of funding that generated $3004 million of total NIH awards, the portfolio of 1492 grants yielded 16 793 publications (median per grant, 8; 25th and 75th percentiles, 4 and 14; range, 0–123), which received in total 2 224 255 citations (median per grant, 1048; 25th and 75th percentiles, 492 and 1932; range, 0–16 295), and 109 305 citations <2 year of publication (median per grant, 38; 25th and 75th percentiles, 15 and 87; range, 0–1302). The median grant h-index was 6 (25th and 75th percentiles, 3 and 11; range, 0–72) when the median h-index for 2-year citations was 3 (25th and 75th percentiles, 2 and 5; range, 0–22).
Table 2 presents the most common journals in which publications appeared and the publications’ most common MeSH terms. The 5 most popular journals were American Journal of Physiology (Heart and Circulatory Physiology), Journal of Biological Chemistry, Circulation, Circulation Research, and Hypertension. The 10 most common MeSH terms were animals, humans, male, female, mice, rats, middle aged, cells (cultured), signal transduction, and myocardium.
The median number of publications per million dollars allocated was 4.6 (25th and 75th percentiles, 2.4 and 7.9; range, 0–55). The median number of citations per million dollars allocated was 600 (25th and 75th percentiles, 259 and 1072; range, 0–7269). The number of citations received per million dollars allocated followed an attenuated Pareto distribution; as shown in Figure 1A, the 40% most productive grants generated 76% of productivity, whereas the 40% least productive grants generated only 5%. Similarly, the number of citations received <2 years of publication followed a Pareto distribution; as shown in Figure 1B, the 40% most productive grants generated 83% of productivity, whereas the 40% least productive grants generated only 3%.
Publications and Citations According to Percentile Ranking
There were no associations between percentile rankings and any of the publication and citation metrics we considered (Table 1, bottom). Figure 2A presents grant-specific data of citations per million dollars allocated according to percentile ranking and grant type (human versus nonhuman); Figure 2B to 2D shows the corresponding data for 2-year citations, 2-year citations for maximally cited papers, and grant h-index. Figure 3 shows the data specific for the 6 study sections that reviewed the most number of funded grants; again, even within each study section, there was no association between percentile rankings and citations received per million dollars allocated.
In a machine learning Breiman random forest model, which accounted for the same covariates listed in Table 1, the strongest predictor of citations per million dollars allocated was average number of grants acknowledged per paper, whereas percentile ranking was a much weaker predictor (Figure 4A). There was no clear monotonic association between percentile rankings and citations per million dollars allocated (Figure 4B). The association between average number of grants acknowledged per paper and citations per million dollars allocated followed an inverse-V association, with a peak ≈3 to 4 grants (Figure 4C). There was a weak association between percentile rankings and 2-year citations for any given grant’s most highly cited paper, with 2-year citation rates highest for grants with a percentile ranking <10 (Figure 4D).
Papers That Acknowledged Only 1 Grant
In a supplementary analysis, we identified 927 R01 grants that produced ≥1 paper that acknowledged only 1 grant. The 4122 single-grant papers (representing 25% of all papers) received 548 024 citations, of which 21 615 occurred <2 years of publication. In unadjusted analyses, there was no association between percentile rankings and total number of citations received (Figure 5A) or citations received <2 years (Figure 5B). After accounting for covariates, there was no association between percentile rankings and total citations (adjusted P=0.40) or citations received <2 years (adjusted P=0.59).
We analyzed the bibliometric outcomes of 1492 cardiovascular R01 grants that received initial funding between 2001 and 2008 according to peer review percentile rankings. We found no clear association between percentile rankings and outcomes; as percentile ranking decreased (meaning a better score), we did not observe a corresponding monotonic increase in publications produced or citations received per million dollars spent. The absence of association persisted even after accounting for select confounders, for consideration of human versus nonhuman research, and for actions taken by specific high-volume study sections.
In an exploratory machine learning analysis, we found an intriguing, though admittedly weak, pattern whereby grants scoring in the 10th percentile or better generated more 2-year citations for maximally cited papers. This pattern suggests that peer review may identify grants that generate home runs, that is, individual papers that have unusually high impact. Given that scientific discovery is inherently heavy-tailed,9 our observation is worthy of further exploration in other grant cohorts. Our observation may also be particularly relevant at this time, when we are no longer living in a more generous funding climate.
Our findings are consistent with previous impressions that peer review assessments of grant applications are relatively crude predictors of subsequent productivity.2,4,5,10,11 Critics argue that the current approach to selecting proposals for funding has no evidence base.10,11 Some empirical work suggests that selection mechanisms that focus on researcher track records, instead of peer review assessments of project proposals, may better predict subsequent high-impact publications and willingness to consider innovative ideas.12
If percentile ranking does not predict scientific outcome, there is a rationale for considering other approaches to evaluating proposals and choosing which ones to fund. Kaplan2 suggested that the study section committee structure inherently discriminates against innovative projects and has identified alternative peer review methods, which range from appointing prescient individuals to using highly inclusive Web-based crowd sourcing. Ioannidis10 identifies 6 possible alternative options for choosing projects to fund: egalitarian (fund all but at low amounts), aleatoric (fund at random, an approach being used by some sponsors), career assessment, automated impact indices, scientific citizenship, and projects with broad goals. He and others acknowledge that it is not known which approach (if any) is best, and therefore funding agencies, including NIH, should consider conducting randomized trials.13 The National Cancer Institute recently modified its approach to funding decisions, restricting automatic funding to those grants with percentile rankings of ≥7, while staff scientists undertake initial review to decide which additional grants to fund. Our machine learning exploratory analyses (Figure 4D) might offer support—automatically fund those with topnotch percentile rankings while taking a more deliberative approach for all others.
There are limitations to our analyses. There is no clear gold standard for measuring research success or impact. We focused on the number of citations, and specifically citations according to funding, as our primary end point, an end point consonant with those used by others interested in measuring scientific productivity.14 Citations represent a measure of interest on the part of the scientific community; recent work is focusing on newer Web-based, arguably more sensitive, measures of impact. Work on rare diseases may attract interest from a smaller community, yet still be of substantial scientific value. We deliberately chose not to analyze outcomes according to journal impact factor, because some recently singled out this practice for intense criticism.15 Nature, a journal with one of the highest impact factors in biomedical science, has argued against the use of impact factor for describing the impact of individual papers, noting that only a small proportion of its papers yields the vast majority of its citations.16 Recently, the Editor-in-Chief of Science, another journal with a high impact factor, critiqued the “misuse” as “never intended to be used to evaluate individual scientists.”17 Our analyses focused only on funded R01 cardiovascular grant applications but still covered a diversity of projects, scientists, and scientific institutions. There were many confounders we did not consider, such as detailed preapplication metrics of principle investigators.
Finally, we did not compare the productivity of scientists who successfully secured funding as compared with others who did not; such an analysis would be beyond the scope of our study. One report from the National Bureau of Economic Research suggested a moderate impact of R01 funding on scientific productivity.18 Nonetheless, because we were only able to analyze outcomes for those grants that scored within a relatively narrow and positively received range, our findings are consistent with an argument that NIH funding levels are inadequate to support all potentially fundable high-quality work.
Despite these limitations, we noted a striking lack of association between percentile rankings and bibliometric outcomes in a large cohort of cardiovascular R01 grants. Our findings offer justification for further research and consideration into innovative approaches for evaluating research proposals and for selecting projects for funding.
We are grateful to Dr Frank Evans for his invaluable assistance in assembling the analysis data set. We also wish to thank the anonymous peer reviewers for their constructive comments and suggestions, and in particular for their queries on how to handle papers that acknowledge support from multiple grants and on the possibility of calculating grant-specific h-indices.
Sources of Funding
All authors were full-time employees of the National Heart, Lung, and Blood Institute at the time they worked on this project.
In December 2013, the average time from submission to first decision for all original research papers submitted to Circulation Research was 11.66 days.
- Nonstandard Abbreviations and Acronyms
- electronic scientific portfolio assistant
- medical subject heading
- National Heart, Lung, and Blood Institute
- Received September 19, 2013.
- Revision received January 3, 2014.
- Accepted January 8, 2014.
- © 2014 American Heart Association, Inc.
- Galis ZS,
- Hoots WK,
- Kiley JP,
- Lauer MS
- Kaplan D
- Demicheli V,
- Di Pietrantonj C
- Graves N,
- Barnett AG,
- Clarke P
- Hirsch JE
- Hastie T,
- Tibshirani R
- Press WH
- Langer JS
- Dizikes P
- Balaban RS
- Alberts B