|
|
| Significant Statistics for Scientists |
|
|
| By Sriram | ||
| April 2007 | Science and Society | |
|
Key decisions such as faculty recruitment, granting of tenure, awarding of grants, and election into scientific academies should be taken on the basis of merit. While the criteria used for such decisions are multi-faceted and subjective, there is an evident rationale—if not for fairness and transparency, then for sheer convenience—for using simple, objective metrics for comparing individual scientists. Likewise, objective criteria for comparing journals are often used as a proxy for the scientific quality and importance of the papers published in a particular journal as well as to determine whether it is worthwhile for a library to subscribe to that journal. What are these metrics and how reasonable and effective are they? Journal Rankings Most readers of Natural Selections would be aware of the impact factor, which is described by its publishers as a “quantitative tool for ranking, evaluating, categorizing, and comparing journals”1. The impact factor of a journal in a particular year (say 2006) is the ratio A/B, where A is the total number of citations in 2006 to articles published in that journal in 2004-05 and B is the total number of citable articles published in it during 2004-051,2. Impact factors have a tremendous influence on both scientists and publishers of journals. Publications in high-impact journals are often a pre-requisite for getting tenure and promotions and, in some countries, researchers may even be rewarded for such publications9. Impact factors also guide the choice of journal to which a paper is submitted. Journal publishers also adopt various means to improve their impact factor, such as increasing the number of review articles (which are cited, on average, more than primary literature) and “suggesting” authors to cite articles in their journals9. While a high impact factor indicates a high rate of citation on average, not all articles in a journal are equally cited. In fact, the distribution of the number of citations is highly skewed. According to an editorial in the journal Nature6, a mere 25% of the papers published in 2002-03 accounted for 89% of the citations made to Nature articles in 2004, and a “great majority” of the papers had fewer than twenty citations. In other words, most of the articles had a “low impact.” While these articles might be widely read on account of being published in Nature, they were rarely cited in future work. The impact factor of a journal should, therefore, never be used as a proxy for the “citation impact” of individual papers. Nevertheless, impact factors are a fair measure of the popularity of a journal. They, however, do not necessarily reflect the prestige of the journal amongst the scientific community, a fact substantiated by the preponderance of dedicated review periodicals amongst the highest impact journals. Recently, Bollen et al suggested that a modified version of the PageRank algorithm, the algorithm used by Google to rank web pages, could be used to gauge the prestige of journals4. While the impact factor calculation treats all citations to a journal equally, the PageRank algorithm gives more weight to citations from prestigious journals (the algorithm is iterative, since a prestigious journal is defined as one with a high PageRank). The authors also proposed a third metric, the Y-factor, defined as the product of the PageRank and the impact factor of a journal, which was claimed to be an even better yardstick for prestige. For instance, based on data on citations in 2003 to articles published in 2001-02, the five journals with the highest Y-factor were Nature, Science, The New England Journal of Medicine, Cell, and PNAS. In fact, the paper concludes with the remarkable assertion that “the intuitive and simplistic definition of the Y-factor rankings may not be scientifically convincing, still the authors were more than slightly intrigued to find that the top scoring journals according to this ranking principle rather closely matched their personal perception of importance”4. Measuring Individual Scientific Achievement As mentioned in the beginning of this article, the assessment of the scientific prowess of an individual is essential in taking many key decisions in academia. There are at least two sources of information that might facilitate such an assessment: (1) the opinions of colleagues and those of other scientists working in the same field, and (2) the publication record of the individual. Of these two, it is only the latter that is amenable to objective evaluation. When comparing publication records of scientists using an objective criterion, it is useful to have a metric that balances quantity (number of publications) and quality (the impact of publications). The metrics that have been traditionally used to evaluate scientific performance tended to favor one of these two criteria. For example, metrics like total number of publications or publications per year do not take the quality of publications into account. On the other hand, a metric like number of citations per paper measures quality and can be skewed by a rare, high-impact publication. Moreover, metrics dependent on total number of citations can be manipulated by publishing several review articles. Other metrics, like the number of “significant” papers, while being more balanced, require specification of arbitrary parameters which define significance. The h-index, however, has its drawbacks. The index ignores the actual number of citations garnered by the top h papers of a researcher. It is possible that a researcher whose ten best papers have been cited 500 times each and another whose ten best have been cited only 20 times each have the same h-index. Hirsch himself notes that while a high h-index is consistent with high achievement, a low h-index does not imply poor achievement. As the Wikipedia entry on h-index points out, had Einstein died in 1906 (1905 is called Einstein’s Annus Mirabilis, the year in which he published four seminal papers), he would have a very low h-index even though he would still have been regarded as a brilliant physicist3. More rigorous critiques of the h-index can be found in references 7 and 8. One major limitation with existing metrics for ranking individuals (including the h-index) is the lack of apportioning of credit among various authors in a multi-author paper. The metrics described above give equal credit to all authors, whereas, in practice, this is not the case. While the h-index can be computed for individuals working in any field (some fields like biology permit a higher h-index than others, which is alright as long as cross-field comparisons are not made), it is not obvious whether it is possible to find a solution to the credit apportioning issue that works across multiple fields with differing conventions for co-authorship. The use of citation-based yardsticks to describe individual achievement must be done with great caution. The story of Gregor Johann Mendel is a poignant reminder of the pitfalls of using such yardsticks. Mendel’s seminal work on the inheritance of traits in pea plants was published in 1866 in Proceedings of the Natural History Society of Brunn10. That work remained in obscurity (just three citations in the first 35 years since publication) until it was rediscovered in the early part of the 20th century. Using the above metrics, Mendel would not have been considered an influential researcher. The fact is that his work is part of every textbook on genetics and is hailed as the harbinger of modern genetics. While Mendel’s story might be an exception, one that is unlikely to repeat with the availability of comprehensive online literature databases with powerful search facilities, it is worth emphasizing that citation-based metrics suffer from an even more fundamental flaw, viz., they ignore other aspects of academic life such as mentoring, teaching, and “generosity with ideas, skills, and time”7. Is it really prudent to “reduce a lifetime’s work” to a single number? References1 http://scientific.thomson.com/free/essays/journalcitationreports/impactfactor/ 2 http://en.wikipedia.org/wiki/Impact_factor 3 http://en.wikipedia.org/wiki/H-index 4 Johan Bollen et al, “Journal status,” Scientometrics 69, no. 3 (2006): 669-687. (View preprint) 5 J. E. Hirsch, “An index to quantify an individual’s scientific research output,” Proc Natl Acad Sci U S A 102, no. 46 (2005): 16569-72. (Pubmed link) 6 “Not-so-deep impact,” Nature 435, no. 7045 (2005): 1003-1004. (Pubmed link) 7 Clint D. Kelly and Michael D. Jennions, “The h index and career assessment by numbers,” Trends Ecol Evol 21, no. 4 (2006): 167-170. (Pubmed link) 8 Sune Lehmann et al, “Measures for measures,” Nature 444, no. 7122 (2006): 1003-1004. (Pubmed link) 9 Richard Monastersky, “The Number That’s Devouring Science,” The Chronicle of Higher Education, October 14 (2005). (View article) |
||