«CSE Report 643 Lee J. Cronbach Stanford University Editorial Assistance by Richard J. Shavelson Stanford University December 2004 Center for the ...»
My Current Thoughts on Coefficient Alpha and Successor Procedures
CSE Report 643
Lee J. Cronbach
Editorial Assistance by
Richard J. Shavelson
Center for the Study of Evaluation (CSE)
National Center for Research on Evaluation,
Standards, and Student Testing (CRESST)
Graduate School of Education & Information Studies
University of California, Los Angeles
Los Angeles, CA 90095-1522
Project 3.6: Study Group Activity on Cognitive Validity Name, Richard Shavelson, Stanford University.
Copyright © 2004 The Regents of the University of California The work reported herein was supported under the Educational Research and Development Centers Program, PR/Award Number R305B960002, as administered by the Institute of Education Sciences (IES), U.S. Department of Education.
The findings and opinions expressed in this report do not reflect the positions or policies of the National Center for Education Research, the Institute of Education Sciences, or the U.S. Department of Education.
EDITOR’S PREFACE TO LEE J. CRONBACH’S
“MY CURRENT THOUGHTS ON COEFFICIENT ALPHA AND SUCCESSOR
Lee looked exhausted, listless, and despondent. I knew he was battling a disease that severely limited his eyesight, but I didn’t expect to see this. Perhaps his eyesight was taking a greater toll than I had imagined.
I also knew that he had just completed a major project that his close friend and colleague, Richard E. Snow, had begun and couldn’t complete because of his untimely death in December 1997. As dean of the Stanford University School of Education, I had encouraged Lee to take on the project and provided research assistant support in the person of Min Li, now an assistant professor at the University of Washington. For a good 3 years, Lee had led a team of scholars—Lyn Corno, Haggai Kupermintz, David F.
Lohman, Ellen B. Mandinach, Ann W. Porteus, Joan E. Talbert, all of The Stanford Aptitude Seminar—in completing, Remaking the Concept of Aptitude: Extending the Legacy of R.E. Snow. Perhaps Lee simply reflected the letdown that often comes upon completing a major project.
Finally, I knew that to honor his commitment to Dick Snow, Lee had put a very important project on the back burner. He was quite aware at the time he took on the Snow project that the 50th anniversary of his “alpha paper” was fast approaching. Before Dick’s death he had planned on writing a major technical paper for Psychometrika on his current views at this golden anniversary. The Snow project had intervened and it looked as if this alpha paper wasn’t going to get done.
In the end, all three events probably contributed to his demeanor at our meeting at the Center in June 2001. It would be two months later that I would learn the major reason for his appearance: Lee had decided in February of that year not to take medication to retard his congestive heart failure; he died on October 3rd, 2001.
Fortunately, being in my usual state of ignorance in June and knowing how excited Lee could get at the thought of a new project, I didn’t hesitate to give unsolicited advice based on my then current “diagnosis.” I asked Lee to reconsider doing a paper to celebrate the 50th anniversary of coefficient alpha. I tried to persuade him that the world didn’t need another esoteric technical paper when what was important were his nontechnical ideas about alpha now. I went on to say, in response to his demure due to poor eyesight, that he could dictate his thoughts and we could have them transcribed.
I believe the idea of a new project lifted his spirits. As I drove him home, he allowed as how he’d give the new project some thought but that I shouldn’t be too optimistic. A day or two later, he called. He was excited, spirits lifted: he’d do the project... if I would agree to help edit the manuscript. We had a deal and the result is before your eyes in the article that follows this preface, “My Current Thoughts on Coefficient Alpha on Alpha Successor Procedures.” Just before he died, we were together in his apartment. Lee, lying in his bed, looked at me, smiled that Cronbach smile, and said something like, “Turn about is fair play.” He was referring to my encouraging him to take up the book for Dick Snow.
Now it was my turn to put together his monograph. He smiled again and said, “I thank you and my ghost thanks you.” He died eight hours later.
What we have in “My Current Thoughts” is vintage Cronbach. I followed his admonition not to edit the paper heavily but to make sure the ideas flowed and especially that there were no technical gaffes. He was right once again about editing.
When I read the paper now, I hear Lee speaking and I can see him speaking to me. In the end, his dictating not only preserved his “current” ideas about alpha 50 years after its publication, it also preserved the way he reasoned and talked about his ideas. I hope you enjoy reading the piece as much as I have enjoyed editing it.
The project could not have been started without the assistance of Martin Romeo Shim, who helped me not only with a reexamination of the 1951 paper but with various library activities needed to support some of the statements in these notes.
My debt is even greater to Shavelson for his willingness to check my notes for misstatements and outright errors of thinking, but it was understood that he was not to do a major editing. He supported my activity, both psychologically and concretely, and I thank him.
Editor’s (Richard Shavelson’s) Note:
The work reported herein was supported in part by the National Center on Evaluation, Standards, and Student Testing (CRESST) under the Educational Research and Development Center Program, PR/Award Number R305B60002, as administered by the Office of Educational Research and Improvement, U.S.
Department of Education. The findings and opinions expressed in this report do not reflect the positions or policies of, the Office of Educational Research and Improvement, the U.S. Department of Education. I am indebted to my colleague, Ed Haertel, for helping to check for accuracy. Nevertheless, I alone am responsible for errors of commission and omission.
1 Where the accuracy of a measurement is important, whether for scientific or practical purposes, the investigator should evaluate how much random error affects the measurement. New research may not be necessary when a procedure has been studied enough to establish how much error it involves. But, with new measures, or measures being transferred to unusual conditions, a fresh study is in order. Sciences other than psychology have typically summarized such research by describing a margin of error; a measure will be reported followed by a “plus or minus sign” and a numeral that is almost always the standard error of measurement (which will be explained later).
The alpha formula is one of several analyses that may be used to gauge the reliability (i.e., accuracy) of psychological and educational measurements. This formula was designed to be applied to a two way table of data where rows represent persons (p) and columns represent scores assigned to the person under two or more conditions (i).
"Condition" is a general term often used where each column represents the score on a single item within a test. But it may also be used, for example, for different scorers when more than one person judges each paper and any scorer treats all persons in the sample. Because the analysis examines the consistency of scores from one condition to another, procedures like alpha are known as “internal consistency” analyses.
Origin and Purpose of These Notes My 1951 Article and Its Reception I published in 1951 an article entitled, "Coefficient Alpha and the Internal Structure of Tests." The article was a great success. It was cited frequently [Ed.: no less than 5590 times]. Even in recent years, there have been approximately 325 social science citations per year.1 The numerous citations to my paper by no means indicate that the person who cited it had read it, and does not even demonstrate that he had looked at it. I envision the typical activity leading to the typical citation as beginning with a student laying out his research plans for a professor or submitting a draft report and it would be the professor’s routine practice to say, wherever a measuring instrument was used, that the student ought to check the reliability of the instrument. To the question, “How do I do that?” the professor would suggest using the alpha formula because the computations are well within the reach of almost all students undertaking research, and because the calculation can be performed on data the student will routinely collect. The professor might write out the formula or simply say "you can look it up". The student would find the formula in many textbooks and the textbook would be likely to give the 1951 article 2 as reference, so the student would copy that reference and add one to the citation count.
There would be no point for him to try to read the 1951 article, which was directed to a specialist audience. And the professor who recommended the formula may have been born well after 1951 and not only be unacquainted with the paper but uninterested in the debates about 1951 conceptions that had been given much space in my paper. (The citations are not all from non-readers; throughout the years there has been a trickle of papers discussing alpha from a theoretical point of view and sometimes suggesting interpretations substantially different from mine. These papers did little to influence my thinking.) Other signs of success: There were very few later articles by others criticizing parts of my argument. The proposals or hypotheses of others that I had criticized in my article generally dropped out of the professional literature.
A 50th Anniversary In 1997, noting that the 50th anniversary of the publication was fast approaching, I began to plan what has now become these notes. If it had developed into a publishable article, the article would clearly have been self-congratulatory. But I intended to devote most of the space to pointing out the ways my own views had evolved; I doubt whether coefficient alpha is the best way of judging the reliability of the instrument to which it is applied.
My plan was derailed when various loyalties impelled me to become the head of the team of qualified and mostly quite experienced investigators who agreed on the desirability of producing a volume (Cronbach, 2002) to recognize the work of R. E.
Snow, who had died at the end of 1997.
When the team manuscript had been sent off for publication as a book, I might have returned to alpha. Almost immediately, however, I was struck by a health problem, which removed most of my strength, and a year later, when I was just beginning to get back to normal strength, an unrelated physical disorder removed virtually all my near vision. I could no longer read professional writings, and would have been foolish to try to write an article of publishable quality. In 2001, however, Rich Shavelson urged me to try to put the thoughts that might have gone into the undeveloped article on alpha into a dictated memorandum, and this set of notes is the result. Obviously, it is not the scholarly review of uses that have been made of alpha and of discussions in the literature about its interpretation that I intended. It may nonetheless pull together some ideas that have been lost from view. I have tried to 3 present my thoughts here in a non-technical manner, with a bare minimum of algebraic statements, and hope that the material will be useful to the kind of student who in the past has been using the alpha formula and citing my 1951 article.
My Subsequent Thinking Only one event in the early 1950's influenced my thinking: Frederick Lord's (1955) article in which he introduced the concept of "randomly parallel" tests. The use I made of the concept is already hinted at in the preceding section.
A team started working with me on the reliability problem in the latter half of the decade, and we developed an analysis of the data far more complex than the two-way table from which alpha is formed. The summary of that thinking was published in 1963, 2 but is beyond the scope of these notes. The lasting influence on me was the appreciation we developed for the approach to reliability through variance components, which I shall discuss later.
From 1970 to 1995, I had much exposure to the increasingly prominent state-wide assessments and innovative instruments using samples of student performance. This led me to what is surely the main message to be developed here. Coefficients are a crude device that does not bring to the surface many subtleties implied by variance components. In particular, the interpretations being made in current assessments are best evaluated through use of a standard error of measurement, as I discuss later.
Conceptions of ReliabilityThe Correlational Stream
Emphasis on individual differences. Much early psychological research, particularly in England, was strongly influenced by the ideas on inheritance suggested by Darwin’s theory of Natural Selection. The research of psychologists focused on measures of differences between persons. Educational measurement was inspired by the early studies in this vein and it, too, has given priority to the study of individual differences—that is, this research has focused on person differences.
When differences were being measured, the accuracy of measurement was usually examined. The report has almost always been in the form of a “reliability coefficient.” The coefficient is a kind of correlation with a possible range from 0 to 1.00. Coefficient alpha was such a reliability coefficient.