Dr. Stephen Sireci is passionate about equity in educational testing. A professor in the College of Education’s Research, Educational Measurement, and Psychometrics Program (REMP) and director of the Center for Educational Assessment, Sireci researches test development and evaluation, particularly issues of validity, cross-lingual assessment, standard setting, and computer-based testing. His goal is to make testing more fair and more useful and less prone to misuse—creating tests that better measure students’ knowledge and skills, and helping administrators use tests more productively and not for purposes beyond which they were designed (such as evaluating public school teachers). Sireci has built an international reputation, publishing over 130 peer-reviewed articles and book chapters, securing multiple contacts and grants with U.S. Department of Education, the Educational Testing Service, the College Board, and Pearson Educational Measurement, attracting more than $10 million in external funding, and serving on dozens of national commission, blue-ribbon panels, and advisory committees.

Fittingly, it was a standardized test that started Sireci on his career in educational measurement. He planned to go into industrial and organizational psychology, but wasn’t accepted into the doctoral programs to which he applied—likely because his GRE scores weren’t high enough. He had taken a course in research methods and statistics as a master’s student, and having found that particularly interesting, he applied to a psychometrics doctoral program at Fordham University as well, and was accepted there.

While working on his doctorate, Sireci took a job as research supervisor of testing at the Board of Education in Newark, New Jersey. “That was really enlightening,” he recalls. “My job was to analyze some of the test data and report back to the federal government as part of their Title I evaluation. I got to see inner city education up close and I got to see how people were using the data that was supposed to describe the situation I was seeing.”

Sireci

Noticing his interest in education evaluation, Sireci’s supervisor in Newark encouraged him to go into measurement and he took a predoctoral internship at the Educational Testing Service. From there he began working as a psychometrician for the Uniform CPA Exam in New York City and later as the senior psychometrician at GED Testing Service in Washington, D.C.

While working for the CPA exam, Sireci began collaborating with Ron Hambleton, a UMass College of Education faculty member and revered leader in the field. “Ron Hambleton was a god to me just from reading his work.” Hambleton served as a consultant with the organization, and he and Sireci became close friends. 

Sireci never planned to pursue a career in academia, but when he was considering moving to the Amherst area, Hambleton encouraged him to apply for an open position at REMP. Sireci was hired, and thought he’d remain in academia for five years or so. 24 years later he’s a Distinguished University Professor, and recipient of numerous awards for teaching, research, and mentorship. 

Sireci describes himself as both a critic of educational tests and someone who wants to improve educational testing. His key goals are validity and fairness: getting better measures of people that are as unbiased as possible and ensuring there is evidence and theory to support the use of a test for a particular purpose. “A test is not inherently valid or inherently invalid, what matters is what you are using it for,” he explains.

 A test is not inherently valid or inherently invalid, what matters is what you are using it for.

Sireci

Sireci has also worked with his colleagues at UMass and at the University of Nebraska Lincoln for the U.S. Department of Education for the past decade, evaluating the National Assessment of Educational Progress (NAEP). Also known as “the Nation’s Report Card,” it measures students’ knowledge and skills in various subjects, across the nation, states, and in some urban districts. They’re also currently researching test accommodations for students with disabilities for ETS and helping Pearson with psychometric support on the National Board for Professional Teaching Standards Exam.

Sireci’s international work has involved the challenges of test translation—examining whether scores across different language versions of an exam are comparable, and whether exams in translation are valid. “The assessment of English learners or linguistic minorities in general is one of the most difficult problems we face. Anytime someone’s taken a test in a language that is not their dominant language, it's very hard to say you’ve got a good measure that’s sufficient,” he asserts. “You probably don’t.” Sireci also sits on the Council for the International Test Commission, which holds a biannual conference on research on educational and psychological testing.

 The assessment of English learners or linguistic minorities in general is one of the most difficult problems we face. Anytime someone’s taken a test in a language that is not their dominant language, its very hard to say you’ve got a good measure that’s sufficient.

As educational assessment advances, Sireci is hoping to see the field take a very critical look at the positive and negative sides of testing. He asserts that testing must work in concert with instructors and curriculum, and that this was lost with national policy changes over the last 20 years. “When no child left behind came out, the first few years was probably one of the best periods in education measurement, in that the tests were providing information that people were paying attention to, especially about achievement gaps,” Sireci explains. “And then when Race to the Top came out, they tied test scores to evaluating teachers, which might sound like a good idea, but it was a terrible idea. And one of the consequences of that is we lost the teaching community. We lost the buy-in of the teachers. For educational tests to contribute to education they have to be aligned with instruction and aligned with the curriculum. If the assessment folk are not in full partnership with the instructional folk, that falls apart.”

 For educational tests to contribute to education they have to be aligned with instruction and aligned with the curriculum. If the assessment folk are not in full partnership with the instructional folk, that falls apart.

Sireci also feels it’s imperative that the field makes research and educational testing more accessible to the public and to policy makers, and to ensure that all teachers get extensive training in assessment so that they’re able to choose good tests and properly interpret the results. 

Finally, Sireci advocates for more attention to the role that measurement plays in social justice and equity, and how to prevent tests from becoming, as former American Educational Research Association president Amy Stuart Wells describes, “tools of oppression to keep marginalized people marginalized.” 

As a significant voice in the field of measurement, and as the president of the National Council on Measurement in Education (NCME), the largest organization of educational measurement specialists, Sireci is able to be a catalyst for change. In organizing the next NCME conference, they’ve included opportunities to have extensive discussion of values in educational testing. He’s also committed to diversifying the field of educational measurement, both in terms of underrepresented and historically marginalized groups, “but also with respect to perspectives on how we should do assessment and what really matters in test development and in testing research.” Sireci also affirms that the UMass Amherst College of Education and REMP create an ideal base from which he can advocate: “UMass is a perfect place for that because it fits into our social justice mission.”