Assessment and Evaluation in Educational Contexts
For at least the past three decades, assessment, evaluation, and accountability have been major strands of educational policy and practice internationally. However, the available data on how exactly assessment- and evaluation-based policies are framed and implemented, or how they shape practices within schools, are still limited. This chapter addresses these issues with a broad focus that takes into account several perspectives on school evaluation and student assessment, together with everyday practices of teacher judgment and grading. First, we address assessment and evaluation practices for the purpose of educational system monitoring. Second, school evaluation practices, as well as the use of assessment and evaluation results at the school level, are discussed. A third perspective focuses on practices of teacher evaluation. Finally, practices of student assessment within schools and classrooms are examined. The instruments described and recommended in this chapter have implications for international research, as well as national studies.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Subscribe and save
Springer+ Basic
€32.70 /Month
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (France)
eBook EUR 117.69 Price includes VAT (France)
Softcover Book EUR 158.24 Price includes VAT (France)
Hardcover Book EUR 158.24 Price includes VAT (France)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Accountability and Assessment
Chapter © 2022
Principal Leadership and Challenges for Developing a School Culture of Evaluation
Chapter © 2016
Reframing conversations about teacher quality: school and district administrators’ perceptions of the validity, reliability, and justifiability of a new teacher evaluation system
Article 13 February 2019
Notes
This chapter expands on a technical paper that was presented to the PISA 2015 Questionnaire Expert Group (QEG) in May 2012 (Doc. QEG 2012−05 Doc 08).
Table 19.1 List of constructs included in the PISA 2015 field trial to assess assessment and evaluation in educational contexts
References
- Abrams, L. M. (2007). Implications of high-stakes testing for the use of formative classroom assessment. In J. H. McMillan (Ed.), Formative classroom assessment: Theory into practice (pp. 79–98). New York/London: Teacher College, Columbia University. Google Scholar
- Ajzen, I. (2005). Attitudes, personality, and behavior (2nd ed.). Maidenhead/New York: Open University Press. Google Scholar
- Alkin, M. (1972). Evaluation theory development. In C. Weiss (Ed.), Evaluation action programs (pp. 105–117). Boston: Allyn and Bacon. Google Scholar
- Alkin, M., & Christie, C. A. (2004). An evaluation theory tree. In M. Alkin (Ed.), Evaluation roots tracing theorists’ views and influences (pp. 12–65). Thousand Oaks: Sage. Google Scholar
- Altrichter, H., & Maag Merki, K. (2016). Handbuch Neue Steuerung im Schulsystem (2nd ed.). Wiesbaden: Springer VS. BookGoogle Scholar
- Archer, J., & McCarthy, B. (1988). Personal biases in student assessment. Educational Research, 30(2), 142–145. ArticleGoogle Scholar
- Barber, M., & Mourshed, M. (2007). How the world’s best-performing school systems come out on top. New York: McKinsey and Co. Google Scholar
- Bennett, R. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy & Practice, 18(1), 5–25. ArticleGoogle Scholar
- Berkemeyer, N., & Müller, S. (2010). Schulinterne evaluation: Nur ein Instrument zur Selbststeuerung von Schulen? [Internal school-based evaluation: Only a tool for self-management?]. In H. Altrichter & K. Maag Merki (Eds.), Handbuch Neue Steuerung im Schulsystem (1st ed., pp. 195–218). Wiesbaden: Springer VS. ChapterGoogle Scholar
- Bischof, L. M., Hochweber, J., Hartig, J., & Klieme, E. (2013). Schulentwicklung im Verlauf eines Jahrzehnts: Erste Ergebnisse des PISA-Schulpanels [School improvement throughout one decade: First results of the PISA school panel study]. Zeitschrift für Pädagogik, special issue, 59, 172–199. Google Scholar
- Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7–74. ArticleGoogle Scholar
- Black, P., & Wiliam, D. (2004). The formative purpose. Assessment must first promote learning. In M. Wilson (Ed.), Towards coherence between classroom assessment and accountability: 103rd yearbook of the national society for the study of education, Part II (pp. 20–50). Chicago: University of Chicago Press. Google Scholar
- Blöchliger, H. (2013). Decentralisation and economic growth—part 1: How fiscal federalism affects long-term development (OECD working papers on fiscal federalism, No. 14). Paris: OECD Publishing. BookGoogle Scholar
- Brookhart, S. M. (2004). Classroom assessment: Tensions and intersections in theory and practice. Teachers College Record, 106(3), 429–458. ArticleGoogle Scholar
- Brown, G. T. L. (2012). Prospective teachers’ conceptions of assessment: A cross-cultural comparison. The Spanish Journal of Psychology, 15(1), 75–89. ArticleGoogle Scholar
- Coburn, C., & Turner, E. O. (2011). Research on data use: A framework and analysis. Measurement: Interdisciplinary Research and Practice, 9(4), 173–206. Google Scholar
- Colby, S. A., Bradshaw, L. K., & Joyner, R. L. (2002). Teacher evaluation: A review of literature. Paper presented at the annual meeting of the American Educational Research Association. New Orleans, LA. Google Scholar
- Creemers, B. P. M., & Kyriakides, L. (2008). The dynamics of educational effectiveness. A contribution to policy, practice and theory in contemporary schools. London/New York: Routledge. Google Scholar
- Cross, L. H., & Frary, R. B. (1999). Hodgepodge grading: Endorsed by students and teachers alike. Applied Measurement in Education, 12(1), 53–72. ArticleGoogle Scholar
- de Boer, H., Enders, J., & Schimank, U. (2007). On the way towards new public management? The governance of university systems in England, the Netherlands, Austria and Germany. In D. Jansen (Ed.), New forms of governance in research organizations (pp. 137–152). Dordrecht: Springer. ChapterGoogle Scholar
- DeLuca, C., LaPointe-McEwan, D., & Luhanga, U. (2015). Teacher assessment literacy: a review of international standards and measures. Educational Assessment, Evaluation and Accountability, 28, 1–22. doi:10.1007/s11092-015-9233-6. Google Scholar
- Donaldson, S. I. (2004). Using professional evaluation to improve the effectiveness of nonprofit organizations. In R. E. Riggo & S. S. Orr (Eds.), Improving leadership in nonprofit organizations (pp. 234–251). San Francisco: Wiley. Google Scholar
- Elacqua, G. (2016). Building more effective education systems. In S. Kuger, E. Klieme, N. Jude, & D. Kaplan (Eds.), Assessing contexts of learning: An international perspective. Dordrecht: Springer. Google Scholar
- European Commission. (2011). Progress towards the common European objectives in education and training: Indicators and benchmarks 2010/2011 (Commission staff working document based on document SEC(2011)526)). Luxembourg: European Union. Google Scholar
- Faubert, V. (2009). School evaluation: Current practices in OECD countries and a literature review (OECD Education Working Papers, No. 42). Paris: OECD Publishing. BookGoogle Scholar
- Glazermann, S., Goldhaber, D., Loeb, S., Raudenbush, S., Staiger, D., & Whitehurst, G. J. (2011). Passing muster: Evaluating teacher evaluation systems. Washington, DC: The Brookings Brown Center Task Group on Teacher Quality. Google Scholar
- Goe, L. (2007). The link between teacher quality and student outcomes: A research synthesis. Washington, DC: National Comprehensive Center for Teacher Quality. http://www.gtlcenter.org/sites/default/files/docs/LinkBetweenTQandStudentOutcomes.pdf. Accessed 17 June 2016.
- Goldhaber, D. D., Goldschmidt, P., & Tseng, F. (2013). Teacher value-added at the high-school level. Different models, different answers? Educational Evaluation and Policy Analysis, 35(2), 220–236. ArticleGoogle Scholar
- Guskey, T. R. (2007). Multiple sources of evidence. An analysis of stakeholders’ perceptions of various indicators of student learning. Educational Measurement: Issues and Practice, 26(1), 19–27. ArticleGoogle Scholar
- Guskey, T. R. (2012). Defining students’ achievement. In J. Hattie & E. M. Anderman (Eds.), International guide to student achievement. Educational psychology handbook series (pp. 3–6). New York/London: Routledge. Google Scholar
- Haertel, E. H. (2013). Reliability and validity of inferences about teachers based on student test scores. Princeton: Education Testing Service. https://www.ets.org/Media/Research/pdf/PICANG14.pdf. Accessed 17 June 2016.
- Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis of the evidence. Educational Assessment, Evaluation and Accountability, 26(1), 5–28. ArticleGoogle Scholar
- Hanushek, E. A., Link, S., & Wößmann, L. (2013). Does school autonomy make sense everywhere? Panel estimates from PISA. Journal of Development Economics, 104, 212–232. ArticleGoogle Scholar
- Haptonstall, K. G. (2010). An analysis of the correlation between standards-based, non-standards-based grading systems and achievement as measured by the Colorado Student Assessment Program (CSAP) (Doctoral dissertation). Colorado: ProQuest, UMI Dissertation Publishing. Google Scholar
- Harlen, W. (2007). Formative classroom assessment in science and mathematics. In J. H. McMillan (Ed.), Formative classroom assessment: Theory into practice (pp. 116–135). New York/London: Teachers College Press, Columbia University. Google Scholar
- Harlen, W., & Deakin Crick, R. (2002). A systematic review of the impact of summative assessment and tests on students’ motivation for learning (EPPI-Centre Review, version 1.1*). London: EPPI-Centre. https://eppi.ioe.ac.uk/cms/Portals/0/PDF%20reviews%20and%20summaries/ass_rv1.pdf?ver=2006-02-24-112939-763. Accessed 17 June 2016.
- Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. ArticleGoogle Scholar
- Hochweber, J., Hosenfeld, I., & Klieme, E. (2014). Classroom composition, classroom management, and the relationship between student attributes and grades. Journal of Educational Psychology, 106(1), 289–300. ArticleGoogle Scholar
- Hofman, R. H., Dijkstra, N. J., & Hofman, W. H. A. (2009). School self-evaluation and student achievement. School Effectiveness and School Improvement, 20(1), 47–68. ArticleGoogle Scholar
- Huber, S. G., & Skedsmo, G. (2016). Editorial: Data use—a key to improve teaching and learning. Educational Assessment, Evaluation and Accountability, 28(1), 1–3. ArticleGoogle Scholar
- Johnson, K., Greenseid, L. O., Toal, S. A., King, J. A., Lawrenz, F., & Volkov, B. (2009). Research on evaluation use: A review of the empirical literature from 1986 to 2005. American Journal of Evaluation, 30(3), 377–410. ArticleGoogle Scholar
- Jude, N. (2016). The assessment of learning contexts in PISA. In S. Kuger, E. Klieme, N. Jude, & D. Kaplan (Eds.), Assessing contexts of learning: An international perspective. Dordrecht: Springer. Google Scholar
- Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching: combining high-quality observations with student surveys and achievement gains (Research paper, MET Project). Seattle: Bill & Melinda Gates Foundation. http://files.eric.ed.gov/fulltext/ED540960.pdf. Accessed 17 June 2016.
- Kane, T. J., McCaffrey, D. F., Miller, T., & Staiger, D. O. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment (Research paper, MET Project). Seattle: Bill & Melinda Gates Foundation. http://www.hec.ca/iea/seminaires/140401_staiger_douglas.pdf. Accessed 17 June 2016.
- Kellaghan, T., & Stufflebeam, D. L. (Eds.). (2003). International handbook of educational evaluation. Part one: Perspectives/part two: Practice. Dordrecht: Kluwer Academic Publishers. Google Scholar
- Kingston, N., & Nash, B. (2011). Formative assessment: A meta-analysis and a call for research. Educational Measurement: Issues and Practice, 30(4), 28–37. ArticleGoogle Scholar
- Klieme, E. (2013). The role of large-scale assessment in research on educational effectiveness and school development. In M. von Davier, E. Gonzalez, E. Kirsch, & K. Yamamoto (Eds.), The role of international large-scale assessments: Perspectives from technology, economy, and educational research (pp. 115–147). New York: Springer. ChapterGoogle Scholar
- Koeppen, K., Hartig, J., Klieme, E., & Leutner, D. (2008). Current issues in competence modeling and assessment. Zeitschrift für Psychologie/Journal of Psychology, 216(2), 61–73. ArticleGoogle Scholar
- Kuger, S., & Klieme, E. (2016). Dimensions of context assessment. In S. Kuger, E. Klieme, N. Jude, & D. Kaplan (Eds.), Assessing contexts of learning: An international perspective. Dordrecht: Springer. ChapterGoogle Scholar
- McMillan, J. H. (2007). Formative classroom assessment: The key to improving student achievement. In J. H. McMillan (Ed.), Formative classroom assessment. Theory into practice (pp. 1–7). New York/London: Teacher College, Columbia University. Google Scholar
- Mead, S., Rotherham, A., & Brown, R. (2012). The hangover: Thinking about the unintended consequences of the nation’s teacher evaluation binge. Teacher Quality 2.0, Special Report 2. Washington, DC: American Enterprise Institute. http://bellwethereducation.org/sites/default/files/legacy/2012/09/Teacher-Quality-Mead-Rotherham-Brown.pdf. Accessed 17 June 2016.
- Nevo, D. (1998). Dialogue evaluation: A possible contribution of evaluation to school improvement. Prospects, 28(1), 77–89. ArticleGoogle Scholar
- Nevo, D. (2002). Dialogue evaluation: Combining internal and external evaluation. In D. Nevo (Ed.), School-based evaluation: An international perspective (pp. 3–16). Amsterdam/Oxford: Elsevier Science. ChapterGoogle Scholar
- OECD. (1989). Schools and quality: An international report. Paris: OECD. Google Scholar
- OECD. (2005). Formative assessment: Improving learning in secondary classrooms. Paris: OECD. Google Scholar
- OECD. (2007). PISA 2006: Science competencies for tomorrow’s world (Vol. 1). Paris: OECD. BookGoogle Scholar
- OECD. (2010). PISA 2009 results: What students know and can do. Paris: OECD. Google Scholar
- OECD. (2012). Grade expectations: How marks and education policies shape students’ ambitions. PISA. Paris: OECD. Google Scholar
- OECD. (2013). Synergies for better learning. An international perspective on evaluation and assessment. OECD reviews of evaluation and assessment in education. Paris: OECD. Google Scholar
- OECD. (2014). TALIS 2013 results: An international perspective on teaching and learning (Revised version). TALIS. Google Scholar
- Papanastasiou, E. C. (1999). Teacher evaluation: Theories and practices. ERIC. http://files.eric.ed.gov/fulltext/ED439157.pdf. Accessed 17 June 2016.
- Patton, M. Q. (1997). Utilization-focused evaluation: The new century text (3rd ed.). Thousand Oaks: Sage. Google Scholar
- Rakoczy, K., Klieme, E., Bürgermeister, A., & Harks, B. (2008). The interplay between student evaluation and instruction. Zeitschrift für Psychologie, 2, 111–124. ArticleGoogle Scholar
- Ryan, K. E., Chandler, M., & Samuels, M. (2007). What should school-based evaluation look like? Studies in Educational Evaluation, 33(3–4), 197–212. ArticleGoogle Scholar
- Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. ArticleGoogle Scholar
- Sanders, J. R., & Davidson, E. J. (2003). A model for school evaluation. In T. Kellaghan & D. L. Stufflebeam (Eds.), International handbook of educational evaluation. Part one: Perspectives/part two: Practice (pp. 807–826). Dordrecht: Kluwer Academic Publishers. ChapterGoogle Scholar
- Santiago, P., & Benavides, F. (2009). Teacher evaluation: A conceptual framework and examples of country practices. Paris: OECD. Google Scholar
- Scheerens, J. (2002). School self-evaluation: Origins, definitions, approaches, methods and implementation. In D. Nevo (Ed.), School-based evaluation: An international perspective (pp. 35–69). Amsterdam/Oxford: Elsevier Science. ChapterGoogle Scholar
- Scheerens, J., & Bosker, R. (1997). The foundations of educational effectiveness. Oxford: Emerald. Google Scholar
- Scheerens, J., Glas, C. A., & Thomas, S. M. (2003). Educational evaluation, assessment, and monitoring. A systemic approach. Lisse/Exton: Swets & Zeitlinger. Google Scholar
- Shepard, L. A. (2006). Classroom assessment. In R. L. Brennan (Ed.), Educational measurement (pp. 623–646). Westport: Rowman and Littlefield Publishers. Google Scholar
- Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189. ArticleGoogle Scholar
- Simons, H. (2002). School self-evaluation in a democracy. In D. Nevo (Ed.), School-based evaluation: An international perspective (pp. 17–34). Amsterdam/Oxford: Elsevier Science. ChapterGoogle Scholar
- Spillane, J. P. (2012). Data in practice: Conceptualizing the data-based decision-making phenomena. American Journal of Education, 118(2), 113–141. ArticleGoogle Scholar
- Stufflebeam, D. L. (2003). The CIPP model for evaluation. In T. Kellaghan & D. L. Stufflebeam (Eds.), International handbook of educational evaluation. Part one: Perspectives/part two: Practice (pp. 31–62). Dordrecht: Kluwer Academic Publishers. ChapterGoogle Scholar
- Taylor, E. S., & Tyler, J. (2011). The effect of evaluation on performance: Evidence from longitudinal student achievement data of mid-career teachers. NBER Working Paper 16877. Cambridge, MA. Google Scholar
- Teltemann, J., & Klieme, E. (in press). The impact of international testing projects on policy and practice. In G. T. L. Brown & L. R. Harris (Eds.), Handbook of human and social conditions in assessment (pp. 369–386). New York: Routledge. Google Scholar
- Torrance, H. (Ed.). (2013). Educational assessment and evaluation: Major themes in education. New York: Routledge. Google Scholar
- Van de Vijver, F. J. (1998). Towards a theory of bias and equivalence. Zuma Nachrichten Spezial, 3, 41–65. Google Scholar
- Visscher, A. J., & Coe, R. (2003). School performance feedback systems: Conceptualisation, analysis, and reflection. School Effectiveness and School Improvement, 14(3), 321–349. ArticleGoogle Scholar
- Whitcomb, J. (2014). Review of “Fixing classroom observations”. Boulder: National Education Policy Center. http://nepc.colorado.edu/thinktank/review-fixing-classroom-observations. Accessed 17 June 2016.
- Wößmann, L. (2003). Schooling resources, educational institutions, and student performance: The international evidence. Oxford Bulletin of Economics and Statistics, 65(2), 117–170. ArticleGoogle Scholar
- Wößmann, L., Lüdemann, E., Schütz, G., & West, M. R. (2009). School accountability, autonomy and choice around the world. Cheltenham: Edward Elgar. Google Scholar
- Wyatt-Smith, C. (2014). Designing assessment for quality learning: The enabling power of assessment. Heidelberg: Springer. BookGoogle Scholar
Author information
Authors and Affiliations
- Department for Educational Quality and Evaluation, German Institute for International Educational Research (DIPF), Frankfurt, Germany Sonja Bayer, Eckhard Klieme & Nina Jude
- Sonja Bayer