Teacher Evaluation Policy Series | Deweaponizing Teacher Evaluation: Using Teacher Evaluation for Growth (Part Two) by Sarah L. Hairston in collaboration with Dr. Thomas W. Hairston
This is the second in our series on teacher evaluation policy. Teacher evaluation has been a major feature of education policy reform in the United States during the past ten years. The triumvirate of Race to the Top (2009) funding competition, the No Child Left Behind waivers (2011), and the passage of the Every Student Succeeds Act (2015) made teacher evaluation central to education reform over the past decade. This series will consider emerging areas of growth in teacher evaluation policy. In the two-part discussion “Deweaponizing teacher evaluation: Using teacher evaluation for growth,” Sarah Hairston and a co-developer of the largest teacher evaluation system in the state of Missouri will look at the snares of teacher evaluations (part one) and discuss how to construct teacher evaluations towards a growth model (part two). Seyma Dagistan, in “Student Voice in the Teacher Evaluation Process – A lost opportunity” considers the case of Massachusetts in the attempt to incorporate student voice in teacher evaluation. In “Responsible Teachers: Teacher Evaluation in Finland,” Hansol Woo looks into the Finnish model of teacher evaluation where teachers evaluate their own progress. These articles will consider why we evaluate teachers and who is best positioned to do so. We invite you to join the discussion be leaving a comment or submitting an essay of your own.
Teacher evaluation systems have historically failed to produce meaningful data toward growth (Ross & Walsh, 2019). A growth model evaluation is one that provides formative data, specific feedback, and continuous assessment of teaching practices in the classroom from the beginning of the school year until the end (NEE, 2019). Last month, I sat down with my spouse, Dr. Thomas W. Hairston (Tom) who is a co-developer of the largest teacher evaluation system in the state of Missouri – the Network for Educator Effectiveness (NEE). The intention of the discussion was to focus on how teacher evaluations can move to a growth model. In our November issue we looked at the snares of teacher evaluations. Tom laid out three reasons for the lack of a growth model in teacher evaluations as the following: 1) evaluations do not focus on building accuracy within the evaluators; 2) evaluations over-rely on student test scores; and 3) evaluations use subjective labels that create a binary between “good” teacher versus “bad” teacher (T. Hairston, personal communication, September 6, 2019). In this issue, Tom and I, readdress those attributes and discuss how to move to more localized teacher evaluations focused on growth that can benefit the student, teacher, and district as a whole.
Commitment to Building Accuracy
In part one of this discussion Tom indicated that one of the reasons that teacher evaluations lack a growth model approach is due to the inaccuracy within the evaluators (T. Hairston, personal communication, September 6, 2019). Evaluation systems vary and we can discuss the validity of these systems but essentially these tools are reliant on the efficacy of implementation by the evaluator themselves. Federal and local policies require evaluation to be conducted in a quantitative manner, one where complex human behavior is turned into numerical measurement. Evaluating through that psychometric approach to human behavior must be learned. Unfortunately, most evaluation systems employed by states or school districts do not require training on how to use the evaluation components. Research conducted by NEE found that principals achieve an overall level of accuracy after training, but that issues such as grade level, subject area, and teaching practice being evaluated play critical roles in that accuracy (T. Hairston, personal communication, September 6, 2019). In other words, while a principal may be accurate, that accuracy varies depending on the context of the observation (Bergin, Wind, Grajeda, & Tsai, 2017). Classroom observations are not something that exist in a vacuum, and when accompanied with high-stakes decisions, they are unethical if they are not accurate (AERA, APA, & NCME, 2014). School districts and evaluation systems must be committed to a training regimen for their evaluators if they want classroom observations that honor the professionalism of teachers.
With these freedoms come greater responsibility on states and districts to think deeply about the systems they use and how they are implemented toward growth that can benefit the student, teacher, and district as a whole.
At NEE, evaluators are required to not only participate in an initial training session but also annually attend recertification trainings to both build and maintain accuracy (T. Hairston, personal communication, September 6, 2019). During those trainings, the evaluators work to understand evaluation rubrics, and how those may be employed in a variety of classrooms and across a variety of teaching practices. The time commitment can be a daunting ask but as Tom states:
If the administrator is to function as an instructional leader, then the time commitment should reflect that role. The more time and connection made with what an evaluator is observing and how they are using the rubrics, the more accurate they will be in evaluating the teachers.
(T. Hairston, personal communication, September 6, 2019)
Changing the Conversation from Accountability to Growth
As stated in our previous discussion, Tom indicated that another snare within teacher evaluations is the over-reliance on student test scores, which are often one snapshot of time close to the end of the year (T. Hairston, personal communication, September 6, 2019). By waiting until the end of the year to determine effectiveness, and by putting so much weight on one measure, evaluation becomes divorced from a growth model approach. To attribute student achievement to a specific teacher is contradictory to research that reveals that a student’s achievement is influenced by a range of external factors outside and classroom teacher’s control. These factors may include: available resources and class structures; prior teachers and schooling; social economic status; outside support and culture of a student’s home, community, and peer group; testing bias; and the student’s individual abilities and needs (Darling-Hammond, Amrein-Beardsley, & Rothstein, 2012). Research shows that a fair and more valid measure of teacher performance includes multiple measures both in data points and frequency (Ross & Walsh, 2019). Some examples of data points NEE utilizes include classroom observation, student surveys, professional development plans, and the implementation of district curriculum (T. Hairston, personal communication, September 6, 2019).
Multiple data points help to alleviate an invalid over-reliance on student test scores but Tom insists that data alone is not enough in developing a growth model, but rather feedback is the linchpin in the evaluation process (T. Hairston, personal communication, September 6, 2019). As Tom states:
Administrators collect that data, but it means nothing if they aren’t talking to teachers about that data – and aligning those conversations to the localized expectations. When evaluation is done for professional growth, data can be utilized in this way throughout the school year, and as school years bridge into the next. Additionally, teachers need to see the data. There needs to be a transparency in order to build a culture of growth.
(T. Hairston, personal communication, September 6, 2019)
Therefore, part of the time commitment in evaluating teachers must be attributed to feedback conversations. Feedback should be timely, occur at multiple times throughout the academic year, include multiple data points, and focus on the teaching practices being evaluated and how the educator is providing the best possible environment for student learning (Darling-Hammond, et al., 2012).
Disrupting Binary Language
The biggest potential deficit of teacher evaluation systems Tom indicated was the subjective labels that arise (T. Hairston, personal communication, September 6, 2019). Most evaluation systems use labels to tell a teacher’s story. Those labels are not psychologically safe and can often be triggering. Often the expectation is to score perfect at all times because there is strongly worded judgments and critiques if a teacher does not. Tom emphasized that a growth model needs to be separated from the binary notion of failure that often accompanies a need for punishment, “NEE was built by educators and former educators, and we knew just as well as anyone the media discourse and splashy headlines that were crippling teacher morale” (T. Hairston, personal communication, September 6, 2019). One way to disrupt this binary is to change the conversation from one focused on accountability and cut scores to that of locally defined essential teaching practices. Such a shift helps to reframe the purpose and language around evaluations.
According to Tom, often times teacher evaluation systems are rooted in an overabundance of pre-defined indicators that are externally produced, and then teachers are forced to teach to practices that may work in other settings and environments, but not in theirs (T. Hairston, personal communication, September 6, 2019). Each district and school, however, are unique in their needs and a one-size-fits-all approach does not honor students, teachers, and administrators in their work toward growth. Instead, NEE works with schools to identify the most essential teaching practices that they want their teachers to be using in the classroom and in their other duties (T. Hairston, personal communication, September 6, 2019). By localizing key essential teaching practices, the conversation becomes uniquely situated. For example, Tom recommends that each school answer the following questions before embarking in teacher evaluations:
One, for students to learn, in this school and/or district, we believe that teaching must include a, b, c, etc.; and two, we are seeking to improve in the following practices: a, b, c, etc. Doing that should tell a district pretty easily what they should be evaluating.
(T. Hairston, personal communication, September 6, 2019)
Answering those questions is a growth model approach that begins to build a common language, builds a dataset off of those practices, and aligns professional development initiatives.
Conclusion
With current federal education policies, states have some freedom in adopting policies that allow districts more localized control. With these freedoms come greater responsibility on states and districts to think deeply about the systems they use and how they are implemented toward growth that can benefit the student, teacher, and district as a whole. Some key components laid out include: specific and transparent scoring rubrics and data collection; multiple data points; continuous assessment throughout the year; and feedback that accompanies each piece of evaluation data (T. Hairston, personal communication, September 6, 2019).
Sarah L. Hairston is a Ph.D. student in Educational Leadership and Policy Analysis at the University of Missouri-Columbia. Previously, she taught theatre and public speaking for 16 years and holds an Educational Specialist degree in Educational Leadership and a Masters in Curriculum and Instruction. Her current research interests include educational policy, structural violence in education, student voice, and educational activism.
References
AERA, APA, & NCME (2014). Standards for educational and psychological testing.
Bergin, C., Wind, S., Grajeda, S. & Tsai, C. 2017. “Teacher Evaluation: Are Principals’ Classroom Observations Accurate at the Conclusion of Training?” Studies in Educational Evaluation 55: 19-26. doi: 10.1016/j.stueduc.2017.05.002.
Darling-Hammond, L., Amrein-Beardsley, E. H., & Rothstein, J. 2012. “Evaluating teacher evaluation.” Kappan, 93(6): 8-15. DOI: 10.1177/003172171209300603
Network for Educator Effectiveness (NEE). (2019), https://neeadvantage.com/
Ross, E. & Walsh, K. 2019. State of the States 2019: Teacher and Principal Evaluation Policy.
Washington, DC: National Council on Teacher Quality, https://www.nctq.org/pages/State-of-the-States-2019:-Teacher-and-Principal-Evaluation-Policy#Background
Thank you Sarah for adding this!