Category Archives: Testing

Big Lies of Education: International Test Rankings and Economic Competitiveness

“Human development is an important component of determining a nation’s productivity and is measured in employee skills identified by employers as critical for success in the modern global economy,” claims Thomas A. Hemphill, adding:

The United States is obviously not getting a sufficient return on investment in elementary and secondary education, as it has mediocre scores in mathematics literacy and declining scores for science literacy for 15-year-old students surveyed in 2022. The only significant improvement for 15-year-olds is in reading, where the United States finally entered the top 10 in 2022.

Commentary: We must improve students’ math, science skills to boost US competitiveness

Hemphill reaches an unsurprising conclusion:

If these educational trends continue, the United States will not have an adequate indigenous workforce of scientists, engineers and technologists equipped to maintain scientific and technological leadership and instead will become perpetually reliant on scientifically and technologically skilled immigrants. We must demand that elementary and secondary education systems reorient efforts to significantly improve mathematical and scientific teaching expectations in the classroom.

Commentary: We must improve students’ math, science skills to boost US competitiveness

However, for decades, evidence has shown that there is no causal link between international rankings of student test scores and national economic competitiveness.

This Big Lie is purely rhetorical and relies on throwing statistical comparisons at the public while drawing hasty and unsupported causal claims from those numbers.

If you really care about the claim, see Test Scores and Economic Growth by Gerald Bracey.

Bracey offers this from researchers on the relationship between international education rankings and economic competitiveness:

Such countries [highest achieving] “do not experience substantially greater economic growth than countries that are merely average in terms of achievement.”

The researchers then lay out an interpretation of their findings that differs from the causal interpretation one usually hears:

“We venture, here, the interpretation that much of the achievement ‘effect’ is not really causal in character. It may be, rather, that nation-states with strong prodevelopment policies, and with regimes powerful enough to enforce these, produce both more economic growth and more disciplined student-achievement levels in fields (e.g., science and mathematics) perceived to be especially development related. This idea would explain the status of the Asian Tigers whose regimes have been much focused on producing both economic growth and achievement-oriented students in math and science.”

Test Scores and Economic Growth

Bracey quotes further from that research:

“From our study, the main conclusion is that the relationship between achievement in science and mathematics in schoolchildren and national economic growth is both time and case sensitive. Moreover, the relationship largely reflects the gap between the bottom third of the nations and the rest; the middle of the pack does not much differ from the rest. . . . Much of the obsession with the achievement ‘horse race’ proceeds as if beating the Asian Tigers in mathematics and science education is necessary for the economic well-being of other developed countries. Our analysis offers little support for this obsession. . . .

“Achievement indicators do not capture the extent to which schooling promotes initiative, creativity, entrepreneurship, and other strengths not sufficiently curricularized to warrant cross-national data collection and analysis. Unfortunately, the policy discourse that often follows from international achievement races involves exaggerated causal claims frequently stress- ing educational ‘silver bullets’ for economic woes. Our analyses do not offer defini- tive answers, but they raise important ques- tions about the validity of these claims. In an era that celebrates evidence-based policy formation, it behooves us to carefully weigh the evidence, rather than use it simply as a rhetorical weapon.”

Test Scores and Economic Growth

A key point to note here is Bracey is writing in 2007, and the OpEd above is March 2024. The Big Lie about international education rankings and economic competitiveness is both a lie and a lie that will not die.

I strongly recommend Tom Loveless exposing a similar problem with misrepresenting and overstating the consequences of NAEP data: Literacy and NAEP Proficient.

Bracey offers a brief but better way to understand test data and economic competitiveness: “education is critical, but among the developed nations differences in test scores are trivial.”

Instead of another Big Lie, the US would be better served if we tried new and evidence-based (not ideological) ways to reform our schools and our social/economic structures.


Teaching Writing: Reconsidering Genre (Again)

[Header Photo by David Pupăză on Unsplash]

My midterm exam for first-year writing invites students to interview a professor in a discipline they are considering as a major. The discussion is designed to explore those professors as researchers and writers.

On exam day, we have small and whole-class discussions designed to discover the wide variety of activities that count as research in various disciplines, and more importantly, what writing as a scholar looks like across disciplines.

The outcomes of this activity are powerful since students learn that research and writing are context-based and far more complicated that they learned in K-12 schooling.

Two points that I often emphasize are, first, that many (if not most) of the professors confess that they do not like to write, and second, I help them see that a profoundly important distinction between their K-12 teachers and professors is that professors practice the fields they teach.

This brings me to two posts on Twitter (X):

First, Luther is confronting a foundational failure of K-12 writing instruction—students being taught the “4 Types/Genres of Writing” (narration, description, exposition, persuasion).

That framing is deeply misleading and overly simplistic, but that framing is grounded in two realities: most K-12 teachers who teach writing are not writers, and the so-called “4 Types/Genres of Writing” are rooted in the rise of state-level accountability testing of writing (not any authentic or research-based approach to teaching composition).

Second, so I don’t appear to be beating up unfairly on K-12 teachers (I was one for 18 years and love K-12 teachers), Dowell is then confronting the often careless and reductive ways in which “academic writing” is both taught and even practiced (academic norms of published writing ask very little of scholars as writers and even impose reductive templates that cause lifeless and garbled writing).

The 1980s and 1990s saw a rise in state accountability testing that asked very little of students. The “4 Types/Genres of Writing” quickly supplanted the gains made with authentic writing instruction grounded in writer’s workshop and the influence of the National Writing Project in the 1970s and 1980s.

Those writing tests prompted students to write narrative or expository essays (for example) that were only a few paragraphs long (likely the 5-paragraph essay). These were scored based on state-developed rubrics that teachers taught to throughout the year.

In other words, as Gerald Bracey warned, writing instruction became almost exclusively teaching to the test. And since K-12 teachers of writers were primarily not writers themselves, this reductive and mechanical way to teach and assess writing was rarely challenged.

Let’s be blunt. K-12 teachers not resisting this dynamic is a logical response to an impossible learning and teaching environment that is dominated by accountability and high-stakes testing.

My criticism is that teachers and students were (and are) put in this situation; I am not criticizing teachers and students, who are the victims of the accountability era of education reform.

Further, while students who move from K-12 to higher ed discover that their K-12 preparation in writing is inadequate and often deeply misleading for how they are expected to write in academia, this new situation is not some idealistic wonderland of authentic writing (as Dowell confronts).

The K-12 to higher ed transition makes students feel unfairly jerked around (many are exasperated when they find out they didn’t need to “memorize” MLA and may never use it again), but navigating academic expectations for writing is equally frustrating (one first-year student this spring noted that my first-year writing seminar is unique, they said, because I teach writing while other professors simply assign and grade writing).

Students deserve better at both the K-12 and higher ed levels so here I want to offer a few thoughts on how to move past the traps I have noted above about teaching writing.

I highly recommend Genre awareness for the novice academic student: An ongoing quest by Ann Johns.

Johns argues for fostering “genre awareness” (addressing in complex and authentic ways Dowell’s concern) and not “genre acquisition” (for example, the reductive “4 Types/Genres of Writing” approach):

The first is GENRE ACQUISITION, a goal that focuses upon the students’ ability to reproduce a text type, often from a template, that is organized, or ‘staged’ in a predictable way. The Five Paragraph Essay pedagogies, so common in North America, present a highly structured version of this genre acquisition approach.

A quite different goal is GENRE AWARENESS, which is realized in a course designed to assist students in developing the rhetorical flexibility necessary for adapting their socio-cognitive genre knowledge to ever-evolving contexts. …After my many years of teaching novice tertiary students who follow familiar text templates, usually the Five Paragraph Essay, and who then fail when they confronted different types of reading and writing challenges in their college and university classrooms, I have concluded that raising genre awareness and encouraging the abilities to research and negotiate texts in academic classrooms should be the principal goals for a novice literacy curriculum (Johns 1997).

Genre awareness for the novice academic student:
An ongoing quest

Here I think is an outstanding graphic (Johns draws from Bhatia) of moving past confusing modes of writing (narration, description, exposition, persuasion) with genres of writing (OpEd, memoir, meta-analysis, literature review, etc.):

At both the K-12 and higher ed levels, then, teaching writing has been reduced to serving something other than students—either the mandates of high-stakes testing or the nebulous and shifting expectations of “academic writing,” which include very dangerous traps such as a maze of citation expectations among disciplines.

My first-year writing students and I are at midterm this spring, and we just held our conferences for Essay 2 with a scholarly cited essay looming once we return from spring break.

In those conferences, we have been discussing the huge learning curve they are facing since I ask them to choose their essay topic and thus develop their own thesis within a genre of writing.

They are making all the decisions writers do in authentic contexts.

Before my class, they have had most of their writing prompted, most of their thesis sentences assigned to them, and most of their genre experiences entirely reduced or erased.

So I explain this to them, assuring them that their struggles are reasonable and not a product of them failing or being inadequate.

These are new and complex expectations of young writers.

But is the only fair thing to offer them, this experience of becoming a writer as an act of them as humans and not as a performance for a test or to fill in a template.


Recommended

Investigating Zombi(e)s to Foster Genre Awareness

Thomas, P.L. (2019). Teaching writing as journey, not destination: Essays exploring what “teaching writing” means. Charlotte, NC: Information Age Publishing.

RECOMMENDED: John Warner’s “Why They Can’t Write”

RECOMMENDED: John Warner’s “The Writer’s Practice”

Contrarian Truths about Public Education and Student Achievement

“The 2022 NAEP results show that the average reading score for fourth graders is lower than it has been in over 20 years. For eighth and twelfth graders, average scores are at about a 30-year low,” states Senator Bill Cassidy (R-LA) in his new literacy report, adding, “The 2022 NAEP LongTerm Trend assessment for nine-year-old students showed average reading scores not seen since 1999.”

Cassidy’s alert about a reading crisis fits into dozens and dozens of media articles announcing crises and failures among students, teachers, and public schools all across the US. Typical of that journalism was Nicholas Kristof in the New York Times about a year ago:

One of the most bearish statistics for the future of the United States is this: Two-thirds of fourth graders in the United States are not proficient in reading.

Reading may be the most important skill we can give children. It’s the pilot light of that fire.

Yet we fail to ignite that pilot light, so today some one in five adults in the United States struggles with basic literacy, and after more than 25 years of campaigns and fads, American children are still struggling to read. Eighth graders today are actually a hair worse at reading than their counterparts were in 1998.

One explanation gaining ground is that, with the best of intentions, we grown-ups have bungled the task of teaching kids to read. There is growing evidence from neuroscience and careful experiments that the United States has adopted reading strategies that just don’t work very well and that we haven’t relied enough on a simple starting point — helping kids learn to sound out words with phonics.

Two-Thirds of Kids Struggle to Read, and We Know How to Fix It

As I have noted, education and reading crises have simply been a fact of US narratives since A Nation at Risk. But as I have also been detailing, these claims are misleading and manufactured.

In fact, a report from the progressive NPE and an analysis from the conservative Education Next offer contrarian truths about public education and student achievement, neither of which is grounded in crisis rhetoric or blaming students, teachers, and schools for decades of political negligence.

Based on NAEP data—similar to Cassidy’s report—Shakeel and Peterson offer a much different view of student achievement in the US, notably about reading achievement:


This analysis demonstrates that the current reading crisis is manufactured, exclusively rhetorical and ideological, generating profit for media, politicians, and commercial publishers.

In short, the manufactured crises are distractions from the other contrarian truth about education as highlighted in the analysis from NPE:

Public Schooling in America

This educational grading from NPE is unique because it doesn’t grade students, teachers, or public school, but holds political leadership accountable for supporting universal public education and democracy. The standards for these grades include the following:

  • Privatization Laws: the guardrails and limits on charter and voucher programs to ensure that taxpayers and students are protected from discrimination, corruption, and fraud.
  • Homeschooling Laws: laws to ensure that instruction is provided safely and responsibly.
  • Financial Support for Public Schools: sufficient and equitable funding of public schools.
  • Freedom to Teach and Learn: whether state laws allow all students to feel safe and thrive at school and receive honest instruction free of political intrusion.

These two examples come from contrasting ideologies, yet they offer contrarian truths about public schools and student achievement that would better serve how we talk about schools and student achievement as well as how we seek ways in which to reform those schools in order to better serve those students and our democracy.


Recommended

Big Lies of Education: Reading Proficiency and NAEP

Big Lies of Education: A Nation at Risk and Education “Crisis”

Opinion: Should California schools stick to phonics-based reading ‘science’? It’s not so simple

Thomas, P.L. (2022). The Science of Reading movement: The never-ending debate and the need for a different approach to reading instruction. Boulder, CO: National Education Policy Center. http://nepc.colorado.edu/publication/science-of-reading

Accordingly, when policymakers explore new guidelines,
they would be wise to do the following:

• Be wary of overstatements and oversimplifications within media and public advocacy, acknowledging concerns raised but remaining skeptical of simplistic claims about causes and solutions.

• Attend to known influences on measurable student reading achievement, including the socioeconomics of communities, schools, and homes; teacher expertise and autonomy; and teaching and learning conditions.

• Recognize student-centered as an important research-supported guiding principle but also acknowledge the reality that translating such research-based principles into classroom practice is always challenging.

• Shift new reading policies away from prescription and mandates (“one-size-fits-all” approaches) and toward support for individual student needs and ongoing teacher-informed reform.

In rethinking past efforts and undertaking new reforms, policymakers should additionally move beyond the ineffective cycles demonstrated during earlier debates and reforms, avoid ing specific mandates and instead providing teachers the flexibility and support necessary to adapt their teaching strategies to specific students’ needs. Therefore, state policymakers should do the following:

• End narrowly prescriptive non-research-based policies and programs such as:

o Grade retention based on reading performance.
o High-stakes reading testing at Grade 3.
o Mandates and bans that require or prohibit specific instructional practices, such as systematic phonics and the three-cueing approach.
o A “one-size-fits-all” approach to dyslexia and struggling readers.

• Form state reading panels, consisting of classroom teachers, researchers, and other literacy experts. Panels would support teachers by serving in an advisory role for teacher education, teacher professional development, and classroom practice. They would develop and maintain resources in best practice and up-to-date reading and literacy research.

On a more local level, school- and district-level policymakers should do the following:

• Develop teacher-informed reading programs based on the population of students served and the expertise of faculty serving those students, avoiding lockstep implementation of commercial reading programs and ensuring that instructional materials support—rather than dictate—teacher practice.

• Provide students struggling to read and other at-risk students with certified, experienced teachers and low student-teacher ratios to support individualized and differentiated instruction.

Big Lies of Education: Series

Here I will collect a series dedicated to the Big Lies of Education. The initial list of topics include :

  • A Nation at Risk and education “crisis”
  • Poverty is an excuse in educational achievement
  • 2/3 students not proficient/grade level readers; NAEP
  • Elementary teachers don’t know how to teach reading
  • NRP = settled science
  • Teacher education is not preparing teachers based on science/research
  • Education “miracles”
  • Reading program X has failed
  • Whole language/balanced literacy has failed
  • Systematic phonics necessary for all students learning to read
  • Nonsense word assessments measure reading achievement
  • Reading in US is being taught by guessing and 3 cueing
  • Balanced literacy = guessing and 3 cueing
  • K-3 students can’t comprehend
  • 40% of students are dyslexic/ universal screening for dyslexia needed
  • Grade retention
  • Grit/ growth mindset
  • Parental choice
  • Education is the great equalizer
  • Teacher quality is most important factor in student achievement (VAM)

Series:

Big Lies of Education: A Nation at Risk and Education “Crisis”

Big Lies of Education: Reading Proficiency and NAEP

Big Lies of Education: National Reading Panel (NRP)

Big Lies of Education: Poverty Is an Excuse

Big Lies of Education: International Test Rankings and Economic Competitiveness

Big Lies of Education: “Science of” Era Edition [Access PP PDF Here]

Big Lies of Education: Grade Retention

Big Lies of Education: Growth Mindset and Grit

Big Lies of Education: Word Gap


The Heightened Negative Consequences of Reductive Behaviorism on Student Learning

One of the most significant failures of media, public, political, and educational responses to the Covid/post-Covid era of traditional schooling is claiming that the Covid disruption created the problems being addressed about student mental health and student achievement.

The Covid/post-Covid era has heightened those problems in many ways, but the core issues were always the worst features of traditional schooling, notably the reductive behaviorism that drives testing/grading and classroom/school discipline.

Although I taught in K-12 education 18 years, I have been in higher education now for 22 years, working often with first-year students in my writing seminar (and this semester, in our advising program).

My first-year writing seminar students this fall are both very predictably similar to those I have taught for a couple decades and significantly exaggerated versions of those students.

We are at midterm, and students have just submitted Essay 2, a public essay in on-line format designed to help students ease into a formal cited essay (Essay 3). Essay 2 requires students to use hyperlinking for citation (and thus practice evaluating on-line sources, etc.) and incorporate images.

My first-year writing course is grounded in both writing workshop and minimum requirements instead of grades. This minimum requirements include the following:

  • Submit all essays in MULTIPLE DRAFTS per schedule before the last day of the course; initial drafts and subsequent drafts should be submitted with great care, as if each is the final submission, but students are expected to participate in process writing throughout the entire semester as a minimum requirement of this course—including a minimum of ONE conference per major essay.
  • Each essay rewrite (required) must be submitted after the required conference and BEFORE the next original essay is due (for example, E1RW must be submitted before E2 submission can be accepted).
  • Demonstrate adequate understanding of proper documentation and citation of sources through at least a single well-cited essay or several well-cited essays. A cited essay MUST be included in your final portfolio.

I recognize that I must not only teach students how to write at the college level, but also how to fully engage in a process writing course.

That last point is where students have always struggled, but Covid/post-Covid students are struggling mightily.

I provide students a wealth of support material and models for assignment, such as the following for general support:

And these specifically for Essay 2/hyperlink cited essay:

For context, I should note that I do not grade assignments throughout the semester (I must submit a course grade, which is based on a final portfolio as the final exam), and I do not take off for late work because I require that all work must be completed.

Historically, despite no grades or late penalty, my students have submitted work fully and on time at about a 90+% rate. Students typically receive As and Bs in the course with a sporadic student or two who do not meet the minimum requirements and thus fail (which is a consequence of simply not being able to fully engage in process writing).

A couple weeks ago, my first-year students submitted Essay 2; only 4 out of 12 did so on time.

So, yes, Covid/post-Covid education is different, but the issues are not new, just heightened.

What I am noticing is that students struggle to follow guidelines (see above), and I spend a great deal of time prompting students on their essay submission to review the sample and checklist provided.

One recent example struck me because a student submitted their Essay 2 rewrite, which was not significantly different than the initial submission—although I provided comments, directed them to the sample/checklist, and conferenced with the student (conferences end with revision plans and students choosing their rewrite due date).

I did not respond to the rewrite, but returned it with the original submission, noted my concern about almost no real revision, re-prompted the student to review the sample/checklist, and recommended another conference to insure the student and I are using our time well with another resubmission.

Two aspects of the essay were not addressed at all; the essay failed to mention the focus/thesis throughout the body of the essay (three subhead sections), and despite the checklist explicitly requiring students use journalistic paragraphing structure (noting restricting paragraphs to 1-3 sentences), the resubmission included (as in the original submission) opening and closing paragraphs of 5-6 sentences.

The student’s response is notable because they explained how hard they worked on the rewrite, including working with our writing lab, and then apologized.

I want to emphasize that I have over 40 years of teaching writing had to help students let go of the fear of mistakes and the urge to produce “perfect” writing in one submission. Most students simply can’t engage in process writing because the dominant culture of their schooling has been reductive behaviorism that hyper-focuses on student mistakes, fosters a reward/punishment culture, and shifts student concern from authentic artifacts and learning to securing grades.

As I have examined before, students are apt to view all feedback as negative even as I carefully and consistently urge them to see feedback as necessary for growing as writers.

One strategy I incorporate is showing students the real-world process of submitting and publishing academic writing; for example, my own experience publishing a policy brief:

This context, I think, helps some with the anxiety students feel about feedback and their tendency to view that feedback as negative (even though I am not grading them and they are performing in a low-stakes environment).

None the less, students at the college level have been so powerfully trained into the reductive behaviorism of success/failure, tests/grades, and avoiding mistakes that authentic process writing and writing outcomes (students write on topics by choice) are too foreign for them to fully engage.

What concerns me beyond why and how my students are struggling (in justifiable ways) is that I also see teachers and professors complaining about “students today” on social media.

Those complaints are quintessentially American responses—blaming the individuals while ignoring the systemic influences.

Our students are struggling in heightened ways because of the disruptions of Covid/post-Covid formal schooling. But traditional and uncritical commitments to reductive behaviorism are also at the core of their struggling as well.

Many if not most of the traditional approaches to schooling in the US are antagonistic not only to learning but also to the basic humanity of students and teachers.

Learning to write is a journey, a process, but so is all learning.

Students are the canaries in the coal mine warning us that education is too often dehumanizing and reductive. When students choose not to fully engage with that education, they may be making the most reasonable decision by choosing themselves.

Testing for Perpetual Education Crisis

“The administrations in charge,” writes Gilles Deleuze in Postscript on the Societies of Control, “never cease announcing supposedly necessary reforms: to reform schools, to reform industries, hospitals, the armed forces, prisons” (p. 4).

Deleuze’s generalization about “supposedly necessary reforms” serves as an important entry point into the perpetual education crisis in the US. Since A Nation at Risk, public education has experienced several cycles of crisis that fuel ever-new and ever-different sets of standards and high-stakes testing.

Even more disturbing is that for at least a century, “the administrations in charge” have shouted that US children cannot read—with the current reading crisis also including the gobsmacking additional crisis that teachers of reading do not know how to teach reading.

The gasoline that is routinely tossed on the perpetual fire of education crisis is test scores—state accountability tests, NAEP, SAT, ACT, etc.

While all that test data itself may or may not be valuable information for both how well students are learning and how to better serve those students through reform, ultimately all that testing has almost nothing to do with either of those goals; in fact, test data in the US are primarily fuel for that perpetual state of crisis.

Here is the most recent example—2023 ACT scores:

I have noted that reactions and overreactions to NAEP in recent years follow a similar set of problems found in reactions/overreactions to the SAT for many decades; the lessons from those reactions include:

  • Lesson: Populations being tested impact data drawn from tests.
  • Lesson: Ranking by test data must account for population differences among students tested. 
  • Lesson: Conclusions drawn from test data must acknowledge purpose of test being used (see Gerald Bracey)

The social media and traditional media responses to 2023 ACT data expose a few more concerns about media, public, and political misunderstanding of test data as well as how “the administrations in charge” depend on manipulating test data to insure the perpetual education crisis.

Many people have confronted the distorting ways in which the ACT data are being displayed; certainly the mainstream graph from Axios above suggests “crisis”; however, by simply modifying the X/Y axes, that same data appear at least less dramatic and possibly not even significant if the issues I list above are carefully considered.

Many causal elements could be at work to explain the ACT decrease, including population shifts, social influences (such as the Covid impact), and the inherently problematic element of using test data for purposes not intended as well as making nuanced claims based on singular data points (averages).

For example, the ACT is exclusively designed to measure college preparedness, like the SAT, and not general educational quality of schools or general evaluations of student learning.

Students who take the ACT are a narrow subset of students skewed by region and academic selectivity (college-bound students versus general population of US students).

Also, while a careful analysis could answer these questions, the ACT score drop may or may not represent a significant event, depending on what that single point (average) represents (how many questions and how large is the change substantively).

Likely, however, there is never any credible reason to respond to college entrance data as a crisis of general educational quality because, as noted above, that simply is not what the tests are designed to measure.

The larger issue remains: Testing in the US rarely serves well evaluating learning and teacher, testing has not functioned in service of achieving effective education reform, but testing does fuel perpetual education crisis.

This crisis-of-the-day about the ACT parallels the central problem with NAEP, a test that seems designed to mislead and not inform since NAEP’s “Proficient” feeds a false narrative that a majority of students are not on grade level as readers.

The ACT crisis graph being pushed by mainstream media is less a marker of declining educational quality in the US and more further proof that “the administrations in charge” want and need testing data to justify “supposedly necessary reforms,” testing as gas for the perpetual education crisis fire.

Moving Beyond the Cult of Pedagogy in Education Reform

As a teacher for forty years and a teacher educator for more than half of that career, I have always struggled with the tendency to oversell teacher quality and instructional practice.

Does teacher quality matter? Of course.

Does instructional practice matter? Again, of course.

But both teacher quality and instruction (pedagogy) are dwarfed by teaching and learning conditions within schools and more significantly by the conditions of any child’s life.

As I have noted recently, the peak era of focusing on teacher quality, the value-added movement (VAM) occurring mostly under the Obama administration, instead of identifying high-quality teachers as a driver for improving student achievement found out something much different than intended:

VAMs should be viewed within the context of quality improvement, which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.

ASA Statement on Using Value-Added Models for Educational Assessment (2014)

Teacher quality necessarily includes two types of knowledge by a teacher—content knowledge and pedagogical knowledge.

Yet the VAM experiment revealed something we have known for decades—standardized tests of student learning mostly reflect the student’s relative privilege or inequity outside of school.

Despite the refrain of Secretary Duncan under Obama, schools have never in fact been “game changers.”

While neoliberal/conservative education reforms leveraged the “soft bigotry of low expectations” and unsubstantiated claims that the Left uses poverty as an excuse, people all along the ideological spectrum are over-focused on instructional practices. And that overemphasis is used to keep everyone looking at teachers, students, and instruction instead of those more impactful out-of-school (OOS) influences on student learning.

A companion to the cult of pedagogy in education reform is the “miracle” school claim, but “miracle” schools rarely (almost never) exist once the claim is interrogated, and even if a “miracle” school exists, it is by definition an outlier and essentially offers no guidance for scaling outward or upward.

The paradox of the cult of pedagogy in education reform is that until will directly address OOS factor we will never have the context for better teasing out the importance of teacher quality and instructional practices.

The current education reform trapped in the cult of pedagogy is the “science of reading” (SOR) movement which oversells the blame for student reading achievement as well as oversells the solutions in the form of different reading programs, reading instructional practice, and teacher preparation and professional development.

The “miracle” of the day in the SOR propaganda is Mississippi, which is very likely a mirage based on manipulating the age of students being tested at grade level and not on teacher quality and instructional practices.

Not a single education reform promise since the 1980s has succeeded, and the US remains in a constant cycle of crisis and reform promises.

Yet, the evidence is overwhelming that many OOS factors impact negatively student learning and that social reform would pay huge dividends in educational outcomes if we simply would move beyond the cult of pedagogy in education reform.

For example, see the following:

My entire career has existed within the neoliberal accountability era of education reform that oversells education as a “game changer” and oversells teacher quality and instructional practices.

Like time-share frauds, we are being duped, and teachers and students need us to move beyond the cult of pedagogy in education reform and focus on the much larger influences on students being able to learn and teachers being able to show that their quality and instruction can matter.

NAEP LTT 2023: A Different Story about Reading

“The Nation’s Report Card” has released NAEP Long-Term Trend Assessment Results: Reading and Mathematics for 2023.

The LTT is different than regularly reported NAEP testing, as explained here:

As I will highlight below, it is important to emphasize LTT is age-based and NAEP is grade-based.

LTT assesses reading for 13-year-old students, and by 2023, these students have experienced school solidly in the “science of reading” (SOR)-legislation era, which can be traced to 2013 (30+ states enacting SOR legislation and growing to almost every state within the last couple years [1]).

Being age-based (and not impacted by grade retention), the trends tell a much different story than the popular and misleading SOR movement.

Consider the following [2]:

Here is the different story:

  • There is no reading crisis.
  • Test-based gains in grades 3 and 4 are likely mirages, grounded in harmful policies and practices such as grade retention.
  • Age 13 students were improving at every percentile when media and politicians began crying “crisis,” but have declined in SOR era, notably the lowest performing students declining the most.
  • Reading for fun and by choice have declined significantly in the SOR era (a serious concern since reading by choice is strongly supported by research as key for literacy growth).

Here are suggested readings reinforced by the LTT data:

The US has been sold a story about reading that is false, but it drives media clicks, sells reading programs and materials, and serves the rhetorical needs of political leaders.

Students, on the other hand, pay the price for false stories.


[1] Documenting SOR/grade-three-intensive reading legislation, connected to FL as early as 2002, but commonly associated with 2013 as rise of SOR-labeled legislation (notably in MS):

Olson, L. (2023, June). The reading revolution: How states are scaling literacy reform. FutureEd. Retrieved June 22, 2023, from https://www.future-ed.org/teaching-children-to-read-one-state-at-a-time/

Cummings, A. (2021). Making early literacy policy work in Kentucky: Three considerations for policymakers on the “Read to Succeed” act. Boulder, CO: National Education PolicyCenter. Retrieved May 18, 2022, from https://nepc.colorado.edu/publication/literacy

Cummings, A., Strunk, K.O., & De Voto, C. (2021). “A lot of states were doing it”: The development of Michigan’s Read by Grade Three law. Journal of Educational Change. Retrieved April 28, 2022, from https://link.springer.com/article/10.1007/s10833-021-09438-y

Collet, V.S., Penaflorida, J., French, S., Allred, J., Greiner, A., & Chen, J. (2021). Red flags, red herrings, and common ground: An expert study in response to state reading policy. Educational Considerations, 47(1). Retrieved July 26, 2022, from https://doi.org/10.4148/0146-9282.2241

Reinking, D., Hruby, G.G., & Risko, V.J. (2023). Legislating phonics: Settle science of political polemic? Teachers College Record. https://doi.org/10.1177/01614681231155688

Schwartz, S. (2022, July 20). Which states have passed “science of reading” laws? What’s in them? Education Week. Retrieved July 25, 2022, from https://www.edweek.org/teaching-learning/which-states-have-passed- science-of-reading-laws-whats-in-them/2022/07

Thomas, P.L. (2022). The Science of Reading movement: The never-ending debate and the need for a different approach to reading instruction. Boulder, CO: National Education Policy Center. http://nepc.colorado.edu/publication/science-of-reading

[2] Despite claims of a “miracle” MS grade 8 NAEP in reading remains at the bottom after a decade of SOR legislation:

SAT Lessons Never Learned: NAEP Edition

Yesterday, I spent an hour on the phone with the producer of a national news series.

I realized afterward that much of the conversation reminded me of dozens of similar conversations with journalists throughout my 40-year career as an educator because I had to carefully and repeatedly clarify what standardized tests do and mean.

Annually for more than the first half of my career, I had to watch as the US slipped into Education Crisis mode when SAT scores were released.

Throughout the past five decades, I have been strongly anti-testing and anti-grades, but most of my public and scholarly work challenging testing addressed the many problems with the SAT—and notably how the media, public, and politicians misunderstand and misuse SAT data.

See these for example:

Over many years of critically analyzing SAT data as well as the media/public/political responses to the college entrance exam, many key lessons emerged that include the following:

  • Lesson: Populations being tested impact data drawn from tests. The SAT originally served the needs of elite students, often those seeking Ivey League educations. However, over the twentieth century, increasingly many students began taking the SAT for a variety of reasons (scholarships and athletics, for example). The shift in population of students being tested from an elite subset (the upper end of the normal curve) to a more statistically “normal” population necessarily drove the average down (a statistical fact that has nothing to do with school or student quality). While statistically valid, dropping SAT scores because of population shifts created media problems (see below); therefore, the College Board recentered the scoring of the SAT.
  • Lesson: Ranking by test data must account for population differences among students tested. Reporting in the media of average SAT scores for the nation and by states created a misleading narrative about school quality. Part of that messaging was grounded in the SAT reporting average SAT scores by ranking states, and then, media reporting SAT average scores as a valid assessment of state educational quality. The College Board eventually issued a caution: “Educators, the media and others should…not rank or rate teachers, educational institutions, districts or states solely on the basis of aggregate scores derived from tests that are intended primarily as a measure of individual students.” However, the media continued to rank states using SAT average scores. SAT data has always been strongly correlated with parental income, parental level of education, and characteristics of students such as gender and race. But a significant driver of average SAT scores also included rates of participation among states. See for example a comparison I did among SC, NC, and MS (the latter having a higher poverty rate and higher average SAT because of a much lower participation rate, including mostly elite students):
  • Lesson: Conclusions drawn from test data must acknowledge purpose of test being used (see Gerald Bracey). The SAT has one very narrow purpose—predicting first-year college grades; and the SAT has primarily one use—a data point for college admission based on its sole purpose. However, historically, media/public/political responses to the SAT have used the data to evaluate state educational quality and the longitudinal progress of US students in general. In short, SAT data has been routinely misused because most people misunderstand its purpose.

Recently, the significance of the SAT has declined, students taking the ACT at a higher rate and more colleges going test-optional, but the nation has shifted to panicking over NAEP data instead.

The rise in significance of NAEP includes the focus on “proficiency” included in NCLB mandates (which required all states to have 100% student proficiency by 2014).

The problem now is that media/public/political responses to NAEP mimic the exact mistakes during the hyper-focus on the SAT.

NAEP, like the SAT, then, needs a moment of reckoning also.

Instead of helping public and political messaging about education and education reform, NAEP has perpetuated the very worst stories about educational crisis. That is in part because there is no standard for “proficiency” and because NAEP was designed to provide a check against state assessments that could set cut scores and levels of achievement as they wanted:

Since states have different content standards and use different tests and different methods for setting cut scores, obviously the meaning of proficient varies among the states. Under NCLB, states are free to set their own standards for proficiency, which is one reason why AYP school failure rates vary so widely across the states. It’s a lot harder for students to achieve proficiency in a state that has set that standard at a high level than it is in a state that has set it lower. Indeed, even if students in two schools in two different states have exactly the same achievement, one school could find itself on a failed-AYP list simply because it is located in the state whose standard for proficient is higher than the other state’s….

Under NCLB all states must administer NAEP every other year in reading and mathematics in grades 4 and 8, starting in 2003. The idea is to use NAEP as a “check” on states’ assessment results under NCLB or as a benchmark for judging states’ definitions of proficient. If, for example, a state reports a very high percentage of proficient students on its state math test but its performance on math NAEP reveals a low percentage of proficient students, the inference would be that this state has set a relatively easy standard for math proficiency and is trying to “game” NCLB.

What’s Proficient?: The No Child Left Behind Act and the Many Meanings of Proficiency

In other words, NAEP was designed as a federal oversight of state assessments and not an evaluation tool to standardize “proficient” or to support education reform, instruction, or learning.

As a result, NAEP, as the SAT/ACT has done for years, feeds a constant education crisis cycle that also fuels concurrent cycles of education reform and education legislation that has become increasingly authoritarian (mandating specific practices and programs as well as banning practices and programs).

With the lessons from the SAT above, then, NAEP reform should include the following:

  • Standardizing “proficient” and shifting from grade-level to age-level metrics.
  • Ending state rankings and comparisons based on NAEP average scores.
  • Changing testing population of students by age level instead of grade level (addressing impact of grade retention, which is a form of state’s “gaming the system” that NAEP sought to correct). NAEP testing should include children in an annual band of birth months/years regardless of grade level.
  • Providing better explanations and guidance for reporting and understanding NAEP scores in the context of longitudinal data.
  • Developing a collaborative relationship between federal and state education departments and among state education departments.

While I remain a strong skeptic of the value of standardized testing, and I recognize that we over-test students in the US, I urge NAEP reform and that we have a NAEP reckoning for the sake of students, teachers, and public education.

Recommended

Literacy and NAEP Proficient, Tom Loveless

The NAEP proficiency myth, Tom Loveless

Test Scores Reflect Media, Political Agendas, Not Student or Educational Achievement [UPDATED]

In the US, the crisis/miracle obsession with reading mostly focuses on NAEP scores. For the UK, the same crisis/miracle rhetoric around reading is grounded in PIRLS.

The media and political stories around the current reading crisis cycle have interested and overlapping dynamics in these two English-dominant countries, specifically a hyper-focus on phonics.

Here are some recent media examples for context:

Let’s start with the “soar[ing]” NAEP reading scores in MS, LA, and AL as represented by AP:

‘Mississippi miracle’: Kids’ reading scores have soared in Deep South states

Now, let’s add the media response to PIRLS data in the UK:

Reading ability of children in England scores well in global survey
Reading ability of children in England scores well in global survey

Now I will share data on NAEP and PIRLS that shows media and political responses to test scores are fodder for their predetermined messaging, not real reflections of student achievement or educational quality.

A key point is that the media coverage above represents a bait-and-switch approach to analyzing test scores. The claims in both the US and UK are focusing on rank among states/countries and not trends of data within states/countries.

Do any of these state trend lines from FL, MS, AL, or LA appear to be “soar[ing]” data?

The fair description of the “miracle” states identified by AP is that test scores are mostly flat, and AL, for example, appears to have peaked more than a decade ago and is trending down.

The foundational “miracle” state, MS, has had two significant increases, one before their SOR commitment and one after; but there remains no research on why the increases:

Scroll up and notice that in the UK, PIRLS scores have tracked flat and slightly down as well.

The problematic elements in all of this is that many journalists and politicians have used flat NAEP scores to shout “crisis” and “miracle,” while in the UK, the current flat and slightly down scores are reason to shout “Success!” (although research on the phonics-centered reform in England since 2006 has not delivered as promised [1]).

Many problems exist with relying on standardized tests scores to evaluate and reform education. Standardized testing remains heavily race, gender, and class biased.

But the greatest issue with tests data is that inexpert and ideologically motivated journalists and politicians persistently conform the data to their desired stories—some times crisis, some times miracle.

Once again, the stories being sold—don’t buy them.


Recommended

Three Twitter threads on reading, language and a response to an article in the Sunday Times today by Nick Gibb, Michael Rosen

[1] Wyse, D., & Bradbury, A. (2022). Reading wars or reading reconciliation? A critical examination of robust research evidence, curriculum policy and teachers’ practices for teaching phonics and reading. Review of Education10(1), e3314. https://doi.org/10.1002/rev3.3314

UPDATE

Mainstream media continues to push a false story about MS as a model for the nation. Note that MS, TN, AL, and LA demonstrate that political manipulation of early test data is a mirage, not a miracle.

All four states remain at the bottom of NAEP reading scores for both proficient and basic a full decade into the era of SOR reading legislation: