Assessing Quantitative Literacy
Jack Bookman
Duke University
bookman@math.duke.edu
Brief History of the Calculus Reform Movement
why discuss this?:
QL has some historical roots in the CR effort
lessons to be learned from the evaluation of the calculus reform efforts
The Tulane Conference - January 1986
Problems with instruction in calculus repeatedly mentioned by participants at the conference
(Source: Tucker, Alan and Leitzel, Jim (1995). Assessing Calculus Reform Efforts. published by MAA)
• too few students were successfully completing calculus;
• students were mindlessly implementing symbolic
algorithms with no understanding and little facility at
using calculus in subsequent mathematics courses;
• faculty were frustrated at the need to work so hard to
help poorly prepared, poorly motivated students learn
material that was a shadow of the calculus they had
learned;
• calculus was being required as an unmotivated and
unnecessary filter by some disciplines which made little
use of it in their own courses; and
• mathematics was lagging [behind] other disciplines in
the use of technology.
In 1987, the NSF began its Calculus Initiative (eventually spending $40,000,000 on this project over approximately the next ten years)
What calculus reform means varies greatly from one institution to another. Reform is manifested by changes in one or more of the following:
use of technology;
cooperative learning;
group projects;
writing; and
real world problems.
Evaluating Calculus Reform Efforts at Duke University
The evaluation had two phases:
During the first years of the project, the emphasis was on formative evaluation (the cook tastes the soup, quoting Robert Stake) focussing on student and faculty reactions to the course and other problems of implementation.
During the later years, the emphasis changed to summative evaluation (the guests taste the soup) comparing outcomes of traditionally and experimentally taught students on a set of outcomes. The outcome based phase had three main components:
(1) a problem solving test given to both Project CALC (PC) and traditional students (TR) while they were enrolled Calculus II;
(2) a “retention" study of sophomores and juniors, both PC and TR; this comparison used tests of writing, attitudes, skills, conceptual understanding and problem solving ; and
(3) a follow-up study, conducted during the last year of the project, focussed on the question, “Do PC students do better in and/or take more courses that require calculus?"
Other features of the evaluation of the evaluation of Project CALC:
the "accounting model" of evaluation - insiders vs outsiders
(almost) random assignment
both qualitative and quantitative
What worked and what didn't in this study?
What have we learned from attempts at evaluating calculus reform efforts?
comparison groups - even random assignment is difficult
need to be built into the design of an intervention - not an afterthought
not well defined treatment
political and ideological issues
issues of validity and reliability of the instruments used
difficult to get statistically significant results from a single intervention - there are often many other important factors
the opposite of the Hawthorne effect
this is messy and time consuming and doesn't make you popular
need to do it early and to make the assessment a central part of the project and not an afterthought.
But it is possible to get insight into what is going on and to use results to make improvements
If you thought that was difficult, try evaluating QL
Would any two people in this room agree on what QL is? but there seems to be a growing consensus
What we can do?
locally define what it means to be quantitative literate
graduates from college x should be able to ...
this is hard and must be thought of as an iterative process
involve lots of faculty - in ways other than convening a meeting
well designed surveys; focused interviews
develop and pilot some instruments (both qualitative and quantitative) - including pre & post tests, attitude surveys, faculty surveys, rubrics for analyzing course content
develop baseline data
be aware of the political and cultural environment
some standards for evaluation work should be the same as in more traditional scientific research - ideology shouldn't drive conclusions (be prepared to be surprised by results, to be honest and to take the heat)
but some standards are different - try doing a double blind study in education
conduct longitudinal developmental studies like the one being conducting here at Mac
remember though that we can't do everything
QL presents a particular challenge for assessment:
Assessment items must be set in a rich context in which multiple problem solving approaches are possible, and perhaps there is not one "right" answer.
On the other hand, the assessment instruments must not take too much time to administer, must include a multitude of problems and situations, and must be designed to allow for reliable scoring.
QL assessment instruments must include problems that are not simply schema driven (i.e., students have a well-recognized routine for solving the problem), but also are not so non-routine that students cannot solve them.