10. Improve Student Performance on State Accountability Tests

This kind of project requires two kinds of evaluation:

1. A formative evaluation to determine if the efforts you have made to improve the scores have been implemented.

2. A summative evaluation to determine if these efforts or programs have improved scores on the state tests.

I  Formative Evaluation

The task here is to determine how well (or, indeed, if) the new programs and efforts have been implemented. This is not something that can be taken for granted.

Thus, you will need to assemble credible, objective evidence (not simply your own judgments) on the following types of questions:

1. Do the new programs exist?
2. Are they operating as they are supposed to?
3. If not, what changes are needed to make them operational?
4. Are funds being appropriately spent?
5. Have the necessary books, materials, and equipment been purchased and made available to the teachers and students?
6. Have competent staff been hired and trained?
7. Are the programs serving the intended students?
8. Are the students receiving the intended educational information and services?

This kind of evaluation will tell you whether the new programs are operational and potentially capable of having an impact on the state test scores. It is essential to complete this type of evaluation before undertaking a summative, impact, or outcome evaluation. It makes no sense to evaluate the effectiveness of a program that does not exist-even though we regrettably see many instances of this.

II  Summative Evaluation

The two basic questions here are:

>Has there been an improvement on the state test scores?

>Can any improvement that has occurred be confidently attributed to the new programs rather than other factors?

There are two preferred evaluation designs for answering these questions:

(1) Compare pre- and post-test scores of a group of students exposed to the new programs and a comparable control group of students who weren't exposed to the programs. If the gains of the treatment group are statistically and educationally greater than those of the control group, you can confidently conclude that these gains were the result of the new program. Simply administering pre- and post-tests to the treatment or program group without a control group will not do. The state test cores may have increased because of other conditions or factors besides the new programs. For comparison, you need some measure of change under the conditions of non-treatment. (See the section on An Example of the Most Common Pitfall in Evaluating Education Programs in the Short Course on Evaluation Basics.)

(2) As an alternative, if a control group is not available, you can use what is known as the interrupted time-series design. If test scores are available for several years or periods prior to the introduction of the new programs, and there is a marked improvement in these scores immediately following the introduction of the new programs that is sustained in following years or periods, you can confidently conclude that the gains were the result of the new program.

For more information on these two designs, see the section on Alternative Summative Evaluation Methods in the Short Course on Evaluation Basics, and references 1 (Campbell), 4 (Cook & Campbell), and 8 (Rossi & Freeman) in the Evaluation References.

Design Selections | Evaluation Support Home
Applied Research Center | EdD Major | Ed Leaders Home/ Basic Statistics