An Analysis of Achievement Scores of  
Arthur Academy Schools, 2007 to 2013 
 
 
 
 
 
 
 
 
Technical Report 2014-2 
 
 
 
 
 
 
 
 
 
 
 
 
 
Charles Arthur – Early Child Literacy Consultant, Arthur Reading Workshop & 
Jean Stockard, Office of Research and Evaluation, National Institute for Direct Instruction 
AUGUST 7, 2014 
i 
 
 
 
An Analysis of Achievement Scores of Arthur Academy 
Schools, 2007-2013 
              
 
Arthur Academies are a set of six charter elementary schools in the greater Portland, Oregon 
metropolitan area. The mission of the schools is “to accelerate educational achievement 
and academic competency of all its students” and “to become an effective and innovative 
model of instruction that can influence teaching practices in other schools” (Arthur, 2009-
10, p. 1). This report examines data on the academic achievement of Arthur students from 
six consecutive school years: 2007-2008 through 2012-2013 and shows how the schools 
are accomplishing this mission. The data indicate that, at the start of kindergarten, Arthur 
students had achievement scores that were similar to or slightly lower than students in the 
nation as a whole. However, by the end of kindergarten they had achievement scores that 
were significantly higher than their peers. This high level of achievement persists, and even 
increases, through later grades. The body of the text below includes selected figures to 
illustrate this phenomenon. Tables with the supporting data and explanations of the 
statistical analyses are in the appendix.  
 
Background 
The Arthur Academies are managed by the Mastery Learning Institute, a Charter 
Management Organization. The first of the six schools opened in the fall of 2002. Two more 
schools opened in 2004, a fourth in 2005, and two more in the fall of 2007. Each of the 
schools is in a different school district. 
 
All of the schools use Direct Instruction programs for reading and mathematics: Reading 
Mastery and Connecting Math Concepts. These Direct Instruction programs use an 
incremental, mastery-learning approach to teaching basic subjects of reading, math and 
language.  This specialized approach to teaching is based on a comprehensive model of 
instruction. Providing this model is based on the belief that a powerful way of teaching exists 
that is not being utilized in most schools, and therefore our charter schools offer it as a 
choice. The basis of Mastery Learning is that a child’s rate of progress is determined by the 
extent to which he or she masters carefully sequenced lessons and activities, culminating in 
the mastery, or thorough learning, of essential foundational skills and knowledge.   
 
This model, with these programs, is the most thoroughly documented educational reform 
model in elementary and middle school grades. It emphasizes well-developed and carefully 
planned lessons, designed around small learning increments and prescribed teaching tasks. 
Learning is arranged very incrementally so that students find learning easy but challenging 
and, therefore, can be successful in mastering everything that is taught as they progress 
 
through the programs. The curriculum materials break down all general objectives into very 
small teaching progressions.  All activities are very carefully sequenced so that they can be 
easily learned, mastered and gradually accumulated towards larger objectives.  The 
activities are presented to children in very exacting, interactive ways so that children are 
motivated. 
 
The model includes sustained attention to maintaining quality of the implementation of the 
programs and high fidelity to the guidelines of the DI programs.  A great deal of emphasis is 
placed on having very capable and well-trained teachers, especially in the very early grades. 
Extensive training is provided yearly during two weeks before fall opening as well as in-class 
coaching during the year to attain and maintain quality teaching with these specialized 
programs. 
 
The Direct Instruction model of instruction is academically focused right from the beginning, 
in kindergarten.  It is based on the belief that academic learning can be highly motivating by 
itself when taught clearly, systematically and enthusiastically.  It has been observed that 
children are excited and gain self-confidence when they are learning how to read and do 
math in kindergarten. We have learned that, if at-risk children who have the highest 
likelihood of learning problems can start kindergarten in a strong academic program, many 
of their learning difficulties can be prevented. The data in the next section illustrate this 
phenomenon. 
 
Strong Achievement Growth in Kindergarten 
All Arthur Academy students are given the Stanford Achievement Test (SAT), a norm-
referenced test, in the fall and spring of their kindergarten year. Figure 1 and Figure 2 report 
the percentage of students who scored at or above the 40th percentile in reading and math 
at both testing periods from 2008-09 to 2012-13.  This benchmark was chosen to indicate a 
point of “average” achievement. By definition, 60 percent of the students in the nation have 
scores within this range, a level indicated by the solid horizontal line in the figures.  
 
In most of the comparisons, the percentage of Arthur students meeting this range at the 
start of kindergarten (the bars labeled “fall”) was substantially smaller than the 60 percent 
national norm. In other words, the Arthur students had lower scores than comparable 
students in the nation. Yet, by the spring of the year the percentage of students in this range 
was substantially larger than the 60 percent level. (Compare the bars labeled “spring” with 
the line labeled “nation.”) That is, by spring the Arthur students had much higher scores 
than comparable students in the nation.  (See Table A-1 for statistical details.) 
 
 
 
 
2 
  
 
 
Figure 1: Percentage of Kindergarten Students at or 
above 40th%ile, Reading SAT, Fall and Spring, by Year 
100
90
80
70
60 Fall
50
Spring
40
Nation
30
20
10
0
2008-09 2009-10 2010-11 2011-12 2012-13
 
 
Figure 2: Percentage of Kindergarten Students at or 
above 40th%ile Math SAT, Fall and Spring by Year 
100
90
80
70
60 Fall
50
Spring
40
Nation
30
20
10
0
2008-09 2009-10 2010-11 2011-12 2012-13
 
 
Figure 3 reports results using a much higher standard, the 80th percentile, as the point of 
comparison, In the nation as a whole, 20 percent of students score at or above that level. At 
entry to kindergarten ten percent or fewer of the Arthur students (about half of what would 
be expected) were in this high achieving range. By the end of the year, however, from three 
to four times as many students were in this range.  
 
 
 
 
3 
  
 
In other words, at the start of their kindergarten year, the average Arthur student had SAT 
scores that were lower than students in the nation as a whole. However, by the end of their 
kindergarten year, the situation had reversed, and the average Arthur student scored much 
higher than peers in the nation. In all cases, the changes over time, relative to the national 
norms, were statistically significant. The associated effect sizes ranged from .35 to 1.12, 
with an average of value of .72. Traditionally, effect sizes of .25 or larger are seen as 
“educationally important,” and those above .60 are seen as large. Thus, these changes in 
the achievement of Arthur kindergarten students were both statistically significant and 
educationally important. Providing this kind of accelerated progress in kindergarten gives all 
children a huge advantage for success in future grades, a phenomenon that is documented 
in the next section of this report. 
 
 
Figure 3: Percentage of Kindergarten Students at or 
above 80th%ile, SAT, Fall & Spring, by Year and Subject 
60
50
40
30 Fall
Spring
20
Nation
10
0
 08-09  09-10  10-11  12-13  08-09  09-10  10-11  12-13
Reading Math
 
 
 
High Achievement Levels Persist Through Elementary School 
Two types of achievement data were available for students in grades after Kindergarten: SAT 
scores like those shown in Figures 1 to 3 were available for grades 1 to 5 in from 2007-08 
to 2010-11 and grades 1-2 in 2011-12 and 2012-13. Scores from the Oregon Assessment 
of Knowledge and Skills (OAKS) were available for grades 3 to 5 for 2009-10 to 2012-13. 
The remaining figures in this report show how Arthur students maintained their high levels of 
achievement through their elementary years.  
 
 
 
 
 
4 
  
 
Arthur Students’ SAT Scores Surpass National Norms in Grades 1 to 5 
Figures 4, 5, 6 and 7 summarize students’ spring scores on the SAT from 2008 to 2013. 
Figures 4 and 5 report the percentage of students scoring at or above average, defined as at 
or above the 40th percentile). Figures 6 and 7 report the percentage of students scoring 
within the high achieving range, defined as scoring at or above the 80th percentile. The 
format of the graphs is like that used in Figures 1, 2, and 3 with the bars showing Arthur  
 
Figure 4: Percentage of Arthur Students At or Above the 
40th %ile in SAT Reading, Spring, by Grade and Year 
100
95
90
85
80
75
AA
70
65 Nat.
60
55
50
K-5 K-5 1 2 3 4 5 1 2 3 4 5 1 2 1 2
08 09 2010 2011 2012 2013
 
 
Figure 5: Percentage of Arthur Students At or Above 
40th Percentile in SAT Math, Spring, by Grade and Year 
100
95
90
85
80
75
AA
70
65 Nat.
60
55
50
K-5 K-5 1 2 3 4 5 1 2 3 4 5 1 2 1 2
08 09 2010 2011 2012 2013
 
 
 
 
 
5 
  
 
students’ scores and the horizontal lines representing the national norm. (Data for 2008 
and 2009 were not available separately by grade, so these results are aggregated across all 
grade levels.) 
 
 
Figure 6: Percent of Arthur Students At or Above the 
80th %ile, SAT Reading, Spring, by Grade and Year 
55
50
45
40
35
30
AA
25
20 Nation
15
10
K-5 K-5 1 2 3 4 5 1 2 3 4 5 1 2
08 09 2010 2010-11 2012-
13
 
 
 
Figure 7: Percentage of Arthur Students At or Above  
80th%ile in SAT Math, Spring, by Grade and Year 
80
70
60
50
40
30 AA
20 Nation
10
0
K-5 K-5 1 2 3 4 5 1 2 3 4 5 1 2
08 09 2010 2010-11 2012-
13
 
 
In all cases, the percentage of students in the depicted ranges was substantially higher than 
would be expected given national norms. All comparisons were statistically significant. The 
average effect size across all of the results was .59, more than twice the level traditionally 
 
 
 
6 
  
 
used to denote educationally important results and very similar to that found in other meta-
analyses of DI programs. (See Coughlin 2014 for a summary of meta-analyses and 
systematic reviews of the efficacy of DI programs.) 
 
Tables A-2 and A-3 in the Appendix provide additional details on the results shown in Figures 
4 through 7. Tables A-4 through A-7 provide details on changes from fall to spring in SAT 
scores. In almost all cases, the Arthur students’ gains in achievement over a single school 
year were significantly greater than would be expected given the national norms. The 
average effect size associated with these changes was .43, again well above the traditional 
level of educational importance.  
  
Arthur Students’ OAKS Scores Surpass District and State Figures 
Figures 8, 9 and 10 summarize students’ scores on the reading, mathematics, and science 
sections of the OAKS, given in the spring of each year. The bars represent the percentage of 
Arthur students who met the state defined benchmark on the assessments, and the lines in 
the graphs report the average percentage of students in the districts in which the schools 
are located and in the state as a whole that met the benchmark.  
 
The OAKS benchmarks are, by definition, a much easier criterion to meet than the national 
mean on the SAT. While only half of the national population is above the mean on the SAT, 
the data shown in Figures 8 to 10 indicate that over two-thirds of the students in the state 
met most of the benchmarks. Given such a high percentage of students meeting this mark, 
the “room” for surpassing this level is thus relatively small.  
 
In almost all cases, however, Arthur students were more likely to score at the proficient level 
than students in their home districts or in the state as a whole. The average effect size 
across all comparisons with the district was .48 and the average effect size for comparisons 
with the state was .32, both above the level traditionally used to denote educationally 
important effects. Most of the comparisons were also statistically significant. (See Table A-8 
for detailed statistical results and discussion in the appendix of the related calculations.) 
 
 
 
 
 
 
7 
  
 
Figure 8: Percentage of AA Students Meeting OAKS 
Reading Benchmark, by Year and Grade 
100
95
90
85
80
75 A. Acad.
70
65 Districts
60 State
55
50
Gr. 3 Gr. 4 Gr. 5
 
 
 
Figure 9: Percentage of AA Students Meeting OAKS 
Benchmark, Mathematics, by Year and Grade 
90
85
80
75
70 A. Acad.
65
Districts
60
State
55
50
Gr. 3 Gr. 4 Gr. 5
 
 
 
 
 
 
8 
  
 
2010
2010
2011
2011
2012
2012
2013 2013
2010 2010
2011 2011
2012 2012
2013 2013
2010 2010
2011 2011
2012 2012
2013 2013
Figure 10: Percentage of AA 5th Graders Meeting OAKS 
Benchmark, Science, By Year 
100
90
80
A. Acad.
70
Districts
60 State
50
40
2011 2012 2013
 
 
 
Discussion and Summary 
This document has analyzed data on the academic achievement of Arthur Academy students 
over a six year period. The results provide a clear picture of academic success. At the 
beginning of kindergarten the average Arthur student generally had achievement scores that 
were lower than national norms. Yet, by the end of kindergarten, the average student scored 
well above others in the nation. This high achievement continued to appear in the upper 
elementary grades. In almost all analyses, Arthur students had higher Stanford Achievement 
Test (SAT) and Oregon State Assessment Scores (OAKS) than counterparts in their districts, 
state, and nation. Almost all of the comparisons were statistically significant and the 
associated effect sizes were well beyond the levels used to denote educational importance. 
 
The replication of the results across six academic years is noteworthy. The high achievement 
patterns do not appear to be a fluke, but instead appear to be a consistent pattern. It should 
also be noted that the analysis is based on the aggregation of results combined across six 
schools, allowing more confident use of statistical manipulations. There is no indication that 
the pattern of strong achievement gains varied from one school to another. Moreover, the 
fact that the results have been replicated in six different settings is especially noteworthy 
and should provide additional confidence in the results. Finally, the fact that consistent 
positive results appeared in different subject matters (reading, mathematics, and, for fifth 
grade, science) and with two different assessments (SAT and OAKS) provides additional 
confidence in the results.  
 
 
 
 
9 
  
 
Taken together, the data in this report provide substantial evidence that the Arthur Academy 
schools are meeting their stated mission. They have clearly accelerated the educational 
achievement and academic competency of their students. Moreover, they can serve as an 
effective and innovative model of instruction to influence teaching practices in other 
schools. 
 
 
  
 
 
 
10 
  
 
Appendix 
Details on the analysis summarized above are given in this appendix. They describe the data 
more completely and explain the procedures used to calculate the effect sizes and tests of 
significance reported in the text. 
 
Strong Achievement Growth in Kindergarten 
Table A-1 presents the statistical results for the data that are shown in Figures 1, 2, and 3. 
The first two columns of Table A-1 give the subject matter and year of data used in the 
analysis. The first column of data (third column in the table) reports the percentage of 
students in a given range for the nation as a whole (60 percent for the 40th percentile and 
higher and 20 percent for the 80th percentile and higher). The next two columns of data 
report the percentage of Arthur students within this range at the start and end of the 
kindergarten year. The following column, labeled ES change (for Effect Size change), gives 
the effect size associated with the change, relative to the national norms, from fall to spring, 
The next two columns report the associated t-ratio and probability level; and the final column 
gives the number of Arthur students in the comparison. The first panel of data reports 
results for scores at or above the 40th percentile and the second panel reports results for 
scores at or above the 80th percentile. 
 
The following logic was used in calculating the effect sizes. Results for reading for the 40th 
percentile for kindergartners in 2010-11 (the third row in Table A-1) are used as an 
example. In the fall of 2010, 32 percent of Arthur students fell in this range and in the 
spring 87 percent of the students did so. In the nation, by definition, 60% of the students 
scored at or above this level at both time periods. This national group is the population to 
which Arthur students are compared.  
 
Using the logic and terminology associated with analyses of binomial distributions, the value 
of 60% is equivalent to Pu = .60, the proportion of cases in the population. By definition, the 
standard deviation of this population,  
ϭpu = √     , where Qu = 1-Pu (the square root of the product of Pu and Qu).  (1) 
 
For this population ϭpu = √          = .49      (2) 
 
The effect size comparing Arthur students with the nation at any given time point is simply  
(Ps-Pu)/ ϭpu ,            (3) 
the difference between the sample and population value divided by the standard deviation 
of the population. In the fall, the effect size,  
 
 
 
11 
  
 
dfall, = (.32-.60)/.49 = .28/.49 = -.57.        (4)  
In the spring, the effect size,  
dspring, = (.87-.60)/.49 = .27/.49 = +.55.       (5) 
 
In other words, in the fall, the probability that an Arthur student would score in the average 
or above range was .57 of a standard deviation lower than the probability for the population 
as a whole. In the spring, the probability that an Arthur student would score in this range was 
.55 of a standard deviation higher than that for the nation.  
 
The effect size associated with the change from fall to spring is calculated as 
dchange  = dspring – dfall = .55 – (-.57) = 1.12.       (6) 
 
The t-ratio and associated probabilities shown in the last two columns of Table A-1 were also 
calculated using the logic associated with the binomial distribution. This t-ratio tests the null 
hypothesis that the difference from fall to spring equals zero. The standard error, is defined 
as 
ϭpu/√   .           (7) 
For the data in the third line of Table A-1, the sample size is 133, and the standard error 
equals 
ϭpu/√    = .49/√        = .49/11.49 = .043     (8) 
 
Then, by definition, the t-ratio, 
t = [(Pspring,Arthur-Pspring,u) – (Pfall,Arthur-Pfall,u)/[ ϭpu/√    , and, because Pspring,u = Pfall,u, 
t = [Pspring,Arthur – Pfall,Arthur/[ ϭpu/√           (9) 
 
And, for the data in the third line of Table A-1,  
t =  (Psp-Pfall)/[ ϭpu/√     = (.87-.32)/.043 = .55/.043 = 12.90.    (10) 
 
This t-value would occur by chance far less than one time out of 1000. 
 
High Achievement Levels Persist Through Elementary School 
Figures 4 to 8 in the text compare students’ spring scores to those in the nation, state and 
districts. The sections describe the data in more detail and explain the associated 
calculations of effect sizes and tests of significance. 
 
 
 
 
 
 
12 
  
 
Arthur Students’ SAT Scores Surpass National Norms in Grades 1 to 5 
Figures 4 to 7 and the related discussion compare the spring SAT scores of students to 
national values, focusing on the percentage at or above the 40th percentile (Figures 4 and 5) 
and the 80th percentile (Figures 6 and 7). Tables A-2 and A-3 display the data used to 
construct these figures.   
 
The logic for calculating effect sizes and t-ratios for these data is very similar to that 
explained directly above. To calculate the effect size associated with the comparison to the 
national average, the following logic was used, using data in Table A-2 as an example. The 
first panel of data in Table A-2 compares the percentage of students scoring at or above 
average (defined as at or above the 40th percentile on the SAT reading test) to the 
percentage expected given national norms. For example, among Arthur students in 2010-
11, 90% of first graders, 84% of second graders, and 79% of third graders scored in this 
range. The second column of data is 60 for all grades, reflecting the fact that, by definition, 
60% of all students would score at the 40th percentile or higher. Using equation (1) above, 
based on the binomial distribution, the standard deviation for the population is defined as 
ϭpu  =  √        =  √          = .49 
 
The effect size of the difference between the percentage of Arthur students and that for the 
nation can then be calculated as  
d = (Ps – Pu)/ ϭpu,           (11) 
where Ps is the proportion for the sample and Pu is the proportion for the population. 
For the data for first graders in 2010-11, 
d = (.90 - .60)/.49 = .30/.49 = .61. 
In other words, in the spring of 2011, the percentage of Arthur students scoring at or above 
average was .61 of a standard deviation higher than would be expected given national data. 
 
A simple t-ratio can be used to test the null hypothesis that this effect size (or the difference 
of Arthur students from the national value) is significantly greater than zero. The calculation 
of the associated t-ratio is a simple extension of this logic, using the definition of the 
standard error given in equation (8) above: 
ϭpu/√     
The sample size for first grade students in the spring of 2011 was 142. Thus, 
 
 ϭpu/√    = .49/√        = .49/11.87 = .041. 
 
The t-ratio is then calculated as  
 
 
 
13 
  
 
t =  (Ps-Pu)/[ ϭpu/√    .         (12) 
For first grade Arthur students in the spring of 2011,  
t =  (Ps-Pu)/[ ϭpu/√     = (.90-.60)/.041 = .30/.041 = 7.32. 
 
A t-ratio of this magnitude would be expected to occur far less than one time out of a 1000 if 
the null hypothesis were true. 
 
Arthur Students’ OAKS Scores Surpass District and State Figures 
Table A-4 displays the data represented in Figures 8, 9 and 10, which compare the scores of 
Arthur Students to the average scores of students in the home districts of the schools and 
the state as a whole. The format of the table is similar to that of other tables. It shows the 
relevant percentages for Arthur students, the districts, and the state (the first three columns 
of data), the effect sizes associated with the comparisons of the Arthur students’ scores to 
those of the district and state, and the associated t-ratios. For instance, the first line of data 
indicates the percentage of third grade students in Arthur Academies, the districts and the 
state who met the OAKS reading benchmark in 2010 (95%, 74%, and 83% respectively).  
 
Again, the binomial distribution was used for calculating the effect sizes and t-ratios.  
However, a correction factor was included to adjust for the “ceiling effect” or limited amount 
of change possible for Arthur students relative to the larger population. Data in the second 
line of Table A-4 are used as an example of the calculations.  
 
The correction factor used was based on the maximum effect size that would occur if all 
students within a population were to increase their achievement levels to meet the 
benchmark; i.e. if Pu = 1.00. Using equation 10,  
In general, the maximum effect size for a given population is calculated as  
dmax = (1-   )/ √       .          (13) 
 
For the data in the second row of Table A-4, the maximum effect size for the district, where 
Pu = .77, would be calculated as 
dmax, district = (1-    )/ √          = .23/.42 = .55 
The maximum effect size for the state, where Pu = .85, would be calculated as 
dmax, state = (1-    )/ √          = .15/.36 = .42 
 
Using equation 10, the raw, or uncorrected, effect size for the Arthur students relative to the 
district would be calculated as 
dA, district(raw) = (.92-    )/ √          = .15/.42 = .36; 
 
 
 
14 
  
 
 
and the raw effect size relative to the state would be calculated as 
dA, state(raw) = (.92-    )/ √          = .07/.36 = .19. 
 
The corrected effect size is calculated as simply the ratio of the raw effect size to the 
maximum effect size 
 
dcor. = draw/dmax.          (14) 
 
For the second line of data in Table A-4 
dcor., district = .36/.55 = .65 
 
and  
dcor., state = .19/.42 = .45.1 
 
Note that the correction factor only impacts the value of the effect size when Pu approaches 
unity. In addition, note that the correction factor would not be appropriate in the 
comparisons involving groups defined by percentiles, as in the previous section.  
 
The t-ratios associated with these comparisons were calculated from the effect sizes using a 
formula derived from simple manipulations of equations 10 and 11 above: 
t = d * √   .          (15) 
 
 
Arthur Students’ Achievement Scores Continue to Improve in Grades 1 to 5 
Both SAT and OAKS data were available on changing scores over time for students in grades 
1 to 5. While these results are mentioned in the body of this report, associated figures were 
not given. The sections below describe the findings in more detail. 
 
Increasing SAT Scores from Fall to Spring 
Data from the Stanford Achievement Test are in Tables A-5 through A-8. The interpretation of 
the tables and the calculation of the effect sizes followed the procedure described above for 
Table A-1. Results show that students’ scores improved significantly from fall to spring in 
both reading and mathematics in all 5 grade levels.  
 
                                                 
1
 The numbers in these calculations are slightly different than those in Table A-4 because of rounding. The 
calculations in Table A-4 were done using excel and thus included many more decimal points. 
 
 
 
15 
  
 
The first two panels of Table A-5 show data for the 2007-08 and 2008-09 school years, in 
which the available data were aggregated across the 6 grade levels (K to 5). In all cases the 
percentage of students scoring in either the average or above or high achieving ranges 
increased strongly and significantly during the school year. Separate data were available for 
first and second graders were available for students from 2009-10 through 2012-13 (four 
years, see last panel of Table A-5 first panels in Table A-6, and the entirety of Tables A-7 and 
A-9). all three years. On average over four-fifths (82% of the students) fell into the average 
and above range (at the 40th percentile or higher) at the start of the school year. By the 
spring an additional 10% (92%) were in this range. The increase over time was statistically 
significant in all but three cases. The exceptions involved subjects and years in which over 
85% of the students were in the range at baseline, thus having little range for improvement 
(2nd grade reading in 2012-13, and 2nd grade math in 2011-12 and 2012-13). The 
associated effect sizes ranged from zero to .94, with an average of .21, slightly lower than 
the level traditionally termed educationally important. Similar results appeared with the 
analysis of first and second grade students scoring at the 80th percentile or higher. In all but 
two cases there were strong increases from fall to spring and the increases were statistically 
significant. The two exceptions involve second grade students in the 2012-13 school year, 
where over twice as many students as would be expected scored in this high range.   
 
Data on SAT scores were available for students in grades three, four and five for the 2009-
10 and 2010-11 school years. (See Tables A-5 and A-6). Again, comparisons involved 
changes in the percentage of students scoring at or above the 40th percentile and at or 
above the 80th percentile. In all cases the percentage of students scoring in these ranges 
increased during the school year. The average effect size across all comparisons was .29, 
and most of the changes were statistically significant. The exceptions usually involved cases 
in which the students began the school year with relatively high percentages of students 
already scoring at the designated level. 
 
OAKS Performance Increases Over Time 
The OAKS data were used to compare students’ performance over time for two cohorts: 
those who were in fourth grade in the spring of 2012 and in fifth grade in the spring of 2013 
(called Cohort 1), and students who were in third grade in the spring of 2012 and fourth 
grade in spring 2013 (Cohort 2). Data for the cohorts and comparison data for each cohort 
for students in the districts and state are in Table A-9. Effect sizes and t-ratios were 
calculated following the logic outlined in previous sections. Effect sizes that compare the 
percentage of Arthur students meeting benchmark to percentages within the district and 
state were calculated for each time point and cohort using equations 10 and 12 above. The 
 
 
 
16 
  
 
effect size associated with the change over time was then calculated using equation 6.  The 
t-ratio associated with this change was calculated using equation 13. 
 
The patterns of results differed slightly for the two cohorts and subjects. However, for all 
comparisons, the percentage of Arthur students meeting benchmark increased or stayed the 
same from one grade to the next, while the percentage of students in the districts and state 
stayed the same or decreased. As a result, all of the associated effect sizes were positive 
and three of the associated effect sizes were statistically significant.  
 
Note, however, that the sample size for each cohort declined from one year to the next. 
There was no way to determine the extent to which selection bias (lower scoring students 
dropping out of the Academies from one year to the next) could account for the change over 
time. Also, because the sample size differed across the two testing periods an average of 
the sample size for the two periods was used in the calculation of the standard error. Given, 
however, the very positive and strong results for the SAT data, which involved the same 
students over time, the possibility of such differential attrition affecting the results seems 
quite small.  
 
 
 
 
 
 
 
 
  
 
 
 
17 
  
 
 
Table A-1 
        
Percentage of Kindergarten Students At or Above 40th Percentile or 80th Percentile, Stanford 
Achievement Test, Fall and Spring, Nation and Arthur Schools, 2008-09 to 2012-13 
At or Above Average (40th Percentile or Higher) 
Start of ES 
Nation K End of K Change t-ratio Prob. N 
 Reading  2008-09 60 24 87 1.29 15.00 *** 137 
2009-10 60 32 87 1.14 13.05 *** 133 
 2010-11 60 32 87 1.12 12.90 *** 133 
 2011-12 60 56 73 0.35 4.66 *** 181 
 2012-13 60 53 86 0.67 8.39 *** 156 
M ath 2008-09 60 23 83 1.22 14.28 *** 137 
2009-10 60 30 85 1.13 12.92 *** 132 
 2010-11 60 30 85 1.12 12.90 *** 133 
 2011-12 60 62 86 0.49 6.57 *** 181 
   2012-13 60 53 86 0.67 8.39 *** 156 
High Achieving (80th Percentile or Higher) 
Start of ES 
Nation K End of K Change t-ratio Prob. N 
R eading  2008-09 20 7 50 1 10.24 *** 137 
2009-10 20 11 47 1 8.46 *** 133 
 2010-11 20 11 47 0.90 10.34 *** 133 
 2012-13 20 10 29 0.48 5.91 *** 156 
 Math 2008-09 20 3 35 0.65        7.62  *** 137 
2009-10 20 6 44 0.77        8.85  *** 132 
 2010-11 20 6 44 0.95 10.91 *** 133 
   2012-13 20 10 29 0.48 5.91 *** 156 
*, p< .05; **, p< .01; ***,p<.001; all tests are one-tail reflecting the research hypothesis that students in the Direct 
Instruction programs will have higher levels of achievement. 
  
 
 
 
18 
  
 
 
Table A-2 
Percentage o f Students At  or Above 40 th Percentile  or 80th Pe rcentile, R eading,  
Stanford Achievement Test, Spring, Grades 1-5, Nation and Arthur Schools, by Year and 
Grade  
At or Above Average (40th Percentile or Higher) 
Arthur 
Nation Eff. Size t-ratio N 
Year Grade Academy  
2007-08 K - Gr. 5 86 60 0.53 13.54 *** 652 
2008-09 K - Gr. 5 84 60 0.49 13.99 *** 816 
2009-10 Gr. 1 89 60 0.60 7.13 *** 142 
Gr. 2 84 60 0.49 5.87 *** 144 
 Gr. 3 79 60 0.40 4.69 *** 141 
 Gr. 4 89 60 0.58 6.88 *** 140 
 Gr. 5 84 60 0.50 5.78 *** 135 
2 010-11 
 Gr. 1  90  60  0.61  7.27 * **  142 
 Gr. 2 84 60 0.49 5.86 *** 144 
 Gr. 3 79 60 0.39 4.59 *** 141 
 Gr. 4 88 60 0.57 6.74 *** 140 
 Gr. 5 85 60 0.51 5.91 *** 135 
2 011-12, All Students 
  Gr. 1  90  60  0.61  7.6  ***  155 
  Gr. 2 83 60 0.47 5.94 *** 161 
2011-12, Continuing Students 
Gr. 1 96  60  0.73  8.54 * **  136 
  Gr. 2 85 60 0.51 6.1 *** 144 
2012-13 
 Gr. 1  97  60  0.76  9.43  ***  157 
 Gr. 2 90 60 0.61 7.5 *** 151 
 High Achieving (80th Percentile or Higher) 
Arthur 
Nation Eff. Size t-ratio N 
Year Grade Academy  
2007-08 K - Gr. 5 51 20 0.78 19.77 *** 652 
2008-09 K - Gr. 5 41 20 0.53 14.99 *** 816 
2009-10 Gr. 1 41 20 0.52 6.19 *** 142 
Gr. 2 33 20 0.33 3.99 *** 144 
 Gr. 3 35 20 0.39 4.57 *** 141 
 
 
 
 
19 
  
 
Gr. 4 46 20 0.66 7.79 *** 140 
 Gr. 5 45 20 0.63 7.29 *** 135 
 
2010-11 
G r. 1  41  20  0.53  6.23 * **  142 
 Gr. 2 33 20 0.33 3.89 *** 144 
 Gr. 3 35 20 0.38 4.44 *** 141 
 Gr. 4 46 20 0.65 7.66 *** 140 
 Gr. 5 45 20 0.63 7.23 *** 135 
 2012-13   
 Gr. 1  47  20  0.68  8.43 ***  157 
  Gr. 2 48 20 0.7 8.57 *** 151 
 
  
 
 
 
20 
  
 
Table A-3 
Percentage o f Students At  or Above 40 th Percentile  or 80th Pe rcentile, M athema tics, 
Stanford Achievement Test, Spring, Grades 1-5, Nation and Arthur Schools, by Year and 
Grade  
At or Above Average (40th Percentile or Higher) 
Arthur 
Nation Eff. Size t-ratio N 
Year Grade Academy  
2007-08 K - Gr. 5 88 60 0.57 14.78 *** 670 
2008-09 K - Gr. 5 84 60 0.49 14.00 *** 818 
2009-10 Gr. 1 86 60 0.53 6.42 *** 145 
Gr. 2 85 60 0.50 6.03 *** 144 
 Gr. 3 77 60 0.36 4.23 *** 142 
 Gr. 4 91 60 0.64 7.56 *** 140 
 Gr. 5 79 60 0.38 4.38 *** 135 
2 010-11 
G r. 1  86  60  0.53  6.3 * **  142 
 Gr. 2 85 60 0.51 6.1 *** 144 
 Gr. 3 78 60 0.37 4.35 *** 141 
 Gr. 4 92 60 0.65 7.70 *** 140 
 Gr. 5 78 60 0.37 4.25 *** 135 
2 011-12 
 Gr. 1  95  60  0.71  8.3 * **  136 
  Gr. 2 94 60 0.69 8.27 *** 143 
2012-13 
G r. 1  100  60  0.82  10.2 * **  157 
  Gr. 2 93 60 0.67 8.25 *** 151 
High Achieving (80th Percentile or Higher) 
Arthur 
Nation Eff. Size t-ratio N 
Year Grade Academy  
2007-08 K - Gr. 5 54 20 0.85 21.99 *** 670 
2008-09 K - Gr. 5 42 20 0.55 15.72 *** 818 
2009-10 
 Gr. 1  66  20  1.16  13.86 * **  145 
 Gr. 2 40 20 0.49 5.85 *** 144 
 Gr. 3 39 20 0.47 5.56 *** 142 
 Gr. 4 51 20 0.77 9.05 *** 140 
 Gr. 5 39 20 0.48 5.57 *** 135 
2 010-11 
 Gr. 1  66  20  1.15  13.66 * **  142 
 
 
 
 
21 
  
 
Gr. 2 40 20 0.5 5.98 *** 144 
 Gr. 3 39 20 0.475 5.62 *** 141 
 Gr. 4 51 20 0.775 9.14 *** 140 
 Gr. 5 39 20 0.48 5.5 *** 135 
2 012-13   
 Gr. 1  70  20  1.25  15.61 ***  157 
  Gr. 2 44 20 0.6 7.35 *** 151 
         
  
 
 
 
22 
  
 
Table A-4 
OAKS Scores Arthur Acad emy, Distric ts and State,  by Year, Gra de, and Subj ect    
Reading 
A. Acad.- Districts - Ef Size- Ef Size- t- t-
% % State - % Dist State district state 
 Grade 3   
2010 95  74  83  0.8 1 0.7 1  9.79  ***  8.56  *** 
2011 92 77 85 0.65 0.47 8.04 *** 5.75 *** 
2012 80 60 72 0.50 0.29 6.40 *** 3.66 *** 
2013 81 60 66 0.53 0.44 6.66 *** 5.60 *** 
Grade 4 
2010 93  77  85  0.7 0 0.5 3  8.23  ***  6.31 * ** 
2011 93 81 87 0.63 0.46 7.47 *** 5.46 *** 
2012 88 64 76 0.67 0.50 8.27 *** 6.20 *** 
2013 88 64 75 0.67 0.52 8.03 *** 6.26 *** 
Grade 5 
2010 89  71  77  0.6 2 0.5 2  6.18 * **  5.19 * ** 
2011 83 72 79 0.39 0.19 4.56 *** 2.21 * 
2012 86 60 70 0.65 0.53 7.69 *** 6.31 *** 
2013 80 60 69 0.50 0.35 5.63 *** 4.00 *** 
Mathematics 
A. Acad.- Districts - Ef Size- Ef Size- t- t-
% % State - % Dist State district prob. state prob. 
G rade 3 
2010 87  72  79  0.5 4 0.3 8  6.50 * **  4.62  *** 
2011 61 54 64 0.15 -0.08 1.88 * -1.03 
2012 64 53 65 0.23 -0.03 3.00 ** -0.37  
2013 69 53 62 0.34 0.18 4.32 *** 2.34 * * 
Grade 4 
2010 79  72  79  0.2 5 0.0 0  2.96  **  0.00  
2011 65 60 66 0.13 -0.03 1.48 -0.35  
2012 80 56 67 0.55 0.39 6.77  *** 4.89 * ** 
2013 68 56 65 0.27 0.09 3.28 ** 1.03 
Grade 5  
2010 85  75  79  0.4 0 0.2 9  3.98  ***  2.84 * * 
2011 54 52 58 0.04 -0.10 0.48 -1.11 
2012 74 52 60 0.46 0.35 5.42 * ** 4.14 * ** 
2013 66 52 59 0.29 0.17 3.29 *** 1.92 * 
Science (Grade 5 Only) 
 
 
 
23 
  
 
A. Acad.- Districts - Ef Size- Ef Size- t- t-
% % State - % Dist State district prob. state   
 2011 89 67 75 0.67 0.56 7.75 *** 6.51 *** 
2012 84 57 70 0.63 0.47 7.43 *** 5.52 *** 
2013 82 57 67 0.58 0.45 6.55 *** 5.12 *** 
Note: Sample sizes for 2010 were 148 for grade 3, 141 for grade 4, and 100 for grade 5; for 2011 were 
153 for grade 3, 141 for grade 4, and 136 for grade 5; for 2012 were 165 for grade 3, 155 for grade 4, 
and 141 for grade 5; for 2013 were 162 for grade 3, 146 for grade 4 and 128 for grade 5. *, p< .05; **, 
p < .01, *** p<.001 (two-tail). As with earlier tables, all probabilities are one-tail reflecting the research 
hypothesis that the DI group would have higher scores. 
 
  
 
 
 
24 
  
 
Table A-5 
     
Changes in SAT Reading and Math Scores, Fall to Spring, 2007-08, 2008-09, 2009-10), Arthur Academy 
Charter Schools 
Fall 2007 to Spring 2008 
Fall % Spring % Ef. Size t-ratio 
Reading, Grades K-5 (N =652)  
Average or Above (>=40th %ile)  62 86  0.4 9 12.50  * ** 
High Achieving (>=80th %ile) 20 51 0.78 19.77 *** 
Mathematics, Grades K-5 (n=670) 
Average or Above (>=40th %ile) 59  88  0.7 3 18.75   *** 
High Achieving (>=80th %ile) 20 54 0.85 21.99 *** 
Fall 2008 to Spring 2009 
Fall % Spring % Ef. Size t-ratio 
R eading, Grades K-5 (N=816)  
Average or Above (>=40th %ile) 53   84 0.6 3 18.06  * ** 
High Achieving (>=80th %ile) 20 41 0.53 14.99 *** 
Mathematics, Grades K-5 (n=818) 
Average or Above (>=40th %ile) 51  84  0.8 3 23.58   *** 
High Achieving (>=80th %ile) 17 42 0.63 17.86 *** 
Fall 2009 to Spring 2010 
Grade 1 (n-147 Reading, 145 Math) Fall % Spring % Ef. Size t-ratio n 
Reading, Average or Above (>=40th %ile) 44 89 0.92 11.12 *** 
Reading, High Achieving (>=80th %ile) 16 41 0.50 6.08 *** 
Mathematics, Average or Above (>=40th %ile) 53 86 0.68 8.11 *** 
Mathematics, High Achieving (>=80th %ile) 13 66 1.08 13.01 *** 
Grade 2 (n=144) 
Reading, Average or Above (>=40th %ile) 53  84  0.6 4  7.63  *** 
Reading, High Achieving (>=80th %ile) 17 33 0.34 4.07 *** 
Mathematics, Average or Above (>=40th %ile) 62 85 0.47 5.59 *** 
Mathematics, High Achieving (>=80th %ile) 28 40 0.23 2.71 ** 
Grade 3 (N=141 reading, 142 math) 
Reading, Average or Above (>=40th %ile) 68  79  0.2 3  2.74 * * 
Reading, High Achieving (>=80th %ile) 30 35 0.12 1.37 
Mathematics, Average or Above (>=40th %ile) 64 77 0.27 3.24 * ** 
Mathematics, High Achieving (>=80th %ile) 18 39 0.43 5.12 *** 
Grade 4 (N=140) 
Reading, Average or Above (>=40th %ile) 72  89  0.3 4  3.95  *** 
Reading, High Achieving (>=80th %ile) 26 46 0.42 4.99 *** 
Mathematics, Average or Above (>=40th %ile) 69 91 0.45 5.33 *** 
 
 
 
25 
  
 
Mathematics, High Achieving (>=80th %ile) 25 51 0.52 6.19 *** 
Grade 5 (N=135) 
Reading, Average or Above (>=40th %ile) 76  84  0.1 8  2.10 *  
Reading, High Achieving (>=80th %ile) 41 45 0.09 1.05 
Mathematics, Average or Above (>=40th %ile) 73 79 0.12 1.40  
Mathematics, High Achieving (>=80th %ile) 30 39 0.18 2.10  * 
  
      
 
  
 
 
 
26 
  
 
Table A-6 
Changes in SAT Reading and Math Score s, 2010-11, Art hur Academy C harter S chools  
Reading 
  Pre-Test Post-Test ES Change t-ratio 
First Grade (n=142)  
Aver. or Above (>=40th %ile) 44  90  0.9 4 11.11   *** 
Top 20 (>=80th %ile) 16 41 0.63 7.40 *** 
Second Grade (n=144) 
Aver. or Above (>=40th %ile) 53  84  0.6 3 7.49  * ** 
Top 20 (>=80th %ile) 17 33 0.40 4.78 *** 
Third Grade (n=141) 
Aver. or Above (>=40th %ile) 68  79  0.2 2 2.66   *** 
Top 20 (>=80th %ile) 30 35 0.13 1.48 
Fourth Grade (n=140)  
Aver. or Above (>=40th %ile) 72  88  0.3 3 3.86   *** 
Top 20 (>=80th %ile) 26 46 0.50 5.89 *** 
Fifth Grade (n=135) 
Aver. or Above (>=40th %ile) 75  85  0.2 0 2.42   ** 
Top 20 (>=80th %ile) 41 45 0.10 1.16 
Mathematics  
Kindergarten Pre-Test Post-Test ES Change t-ratio 
First Grade  
Aver. or Above (>=40th %ile) 53  86  0.6 7 7.97  * ** 
Top 20 (>=80th %ile) 13 66 1.33 15.90 *** 
Second Grade 
Aver. or Above (>=40th %ile) 61  85  0.4 9 5.80   *** 
Top 20 (>=80th %ile) 28 40 0.30 3.59 *** 
Third Grade 
Aver. or Above (>=40th %ile) 65  78  0.2 7 3.14  * ** 
Top 20 (>=80th %ile) 18 39 0.53 6.23 *** 
Fourth Grade 
Aver. or Above (>=40th %ile) 70  92  0.4 5 5.31   *** 
Top 20 (>=80th %ile) 25 51 0.65 7.66 *** 
Fifth Grade 
Aver. or Above (>=40th %ile) 72  78  0.1 2 1.45   
Top 20 (>=80th %ile) 30 39 0.23 2.60  ** 
       
 
 
 
 
27 
  
 
Table A-7 
     
Changes in SAT Reading and Math Scores, Fall, 2011 to Spring, 2012, Arthur Academy Charter 
Schools 
Reading 
Fall Spring Ef. Size t-ratio 
First Grade (contin uing students, n = 136)  
Average or Above (>=40th %ile) 87 96  0.1 8 2.13  *  
First Grade (all students (n=155)   
Average or Above (>=40th %ile) 82  90 0.1 6 2.03  *  
Second Grade (continuing students, n= 144) 
Average or Above (>=40th %ile) 74 85  0.2 2 2.69   ** 
Second Grade (all students (n=161) 
Average or Above (>=40th %ile) 70  83 0.2 7 3.36   *** 
Mathematics 
Fall Spring Ef. Size t-ratio 
F irst Grade (n=136)  
Average or Above (>=40th %ile) 81   95 0.2 9 3.32  * ** 
Second Grade (n=143) 
Average or Above (>=40th %ile) 87   94 0.1 4 1.70  *   
 
  
 
 
 
28 
  
 
 
 
Table A-8 
     
Changes in SAT Reading and Math Scores, Fall 2012 to Spring 13, by Grade, Arthur Academy 
Charter Schools 
Reading 
Fall Spring ES Change t-ratio 
  
First Grade (n=157) 
     
Aver. or Above (>=40th %ile) 74 97 0.47 5.86 *** 
Top 20 (>=80th %ile) 24 47 0.58 7.18 *** 
Second Grade (n=151) 
     
Aver. or Above (>=40th %ile) 89 90 0.02 0.25 
 
Top 20 (>=80th %ile) 44 48 0.10 1.22   
Mathematics 
First Grade (n=157) Fall Spring ES Change t-ratio 
 
Aver. or Above (>=40th %ile) 84 100 0.33 4.08 *** 
Top 20th %ile (>=20th %ile) 24 70 1.15 14.36 *** 
Second Grade (n=151)   
    
Aver. or Above (>=40th %ile) 92 92 0.00 0.00 
 
Top 20th %ile (>=20th %ile) 56 44 -0.30 -3.67 
  
          
 
 
 
29 
  
 
Table A-9 
          
Percentage of Students Meeting OAKS Benchmark, Arthur Students, Districts and State, 2012 and 
2013, by Cohort and Subject 
Reading 
Arthur Ef. Size Ef. Size- t-ratio - t-ratio - 
Year Acad. District State - Dist State District State 
C ohort 1 2012 80 64 76 0.06 0.19 0.65  2.19 *  
2013 80 60 69  
 Cohort 2 2012 80 60 72 0.1 7 0.2 3  2.07  *  2.91  ** 
  2013 88 64 75             
Math 
Arthur Ef. Size Ef. Size- t-ratio - t-ratio - 
Year Acad. District State - Dist State District State 
C ohort 1 2012 64 56 67 0.11 0.26 1.28  3.05 * * 
2013 66 52 59  
C ohort 2 2012 64 53 65 0.0 4 0.1 1  0.48   1.42  
  2013 68 56 65               
Note: For cohort 1 there were data for 146 students in 2012 and 128 students in 2013. For cohort 2 
there were data for 165 students in 2012 and 146 students in 2013. Effect sizes and t-ratios were 
corrected for possible ceiling effects as explained in the text of the appendix.  
*, p< .05; **, p< .01; ***,p<.001 
 
 
 
  
 
 
 
30 
  
 
References 
 
Arthur, C. (2009-2010). Mission and Instructional Model. Portland, Oregon: Arthur Academy 
Public Charter Schools. 
 
Coughlin, C. (2014). Outcomes of Engelmann’s Direct Instruction: Research Syntheses, pp. 
25-54 in J. Stockard (Ed.). The Science and Success of Engelmann’s Direct 
Instruction. Eugene, OR: NIFDI Press. 
 
 
 
 
31