Saturday, November 10, 2012

CAPSQ: A Tool to Measure Classroom Assessment Practices

The CAPSQ as a Measure of Classroom Assessment Practices

The  Classroom Assessment Practices Survey Questionnaire (CAPSQ: Gonzales, 2011) consisted of 60 items that were sorted according to thematic similarities using simple Q-sort method (Stephenson, 1953). Item content was derived from interview data from teachers and the works of Birembaum (1997), Bliem and Davinroy (1997), Brown (2002,2004), Brown and Lake (2006), Cheng, Rogers and Hu (2004), Hill, (2002), Mblelani (2008), Mertler (1998, 2003, 2009), Sanchez and Brisk (2004) and Zhang and Burry-Stock (2003). Emerging and recognized scholarship on assessment (Earl and Katz, 2006; Stiggins, 1997, 2008; Angelo & Cross, 1993; Airasian, 1997; and Black & William, 1998) also was consulted.

Items used a 5-point Likert type response scale describing frequency (1-never to 5-always) of doing an assessment activity. Three experts on scale development and classroom assessment reviewed the items in terms of format as well as content clarity and relevance. Items were subsequently categorized according to the four purposed of assessment based on a framework currently used by the Western and Northern Canadian Protocol and described by Earl (2003) and Earl and Katz (2004). These distinct but interrelated purposes include: 1) assessment of learning, 2) assessment as learning, 3) assessment for learning, and 4) assessment to inform. 

Psychometric Evaluation of the CAPSQ

Initial analyses were conducted to determine whether assumptions for univariate normality were met. Item skew (-.35 to -1.33) and kurtosis (05 to 1.63) values were within the acceptable range, /3/ and /10/, respectively (Kline, 2010). Inter-item correlation matrix indicates that the coefficients were generally small (e.g., r =.14 for items 8 & 19) to moderate (e.g., r = .34 for items 4 & 14), suggesting that the correlation matrix was appropriate for factor analysis. Item means (and standard deviations) ranged from 4.07 (.89) for item 7 to 4.46 (.75) for item 14.

Initial solutions for the exploratory factor analysis (EFA) that included Bartlett’s test of sphericity, χ2 (210, N = 364) = 5200.62, p < .001, and the size of the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy (.94) suggest that the data were satisfactory for factor analysis (Hutcheson & Sofroniou,1999). Results of the initial PAF yielded a five-factor solution that accounted for a total variance of approximately 72%. However, promax rotation method with Kaiser Normalization indicated that only two items (16 and 24) loaded on factor 5, and their content cannot be easily interpreted. Further, items 7, 10, and 18 have similar loadings (i.e., ≥ .30) on two factors. These five items were subsequently eliminated and EFA was again conducted with the remaining 18 items. 

Initial communalities of the final 18 items ranged from .47 to .72, with a mean coefficient of .61. Results of the PAF for this solution suggest that a four- factor solution would best describe the structure of CAPSQ. Total variance explained by these four factors was approximately 62%: factor 1 accounted for the majority of the variance, 50.27%; factor 2 accounted 8.02%; factor 3 accounted 5.98%; and factor 4 accounted 5.45%.  Each of these four factors also contributed at least 3% of the sum of squared loadings. Items have pure loadings of at least .40 on only one factor. 
Internal consistency of the factor scores and total score was calculated using Cronbach’s alpha (α). The four factors demonstrated high internal consistency, with α = .92 for factor 1, α = .88 for factor 2, α = .83 for factor 3, and α = .85 for factor 4. Internal consistency for the total score also was high, α = .95. Inter- factor correlations ranged from .57 (moderate) to .72 (high). Correlations between CAPSQ factors and total score were all very high (r = .82-.92), indicating that total score can be the most accurate and valid estimate of the classroom assessment practices.
The factor structure of the CAPSQ conformed to the general purposes of classroom assessment that was considered as a framework in the conceptualization phase of the scale development. All factor and total scores demonstrated high internal consistency. However, there was strong evidence of factor-total score overlap suggesting that the total score is the most valid index when using the CAPSQ to describe classroom assessment practices. Although this is psychometrically true, item and factor information may be beneficial when determining teachers’ strengths and weaknesses in dispensing their roles related to classroom assessment. For example, school administrators and teachers themselves can examine the pattern of responses at the factor and item levels for professional development purposes. CAPSQ total score may be the information to use for research and longitudinal growth modeling in developmental program evaluation. Descriptions of the empirically derived four factors of CAPSQ are important to facilitate understanding of classroom assessment practices

Factor 1: Assessment as learning. This factor refers to the practices of teachers in giving assessment that is aimed at developing and supporting student’s knowledge of his/her thought process (i.e., metacognition). Assessment as learning is translated into practice when teachers assess students by providing them with opportunities to show what they have learned in class (Murray 2006), by creating an environment where it is conducive for students to complete an assigned tasks and by helping students to develop clear criteria of good learning practices (Hill, 2002). This factor also implies that teachers decide to assess students to guide them to acquire personal feedback and monitoring of their learning process (Murray, 2006; Sanchez & Brisk, 2004). Assessment as learning requires more task-based activities than traditional paper-pencil tests. This teaching practice provides examples of good self-assessment practices for students to examine their own learning process (Kubiszyn & Borich, 2007; Mory, 1992).

Factor 2: Assessment of learning.  This factor refers to assessment practices of teachers to determine current status of student achievement against learning outcomes and in some cases, how their achievement compare with their peers (Earl, 2005; Gonzales, 1999; Harlen, 2007).  The main focus of assessment of learning is how teachers make use assessment results to guide instructional and educational decisions (Bond, 1995; Musial, Nieminem, Thomas & Burle, 2009). Hence, this factor describes practices that are associated with summative assessment (Glickman, Gordon, Ross-Gordon, 2009; Harlen, 2007; Struyf, Vandenberghe, & Lens (2001). In summative assessment, teachers aim to improve instructional programs based on how students have learned as reflected by various assessment measures given at the end of the instructional program (Borko et. al., 1997; Harlen, 2008; Mbelani, 2008). Teachers conduct summative assessment to make final decisions about the achievement of students at the end of the lesson or subject (Stiggins, Arter, Chappuis & Chappuis, 2004)

Factor 3: Assessment to inform. This factor refers to the communicative function of assessment, which is reporting and utilizing results for various stakeholders (Jones and Tanner, 2008). Teachers perform assessment to provide information both to students and their parents, other teachers, schools, and future employers regarding students’ performance in class (Guillickson, 1984; Sparks, 2005). Assessment to inform is related to assessment of learning since the intention of assessment is to be able to provide information to parents about the performance of their children in school at the end of an instructional program (Harlen, 2008). Teachers use assessment to rank students and to use assessment results to provide a more precise basis to represent the achievement of students in class through grades and rating (Manzano, 2000; Murray, 2006; Sparks, 2005).

Factor 4: Assessment for learning.  This factor refers to practices of teachers to conduct assessment to determine the progress in learning by giving tests and other tools to measure learning during instruction (Biggs, 1995; Docky & McDowell, 1997; Murray, 2006; Sadler, 1989; Sparks, 2005). Assessment for learning or formative assessment requires the use of learning tests, practice tests, quizzes, unit tests, and the like (Boston, 2002; MacLellan, 2001; Stiggins et al, 2004). Teachers prefer these formative assessment tools to cover some predetermined segment of instruction that focuses on a limited sample of learning outcomes Assessment for learning requires careful planning so that teachers can use the assessment information to determine what students know and gain insights into how, when and whether students apply what they know (Earl and Katz (2006).

Interested researchers and users may contact the author (Email:

You can also find the copy of this study at

No comments: