Saturday, November 10, 2012

CAPSQ: A Tool to Measure Classroom Assessment Practices

The CAPSQ as a Measure of Classroom Assessment Practices

The  Classroom Assessment Practices Survey Questionnaire (CAPSQ: Gonzales, 2011) consisted of 60 items that were sorted according to thematic similarities using simple Q-sort method (Stephenson, 1953). Item content was derived from interview data from teachers and the works of Birembaum (1997), Bliem and Davinroy (1997), Brown (2002,2004), Brown and Lake (2006), Cheng, Rogers and Hu (2004), Hill, (2002), Mblelani (2008), Mertler (1998, 2003, 2009), Sanchez and Brisk (2004) and Zhang and Burry-Stock (2003). Emerging and recognized scholarship on assessment (Earl and Katz, 2006; Stiggins, 1997, 2008; Angelo & Cross, 1993; Airasian, 1997; and Black & William, 1998) also was consulted.

Items used a 5-point Likert type response scale describing frequency (1-never to 5-always) of doing an assessment activity. Three experts on scale development and classroom assessment reviewed the items in terms of format as well as content clarity and relevance. Items were subsequently categorized according to the four purposed of assessment based on a framework currently used by the Western and Northern Canadian Protocol and described by Earl (2003) and Earl and Katz (2004). These distinct but interrelated purposes include: 1) assessment of learning, 2) assessment as learning, 3) assessment for learning, and 4) assessment to inform. 

Psychometric Evaluation of the CAPSQ

Initial analyses were conducted to determine whether assumptions for univariate normality were met. Item skew (-.35 to -1.33) and kurtosis (05 to 1.63) values were within the acceptable range, /3/ and /10/, respectively (Kline, 2010). Inter-item correlation matrix indicates that the coefficients were generally small (e.g., r =.14 for items 8 & 19) to moderate (e.g., r = .34 for items 4 & 14), suggesting that the correlation matrix was appropriate for factor analysis. Item means (and standard deviations) ranged from 4.07 (.89) for item 7 to 4.46 (.75) for item 14.

Initial solutions for the exploratory factor analysis (EFA) that included Bartlett’s test of sphericity, χ2 (210, N = 364) = 5200.62, p < .001, and the size of the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy (.94) suggest that the data were satisfactory for factor analysis (Hutcheson & Sofroniou,1999). Results of the initial PAF yielded a five-factor solution that accounted for a total variance of approximately 72%. However, promax rotation method with Kaiser Normalization indicated that only two items (16 and 24) loaded on factor 5, and their content cannot be easily interpreted. Further, items 7, 10, and 18 have similar loadings (i.e., ≥ .30) on two factors. These five items were subsequently eliminated and EFA was again conducted with the remaining 18 items. 

Initial communalities of the final 18 items ranged from .47 to .72, with a mean coefficient of .61. Results of the PAF for this solution suggest that a four- factor solution would best describe the structure of CAPSQ. Total variance explained by these four factors was approximately 62%: factor 1 accounted for the majority of the variance, 50.27%; factor 2 accounted 8.02%; factor 3 accounted 5.98%; and factor 4 accounted 5.45%.  Each of these four factors also contributed at least 3% of the sum of squared loadings. Items have pure loadings of at least .40 on only one factor. 
Internal consistency of the factor scores and total score was calculated using Cronbach’s alpha (α). The four factors demonstrated high internal consistency, with α = .92 for factor 1, α = .88 for factor 2, α = .83 for factor 3, and α = .85 for factor 4. Internal consistency for the total score also was high, α = .95. Inter- factor correlations ranged from .57 (moderate) to .72 (high). Correlations between CAPSQ factors and total score were all very high (r = .82-.92), indicating that total score can be the most accurate and valid estimate of the classroom assessment practices.
The factor structure of the CAPSQ conformed to the general purposes of classroom assessment that was considered as a framework in the conceptualization phase of the scale development. All factor and total scores demonstrated high internal consistency. However, there was strong evidence of factor-total score overlap suggesting that the total score is the most valid index when using the CAPSQ to describe classroom assessment practices. Although this is psychometrically true, item and factor information may be beneficial when determining teachers’ strengths and weaknesses in dispensing their roles related to classroom assessment. For example, school administrators and teachers themselves can examine the pattern of responses at the factor and item levels for professional development purposes. CAPSQ total score may be the information to use for research and longitudinal growth modeling in developmental program evaluation. Descriptions of the empirically derived four factors of CAPSQ are important to facilitate understanding of classroom assessment practices

Factor 1: Assessment as learning. This factor refers to the practices of teachers in giving assessment that is aimed at developing and supporting student’s knowledge of his/her thought process (i.e., metacognition). Assessment as learning is translated into practice when teachers assess students by providing them with opportunities to show what they have learned in class (Murray 2006), by creating an environment where it is conducive for students to complete an assigned tasks and by helping students to develop clear criteria of good learning practices (Hill, 2002). This factor also implies that teachers decide to assess students to guide them to acquire personal feedback and monitoring of their learning process (Murray, 2006; Sanchez & Brisk, 2004). Assessment as learning requires more task-based activities than traditional paper-pencil tests. This teaching practice provides examples of good self-assessment practices for students to examine their own learning process (Kubiszyn & Borich, 2007; Mory, 1992).

Factor 2: Assessment of learning.  This factor refers to assessment practices of teachers to determine current status of student achievement against learning outcomes and in some cases, how their achievement compare with their peers (Earl, 2005; Gonzales, 1999; Harlen, 2007).  The main focus of assessment of learning is how teachers make use assessment results to guide instructional and educational decisions (Bond, 1995; Musial, Nieminem, Thomas & Burle, 2009). Hence, this factor describes practices that are associated with summative assessment (Glickman, Gordon, Ross-Gordon, 2009; Harlen, 2007; Struyf, Vandenberghe, & Lens (2001). In summative assessment, teachers aim to improve instructional programs based on how students have learned as reflected by various assessment measures given at the end of the instructional program (Borko et. al., 1997; Harlen, 2008; Mbelani, 2008). Teachers conduct summative assessment to make final decisions about the achievement of students at the end of the lesson or subject (Stiggins, Arter, Chappuis & Chappuis, 2004)

Factor 3: Assessment to inform. This factor refers to the communicative function of assessment, which is reporting and utilizing results for various stakeholders (Jones and Tanner, 2008). Teachers perform assessment to provide information both to students and their parents, other teachers, schools, and future employers regarding students’ performance in class (Guillickson, 1984; Sparks, 2005). Assessment to inform is related to assessment of learning since the intention of assessment is to be able to provide information to parents about the performance of their children in school at the end of an instructional program (Harlen, 2008). Teachers use assessment to rank students and to use assessment results to provide a more precise basis to represent the achievement of students in class through grades and rating (Manzano, 2000; Murray, 2006; Sparks, 2005).

Factor 4: Assessment for learning.  This factor refers to practices of teachers to conduct assessment to determine the progress in learning by giving tests and other tools to measure learning during instruction (Biggs, 1995; Docky & McDowell, 1997; Murray, 2006; Sadler, 1989; Sparks, 2005). Assessment for learning or formative assessment requires the use of learning tests, practice tests, quizzes, unit tests, and the like (Boston, 2002; MacLellan, 2001; Stiggins et al, 2004). Teachers prefer these formative assessment tools to cover some predetermined segment of instruction that focuses on a limited sample of learning outcomes Assessment for learning requires careful planning so that teachers can use the assessment information to determine what students know and gain insights into how, when and whether students apply what they know (Earl and Katz (2006).

Interested researchers and users may contact the author (Email:

You can also find the copy of this study at

Classroom Assessment Preferences of Japanese Language Teachers in the Philippines and English Language Teachers in Japan

Very recently, I completed a study with Dr Jonathan Aliponga of Kansai University of International Studies, Hyogo, Japan entitled Classroom Assessment Preferences of Japanese Language Teachers in the Philippines and English Language Teachers in Japan. This study has also been recently published at MEXTESOL Journal, Volume 36, Number 1, 2012.

The following is the abstract.

Classroom Assessment Preferences of Japanese Language Teachers in the Philippines and English Language Teachers in Japan

Richard DLC. Gonzales
University of Santo Tomas Graduate School, Manila, Philippines

Jonathan Aliponga
Kansai University of International Studies, Hyogo, Japan

Student assessment provides teachers with information that is important for decision-making in the classroom. Assessment information helps teachers to understand their students’ performance better as well as improve suitability and effectiveness of classroom instruction. The purpose of the study was to compare the classroom assessment preferences of Japanese language teachers in the Philippines (n=61) and English language teachers in Japan (n=55) on the purposes of assessment as measured by the Classroom Assessment Preferences Survey Questionnaire for Language Teachers (CAPSQ-LT). Results revealed that overall, language teachers from both countries most preferred assessment practices that are focused towards assessment as learning and least preferred assessment practices that refer to the communicative function of assessment (assessing to inform). Comparatively, Japanese language teachers in the Philippines preferred assessment for learning, that is, they assessed to improve learning process and effectiveness of instruction, while the English language teachers in Japan are more concerned with the assessment of learning and the communicative and administrative function of assessment. The two groups did not significantly differ in their preference for assessment of learning and assessment as learning.

 The complete copy of this study can be access at

Roles of Testing

In today's system, testing has become a critical policy in any environment. Schools administer various kinds of tests to students, industries or companies give tests to applicants, organizations tests their members for various reasons. Regardless of whatever the purpose of testing, the main objective of testing is to differentiate and classify individuals to specific roles and functions.