|
||
Assessing Student WorkArtifact MethodologyThe basis of the assessment methodology at NMC is the outcome rubric with which student achievement is measured. By means of a faculty-wide conversation, a general education philosophy was fashioned and general education outcomes were identified. (For more on the philosophy and outcomes see the NMC policy or the NMC assessment website.) InstrumentOnce the outcomes were identified, faculty members with area-specific expertise developed a list of capabilities which taken together define the outcome. The faculty experts designed the rubrics as a measurement instrument with the capabilities broken down into differing levels of skill for use in scoring student work. In this way, the rubrics are analytical in nature. An analytic scoring rubric (as opposed to a holistic rubric) lists each criterion scorers rate as they read or observe students’ work. Analytic scoring rubrics provide detailed results about patterns of student achievement. The advantage of an analytic rubric is that it promotes targeted interpretation and decision-making. For example, “results that identify students’ weakness in abilities to solve certain kinds of problems or take multiple perspectives on an issue lead to discussions about how to modify or change the curriculum. The patterns become fertile evidence upon which to base focused interpretations about students’ strengths or weaknesses.” (Maki, 2004, p.123) The use of rubrics makes measuring performance more accurate, unbiased and consistent. Using rubrics increases the likelihood of inter-reader reliability and minimizes qualitative judgments. Artifact -- Guidelines for FacultyThe student work, or artifacts, to be measured come from course embedded assignments. In this way, the artifact method is authentic and performance based and aligns with students’ learning experiences. As a result, the artifact method enjoys high face validity and content validity. Additionally, consequential validity is assured by scoring the artifacts, for the purposes of general education assessment, outside the classroom. Validity of a method can be increased if students practice and receive feedback on the skills that are being measured, and are then held accountable for demonstrating this learning multiple times. Thus, the validity of the method can be impacted by whether instructors use the rubrics in their classes to grade and give feedback on student work. Another effect of instructors using the rubrics in their classes is the resulting formative assessment that helps instructors facilitate learning. See the Artifact Guidelines if your are a faculty member who has been selected to provide student work to be assessed. Example AssignmentsCheck out these links for example assignments in generating scorable artifacts: (when printing make sure to print these .pdf documents with comments):
ProcessIn terms of process, two scorers read each piece of student work and assign a score from zero (deficient) to three (proficient) for each of the five capabilities on the rubric (or six in the case of Cultural Perspectives). If the sum of the capability scores from one reader differs by three or more points from the second reader, a third reader is required to score the artifact. A difference of three or more points was chosen because three points represents a 20% difference in opinion between readers, which we define as a significant difference. It also represents a more rigorous decision point than suggested in current literature, which tends to be a 25% difference. The final score for an artifact is the average score between the two readers, unless a third reader is needed. If there is a third reader (thus, three sets of scores), the final score for the artifact is the average of all the scores that are within one standard deviation of the mean. By this means, statistically, inter-reader reliability is assured. But NMC takes steps to ensure inter-reader reliability qualitatively as well. Prior to the actual scoring of artifacts, scorers independently read a set of student work that reflects a range of assignments. When the scorers come together they review their scores to reconcile inconsistent patterns. This process is repeated, as Maki (2004) suggests, throughout the actual formal scoring process, and every time there are changes to the measuring instrument in response to process improvement. Reliability is built into the method when the assessors apply the same procedure the same way, producing the same measures and the same results. (For additional resources on designing valid and reliable methods see King, Keohane, & Verba, 1994; Mentkowski & Rogers, 1988; Mentkowski & Loacker, 1985; Mentkowski & Doherty, 1984, 1980.) SamplingAs mentioned above, the artifacts come from prompts in course assignments. A process of clustered random sampling selects courses from which instructors submit student work. Instructors have previously identified whether their course supports a general education outcome. Then the population from which the sample is drawn includes all courses in any given semester that support a specific outcome. The sampling method is random in that courses to provide artifacts are randomly selected. The sampling method is clustered in that the courses are randomly selected and then students whose work will be scored are randomly selected from the course list. Accordingly, the results of assessment are generalizable to students taking courses that support a specific general education outcome. Increasing the sample size of artifacts to be scored, and over-representing specific groups of interest, will help increase ecological validity or the degree to which the results are generalizable to our degreed graduates and the learners for which we intend to make inferential statements. AnalysisOnce the artifact scores are determined, the data are analyzed and presented so as to best provide evidence of accountability and continuous improvement. Construct validity is guaranteed by ruling out alternative explanations for the measured results. Artifact score serves as the dependent variable for which we are determining the sources of variance. The two key independent variables are number of NMC credit hours a student has earned and number of exposures a student has had to the general education outcome being measured. Number of exposures is determined by evaluating the courses a student has completed that have been identified by instructors as supporting a specific general education outcome. A student’s placement test scores serve as a control variable to rule out prior skill as an explanation of artifact score. The data are then analyzed by regression analysis. ResultsThe results of the analysis are presented in a variety of modes. Descriptive statistics answer the accountability question and are simply the percentages of students scoring in the Proficient, Sufficient, Developing, and Deficient ranges. These statistics along with the regression analysis are presented in a report from the Office of Institutional Research.
|
||