서브메뉴
검색
상세정보
The Accuracy of Automatic Qualitative Analyses of Constructed-Response Solutions to Algebra Word Problems. GRE Board Professional Report No. 91-03P. Bennett, Randy Elliot, Sebrechts, Marc M [microform]
The Accuracy of Automatic Qualitative Analyses of Constructed-Response Solutions to Algebra Word Problems. GRE Board Professional Report No. 91-03P. Bennett, Randy Elliot, Sebrechts, Marc M [microform]
상세정보
- 자료유형
- 마이크로피시
- 언어부호
- 본문언어 - English
- 보고서번호
- ETS-RR-94-04
- 청구기호
- 서명/저자
- The Accuracy of Automatic Qualitative Analyses of Constructed-Response Solutions to Algebra Word Problems. GRE Board Professional Report No. 91-03P. : Bennett, Randy Elliot, Sebrechts, Marc M - [microform]
- 발행사항
- 형태사항
- 111; 2
- 총서명
- ERIC Reports
- 주기사항
- 111p.
- 초록/해제
- 요약This study evaluated expert system diagnoses of examinees solutions to complex constructed-response algebra word problems. Problems were presented to three samples (30 college students each), each of which had taken the Graduate Record Examinations General Test. One sample took the problems in paper-and-pencil form and the other two on computer. Responses were then diagnostically analyzed by an expert system, GIDE, and by four Educational Testing Service mathematics test developers. Results were highly consistent across the samples. Human judges generally agreed in describing responses as right or wrong, but concurred at lower levels in categorizing the specific bugs they detected in incorrect solutions. The expert system agreed highly with the judges rightwrong decisions, but less closely with bug categorizations that judges agreed on. Causes of machine-rater disagreement were identified, and suggested remedies were proposed. These results suggest that highly accurate diagnostic analysis through knowledge-based understanding of complex responses may be difficult to achieve at the fine-grained level used by GIDE. Increasing accuracy is discussed. Appendixes A, B, and C present probabilities and canonical solutions for each of the samples; and Appendixes D, E, and F contain Sample 2 judges instructions, and Sample 2 and Sample 3 Bug Classification Scheme and Detailed Error Descriptions with Examples. Twenty-one tables present study data. (Contains 13 references.) (AuthorSLD)
- 복제주기
- Microfiche. . Springfield, VA : ERIC Document Reproduction Service. . microfiches ; 11×15 cm.
- 기금정보
- Graduate Record Examinations Board, Princeton, N.J.e
- 일반주제명
- 키워드
- 기타저자
- 기타저자
MARC
008980930s1994 us b 000 0 eng d■0010000437260
■001PCUL00371052
■002ED385550
■00520020803010316
■007heuumu---buua
■008980930s1994 us b 000 0 eng d
■040 ▼apcul
■0410 ▼aEnglish
■088 ▼aETS-RR-94-04
■090 ▼a370.78▼bE68
■24504▼aThe Accuracy of Automatic Qualitative Analyses of Constructed-Response Solutions to Algebra Word Problems. GRE Board Professional Report No. 91-03P.▼cBennett, Randy Elliot, Sebrechts, Marc M▼h[microform]
■260 ▼aU.S.; New Jersey▼bEducational Testing Service, Princeton, N.J.▼cMar 94q
■300 ▼a111; 2
■440 0▼aERIC Reports
■500 ▼a111p.
■520 ▼aThis study evaluated expert system diagnoses of examinees solutions to complex constructed-response algebra word problems. Problems were presented to three samples (30 college students each), each of which had taken the Graduate Record Examinations General Test. One sample took the problems in paper-and-pencil form and the other two on computer. Responses were then diagnostically analyzed by an expert system, GIDE, and by four Educational Testing Service mathematics test developers. Results were highly consistent across the samples. Human judges generally agreed in describing responses as right or wrong, but concurred at lower levels in categorizing the specific bugs they detected in incorrect solutions. The expert system agreed highly with the judges rightwrong decisions, but less closely with bug categorizations that judges agreed on. Causes of machine-rater disagreement were identified, and suggested remedies were proposed. These results suggest that highly accurate diagnostic analysis through knowledge-based understanding of complex responses may be difficult to achieve at the fine-grained level used by GIDE. Increasing accuracy is discussed. Appendixes A, B, and C present probabilities and canonical solutions for each of the samples; and Appendixes D, E, and F contain Sample 2 judges instructions, and Sample 2 and Sample 3 Bug Classification Scheme and Detailed Error Descriptions with Examples. Twenty-one tables present study data. (Contains 13 references.) (AuthorSLD)
■533 ▼aMicrofiche.▼bSpringfield, VA▼cERIC Document Reproduction Service.▼emicrofiches ; 11×15 cm.
■536 ▼aGraduate Record Examinations Board, Princeton, N.J.e
■650 4▼xEducation
■653 ▼aAlgebra▼aAutomation▼aClassification▼aCollege Entrance Examinations▼aCollege Students▼aComputer Assisted Testing▼aConstructed Response▼aEducational Diagnosis▼aExpert Systems▼aHigher Education▼aQualitative Research▼aScoring▼aTest Construction▼aWord Problems (Mathematics)▼aAccuracy▼aGIDE Computer Program▼aGraduate Record Examinations
■7001 ▼aBennett, Randy Elliot
■7001 ▼aSebrechts, Marc M.
■999 ▼a143


