ERIC Number: EJ1067852
Record Type: Journal
Publication Date: 2015-Aug
Abstractor: As Provided
Reference Count: 27
Validity: Applying Current Concepts and Standards to Gynecologic Surgery Performance Assessments
LeClaire, Edgar L.; Nihira, Mikio A.; Hardré, Patricia L.
Advances in Health Sciences Education, v20 n3 p817-828 Aug 2015
Validity is critical for meaningful assessment of surgical competency. According to the Standards for Educational and Psychological Testing, validation involves the integration of data from well-defined classifications of evidence. In the authoritative framework, data from all classifications support construct validity claims. The two aims of this study were to develop a categorization method for validity evidence published in support of surgery performance assessments and to summarize the results of applying this methodology to the gynecologic surgery literature. This was a critical analysis of published observations reported as validity evidence in studies with a construct validity claim. Medline and Embase databases were searched using keywords: "surgery" and "construct validity". Parameters included English-language articles published from 2000 to 2012. Gynecologic studies were analyzed for definitions of construct validity and nonstandard terminology. Categorization criteria were developed and applied by the researchers to all observations. Two independent evaluators examined reported observations for compliance with guidelines provided by the Standards. Inter-rater agreement was calculated using weighted kappa. The initial search returned 167 articles. Twenty-five articles were left for inclusion in our analysis. Eighteen (72%) articles defined construct validity as the ability to discriminate between expert and novice levels of proficiency. Within the sample, 80 discrete observations of reported validity evidence were identified and categorized according to standard classifications. Nearly 30% of all published observations intended to demonstrate differences in performance by level of proficiency, 25% described a scoring model, and 14% demonstrated support of assessment content. Not one article contained a statistical correlation between assessment scores and objective outcomes from the authentic surgical environment. Medians for level of rigor ranged from 0 to 1 across all forms of evidence. Weighted kappa values ranged 0.60-0.91. Validity claims in gynecologic surgical assessment over-rely on generalizability evidence. No test-criterion evidence was observed. Increased awareness of current standards and systematic argument development is needed for gynecologic performance assessments.
Descriptors: Surgery, Gynecology, Validity, Standards, Performance Tests, Classification, Construct Validity, Evidence, Observation, Literature Reviews
Springer. 233 Spring Street, New York, NY 10013. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-348-4505; e-mail: email@example.com; Web site: http://www.springerlink.com
Publication Type: Journal Articles; Information Analyses; Reports - Research
Education Level: N/A
Authoring Institution: N/A