NotesFAQContact Us
Search Tips
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1067852
Record Type: Journal
Publication Date: 2015-Aug
Pages: 12
Abstractor: As Provided
ISSN: ISSN-1382-4996
Validity: Applying Current Concepts and Standards to Gynecologic Surgery Performance Assessments
LeClaire, Edgar L.; Nihira, Mikio A.; Hardré, Patricia L.
Advances in Health Sciences Education, v20 n3 p817-828 Aug 2015
Validity is critical for meaningful assessment of surgical competency. According to the Standards for Educational and Psychological Testing, validation involves the integration of data from well-defined classifications of evidence. In the authoritative framework, data from all classifications support construct validity claims. The two aims of this study were to develop a categorization method for validity evidence published in support of surgery performance assessments and to summarize the results of applying this methodology to the gynecologic surgery literature. This was a critical analysis of published observations reported as validity evidence in studies with a construct validity claim. Medline and Embase databases were searched using keywords: "surgery" and "construct validity". Parameters included English-language articles published from 2000 to 2012. Gynecologic studies were analyzed for definitions of construct validity and nonstandard terminology. Categorization criteria were developed and applied by the researchers to all observations. Two independent evaluators examined reported observations for compliance with guidelines provided by the Standards. Inter-rater agreement was calculated using weighted kappa. The initial search returned 167 articles. Twenty-five articles were left for inclusion in our analysis. Eighteen (72%) articles defined construct validity as the ability to discriminate between expert and novice levels of proficiency. Within the sample, 80 discrete observations of reported validity evidence were identified and categorized according to standard classifications. Nearly 30% of all published observations intended to demonstrate differences in performance by level of proficiency, 25% described a scoring model, and 14% demonstrated support of assessment content. Not one article contained a statistical correlation between assessment scores and objective outcomes from the authentic surgical environment. Medians for level of rigor ranged from 0 to 1 across all forms of evidence. Weighted kappa values ranged 0.60-0.91. Validity claims in gynecologic surgical assessment over-rely on generalizability evidence. No test-criterion evidence was observed. Increased awareness of current standards and systematic argument development is needed for gynecologic performance assessments.
Springer. 233 Spring Street, New York, NY 10013. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-348-4505; e-mail:; Web site:
Publication Type: Journal Articles; Information Analyses; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A