ERIC Number: EJ966186
Record Type: Journal
Publication Date: 2004-Dec
Abstractor: As Provided
Reference Count: 19
Sentence-Based Natural Language Plagiarism Detection
White, Daniel R.; Joy, Mike S.
Journal on Educational Resources in Computing, v4 n4 Article 2 Dec 2004
With the increasing levels of access to higher education in the United Kingdom, larger class sizes make it unrealistic for tutors to be expected to identify instances of peer-to-peer plagiarism by eye and so automated solutions to the problem are required. This document details a novel algorithm for comparison of suspect documents at a sentence level and has been implemented as a component of plagiarism detection software for detecting similarities in both natural language documents and comments within program source-code. The algorithm is capable of detecting sophisticated obfuscation (such as paraphrasing, reordering, merging, and splitting sentences) as well as direct copying. The implemented algorithm has also been used to successfully detect plagiarism on real assignments at the university. The software has been evaluated by comparison with other plagiarism detection tools.
Descriptors: Plagiarism, Computer Software, Foreign Countries, Natural Language Processing, Computer Uses in Education, College Faculty, College Students, Comparative Analysis, Writing Assignments
Association for Computing Machinery. 2 Penn Plaza Suite 701, New York, NY 10121. Tel: 800-342-6626; Tel: 212-626-0500; Fax: 212-944-1318; e-mail: firstname.lastname@example.org; Web site: http://www.acm.org
Publication Type: Journal Articles; Reports - Descriptive
Education Level: Higher Education
Authoring Institution: N/A
Identifiers - Location: United Kingdom