NotesFAQContact Us
Search Tips
ERIC Number: ED513187
Record Type: Non-Journal
Publication Date: 2009
Pages: 207
Abstractor: As Provided
Reference Count: 0
ISBN: ISBN-978-1-1092-7770-8
Qualitative Information in Annual Reports & the Detection of Corporate Fraud: A Natural Language Processing Perspective
Goel, Sunita
ProQuest LLC, Ph.D. Dissertation, State University of New York at Albany
High profile cases of fraudulent financial reporting such as those that occurred at Enron and WorldCom have shaken public confidence in the U.S. financial reporting process and have raised serious concerns about the roles of auditors, regulators, and analysts in financial reporting. In order to address these concerns and restore public confidence, the Sarbanes-Oxley Act (SOX) of 2002 was enacted. However, SOX has not lived up to its promise. Numerous cases of fraudulent financial reporting have surfaced in the post-SOX era. So far, the major thrust of research has been on examining fraud that has already been discovered. This dissertation creates a methodology to proactively identify means to detect fraud by examining the qualitative content of annual reports using natural language processing tools. The methodology is created using Support Vector Machines, a supervised machine learning technique. In this research, we examine both the verbal content and the presentation style of the qualitative portion of the annual reports and seek to explore linguistic features that distinguish fraudulent annual reports from non-fraudulent annual reports. To detect fraud, it is important to investigate qualitative content as textual content of annual reports contains richer information than the financial ratios, which can be easily camouflaged. This study also creates a classification metric for early prediction of fraud by examining changes in the qualitative content of annual reports for pre-fraud, fraud and post-fraud periods of fraud companies. What distinguishes this methodology from earlier research on fraud detection is its use of qualitative textual content in annual reports as opposed to quantitative financial information such as ratios, which have limited ability to predict fraud as discussed in the literature. Our results indicate that employment of linguistic features is an effective means to detect fraud. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A