NotesFAQContact Us
Search Tips
ERIC Number: ED524678
Record Type: Non-Journal
Publication Date: 2010
Pages: 174
Abstractor: As Provided
Reference Count: 0
ISBN: ISBN-978-1-1244-4757-5
A Probabilistic Approach to Data Integration in Biomedical Research: The IsBIG Experiments
Anand, Vibha
ProQuest LLC, Ph.D. Dissertation, Indiana University
Biomedical research has produced vast amounts of new information in the last decade but has been slow to find its use in clinical applications. Data from disparate sources such as genetic studies and summary data from published literature have been amassed, but there is a significant gap, primarily due to a lack of normative methods, in combining such information for inference and knowledge discovery. In this research using Bayesian Networks (BN), a probabilistic framework is built to address this gap. BN are a relatively new method of representing uncertain relationships among variables using probabilities and graph theory. Despite their computational complexity of inference, BN represent domain knowledge concisely. In this work, strategies using BN have been developed to incorporate a range of available information from both raw data sources and statistical and summary measures in a coherent framework. As an example of this framework, a prototype model (In-silico Bayesian Integration of GWAS or IsBIG) has been developed. IsBIG integrates summary and statistical measures from the NIH catalog of genome wide association studies (GWAS) and the database of human genome variations from the international HapMap project. IsBIG produces a map of disease to disease associations as inferred by genetic linkages in the population. Quantitative evaluation of the IsBIG model shows correlation with empiric results from our Electronic Medical Record (EMR)--The Regenstrief Medical Record System (RMRS). Only a small fraction of disease to disease associations in the population can be explained by the linking of a genetic variation to a disease association as studied in the GWAS. None the less, the model appears to have found novel associations among some diseases that are not described in the literature but are confirmed in our EMR. Thus, in conclusion, our results demonstrate the potential use of a probabilistic modeling approach for combining data from disparate sources for inference and knowledge discovery purposes in biomedical research. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: High Schools
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A