NotesFAQContact Us
Search Tips
Back to results
ERIC Number: ED550442
Record Type: Non-Journal
Publication Date: 2012
Pages: 229
Abstractor: As Provided
ISBN: 978-1-2678-2744-9
Generative Topic Modeling in Image Data Mining and Bioinformatics Studies
Chen, Xin
ProQuest LLC, Ph.D. Dissertation, Drexel University
Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model is proposed for co-existing image features and annotations, in which new latent variables are introduced to allow for more flexible sampling of word topics and visual topics. A perspective hierarchical Dirichlet process (pHDP) model is proposed to deal with user-tagged image modeling, associating image features with image tags and incorporating the user's perspectives into the image tag generation process. It's also shown that in mining large scale text corpora of natural language descriptions, the relation between semantic visual attributes and object categories can be encoded as Must-Links and Cannot-Links, which can be represented by Dirichlet-Forest prior. Novel generative topic models are also introduced to meta-genomics studies. The experimental results show that the generative topic model can be used to model the taxon abundance information obtained by the homology-based approach and study the microbial core. It also shows that latent topic modeling can be used to characterize core and distributed genes within a species and to correlate similarities between genes and their functions. A further study on the functional elements derived from the non-redundant CDs catalogue shows that the configuration of functional groups encoded in the gene-expression data of meta-genome samples can be inferred by applying probabilistic topic modeling to functional elements. Furthermore, an extended HDP model is introduced to infer functional basis from detected enterotypes. The latent topics estimated from human gut microbial samples are evidenced by the recent discoveries in fecal microbiota study, which demonstrate the effectiveness of the proposed models. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A