Text Mining and Knowledge Discovery over Biomedical Literature
The information access paradigm offered by most contemporary text information systems is a search-and-sift paradigm where users have to manually glean and aggregate relevant information from the large number of documents that are typically returned in response to keyword queries. Expecting the users to glean and aggregate information has lead to several inadequacies in these information systems. Owing to the size of many text databases, search-and-sift is a very tedious often requiring repeated keyword searches refining or generalizing queries terms. A more serious limitation arises from the lack of automated mechanisms to aggregate content across different documents to discover new knowledge. This project focuses on processing text to assign semantic interpretations to its content (extracting Semantic metadata) and the design of algorithms and heuristics to utilize the extracted semantic metadata to support knowledge discovery operations over text content. Our recent contributions in extracting semantic metadata cover the extraction of compound entities and complex relationships connecting entities. Extraction results are represented using a standard Semantic Web representation language (RDF) and are manually evaluated for accuracy. Our past work focussed on developing knowledge discovery algorithms that operate on RDF data. To further improve access mechanisms to text content, we are developing applications to support semantic browsing and semantic search over text.
Related Projects:
Conference Papers:
- Cartic Ramakrishnan, Pablo N. Mendes, Rodrigo A.T.S da Gama, Guilherme C. N. Ferreira & Amit P. Sheth, "Joint Extraction of Compound Entities and Relationships from Biomedical Literature," WI2008 IEEE/WIC/ACM International Conference on Web Intelligence (WI-08), Sydney Australia, Dec. 9-12, 2008.
- Cartic Ramakrishnan, Pablo N. Mendes, Shaojun Wang and Amit P. Sheth, "Unsupervised Discovery of Compound Entities for Relationship Extraction," EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns, Acitrezza, Catania, Italy, 9-29 tp 10-3, 2008.
- Cartic Ramakrishnan, Krys Kochut, Amit P. Sheth, "A Framework for Schema-Driven Relationship Discovery from Unstructured Text,", International Semantic Web Conference 2006, Athens, GA, November 5-9, 2006, pp. 583-596
Journal Papers:
- Amit P. Sheth, Cartic Ramakrishnan. (2007). Relationship Web: Blazing Semantic Trails between Web Resources IEEE Internet Computing,, 11(4), 77-81.
- Cartic Ramakrishnan, William H. Milnor, Matthew Perry, Amit P. Sheth. (2005). Discovering informative connection subgraphs in multi-relational graphs. SIGKDD Explorations, 7(2), 56-63.
Tutorials:
- Meenakshi Nagarajan, Cartic Ramakrishnan and Amit Sheth, "Text Analytics for Semantic Computing - the good, the bad and the ugly," Second IEEE International Conference on Semantic Computing Santa Clara, CA, USA, August 4-7, 2008.
Presentations:
Back