Kno.e.sis hosts many foreign students and faculty, usually during the summer. Details of some of our past visitors follow.

Exchange Program Participants

Visiting Research Scholars


  • Gérald Oster

    When: Friday, July 8, 2016 at 11:00 am

    Venue: 366 Joshi Research Center

    Title: Distributed Real-Time Collaborative Editing

    Abstract: Real-time collaborative editing systems such as GoogleDocs or Etherpad are well known and widely used. They allow a group of people to collaborate on a same document from different places at any time. Unfortunately, these systems have scalability limitations regarding the number of users who can collaborate at the same time on a document or the number of concurrent changes they can handle. However, we observe nowadays a change of scale on the numbers of people collaborating on a document: besides small groups, communities of users are now using these tools to coordinate and achieve their writing activities.

    In this talk, I will motivate the need for peer-to-peer real-time collaborative editing systems. I will give a brief overview of optimistic replication mechanisms suitable for this kind of systems and give a focus on a conflict-free replicated data type: a sequence whose atoms have an adaptive granularity. Finally, I will present our current prototype: the MUTE collaborative editor -- a peer-to-peer real-time web collaborative editor.

    Biography: Gérald Oster ( is an Associate Professor at TELECOM Nancy, University of Lorraine since 2006. He is a member of the Inria Coast team( He has an expertise in distributed collaborative systems with a focus on content replication mechanisms and their applicability. He received his Ph.D. in Computer Science from Nancy University in 2005. During his PhD, he worked on verification of correctness of a family of optimistic replication mechanisms (operational transformation) dedicated to collaborative editing. He proposed a framework based on an automated theorem prover and several sets of verified transformation functions for multiple data types. He worked on the design and the implementation of a universal file synchronizer. He is one of the pioneers of the CRDT approach as he participated in the design of the WOOT algorithm that initiated researches on these distinctive data structures. He is currently investigating the limitations and the applicability in diverse domains of these novel replicated data structures. Gérald is or was involved in several research projects and participated in several technologies transfer oriented projects.

  • Claudia-Lavinia Ignat

    When: Tuesday, July 5, 2016 at 4:00 pm

    Venue: 366 Joshi Research Center

    Title: Large-Scale Trust-Based Collaboration

    Abstract: Distributed collaborative systems such as wikis, version control systems or GoogleDrive allow users to collaborate on a set of shared documents from different places, at any time and from different devices. Existing collaborative systems made available by large service providers such as Google are based on a central authority that stores and has control over user personal data. Moreover, these systems do not scale well in terms of the number of users and the frequency of their modifications. Coast Inria team( investigates and designs peer-to-peer collaborative systems that offer a very good scalability and where users share data directly with the other users they trust without relying on a central authority.

    In the first part of the talk I'll present a study on investigating scalability of existing real-time collaborative systems in terms of delays experienced by users, i.e. the elapsed time between a modification done by a user is visible to other users. By means of user studies where we simulated delays on small collaborative tasks, we studied the effect of delay on users. We found out a general effect of delay on performance related to the ability to manage redundancy and errors across the document. We interpret this finding as a compromised ability to maintain awareness of team member activity, and a reversion to independent work.

    In the second part of the talk I'll discuss about a preliminary validation of a trust-based collaboration model where users share data with the people they trust. Trust can be computed based on the collaboration interactions between users. We studied trust game, a money exchange game, widely used in behavioral economics for analysing trust and collaboration between humans. In this game, exchange of money is entirely attributable to the existence of trust between users. We proposed a trust metric that computes trust scores between users based on the amounts they exchanged in the past. This trust metric deals with fluctuating user behavior and can predict future user behavior. By means of user studies we investigated the influence of showing a partner trust score on user behavior during the trust game. We showed that the availability of trust score has the same effect as of user nickname to improve cooperation between users. We conclude that in the case of large scale collaboration where users can change their nicknames and it is difficult to remember who is who from the nicknames, trust scores could be an enhancement over traditional nicknames.

    This work has been done in collaboration with Valerie Shalin from the Department of Psychology of Wright State University, Kno.e.sis, in the context of the USCOAST associated Inria team(

    Biography: Claudia-Lavinia Ignat ( is permanent researcher at Inria, the vice-head of Coast team. She obtained a PhD in Computer Science from ETH Zurich, Switzerland in 2006. Her research interests are distributed collaborative systems with a focus on consistency maintenance, group awareness, security, trust issues and user studies. She is in the editorial board of Journal of CSCW and program committee member of several international conferences such as CSCW, GROUP, CDVE and ICEBE. She was involved in several research and industrial projects and is the coordinator of the USCOAST associated Inria team in collaboration with Kno.e.sis.

  • Dr. Soren Auer, University of Bonn

    When: December 3, 2013 at 1:00pm

    Venue: 399 Joshi Research Center

    Title: Linked Data for Enterprise Information Integration

    Abstract: Data integration in large enterprises is a crucial but at the same time costly, long lasting and challenging problem. While business-critical information is often already gathered in integrated information systems such as ERP, CRM and SCM systems, the integration of these systems itself as well as the integration with the abundance of other information sources is still a major challenge. Large companies often have hundreds or even thousands of different information systems and databases. We argue that classic SOA architectures are well-suited for transaction processing, but more efficient technologies are available and can be deployed for integrating data. A promising approach is the use of the linked data paradigm for integrating enterprise data. Similarly, as the data web emerged complementing the document web, data intranets can complement the intranets and SOA landscapes currently found in large enterprises. We explore the challenges large enterprises are still facing with regard to data integration. These include in particular the establishing, management and interlinking of enterprise taxonomies, domain databases, collaboration portals, wikis and other enterprise information sources. We survey Linked Data approaches for their solution and present some examples of successful applications of the Linked Data principles in that context.

    Biography: Prof. Sören Auer holds the chair for Enterprise Information Systems at University of Bonn and leads a research group at Fraunhofer Institute for Analysis and Information Systems (IAIS). Sören has made substantial contributions to social and semantic web technologies, knowledge engineering, usability, as well as databases and information systems. Sören is author of over 80 peer-reviewed scientific publications. Sören is leading the European Union’s FP7-ICT flagship project LOD2 comprising 15 partners from 11 countries. He is co-founder of several high-impact research and community projects such as DBpedia,, LinkedGeoData and OntoWiki. Organiser and co-programme chair of renowned scientific conferences, area editor of the Semantic Web Journal, serves as an expert for industry, EC, W3C and advisory board member of the Open Knowledge Foundation.

  • Dr. Ullas Nambiar, Lead Scientist, EMC India Center of Excellence

    When: April 3, 2014

    Venue: 399 Joshi Research Center

    Title: Scalable Querying of Semantic Web Data Models: Challenges and Opportunities

    Abstract: The Center for Transformational Innovations (CeTI) at EMC India COE has been chartered with looking at some of fundamental problems afflicting masses in emerging economies. Healthcare and education has come up as the two fundamental pillars on which we can deliver a wholesome human-like life to all citizens. The health of populations is a distinct key issue in public policy discourse in every mature society often determining the deployment of huge society. The first criteria for a just healthcare system is- universal access to an adequate level, and access without excessive burden. In most emerging economies meeting this first criteria itself has been and continuous to be a challenge. In this talk I will highlight the problems in existing Healthcare Systems, present the potential of bringing IT enabled solutions, share the data management, analytics and privacy issues that will arise and conclude with how we at CeTI are engaged in this space.

    Biography: Ullas Nambiar is the Lead Scientist at the CTO Office of EMC India COE. He co-leads a team developing solutions for emerging regions. Ullas received his PhD in Computer Science from Arizona State University in 2005 and a BE in Computer Science with Distinction from M.S. University, Vadodara,India. His experience includes stints at IBM Research, UC San Diego, UC Davis, NEC CCRL, and L&T InfoTech.

    His primary focus is in solving problems related to efficient Data Management & Analytics brought forth by the proliferation of Mobile Devices, Social Media and Internet of Things. He has authored 50+ papers in top conferences & journals, has several patents granted by USPTO and organized 9 international workshops. Ullas is an ACM Senior Member and Distinguished Speaker.

  • Luc Andre

    When: October 30, 2013

    Venue: 399 Joshi Research Center

    Title: Massive Scale Collaborative Editing

    Abstract: Since the Web 2.0 era, the Internet is a huge content editing place in which users contribute to the content they browse. Users do not just edit the content but they collaborate on this content. Such shared content can be edited by thousands of people. However, current consistency maintenance algorithms seem not to be adapted to massive collaborative updating. Shared data is usually fragmented into smaller atomic elements that can only be added or removed. Coarse-grained data leads to the possibility of conflicting updates while fine-grained data requires more metadata. In this discussion we offer a solution for handling an adaptable granularity for shared data that overcomes the limitations of fixed-grained data approaches. Our approach defines data at a coarse granularity when it is created and refines its granularity only for facing possible conflicting updates on this data.

    Biography: Luc Andre is a 3rd year PH.D Student in Computer Science at Université de Lorraine, in France, and works at the Loria Lab which also partners with INRIA Research Center. His research focuses on algorithms for collaborative editing in the distributed environment.

  • Dr. Chris Welty, IBM Research

    When: March 29, 2013

    Venue: 292 Joshi (Brandeberry)

    Title: Scalable Querying of Semantic Web Data Models: Challenges and Opportunities

    Abstract: Watson is a computer system capable of answering rich natural language questions and estimating its confidence in those answers at a level rivaling the best humans at the task. On Feb 14-16, 2011, in an historic event, Watson triumphed over the best Jeopardy! players of all time. In this talk I will discuss how Watson works at a high level with examples from the show, and concentrate on the use of semantic technology in Watson.

    Biography: Chris Welty is a Research Scientist at the IBM T.J. Watson Research Center in New York. Previously, he taught Computer Science at Vassar College, taught at and received his Ph.D. from Rensselaer Polytechnic Institute, and accumulated over 14 years of teaching experience before moving to industrial research. Chris' principal area of research is Knowledge Representation, specifically ontologies and the semantic web, and he spends most of his time applying this technology to Natural Language Question Answering as a member of the DeepQA/Watson team and, in the past, Software Engineering. Dr. Welty was a co-chair of the W3C Rules Interchange Format Working Group (RIF), serves on the steering committee of the Formal Ontology in Information Systems Conferences, is past president of KR.ORG, on the editorial boards of AI Magazine, The Journal of Applied Ontology, and The Journal of Web Semantics, and was an editor in the W3C Web Ontology Working Group. While on sabbatical in 2000, he co-developed the OntoClean methodology with Nicola Guarino. Chris Welty's work on ontologies and ontology methodology has appeared in CACM, and numerous other publications.

  • Dr. Kemafor Anyanwu, Semantic Computing Research Lab, North Carolina State University

    When: March 29, 2013

    Venue: 292 Joshi (Brandeberry)

    Title: Scalable Querying of Semantic Web Data Models: Challenges and Opportunities

    Abstract: Recent advancements in Semantic Web publishing technologies are ushering in the era of "big Semantic Web data". We now have a rapidly growing number of publicly available Semantic Web datasets spanning a variety of domains. Attempting to harness the collective knowledge represented in combinations of these datasets, quickly gives rise to a "big and heterogeneous data" problem. Although, there are numerous research activities now focused on "big data", many of them focus on data that is fairly homogeneous and either completely structured or completely unstructured. However, Semantic Web data models impose very few constraints on structure compared to relational models, leading to graph-like or semi(structured) nature. Further, Semantic Web data collections often contain data expressed using multiple vocabularies or schemas requiring processing that involves reasoning about equivalences that are not explicitly stated. These requirements lead to workloads that do not match assumptions made by traditional relational query optimization techniques. Unfortunately, many existing Semantic Web data query processing techniques still rely on relational-like query optimization techniques and, the impact of this mismatch is much more palpable at large scales. The use of cloud computing platforms to support elastic scaling requirements adds yet another dimension of challenges because several key assumptions underlying traditional query optimization techniques do not transfer to popular platforms like Google's MapReduce. In this talk, I will present an overview of the efforts being undertaken by the Semantic Computing Research Lab at North Carolina State University to address some of these challenges and highlight some open research opportunities.

    Biography: Kemafor Anyanwu is an Assistant Professor of Computer Science and director of the Semantic Computing Research Lab at North Carolina State University. She received a Ph.D. in Computer Science from the University of Georgia in 2007. Her research interests include Semantic Web data management, data analytics and mining, and their applications. The two themes of her research activities revolve around developing optimization techniques for large scale Semantic Web data processing and developing query primitives and languages for supporting more sophisticated querying on the Semantic Web. She has served on program committees of different tracks of conferences (ISWC, ICDE, and ICSC) and was on the organizational committee for WWW2010 held in Raleigh. She reviews for journals such as TKDE, IJSWIS and has been a guest editor for IJSWIS. Her work with her student received the best paper award in JIST 2012. Her work is funded by grants from the NSF and industry awards like the IBM Faculty awards.

  • Dr. Jennifer Kaminski, Center for Cognition Science & Department of Psychology, Ohio State University

    When: September 21, 2012

    Venue: Joshi Atrium

    Title: The trouble with first impressions: How contextualized instantiations hinder transfer of mathematical knowledge

    Abstract: It has been suggested by philosophers and cognitive scientists that mathematical cognition is embodied in the sense that it is grounded in human perception and action. Embodied accounts of cognition often posit that cognition is situated in real-world experiences, and these accounts are frequently extended to advocate that mathematics teaching should also be grounded in human experience. According to such as a position, presenting mathematical concepts through real-world contexts can tap a learner's prior knowledge and facilitate learning. However, a primary goal of learning mathematics is the ability to apply this knowledge to novel situations. Therefore effective teaching should promote not only initial learning but also subsequent transfer of mathematical knowledge. Real-world instantiations of mathematics are typically perceptually and conceptually rich, conveying considerably more extraneous information than their more symbolic counterparts. This extraneous information may remain associated with the learner's interpretation of the mathematical structure consequently constraining the applicability of the mathematical knowledge. In this talk, I will present evidence that acquiring mathematical concepts through contextualized, real-world instantiations can hinder subsequent transfer of knowledge in comparison to acquiring mathematics through more symbolic instantiations. These findings have important implications for the teaching of mathematics because they suggest that while some contextualized instantiations of mathematics may be intuitively appealing and offer a leg-up in the learning process, they may hinder subsequent transfer. Symbolic representations of mathematics may sometimes be more difficult to initially learn, but once acquired they are powerful in the sense that they enable the learner to recognize mathematical structures in real-world situations.

    Biography: Jennifer Kaminski holds bachelors and masters degrees in mathematics and a doctorate in mathematics education with a focus on psychology and cognitive science. She has worked in the field of actuarial consulting and also taught undergraduate mathematics. Currently, she is a research scientist at the Ohio State University Center for Cognitive Science and Department of Psychology. The primary focus of her research has been the acquisition and application of mathematical structures.

  • Dr. Robert Goldstone, Department of Pyschological and Brain Science, Indiana University

    When: September 21, 2012

    Venue: Joshi Atrium

    Title: Mathematical Reasoning as a Literally Physical Symbol System

    Abstract: Much of the power of mathematics comes from its generality and ability to unify prime face dissimilar domains. By one account, analytic thought in math and science requires developing deep construals of phenomena that run counter to untutored perceptions. This approach draws an opposition between superficial perception and principled understanding. In this talk, I advocate the converse strategy of grounding mathematical reasoning in perception and action. I will describe empirical evidence for perceptual changes that accompany learning in mathematics. In arithmetic and algebraic reasoning, we find that proficiency involves executing spatially explicit transformations to notational elements. People learn to attend mathematical operations in the order in which they should be executed, and the extent to which students employ their perceptual attention in this manner is positively correlated with their mathematical experience. People produce mathematical notations that they are good at reading. Perceptual and attentional processes are tailored to fit mathematical requirements. Thus, for reasoning in mathematics, relatively sophisticated performance can be achieved not only by ignoring perceptual features in favor of deep conceptual features, but also by adapting perceptual processing so as to conform with and support formally sanctioned responses. These "Rigged Up Perceptual Systems" (RUPS) offer a promising strategy for achieving educational reform. Based on the theoretical foundation of RUPS, we have designed and implemented a virtual, interactive sandbox for students to explore algebra. At the end of this talk, I will describe experiments that explore the use of this system by 11-19 year old students.

    Biography: Robert Goldstone is a Chancellor's Professor of Psychological and Brain Sciences. He won the 1996 Chase Memorial Award for Outstanding Young Researcher in Cognitive Science, the 2000 APA Distinguished Scientific Award for Early Career Contribution to Psychology in the area of Cognition and Human Learning, and a 2004 Troland research award from the National Academy of Sciences. He served as editor of Cognitive Science from 2001-2005, and Director of the Indiana University Cognitive Science Program from 2005-2011. His primary research interest is in building computational models of human learning, and he has conducted research on similarity, perceptual learning, concept learning, and collective behavior.

  • Dr. E. Michael Maximilien, IBM Research

    When: July 26, 2012

    Venue: 292 Joshi (Brandeberry)

    Title: The IBM PaaS: from inception in research project Altocumulus to customers as IBM Pure Application Systems

    Abstract: In this talk I will describe the evolution of the IBM Platform-as-a-Service or IBM PureApplication System, a member of the IBM PureSystems [1] family of PaaS and private IaaS cloud products and services [2]. Our journey will take us from a little known cloud research project (IBM Altocumulus [3][4]) to its current manifestation as IBM PureApplication System and the many forms it took in between. I will try to draw some lessons learned in the three year journey, highlighting how IBM Research does "research" (from my opinion) as well as what works well in this type of industry research setting and what does not.

    Biography: Dr. E. Michael Maximilien (also know as Max) is a Research Scientist at IBM Research. Max's primary research interests lie in distributed systems and software engineering for the web; in particular, web APIs and services, mashups, cloud computing, social software, and Agile methods and practices. His most recent research project heavily influenced IBM Workload Deployer and now IBM PureSystems family of IaaS and PaaS. Max is an active participant and contributor to communities related to Ruby, Ruby on Rails, and Agile methods and practices, inside and outside of IBM.


  • Dr. Alexandre Passant, DERI, NUI Galway

    When: February 8-12, 2011

    Venue: 365 Joshi Research Center

    Title: Semantic microblogging and citizen sensing and Lightweight ontologies for citizen sensing / integrating Social networks and sensor networks

    Biography: Founder / CEO / Chief-Hacker at, a music-tech start-up, building new tools and solutions in the music discovery space. seevl is spin-out of DERI, where I was previously a Research Fellow / Unit Leader, and I’m still part-time Associate Researcher - and Adjunct Lecturer at NUI Galway. I’m also active in several W3C groups, I regularly speak and publish in international conferences, and I am a Scientific Advisor for Mesagraph, a start-up that derives meaningful insights from Twitter streams.

    I love to build, learn and share. I’m trying to make the Web a better place, and – most important – I’m having fun doing it.

  • Dr. Dan Gruhl, IBM Research-Almaden

    When: April 12, 2010

    Venue: 365 Joshi Research Center

    Title: Semantic Super Computing

    Abstract: Several years ago we started exploring what you could do with text if there was no limit on the size of the corpus or the amount of computation you could use on it. The result was a number of applications that help find interesting insights into the information hidden in these corpora - everything from what TV show 20-something women prefer to what the color of the web is.

    Biography: Daniel Gruhl IBM Research Division, Almaden Research Center, 650 Harry Rd., San Jose, CA 95120. Dr. Gruhl is a Research Staff Member in the Computer Science Department at the Almaden Research Center. He received his B.S('94), M.Eng.('95) and PhD('00) degrees from the Massachusetts Institute of Technology in Electrical Engineering and Computer Science. He subsequently joined IBM at the Almaden Research Center where he has worked on large scale text analytics systems. He has received Outstanding Technology Awards for both the WebFountain system and the UIMA standard for text analytics. He is an author or coauthor of at least a dozen patents and two dozen papers. Dr. Gruhl is a member of the IEEE and ACM.

  • Dr. Ramesh Jain, University of California, Irvine

    When: November 4th, 2011

    Venue: 145 Russ Engineering

    Title: Social Life Networks

    Abstract: We are living in an age of social media that provides numerous channels for digital expression and sharing almost instantaneously in any part of the world. By bringing different media as well as modes of distribution—focused, narrowcast, and broadcast—social networks (SN) have revolutionized communication among people. I believe that by using the enormous reach of mobile phones equipped with myriad sensors,the next generation of social networks can be designed not only to connect people with other people, but also to connect people with essential life resources. I call these networks Social Life Networks (SLN) and believe that this is the right time to focus efforts to discover and develop technology and infrastructure to design and build these networks. I will discuss my approach to building SLNs and will identify challenges that must be addressed to make SLNs practical. I will also discuss their implications for masses in emerging countries, the Middle of the Pyramid, and the technology development in making that practical.

    Biography: Dr. Ramesh Jain joined the University of California, Irvine as the first Bren Professor in the Bren School of Information and Computer Sciences in 2005. Dr. Jain has been an active researcher in multimedia information systems, image databases, machine vision, and intelligent systems. While he was a professor of computer science and engineering at the University of Michigan, Ann Arbor and the University of California, San Diego, he founded and directed artificial intelligence and visual computing labs. He has co-founded three companies and his current research is in experiential computing and its applications.

  • Dr. Ying Ding and colleagues, Indiana University

    When: November 6, 2009

    Venue: 292 Joshi (Brandeberry Conference Room)

    Title: chem2bio2rdf: Semantic Systems Chemical Biology
    Abstract: In this talk, we describe the use Semantic Web technologies including RDF and OWL for the integration of chemical, biological and genomic information within the context of Systems Chemical Biology. We describe how two existing resources, Bio2RDF and Linking Open Drug Data (LODD) can be integrated with chemistry-oriented networks to create large-scale systems chemical biology networks that allow links between compounds, protein targets, genes and diseases to be established. In this work, we describe the generation of this Chem2Bio2RDF network and how it can be analyzed in a variety of ways including the use of Semantic Lenses.

    Title: Social tagging networks: Cohesiveness and Dynamics
    Abstract: In this talk, we describe the use Semantic Web technologies including RDF and OWL for the integration of chemical, biological and genomic information within the context of Systems Chemical Biology. We describe how two existing resources, Bio2RDF and Linking Open Drug Data (LODD) can be integrated with chemistry-oriented networks to create large-scale systems chemical biology networks that allow links between compounds, protein targets, genes and diseases to be established. In this work, we describe the generation of this Chem2Bio2RDF network and how it can be analyzed in a variety of ways including the use of Semantic Lenses.

    Title: Social tagging networks: Cohesiveness and Dynamics
    Abstract: This talk proposes an approach to studying the structure and dynamics of large cohesive groups of tags in online social networks. Given a tag co-occurrence graph defined over a particular time span, the cohesive subgroups of tags are modeled using the graph theoretic concept of a k-plex, which was originally introduced in the social network analysis literature. This model can be thought of as a relaxed, more practical version of the popular clique model that is obtained by allowing a predetermined (and typically small) number k of non-neighbors for each vertex within the group. Intuitively, a maximum k-plex in the graph should be related to one of the most popular topics discussed in the network, and the size of a maximum k-plex can serve as a reasonable global measure of cohesiveness of the network. Moreover, study of the structure and dynamics of changes in the maximum k-plex of the tag co-occurrence graph of the same social network over time can be used to deduce some interesting information about the underlying social network. We illustrate the proposed method on a large set of dynamic data extracted from Delicious social bookmarking community.

    Title: Weighted PageRank for heterogeneous scholarly networks
    Abstract: Large scale weighted PageRank can be calculated for heterogeneous citation network, author-citation networks and journal citation networks. Weights are considered as citation time, self-citation, journal impact factors. Weighted PR ranks have been compared with normal citation rank. This method can be easily extended to other heterogeneous networks. Potential interesting issues are dangling nodes, heuristic parameter settings.

    Biography: Dr. Ying Ding is an Assistant Professor in School of Library and Information Science, Indiana University. Before she worked as a senior researcher at the University of Innsbruck, Austria and as a researcher at the Division of Mathematics and Computer Science at the Free University of Amsterdam, the Netherlands. She completed her Ph.D. in School of Applied Science, Nanyang Technological University, Singapore. She has been involved in various European-Union funded projects: research-oriented EU projects (EASAIER, OntoKnowledge, IBROW, SWWS, COG, Htechsight, Esperanto, SEKT, DIP, Triple Space Computing), thematic network (Ontoweb, KnowledgeWeb), and Accompanied Measurements (Multiple). She is very active in many consultancy projects between University and companies. She has published more than 70 papers in journals, conferences and workshops. She is Program Committee Member for more than 80 international conferences and workshops. She is co-author of the book "Intelligent Information Integration in B2B Electronic Commerce" published by Kluwer Academic Publishers. She is also co-author of book chapters in the book "Spinning the Semantic Web" published by MIT Press and "Towards the Semantic Web: Ontology-driven Knowledge Management" published by Wiley. Her current interest areas include Webometrics, Semantic Web, citation analysis, information retrieval, knowledge management and application of Web Technology.

  • Dr. Bhavani Thuraisingham & Dr. Latifur Khan, University of Texas at Dallas

    When: November 13, 2009

    Venue: 292 Joshi (Brandeberry Conference Room)

    AbstractOntology alignment determines the semantic heterogeneity between two or more domain specifications by considering their associated concepts. Our approach considers name, structural and content matching techniques for aligning ontologies. Together with UMN, we justify the conceptual validity of our ontology alignment technique with a series of experimental results that demonstrate the efficacy and utility of our algorithms on a wide-variety of authentic GIS data including multi-jurisdictions.

    The second part of the presentation deals with scalable storage and retrieval of large RDF graph. Currently available semantic web frameworks do not work well for this retrieval task. In this talk, we describe a framework that we built using Hadoop to store and retrieve large number of RDF triples. We describe a scheme to store RDF data in Hadoop Distributed File System. We also describe our algorithms to generate the best possible query plan to answer a SPARQL (SPARQL Protocol and RDF Query Language) query based on a cost model. We use Hadoop's MapReduce framework to actually answer the queries. Our results show that we can store large RDF graphs in Hadoop clusters built with cheap commodity class hardware. We conclude that our framework is scalable and efficient and can handle large amounts of RDF data.

    Acknowledgements: Our research on semantic web is supported by the National Science Foundation, the Intelligence Advanced Research Projects Activity, the Air Force Office of Scientific Research and the National Geospatial Intelligence Agency.

    Biographies: Bhavani Thuraisingham is a Professor of Computer Science and Director of the Cyber Security Research Center in the Erik Jonsson School of Engineering and Computer Science at the University of Texas at Dallas (UTD) since October 2004. Dr. Thuraisingham teaches courses in Data Security and Semantic Web, and her research is sponsored by NSF, AFOSR, IARPA, NGA, NASA and Raytheon among others. Prior to joining UTD, Dr. Thuraisingham worked for the MITRE Corporation for 16 years which included an IPA (Intergovernmental Personnel Act) at the National Science Foundation as Program Director for Data and Applications Security. At MITRE she was a Department Head in Data and Information Management, and established research programs with AFRL, CECOM, SPAWAR, NSA and CIA. Prior to joining MITRE in January 1983, she worked in the Commercial Industry for six years first at the Control Data Corporation and later at Honeywell Inc. She has also worked as adjunct professor of computer science first at the University of Minnesota and later at Boston University. She has been an instructor at AFCEA since 1998. Dr. Thuraisingham was educated in the United Kingdom both at the University of Bristol and at the University of Wales.

    Professor Thuraisingham is an elected Fellow of three professional organizations: the IEEE (Institute for Electrical and Electronics Engineers), the AAAS (American Association for the Advancement of Science) and the BCS (British Computer Society) for her work in data security. She received the IEEE Computer Society's prestigious 1997 Technical Achievement Award for Outstanding and Innovative contributions secure data management. Dr. Thuraisingham received her education in the United Kingdom at the University of Bristol and the University of Wales. She was quoted by Silicon India Magazine as one of the top seven technology innovators of South Asian Origin in the USA in 2002.

    Prior to joining UTD, Dr. Thuraisingham was an IPA (Intergovernmental Personnel Act) at the National Science Foundation (NSF) in Arlington VA, from the MITRE Corporation. At NSF, she established the Data and Applications Security Program, co-founded the Cyber Trust theme and was involved in inter-agency activities in data mining for counter-terrorism. She worked at MITRE in Bedford, MA between January 1989 and September 2001 first in the Information Security Center and was later a department head in Data and Information Management as well as Chief Scientist in Data Management in the Intelligence and Air Force centers. She has served as an expert consultant in information security and data management to the Department of Defense, the Department of Treasury and the Intelligence Community for over 10 years. Thuraisingham’s industry experience includes six years of research and development at Control Data Corp. and Honeywell Inc. in Minneapolis, MN. While she was in Industry and MITRE, she was an adjunct professor of computer science and member of the graduate faculty first at the University of Minnesota and later at Boston University between 1984 and 2001. She also worked as visiting professor soon after her PhD first at the New Mexico Institute of Technology and later at the University of Minnesota between 1980 and 1983.

    Dr. Thuraisingham's work in Information Security and Data Mining has resulted in over 90 journal articles, over 200 refereed conference papers, over 70 keynote addresses, and three US patents. She is the ahor of nine books in data management, data mining and data security including one on data mining for counter-terrorism and another on Database and Applications Security and is completing her tenth book on SEcure Service Oriented Information Systems. Dr. Thuraisingham has been invited to speak on data mining for security applications at the United Nations and at the White House Office of Science and Technology Policy and has also participated in panels at the National Academy of Sciences and the Air Force Scientific Advisory Board. She is the President of Bhavani Security Consulting, and supports the Department of Treasury on Software Research Credit. She serves (or has served) on editorial boards of leading research and industry journals including several IEEE and ACM Transactions and served as the Editor in Chief of Computer Standards and Interfaces Journal. She is also an Instructor at AFCEA (Armed Forces Communications and Electronics Association) Professional Development Center since 1998 and has served on panel for the Air Force Scientific Advisory Board and the National Academy of Sciences.

    During her nearly five years at UTD, Dr. Thuraisingham has established and lead a strong research program in Intelligence and Security Informatics which now includes 4 core professors, and the team has generated over $9m in research funding from agencies such as NSF, AFOSR, IARPA, NGA, NASA and ONR as well as corporations such as Raytheon Inc. The research projects include an NSF Career Grant, an AFOSR Young Investigator Program Award and a DoD MURI Award. Her current focus includes two activities: i) studying how terrorists and hackers function so that effective and improved solutions can be provided and ii) transferring the technologies developed at the university to commercial development efforts.

    Dr. Thuraisingham promotes Math and Science to high school students as well as to women and underrepresented minorities and has given featured addresses at conferences sponsored by WITI and SWE. Articles on her efforts as well as her vision have appeared in multiple magazines including the Dallas Morning News, The D Magazine, The MITRE Matters and the DFW Metropolis Technology Magazine. She has also appeared in DFW Television speaking on cyber security related topics.

    Dr. Thuraisingham promotes Math and Science to high school students as well as to women and underrepresented minorities and has given featured addresses at conferences sponsored by WITI and SWE. Articles on her efforts as well as her vision have appeared in multiple magazines including the Dallas Morning News, The D Magazine, The MITRE Matters and the DFW Metropolis Technology Magazine. She has also appeared in DFW Television speaking on cyber security related topics.

    Latifur Khan is an Associate Professor of Computer Science at the University of Texas at Dallas and joined the university in 2000 after completing his PhD at the University of Southern California on Ontology Management under Prof. Dennis McLeod. His research interests are in Data Mining for Cyber Security, Semantic Web and Geospatial information management and research is funded by NSF, AFOSR, IARPA, NGA, NASA, Raytheon, Nokia and Cisco. He has published papers in VLDB Journal and several IEEE Transactions as well as in conferences such as ICDM, ACM Multimedia and ECML/PKDD. He is the co-author of the book Design and Implementation of Data Mining Tools for CRC Press. He is a senior member of IEEE.

  • Dr. Pankaj Mehra, Chief Scientist at HP Labs Russia, Technical Lead at

    When: October 21, 2009

    Venue: 292 Joshi (Brandeberry Conference Room)

    Title: Beyond Search: 5 Steps to Insight

    Abstract: Evidence points to the reason why blind application of Web search to enterprises produces undesirable outcomes. First, enterprises are lagging the Web in achieving richly connected information. Second, the level of specificity of meaning and the depth of modeling expected by enterprise users are both higher than by consumers. In this talk, I will show the first houses we have built on the semantic foundations for enterprise search we laid in our previous work.

    The talk will begin with an analysis of how people, processes and infrastructure are currently deployed in information-rich businesses in order to make sense of tens of petabytes of open and proprietary information. I will discuss why existing architectures "especially, their implied cost and delay structures” do not scale to the demands and opportunities thrown open by new economic and business models around information. I will then argue that in order to architect for exabytes and beyond, businesses need to make a switch to architecting their information services around an economy of plenty (away from architecting around the economy of dearth that gave us search).

    In the process, we will turn on its head the very problem that motivates search technology, for instance, asking and answering the fundamental question: Is it not already too late if you have to look for something? I will present architectures for delivery and sense-making. At the heart of these systems lies an engine that shortens the path from "Got it.Got it!" to insight. We will show how to couple this engine with context-mapping technologies in order to move beyond searching for documents to having the right insights delivered into the right heads at the right time. The talk concludes with blueprints from information-heavy industries, such as financial and legal services, which I expect will become widely adopted in the near future.

    Biography: Pankaj Mehra is an HP Distinguished Technologist, Chief Scientist and founder of HP Labs Russia, and technical leader of HP's ontology generation service. He was core architect of HP's NonStovanced Architecture, lead architect of HP's Integrated Archive Platform, and chairman of InfiniBand Trade Association's Management Working Group. He holds an Industry Visitor position at Stanford University.

  • Dr. Srinivasan Parthasarathy, Ohio State University
    When: September 9, 2009
    Venue: 292 Joshi (Brandeberry Conference Room)

    Title: Toward Visual Knowledge Discovery and Analytics

    Abstract: Knowledge discovery and data mining is a process whose goal is to extract interpretable and actionable information from complex (potentially large) data. Visualization can play an important role in this exploratory process. Indeed, a number of important scientific discoveries have ultimately relied on visual confirmation from Galileo seeing the moons of Jupiter to Gerd Binnig and Heinrich Rohrer seeing atoms on a surface. Visualization can also play an important role in understanding the nature of a problem domain and subsequently the patterns governing the underlying solution space. In this talk I will talk about our vision on the roles visualization can play in the knowledge discovery process. Specifically we will examine the use of visualization:

    1. As a mechanism to facilitate exploration of complex datasets.
    2. As a means to validate and confirm results obtained from the discovery process.
    3. As an approach to understand and lend transparency to the discovery process.

    In each case I will attempt to illustrate the roles in the context of specific end applications drawn from the domains of physics of materials, bioinformatics, social network analysis and clinical diagnosis of eye disease. No prior knowledge of data mining, knowledge discovery, visualization or any of these application domains will be assumed.

    Biography: Dr. Srinivasan Parthasarathy (PhD, University of Rochester), is currently an Associate Professor in the Computer Science and Engineering Department at the Ohio State University (OSU). His research interests are broadly in the areas of Data Mining, Databases, Bioinformatics and High Performance Computing. He is a recipient of an NSF CAREER award in 2003, a DOE Early Career Award in 2004, an Ameritech Faculty fellowship in 2001 and an IBM Faculty Award in 2007. His papers have received five best paper awards from leading conferences in the field, including ones at SIAM international conference on data mining (SDM), IEEE international conference on data mining (ICDM), the Very Large Databases Conference (VLDB) and most recently at ACM Knowledge Discovery and Data Mining (SIGKDD). He is a member of the ACM and the IEEE and has served on the program committees of leading conferences in the fields of data mining, databases, and high performance computing. He currently serves on the editorial boards of several journals including the Data Mining and Knowledge Discovery Journal (DMKDJ), the IEEE Transactions on Knowledge and Data Engineering, the Distributed and Parallel Databases Journal (DAPDJ), and the IEEE Intelligent Systems (IEEE-IS) journal. He served as one of the program chairs of SIAM Data Mining in 2007 and is currently serving as one of the general chairs for the 2009-2010 editions.

  • Dr. Olivier Bodenreider, MD, National Institutes of Health

    When: May 27, 2009

    Venue: 292 Joshi (Brandeberry Conference Room)

    Title: Ontologies and Data Integration in Biomedicine Seminar

    Abstract: Review examples of successful biomedical data integration projects in which ontologies play an important role, including the integration of genomic data based on Gene Ontology annotations, the cancer Biomedical Informatics Grid (caBIG) project, and semantic mashups created by the Semantic Web for Health Care and Life Sciences community. Challenges to data integration in biomedicine will also be discussed.

    Biography: Dr. Bodenreider is a Research Scientist at the Lister Hill National Center for Biomedical Communications, US National Library of Medicine, NIH. His research interests include terminology, knowledge representation and ontology in the biomedical domain, both from a theoretical perspective and in their application to natural language understanding, reasoning, information visualization and integration. Dr. Bodenreider is a Fellow of the American College of Medical Informatics. He received a M.D. degree from the University of Strasbourg, France in 1990 and a Ph.D. in Medical Informatics from the University of Nancy, France in 1993. Before joining NLM in 1996, he was an assistant professor for Biostatistics and Medical Informatics at the University of Nancy, France, Medical School.

  • Dr. Thomas C. Rindflesch, National Institutes of Health

    When: October 8, 2010

    Venue: 292 Joshi (Brandeberry Conference Room)

    Title: Extracting Semantic Predications from Biomedical Text

    Abstract: SemRep is a symbolic natural language processing system that identifies semantic predications in biomedical text. For example, "Acetylcholine STIMULATES Nitric Oxide" is extracted from the sentence "In humans, ACh evoked a dose-dependent increase of NO levels in exhaled air." The system is linguistically based and depends on domain knowledge in the Unified Medical Language System. Underspecified interpretation for a range of syntactic structures is provided, rather than detailed representation for a limited number of phenomena. Thirty core predications in clinical medicine, genetic etiology of disease, pharmacogenomics, and molecular biology are retrieved. Several evaluations report precision around 75% and recall near 65% (both lower for molecular biology). SemRep predications have been exploited for text mining applications in genetic etiology of disease, automatic summarization, literature-based discovery, and enhanced information retrieval.

    Biography: Dr. Rindflesch has a PhD in linguistics from the University of Minnesota and currently leads the Semantic Knowledge Representation project, which conducts research in natural language processing methodology and develops innovative applications for advanced access to information in MEDLINE citations.

  • Dr. Alfredo Cuzzocrea, PhD Institute of High Performance Computing and Networking

    When: October 13, 2011, 11:00AM - 12:00PM

    Venue: Joshi Atrium

    Title: Knowledge Discovery From Sensors and Stream

    Biography: Alfredo Cuzzocrea is a Senior Researcher at the Institute of High Performance Computing and Networking of the Italian National Research Council, Italy, and an Adjunct Professor at the Department of Electronics, Computer Science and Systems of the University of Calabria, Italy. His research interests include multidimensional data modeling and querying, data stream modeling and querying, data warehousing and OLAP, OLAM, XML data management, Web information systems modeling and engineering, knowledge representation and management models and techniques, Grid and P2P computing. He is author or co-author of more than 150 papers. He also serves as PC Chair in several international conferences and as Guest Editor in international journals like JCSS, DKE, IS, KAIS, IJBIDM, IJDMMM and JDIM.