Context Mediation among Knowledge Discovery Components
Context Mediation is a field of research that is concerned with the interchange of information across different environments, which provides a vehicle to bridge semantic gaps among disparate entities. Knowledge Discovery is concerned with the extraction of actionable information from large databases. A challenge that has received relatively little attention is knowledge discovery in a highly disparate environment, that is multiple heterogeneous data sources, multiple domain knowledge sources and multiple knowledge patterns. This thesis tackles the problem of semantic interoperability among data, domain knowledge and knowledge patterns in a knowledge discovery process using context mediation.
Context fundamentals are introduced, which encompasses the concepts of context identity, semantic values, contextual equivalence, contextual orders, contextual distances, inheritance of contexts and inter-ontology relationships. Based on this foundation, the principles of context mediators for data, domain knowledge and knowledge patterns are outlined, a context mediator prototype is developed and performance tests are carried out. Expanding on these rudimentary elements, context mediation is introduced for data, domain knowledge and knowledge patterns.
Contextual data mediation is concerned about semantic conflicts among heterogeneous data sources which are used as input for knowledge discovery. In order to treat contexts as first class citizens and allow inheritance as well as overloading and overriding operations, an object data model has been chosen, namely ODMG. In addition to extending the ODMG meta model, its object definition language and object query counterpart have been extended appropriately.
Contextual domain knowledge mediation deals with the integration of pre-existing knowledge about data, preferences and biases ubiquitous in multiple contexts, which are incorporated in the knowledge discovery process. Different types of contextual domain knowledge are formulated, namely taxonomies, constraints, user preferences and previously discovered knowledge. In order to allow the support of subjective and objective domain knowledge, context mediation among different domain knowledge entities is proposed which is compatible with the context mediation formulated earlier.
Contextual knowledge pattern mediation is concerned with the interpretation of the outputs from data mining algorithms from different perspectives. An object-oriented framework is presented that models the output of virtually any knowledge discovery exercise. Based on the proposed framework, two operations are developed which allow the viewing or interpreting of data mining output within different contexts. Contextual ranking allows the ordering of information based on qualitative and quantitative information. Three manipulative operations are introduced which provide a further vehicle to tailor knowledge sets, namely balancing, boosting and inversion. Comparison of discovered knowledge provides a powerful mechanism to evaluate the equivalence between two or more knowledge patterns or pattern objects. A summary value is introduced which is used to calculate pattern equivalence and example summary values have been given for segments and associations.
All presented techniques, methods and models are applied in real-world scenarios, covering disciplines from a wide range of industry, namely web mining and marketing, manufacturing, meteorology and internationalisation. When feasible, industry standards were utilised, for instance ODMG, PMML and KQML.
The carried out research has resulted in almost fifty international publications, including the co-authorship of a book, a journal editorship and one conference best paper award.