SafetyWorks Ontology Construction
Author: G. H. Merrill
Abstract: The SafetyWorks project is oriented towards developing methodologies for the use of large observational data sources in drug safety signal screening and evaluation. The SafetyWorks application, as described here, is a prototype software application (or a set of prototypical components) implementing those methodologies. A feature of the overall SafetyWorks methodologies is the use of large formal biomedical ontologies for the purposes of data normalization and the exploration and inferencing of class effects pertaining to medical conditions and drugs. This paper describes the methods used in the creation and annotation of those ontologies within GSK’s prototype application of the SafetyWorks methodologies.
Note (Added Sept. 15, 2009):
This paper describes the technology employed in the GlaxoSmithKline SafetyWorks project to create the drug and medical conditions ontologies employed in SafetyWorks. It was delivered to ProSanos Corporation as part of a licensing agreement and technology transfer covering their use of this and other intellectual property developed by GlaxoSmithKline, and it is hereby placed in the public domain.
Towards Automating an Inference Model on Unstructured Terminologies: OXMIS Case Study
Author: J. L. Painter
Abstract: Most modern biomedical vocabularies employ some hierarchical representation
that provides a “broader/narrower” meaning relationship among the “codes”
or “concepts” found within them. Often, however, we may find within the
clinical setting the creation and curation of unstructured custom vocabularies
used in the everyday practice of classifying and categorizing clinical data
and findings.
A significant and widely used example of this lies in the General Practice
Research Database which makes use of the Oxford Medical Information Systems
(OXMIS) coding scheme to represent drugs and medical conditions. This scheme
is intrinsically unstructured, is generally regarded as disorganized, and is
not amenable to comparison with other hierarchically structured medical coding
schemes such as ICD-9, MedDRA, or SNOMED CT. In order to improve processes of
data analysis and extraction, we define a semantically meaningful representation
of the OXMIS codes by way of the UMLS Metathesaurus. A structure-imposing
ontology mapping is created, and this process provides a complete illustration
of a general semantic mapping technique applicable to unstructured biomedical
terminologies.
Concepts and Synonymy in the UMLS Metathesaurus
G. H. Merrill
Abstract: This paper advances a detailed exploration of the complex relationships among terms, concepts, and synonymy in the UMLS Metathesaurus, and proposes the study and understanding of the Metathesaurus from a model-theoretic perspective. Initial sections provide the background and motivation for such an approach, and a careful informal treatment of these notions is offered as a context and basis for the formal analysis. What emerges from this is a set of puzzles and confusions in the Metathesaurus and its literature pertaining to synonymy and its relation to terms and concepts. A model theory for a segment of the Metathesaurus is then constructed, and its adequacy relative to the informal treatment is demonstrated. Finally, it is shown how this approach clarifies and addresses the puzzles educed from the informal discussion, and how the model-theoretic perspective may be employed to evaluate some fundamental criticisms of the Metathesaurus.
The MedDRA Paradox
Gary H. Merrill, Ph.D.
Abstract: MedDRA (the Medical Dictionary for Regulatory Activities Terminology) is a controlled vocabulary widely used as a medical coding scheme. However, MedDRA’s characterization of its structural hierarchy exhibits some confusing and paradoxical features. The goal of this paper is to examine these features, determine whether there is a coherent view of the MedDRA hierarchy that emerges, and explore what lessons are to be learned from this for using MedDRA and similar terminologies in a broad medical informatics context that includes relations among multiple disparate terminologies, thesauri, and ontologies.
Construction and Annotation of a UMLS/SNOMED-based Drug Ontology for Observational Pharmacovigilance
Gary H. Merrill, Patrick B. Ryan, Jeffery L. Painter
Abstract: The primary goal of the SafetyWorks project has been the development of an integrated set of methodologies enabling the use of large observational data sources in monitoring and assessing drug safety concerns. To support its analytical and exploratory capabilities, SafetyWorks makes use of two large hierarchically structured ontologies – one for medical conditions, and one for drugs. In this paper we focus on the drug ontology employed in SafetyWorks and on its construction and annotation based on the SNOMED CT and RxNorm subsets of the Unified Medical Language System Metathesaurus. The result is a case study illustrating the value of SNOMED and its integration with UMLS and RxNorm in a critical and innovative drug safety application. We expose sufficient details of our methods to enable others to make use of these methods and to encourage the expanded use of both SNOMED and the UMLS in data exploration and analysis applications, particularly in the area of improving approaches to drug safety.
Inter-translation of Biomedical Coding Schemes Using UMLS
Jeffery L. Painter, Kristopher M. Kleiner, Gary H. Merrill
Abstract: We report the results of our work in using the Unified Medical Language System (UMLS) 1 to apply biomedical ontologies to practical problems faced by epidemiologists in extracting study cohorts from large disparate observational data bases.
A Practical Multi-Ontology Approach to Knowledge Exploration
Gary H. Merrill
Abstract: Babylon Knowledge Explorer is an open source based platform for developing knowledge exploration and data mining applications. As a strongly ontology- driven information system, it rests on an approach and solution to the challenges of using pre-existing large ontologies in scientific domains. The aim of this paper is to describe the model of ontology representation employed by BKE and how this model supports a strategy and methodology facilitating application of these pre-existing ontologies to real-world problems of knowledge discovery in large document corpora and databases. Suficient detail of design and architecture is provided, together with some sketch of the open source implementation, to allow others to make use of this approach.