-
Notifications
You must be signed in to change notification settings - Fork 18
Appendix EMMO Introduction by CoBRAIN
Welcome to the EMMO wiki!
Table | Title |
---|---|
Table 1 | EMMO Reference Level Modules |
Table 2 | EMMO Domain Modules |
Abbreviation | Definition |
---|---|
ACS* | Access e. V. |
ANX* | Aeonx AI |
BAL* | Balance Technology Consulting GMBH |
CA* | Consortium Agreement |
CHADA | Characterisation Data |
DB* | Database |
DMP | Data Management Plan |
EMCC* | The European Materials Characterisation Council |
EMMC | The European Materials Modelling Council |
EMMO | Elementary Multiperspective Material Ontology |
EXE* | Exelisis IKE |
HHM* | High Entropy Hardmetals |
HVOF | High Velocity Oxygen Fuel |
ISQ | International System of Quantities ISO 80000 |
KB*~ | Knowledge Base |
KME | Knowledge Management Environment |
MODA | Materials Modelling Data |
MBN* | MBN NANOMATERIALIA SPA |
OTE*~ | Open Translation Environment |
TRL* | Technology Readiness Level |
UB* | Universitat De Barcelona |
UMR* | Universita Degli Studi Di Modena E Reggio Emilia |
UNIBO*~ | Alma Mater Studiorum - Universita Di Bologna |
UR3*~ | Universita Degli Studi Roma Tre |
There has been considerable improvement in the fields of ontologies, terminology, classification, and data documentation in the last couple of decades.
EMMO is a multidisciplinary effort to develop a standard representational framework (the ontology) for applied sciences. It is based on physics, analytical philosophy and information and communication technologies. It has been instigated by materials science to provide a framework for knowledge capture that is consistent with scientific principles and methodologies. It is released under a Creative Commons CC BY 4.0 license.
The name Elementary Multiperspective Material Ontology should be understood as follows:
• Elementary means, amongst others, that EMMO is a discrete ontology assuming the existence of a smallest possible 4D world object in space and time. The term Elementary in EMMO refers to objects that cannot be divided further in space. Elementary also emphasizes EMMO being a fundamental, top-level ontology.
• Multiperspective highlights a very important aspect of EMMO - that it is possible to describe the world from different perspectives. This makes the ontology both flexible and expressive.
• Material (as the opposite of immaterial) emphasises that EMMO is strictly nominalistic, meaning that it assumes that abstracts do not exist. Material also refers to the historical scope of EMMO aiming at the description of materials and thus to cover the needs of physicists and applied scientists.
• Ontology, EMMO is an ontology. It is based on fundamental philosophical concepts like semiosis, mereology, and topology.
The EMMO ontology is structured in shells, expressed by specific ontology fragments, that extends from fundamental concepts to the application domains, following the dependency flow.
The EMMO top level is the group of fundamental axioms that constitute the philosophical foundation of the EMMO. It starts from causality and mereology, from which it derives space and time. Adopting a physicalistic/nominalistic perspective, the EMMO defines real world objects as 4D objects that are always extended in space and time (i.e. real-world objects cannot be spaceless nor timeless). For this reason, abstract objects, i.e. objects that do not extend in space and time, are forbidden in the EMMO.
EMMO is strongly based on the analytical philosophy discipline semiotics. The role of abstract objects is in EMMO fulfilled by semiotic objects, i.e. real-world objects (e.g. symbol or sign) that stand for other real-world objects that are to be interpreted by an agent. These symbols appear in actions (semiotic processes) meant to communicate meaning by establishing relationships between symbols (signs).
Another important building block of from analytical philosophy is atomistic mereology applied to 4D objects. The EMMO calls it 'quantum mereology', since there is an epistemological limit to how fine we can resolve space and time due to the uncertainty principles.
The mereocausality module introduces the fundamental mereocausality concepts and their relations with the real-world objects that they represent. The EMMO uses mereocausality as the ground for all the subsequent ontology modules. The concept of causal connection is used to define the first distinction between ontology entities namely the Item and Collection classes. Items are causally self-connected objects, while collections are causally disconnected. Quantum mereology is represented by the Quantum class. This module introduces also the fundamental mereocausality relations used to distinguish between space and time dimensions.
The CausalObject is the class of all the individuals that stand for world objects that are a self-connected composition of more than one quantum object and whose temporal parts are always self-connected. It also defines the Elementary class, that restricts mereological atomism in space as causal chains of quantum objects and CausalSystem, that are non-elementary causal objects.
In EMMO, the only univocally defined real world object is the CausalSystem individual called Universe that stands for the universe. Every other real-world object is a composition of elementaries up to the most comprehensive object, the Universe. Intermediate objects are not univocally defined, but their definition is provided according to some specific philosophical perspectives. This is an expression of reductionism (i.e. objects are made of sub-objects) and epistemological pluralism (i.e. objects are always defined according to the perspective of an interpreter, or a class of interpreters).
The middle level of EMMO embraces pluralism by providing different ways to describe the world according to different perspectives. EMMO also allows to combine different perspectives to gain additional expressivity.
The Perspective class collects the different ways to represent the objects that populate the conceptual region between the elementary and universe levels.
The Reductionistic perspective class uses the fundamental non-transitive parthood relation, called direct parthood, to provide a powerful granularity description of multiscale real-world objects. The EMMO can in principle represents the Universe with direct parthood relations as a direct rooted tree up to its elementary constituents.
The Holistic perspective class considers the importance and role of the whole and introduces the concept of real-world objects that unfold in time in a way that has a meaning for the EMMO user, through the definition of the classes Process and Participant.
The Perceptual perspective class introduces the concept of real-world objects that can be perceived by the user as a recognisable pattern in space or time. Under this class the EMMO categorises e.g. formal languages, pictures, geometry, mathematics, and sounds. Phenomenic objects can be used in a semiotic process as signs.
The Physicalistic perspective class introduces the concept of real-world objects that have a meaning for the ontologist under an applied physics perspective.
The Semiotics perspective introduces the concepts of the semiosis process that have the semiotic entities (Sign, Object, Interpretant and Interpreter) as spatial parts. It is inspired by Pierce semiotics and forms the basis in EMMO to represent e.g. models, formal languages, theories, information, and properties.
The Persistence perspective consider 4D objects as they extend in time (process) or as they persist in time (object). It introduces a sometime useful categorization that characterizes persistency through continuant and occurrent concepts, even though this distinction is only cognitively defined.
EMMO comes with a set of generic reference ontologies that combine perspectives with ontologisation of common concepts like materials, math, units, etc. intended to be shared by domain ontologies. The reference ontologies are intended to be used by domain ontologies and imported separately using the IRIs listed in the table below with the current set of reference ontologies.
Reference Domain | IRI |
---|---|
Materials | http://emmo.info/emmo/multiperspective/materials |
Math | http://emmo.info/emmo/multiperspective/math |
Models | http://emmo.info/emmo/multiperspective/models |
Properties | http://emmo.info/emmo/multiperspective/properties |
Metrology | http://emmo.info/emmo/multiperspective/metrology |
Isq | http://emmo.info/emmo/multiperspective/isq |
Siunits | http://emmo.info/emmo/domain/siunits |
Chemistry | http://emmo.info/emmo/multiperspective/chemistry |
Currently there are several domain ontologies in development that use EMMO as the top and middle level ontology. Typically, they import one of the versions of EMMO listed on https://emmo-repo.github.io/. The following table lists the public EMMO-based domain ontologies that we are aware of. Please create an issue if you have a public domain ontology that you think should be listed here.
The complexity of the EMMO ontology is something that cannot be easily handled by a non-expert in formal ontologies, which is the case among the almost entirety of experts in scientific and technical domains. For this reason, the conceptual framework available to the application domain experts has been reduced to the minimal number of concepts required to express the user cases but still manageable by ontology novices. The ontology subset is directly mapped to the overall EMMO ontology, so that it is possible to place the domain knowledge base within the larger and more logically complex EMMO framework.
In the following sections, the EMMO concepts that constitute the TBox foundations are introduced.
The EMMO is intrinsically four-dimensional, meaning that real-world entities are represented as always extending in 4D. The reasons for this choice are related to the intrinsically evolutionary nature of physical phenomena, just like the concept of bond (which is behind every object definition) which requires the establishment in time of persistent interactions between the bonded entities.
Without entering in the details, a generic entity is represented using a graphical representation like the one in Figure 4, where the evolution in time is expressed by the horizontal extension, while the spatial extension is expressed by the vertical extension, and different entities are represented by polygons that may or may not overlap.
In the EMMO the object/process distinction is simply a matter of convenience since in a 4D conceptualisation everything is unfolding in time, and stationarity depends upon observer time scale. However, it is still convenient to retain an object-process distinction since it is naturally rooted in the common sensical way to discuss about the world and may facilitate the comprehension of the concepts by domain experts.
More specifically, an entity is called a process if its defining class (or type) is expressed according to how it extends in time (focus on temporal evolution), or an object if its defining class is expressed according to how it persists in time (focus on spatial configuration). The same individual may then be a process or an object, or both, depending on the class to which it belongs. For example, the same 4D entity representing a human being, which is an object, can represent the process of aging, which is of course a process.
The OWL 2 DL classes Object and Process are then the fundamental classes used in constructing the initial domain or application ontology, as shown in Figure 5.
The mereocausality relationships are the backbone of the EMMO, being the union of a mereology and a causal theory. Mereology is the theory formalising the relations between a whole and its parts, through the fundamental concept of parthood. In constructing the domain or application ontology we make use of a simple subset of mereological relations: overlap, parthood, and spatial/temporal parthood, expressing intuitively their meaning through the above introduced graphical representation.
The relation of proper overlap occurs between two entities that share some of their parts, but they both still retain some parts that are not shared. Proper overlap is formalized by the symmetric object property isProperOverlapOf and provides several sub-relations as shown in Figure 6, when the object/process distinction is used to classify the related entities.
(continuous line standing for objects, dashed line for processes, and relations going from the grey to the white boxes)
The isAddedTo and isRemovedTo relations are used to represent user cases when a generic entity overlaps an object, within which its temporal evolution starts or ends. For example, it can be used to represent the injection of powders into a HVOF jet, or the ejection of molten particles from the same jet.
The isOutputOf and isInputOf relations are used to represent user cases when a generic entity overlaps a process, within which its temporal evolution starts or ends. These are the typical relations used to represent the sample coming out from an experimental procedure, or the gas feed used for a spraying process.
The affects and partakesIn relations are used to represent user cases when a generic entity overlaps an object, and it persists before and after the overlap. For example, these relations can be used to represent a component that is used into a device and then extracted and reused into another device.
The contributesTo and participatesTo relations are used to represent user cases when a generic entity overlaps a process, and it persists before and after the overlap. These are the typical relations used to represent a device such a HVOF torch that participates to an experimental procedure.
Parthood occurs when two entities overlap but one is completely comprised within the other. By combining the concepts of process and object, we can introduce different types of sub-relations. The fact that the part covers (or not) the overall spatial extension of the whole leads us to the concepts of spatial and temporal parts, which are of paramount importance for the representation of actual real-world user cases. A summary of the parthood sub-relations (always antisymmetric) is shown in Figure 7.
The isConstituentOf and isSubjectOf relations are used to represent user cases when an object is spatial or temporal part of another object respectively. For example, they can be used to represent the constituent parts of a device (e.g., the components of a characterisation system), or a particular configuration expressed by an object.
The isConstitutiveProcessOf and isBehaviourOf relations are used to represent user cases when a process is spatial of temporal part of an object respectively. For example, they can be used to represent the constituent processes that makes a device work (e.g., heat exchange), or a particular behaviour of a device (e.g., the pre-heating phase).
The isProperParticipantOf and isStatusOf relations are used to represent user cases when an object is spatial of temporal part of a process respectively. For example, they can be used to represent an entity that participates in a process for the duration of the process itself (i.e., a role such as the experimentalist on a particular test), or a particular state of a process (e.g., a young man as temporal part of the overall human aging process).
The isSubProcessOf and isStageOf relations are used to represent user cases when a process is spatial of temporal part of another process respectively. For example, they can be used to represent the constituent processes that make a process occur (e.g., heat exchange in a HVOF deposition), or a particular stage of process.
Causality in the EMMO is an extremely powerful relation that extends from the elementary particle level, including the representation of quantum systems, up to the macroscopic level. A detailed discussion on Causality in EMMO can be found here. However, in most practical applications of EMMO it is sufficient to work with a reduced level of complexity, and in particular with a subset of relations that are more closely related to applications. The causal relations that have been included in the EMMO subset discussed below hence refer only to macroscopic entities, and are summarized in Figure 8 and Figure 9. Causality between macroscopic entities is expressed by two relations families:
- Temporal causation, which is always asymmetric and expresses the evolution of entities distinguishing between causing entity and effected entity. Direct causation, without intermediaries, is expressed by hasNext or hasNextStep relations, while causation with intermediate entities is expressed by precedes or hasSubsequentStep relations.
- Spatial causation, which is always symmetric and expresses the mutual influence between entities. The isAdjacentTo and communicatesWith relations express a direct interaction between entities, without intermediaries. The indirectlyAffects and indirectlyCommunicatesWith relations express an indirect interaction, with intermediaries. Spatial causation can be used to represent spatial configurations.
The EMMO is designed to formalise the way in which a property (e.g., physical quantity, names, pictures) is generated according to a particular procedure (e.g., observation, modelling, characterisation) by means of a semiotic based approach.
The semiotic triangle is shown in Figure 10, where the semiotic process (in the centre) hosts the semiotic object (the observed entity), the sign (the entity that stands for it) and the interpreter (the agent responsible for the generation of the sign). This general schema can represent modelling and characterisation activities, keeping track of the details of the generation process.
For example, a cold spray process (semiotic object) can be observed (semiosis) by an experimentalist (interpreter) to keep track of the running time (sign). A SEM (interpreter) can scan (semiosis) a coated substrate (semiotic object) and provide an image (sign). A microstructure (semiotic object) can be modelled (semiosis) by a simulation software (interpreter) to provide a prediction for its mechanical properties (sign).
More than one sign can be used to refer to the same entity, reflecting the fact that an entity can be the subject of several measurement or modelling investigations, that can also be in contradiction. In fact, things like physical properties, names, attributes, location, time, or any data in general, are signs generated through a subjective semiotic process, that holds only for a particular class of interpreters. In the EMMO a sign is always related to the agent beyond the semiotic statement, and the process of generation of the sign is always documented, as shown in Figure 11.
In most domains and applications, we are interested in a particular subclass of signs, called properties, which are the ones obtained through a well-defined procedure of interaction (e.g., a characterisation procedure, a modelling workflow, using a physical model such as Fourier law for heat conduction). Among all possible properties, we are more likely interested in the subclass of quantities, which are the properties that can be quantified and that are represented using units of measurement.
To provide a comprehensive representational framework, that can be shared between different domains, the EMMO includes:
- the quantities formalised by the International System of Quantities ISO 80000[1], as shown in Figure 12
- a general metrological framework based on the International Vocabulary of Metrology[2], as shown in Figure 13
- a framework for the system of units based in the International System of Units (SI)[3], as shown in Figure 14
These EMMO modules, based on semiotics, enable the domain and application ontology to represent in a comprehensive way all the methodologies that can be used to generate information about a particular entity. The network of entities and information collected within the project (i.e., the mereocausally represented states of things and the semiotic processes used to observe them) is the knowledge base.
[
[1] https://www.iso.org/standard/76921.html
[2] https://www.bipm.org/documents/20126/54295284/VIM4_CD_210111c.pdf
[3] https://www.bipm.org/en/measurement-units
The combination of mereocausality relations and semiotic processes enables the representation of workflows that can be simulation workflows (MODA) or characterisation workflows (CHADA), or more generally any potential procedure that generates information about a system. The process of knowledge generation can be represented in the EMMO as shown in Figure 15, where the input (if present), the output and the subject of the observation are related together using mereocausal relationships.
This approach can be used to represent complex workflows, such the ones that results from the concatenation of more than on MODA, as shown in Figure 16, and it is possible to focus on the data flow or the task flow representations. Moreover, it can go beyond the MODA, and represent meta-modelling activities such as the usage of data-based models, when AI approach is used to create surrogate models starting from experimental data or physics-based modelling data, as shown in Figure 17.
One of the most important challenges in developing a domain ontology for a specific community of users is to enable the community to express the concepts that make up their overall knowledge framework in a way that can be easily understood by them and by other potential users with similar or different backgrounds.
Gathering and formalising a domain knowledge is done through the process of conceptualisation, i.e., by identifying ontological concepts in the form of classes, relations, and axiomatic constraints that cover the domain of interest. To overcome the barriers coming from the lack of expertise in ontology engineering, the OntoTrans project has developed a methodology for the interaction between the translator and the industrial stakeholders, aimed to facilitate collective contributions to the conceptualization effort.
Prior to starting with ontology implementation in semantic web technologies, domain and ontology experts are advised to collaboratively agree on the concepts and relations that need to be represented in order to achieve the desired objectives of the ontology. For further details on ontology design, including competency questions, see Reference: Poveda-Villalón, M.; Fernández-Izquierdo, A.; Fernández-López, M.; García-Castro, R. LOT: An Industrial Oriented Ontology Engineering Framework. Engineering Applications of Artificial Intelligence 2022, 111, 104755. https://doi.org/10.1016/j.engappai.2022.104755.
The conceptualisation board has been designed to be used in a collaborative online platform (such as MIRO[4] ) to enable participants to contribute to the conceptualisation development and is shown in Figure 18. It guides participants from left to right to:
- Introduce the class concepts needed to describe the user case
- expressing axioms to better define what classes represent relating them using mereocausal relations
- introduce the properties used to determine (or characterise) the entities
- introduce the datatypes used to express the properties (e.g., double, string)
- define the serialisation for each datatype (e.g., file format)
- express how the properties are obtained by introducing the concept of knowledge generators
- formalise the workflows (e.g. materials process, manufacturing, modelling, characterisation) using the concepts introduced in the previous blocks
Besides that, it provides a place to list the references to domain literature (e.g., standards) and to propose labels and definitions. The board has been populated through a collaborative effort, involving all WP1 partners.
This deliverable focuses on the points from 1 to 3. T1.4 activities will cover the serialisation of the datatypes, and the populating of the knowledge base.
[4] https://miro.com
As an example, the first block has been populated by classes expressing relevant concepts in the field of thermal spraying technology from the CoBRAIN project, classifying them as object or process. The results are shown in Figure 19. For each class a set of labels, a definition, a comment, and a list of domain literature sources has been provided to better clarify the concept behind the class.
The second block has been populated by expressing the relations between classes using axioms in the Subject/Predicate/Quantifier/Object form (e.g., COATING isOutputOf some THERMALSPRAYING). The mereocausal relations are summarised on the left of the block, to facilitate the users, as shown in Figure 20. An example of user case in 4D diagram has also been provided and shown in detail in Figure 21. The example represents the temporal parts that constitute the sub-objects of a substrate going through the preparation stages (i.e., roughening, cleaning, tooling, and masking) and the thermal spraying deposition process. Besides that, it describes related devices (e.g., the thermal spraying system) and processes (e.g., gas feeding).
It is important to understand the representation of the user case itself is a form of knowledge without data, since it documents in a formal way the state of things occurring during a specific run. This arrangement of things can also be analysed using AI tools to find recurring patterns according to a specific KPI (e.g., to find the user case structure that provides samples with highest hardness values).
Figure 20 - Relate block, expressing the mereocausality relations between classes as axioms (Subject/Predicate/Quantifier/Object)
The third block has been populated by listing all the properties that are relevant for the thermal spraying process, and then connecting the properties to each respective entity. For each property a reference to the ISQ, to the unit of measurement and to the literature sources has been provided, as documentation.
An example of a workflow, a simple simulation is represented using the CoBRAIN subset in Figure 23, including the connection between the material and the model, the simulation input and output, the data structures, the individuals (i.e., the ABox) that stands for a specific simulation run, and the actual data serialisation.
Following the conceptualisation expressed in the conceptualisation board, an OWL 2 DL ontology has been created, to be used as model in a graph database . As an example, the CoBRAIN ontology will be made public as soon as the final version will be achieved), with IRI https://www.cobrain-project.eu/thermalspraying.
The ontology has been developed using the Protégé tool for OWL 2 DL ontology development[5] as shown in Figure 24. The ontology provides a taxonomy of classes that can be used to represent the thermal spraying process, as shown in Figure 25, together with the subset of EMMO mereocausality and semiotic relationships, as shown in Figure 26. It provides also a taxonomy of classes for the representation of datatypes and properties, as shown in Figure 27.
The OWL 2 DL serialisation of the CoBRAIN ontology is provided as Annex I.
[5] https://protege.stanford.edu/
As an example, a dataset for nanoindentation is shown in Figure 28, listing the data and metadata foreseen by this application. The Excel table collecting all the dataset entries is shown in Figure 29, where each row is representing a specific nanoindentation characterisation process. The mapping between the dataset Excel serialisation and the graph database is shown in Figure 30, where each column has been interpreted semantically according to the CoBRAIN ontology concepts and related to the other columns using the relations provided by the EMMO subset, using in a Subject/Predicate/Object schema.
Using this approach, it is possible for the end users to store the data using traditional, easy to use tools (such as Excel spreadsheets), without the need for training in ontology engineering or data management. The spreadsheet will be imported into the knowledge graph thanks to the related mapping.