Modern research results and methodological principles of knowledge representation
H. Abdulrab,
Professor of computer science, LITIS laboratory, INSA (Rouen, France). E. Babkin,
Associate Professor, Faculty of Business Informatics and Applied Mathematics, State University — Higher School of Economics (Nizhny Novgorod). LITIS laboratory, INSA (Rouen, France).
This article concerns the modern methods of knowledge representation which can be effectively applied in modern intellectual information system. The major attention is given to very promising concept of ontology and corresponding formal and software engineering methods for ontology manipulation.
Introduction
Modern techniques of Knowledge Represena-tion (KR) place special emphasis to effective resolution of practical problems in Web information retrieval, software integration and semantic interoperability. To cope with challenging issues in these domains it was proposed many original approaches combining "pure" classical KR techniques with advanced methods of information modeling coming from operations research and software engineering. In our opinion such situation illustrates wide acceptance of a multi-aspect approach to complex systems modeling, when complex systems are considered from different points of view and for each point of view the most suitable representation technique is chosen. Undoubtedly the models with multiple aspects give us more comprehensive understanding of the universe, but at the same time we need to observe jointly a large number of different research directions usually considered separately.
1. Ontology
Among descendants of the conceptual graphs, the technique of ontology plays the leading role for knowledge representation now. According to [SOWOO] ontology serves for strong support in detailed study of all potentially possible entities and their interrelations in some domain of discourse shared by multiple communities; ontology also enables conceptualization and forming categories of the entities committed by those communities. This direct connection of the ontology technique to integration is pointed out by Y. Kalfoglou: "An ontology is an explicit representation of a shared under-
standing of the important concepts in some domain of interest. [KAL01]." In the world of program systems only those concepts exist which can be submitted in the formal form. So, in general ontology declaratively defines a terminological dictionary and expresses some logic theory (assertions, axioms and inference rules); in its terms inquiries and statements which various agents exchange among them during interaction are expressed. There is a plenty of various conceptual statements characterizing vital features of ontology and its meaning for knowledge representation, knowledge engineering, knowledge management, qualitative modeling, language engineering, database design, information retrieval and extraction [MGU02]. It is became a good breeding to refer to the Gruber's pioneer definition of ontology as "a specification of a conceptualization" [GRU95]. In [DRA06] one can see a good collection of more recent cross-references. Swartout and Tate offer an informal and metaphorical but extremely useful definition for understanding of the essentials of an ontology:
Ontology is the basic structure or armature around which a knowledge base can be built. [SWA99].
One can see that modern pragmatic employment and definition of ontology went far from the original pure philosophical notion of ontology as a most general theory of being as such, that was originally developed by ancient Greek philosophers Socrates and Aristotle. Although modern philosophical works continue their speculative traditions (like [STR75]), practical applications of ontology require its embodiment into formal mathematical forms. In a result [DRA06] states that "...ontologies are growing fast into a distinct scientific
field with its own theories, formalisms, and approaches" and proposes studies of [STA04] for a more comprehensive coverage of the field. Following the proposals we need to find a well-formed mathematical description of ontology. One well-defined and elegant mathematical theory of ontology was developed at the Institute AIFB at the University of Karlsruhe [EHR07]. That theory defines a core ontology (the intentional aspect of the domain of discourse) as a mathematical structure: S = (C, <C, R, o, <R), where C — is a set of concept identifiers (concepts for short). R — is a set of relations identifiers (relations for short). <C — is a partial order on C, called concept hierarchy or taxonomy.
0 — a function R^CxC called signature, such that
01 = (domI, ranI), where r e R, domain domI, range rani.
<R — is a partial order on R, called relation hierarchy, such that r\ <R r2 if and only if dom(^) <C dom(r2) and ran(rj) <C ran(r2).
Domain-specific dependencies of concepts and relations in S are formulated by a certain logical language (e.g. first —order predicate calculus) which fits a rather generic definition:
Let L be a logical language. An L-axiom system for a core ontology is a pair A = <AI, a>, where AI — is a set of axioms identifiers. a — is a mapping AI^L
The elements of A are called axioms. Extensional definition of the domain of discourse (assertions or facts about instances and relations) is given by description of the knowledge base KB. KB is the following structure:
KB = (C, R, I, iC, iR >, where C — is a set of concepts. R — is a set of relations.
I — is a set of instance identifiers (instances for short). iC — is a function C ^ P(I) called concept instantiation. iR — is a function C ^ P(P-) called relation instantiation; it has such properties: Vr e R, iRI c iC (domI) xiC (ranI).
The theory provides also names for concepts and relations calling them signs, and defines a lexicon for ontology:
Lex = (G0 Gr, Gi, Ref0 Refz, Refi), where GC — is a set of concepts signs. Gr — is a set of relations signs. GI — is a set of instances signs. RefC — is a relation RefC ^ GCxC called lexical reference for concepts.
RefR — is a relation RefR ^ GrxR called lexical reference for relations.
Ref¡ — is a relation Ref¡ ^ GIx/called lexical reference for instances.
In summary, a complete ontology O is defined through the following structure: O = {S, A, KB, Lex), where S — is a core ontology. A — is the L-axiom system. KB — is a knowledge base. Lex — is a lexicon.
The presented strict mathematical theory of ontology along with other similar approaches developed a solid foundation for machine-readable representation of ontologies in modern information systems and facilitated their practical application for solution of complex classification problems in such domains as Internet, bio-informatics, and enterprise management as well [DAV06, REC06, DIE06]. Over the years a broad group of specific machine languages for ontology representation were developed such as Loom [MGR91], KIF [GEN07], On-tolingua [GRU92]. In resent years W3C XML became one of the obvious forms of ontology representation and interchange format between different components in the framework of distributed systems (for example — XOL language [KAR07]).
The FIPA standard (Foundation for Intellectual Physical Agents) [FIP07] gives s bright example of XML-based ontologies supporting multi-agent communication in distributed systems. FIPA defines for communication between autonomous agents a special language (ACL). Among other sections, the ACL message may contain a fragment of agent's world description. It is expressed in the form of XML-based ontology.
In order to apply ontology for semantic annotation of Web resources, RDF [MAN07] and RDF Schema (S) were originally developed by the W3C as a refinement of XML. Now RDF and RDF(S) are widely used as an interoperable ontology representation language in the Semantic Web, which is "...is designed to smoothly interconnect personal information management, enterprise application integration, and the global sharing of commercial, scientific, and cultural data." [BER05]. In the nearest future the Web Ontology Language (OWL) [SMI04] will come to take the RDF place of ontology representation language in the Semantic Web due to more expressive power and support of multiple modeling methods. For example, OWL is capable of representing an extensional part of the full ontology (facts about concrete individuals and relations between them), modeling hierarchical relations and certain domain-specific
restrictions on attributes and relations. To support formal reasoning OWL has strong relations to powerful KR logic-based technique of description logics that will be considered in the next part. Recently considerable efforts were made to extend a static ontology model of OWL with a dynamic rule language. As illustrative examples, SWRL, the Semantic Web Rule Language [HOR04] and the LP/F-logic rule language can be pointed out.
2. Description Logics
It should be clear that proper selection of a rich logical language dramatically improves expressive capabilities of ontology. We stated already that first order logics provide for intuitive-clear and formal foundations for knowledge representation and reasoning. But in general traditional formal logical theories such as first-order predicate calculus (FOPC) have a vital shortcoming, preventing its broad use in practical circumstances: the FOPC logic is not deducible. In research community a considerable amount of efforts was paid to cope with that problem and develop a subset of FOPC that is computationally efficient but expressive enough for knowledge representation and practical applications.
In the history of achieving this goal one can recognize a number of outstanding results. Chronologically KL-ONE system [BRA85], CLASSIC [BRA91] by R.J. Brachman were originally proposed. Later on Frame Logic (F-logic) was developed by M. Kifer [KIF89, KIF93]. Further a great interest was expressed to advancing KL-ONE and CLASSIC systems. In the result a special class of sound and complete logics was developed that could combine major concepts of advanced semantic nets (primarily Sowa's Conceptual Graphs) and frames. Such class of logics was called Description Logics (DL) [DLO07] (synonyms are Concept Languages and Terminological Languages). Since the 80's R. Brachman, H. Levesque, F. Baader, I. Horrocks and other authors published a number of fundamental works in the area of DL. Multiple theoretical and practical results enabled developing a coherent DL-based KR technique that is capable of practical application in "...virtually all class-based representation formalisms used in Artificial Intelligence, Software Engineering, and Databases" [BAA03]. As far the DL approach follows general principles of the logic-based technique we can use the same generic mathematical structure S as in the section 5.5, to survey DL syntax and semantics:
S = < T, P, A, B, I >.
DL formalizes the structure of the domain of discourse in commonly accepted terms of concepts (classes), roles (properties, relations) and assertions about individuals. In DL the set T of basic lexical elements
consists of atomic concepts and atomic roles. The former have a lexical form of unary predicates (with only one free variable), and the later have a lexical form of binary predicates (with only two free variables). Usually the universal concept T and the absurd concept ± are added to the set of domain-specific concepts.
A set of composed syntax forms of DL consists of syntactically correct logical formulae containing unary predicates, binary predicates and usual logical connectives. With comparison of FOPS DL poses additional restrictions on the set P of permitted combination rules to prevent undividable logical theory, to avoid needs for explicit application of variables, and to provide easy expression of some useful pragmatic concepts (such as counting). In DL the set of logical axioms A (or knowledge base) specifies terminology of the described domain and individual assertions. Following historical reasons the terminological subset of the set A is called T-Box and the assertions subset is called A-Box accordingly.
Let's for illustrative purposes consider application of smallest propositionally closed DL logic called ALC. [SCH91] to terminological description of a simple domain. In ALC. The set of rules P consists of:
^ logical connectives (intersection of concepts n,
union of concepts LI, negation of concepts —); ^ restricted qualifiers (V, 3); ^ only atomic roles.
In result ALC. Expresses formal meaning of the sentence "Person all of whose children are either Doctors or have a child who is a Doctor" as follows:
Person (nVhasChild.(Doctor L 3hasChild. Doctor). Formal semantic of DL is based on the Tarskian model theoretic approach. In general case of FOPC it means that interpretation I is a triple <D, Ic, Iv> with the following properties:
D — is a non-empty set, the domain of interpretation. Ic — is a function which assigns to every predicate constant P with arity m (m = 1,2) a function Ic(P): Dm^{TRUE, FALSE}.
Ic — is a function which assigns to every variable some element from D.
Inductive extension of I permits interpretation of composed syntax forms.
Considering DL specifics F. Baader proposes a slightly different but equivalent definition [BAA03]:
Interpretation I consists of a non-empty set A1 (the domain of interpretation) and an interpretation function, which assigns to every atomic concepts A a set Ax^ A1 and to every atomic role R a binary relation AJ<=AJ x A1.
Inductive extension of I is formulated as follows:
Tx = Ax
±I = 0
(-A)1 = A1 \ A1
(C nD)1 = CI n D1
(C U D)1 = CI LI DI
(V R.C)1 = {a e A1!
Vb.(a, b) e R1 ^ b e C1}
(3R.T)1 = {a e A1!
3b.(a, b) e R1}
In accordance with the content of the general logical structure S we have to consider specifics of the set of inference rules B for DL. As it is stated in [BAA03] "The basic inference on concept expressions in Description Logics is subsumption." It is typically written as C C D. The semantics of assumption postulates that the concept denoted by D is more general than the concept denoted by C. Another typical inference on concept expressions is concept satisfiability, which is the problem of checking whether a concept expression does not necessarily denote the absurd concept Equivalence and disjointness are also mentioned among inference goals.
Discussing about DLs one must bear in mind that actually there is a plentiful family of Description Logics discriminating on the level of expressiveness. The members of the DL family extend basic capabilities of ALC by various ways. I. Horrocks in their lectures gives the following classification and corresponding abbreviations for frequently used DLs:
& S — ALC with transitive roles;
♦ Sx — a sub-family with additional extensions, the suffix x stands for:
♦ H — role hierarchy (e.g. hasDaugher C hasChild);
♦ O — nominals/singleton classes (e.g. {Italy});
♦ I — inverse roles (isChildOf = hasChild-1);
♦ N — number restrictions (<2hasChild);
Q — qualified number restrictions (<2hasChild. Doctor);
^ F — functional number restrictions ((lhasMother).
Suffixes can follow each other describing rather expressive DL variants. For example the abbreviation SHIQ defines a basic variant extended by transitive roles, role hierarchy, inverse roles, and qualified number restrictions.
In particular the SHIQ dialect[HOR99] is the logical foundation for W3C OWL ontology language mentioned before. W3C defined three sublanguages of OWL with increasing complexity and different DL dialects used:
^ OWL Lite presents a subset that allows for easy in-ferencing and exploits the SHIQ variant;
^ OWL DL is more expressive due to the SHOIQ variant;
& finally OWL Full uses unrestricted first-order logic.
It is obvious that limited expressiveness of DL families with comparison of FOPC gives advantages for practical applications. So-called tableaux algorithms [SCH91] permit decidable DL reasoning in principle. However in [BAA03] F. Baader argues that for certain families of DLs "...a careful analysis of the algorithms for structural subsumption shows that they are sound, but not always complete in terms of the logical semantics: whenever they return "yes" the answer is correct, but when they report "no" the answer may be incorrect". Also computational efficiency issues introduce obstacles for straightforward application of very expressive families of DLs. For practitioners wishing to apply DLs, in [BAA03] the same author gives advice to ".study formally and methodically the tradeoff between the computational complexity of reasoning and the expressiveness of the language, which itself is defined in terms of the constructs that are admitted in the language". In reality it means that many attractable DLs are ExpTime-com-plete (e.g.SHIQ) and only few polynomial-complexity DL dialects for very strict domains were developed so far (e.g. CEL [BAA06]).
Focusing on our main topic of information modeling we would be interested in considering DL capabilities for semantic enrichment of modern information models (e.g. UML). To the best of our knowledge we need to state that results are not so optimistic from the practical point of view: although one can define a formal mapping between UML and DL, the produced DL is ExpTime-complete [BER05a].
3. Constraints
Constraints satisfaction technique is one of the advanced methods in modern operation research and knowledge representation. This technique attracted our attention due to enabling representation of different real-life problems in the declarative and generic form. One of the distinguished researchers in this area, Eugene C. Freuder, mentioned that
„ Constraint programming represents one of the closest approaches computer science has yet made to the Holy Grail of programming: the user states the problem, the computer solves it."
Also, that is an interesting fact, that constraint satisfaction has been identified by the ACM (Association for Computing Machinery) as one of the strategic directions in computer research.
In the most general form the technique of constraints satisfaction considers such problems, which are defined by a finite set of variables and a finite set of constraints. Each variable has assigned values from the certain finite domain, while constraints, represented in the form of logical relation among several variables, restrict combinations of values that the variables can take together. Constraints can also be heterogeneous, so they can bind unknowns from different domains, for example the length (number) with the word (string). The important feature of constraints is their declarative manner, i.e., they specify what relationship must hold without specifying a computational procedure to enforce that relationship. Therefore the constraint satisfaction problem can be represented as a triplet P = <X, D, C> where:
^ X = {x1, ..., xn} is a set of n variables;
^ D = {d1, ..., dn} is a set of n domains;
^ C = {c1, ... cp} is a set of p explicitly given constraints; in the most general form each constraint is a logical predicate with an arbitrary arity kc\(x\,..., xk) that maps the Cartesian product di(xi) x ... x dk(xk) to {0,1}. As usually the value 1 means that the value combination for x, ..., xk is allowed, and 0 otherwise.
The main task is to find a value of each variable xi from the permitted domain di in such a way that all constraints are satisfied:
(V), Vi e diand (Vq (x, ..., xk) eC) c{(xt,..., xk) = 1.
To find a feasible solution constraints are used actively to reduce domains by filtering values that cannot take part in any solution. This process is called constraint propagation, domain filtering, pruning or consistency technique. Constraint propagation can be used to solve fully the problem but this is rarely done due to efficiency issues. It is more common to combine an efficient but incomplete consistency technique with non-deterministic search. Highly developed class of constraint satisfaction problems includes centralized binary and non-binary CSP.
The powerful notion of the constraint can be applied both for representation of relationships inside the information model of the separate software component, as well as for representation of design restrictions in the context of the whole distributed software system. The former feature enables more comprehensive semantic annotation of UML models with comparison of pure OCL, and the later brings on the scene new principles of communication between autonomous agents. Modern advances in distributed constraints satisfaction problems (DCSP) show their abilities. Multiple commercial and free implementations of constraint solvers with different
functionality and problems' domains are widely used in research in industrial applications. Among different specializations of the generic CSP task for the purposes of the research logic-based constraint satisfaction methods have the greatest impact.
In this area we found Alloy system [JAC06] to be the most suitable tool for expressing complex object-oriented structural constraints and behavior in software systems. Alloy software has been developed by the Software Design Group at MIT. The first Alloy prototype came out in 1997 and to the moment it evolved to the matured simulation and verification system.
Foundations of Alloy development and application are closely related with the principal view on software development process as the design of abstractions. Alloy gives the practical methods for implementation of integration methodology because it allows formal specification of the software abstractions with consequent simulation and checking. Simulation of the model returns in the result instances of states of executions that satisfy a given constraint (the model of the logical theory), and checking gives instances of the model , which violate the specified constraints.
As the Alloy can be selected for practical application in our research let's describe precisely major principles of Alloy language, model simulation and model checking. In order to illustrate natural correspondence between UML models and Alloy language expressions we will use the following simple information model (fig.1).
Figure 1. The UML model of TMF SID High Level Concepts
Alloy consists of the structural modeling language based on first-order logic, and of a Java-based constraint solver for models analysis and verification. That modeling language is rooted in well known formal language Z for programs specifications [JAC96], but it uses different
modeling capabilities like inheritance and reuse of formulas, which facilitate declarative object-oriented description of the problem.
Alloy's modeling capabilities are shaped by a sort relational logic, which combines the first-order predicate calculus with the operators of the relational calculus and uses the same representation of functions in the form of relations, as the Zermelo-Frankel's set theory proposed. In result the users of Alloy can select the most appropriate formal approach to define the structure and behavior of the system among the following alternatives: ^ predicate calculus; ^ relational calculus; ^ navigation expression style.
In Alloy all universe of discourse is modeled in terms of atoms and relations. Atoms model indivisible, constant entities, while relations with multiple arities represent meaningful relationships and dynamical aspects. Expressive means of logic include:
^ set constants (empty set, universal set, identity set); ^ commonly accepted set-theoretical operators
(union, intersection, subset inclusion, etc); ^ relational operators (product, join, transpose, transitive closure, etc).
Non-trivial constraints against the relations can be made from usual logical operators, quantifiers, specific multiplicity constraints restricting the basic relational operators, and cardinality constraints.
Practice-oriented modeling language of Alloy facilitates easy declaration of logic sentences in the text object-oriented form, provides convenient means to organize large models into tractable components and to manage simulation or verification. In the Alloy modeling language the basic building block is called signature. For example the signatures below describe structure of UML classes from the model offig.l:
In principle each signature represents a set of atoms. Because the signature closely corresponds to entities of ontology or classes of UML models, it may also define fields, which are new relations in fact.
Specific constraints, defined in the modeled domain, are expressed in terms of facts, predicates and functions. Facts and predicates, which are listed below, define constraints over UML classes offig.1, which are induced by the relations between the classes.
1 fact
2 {
3 MyOManytoOMany [ physicalresource,
4 LogicalResource,
5 PhysicalResource]
6 }
7 fact
8 {
9 My0Manyto01 [physicaldevice,
10 Hardware,
11 PhysicalDevice]
12 }
13 pred My0Manyto01(r:univ -> univ, t: set univ, u: set univ)
14 {
15 some x:tlsome y:ul y in x.r lone y:ulsome x:tl y in x.r all disj p,q:t, a:u la in p.r => a ! in q.r
16 }
17 fact physicalresource_logicalresource
18 { physicalresource = ~logicalresource }
19 fact physicaldevice_hardware { physicaldevice = -hardware }
20 pred ModelO
Assertions denote questionable properties of the domain, which are verified during model simulation.
Finally, testing of the model is performed with the help of the pair of control commands: run and check. Run command starts Alloy analyzer in order to find the correspondent model for the given number of instances. For example, let suppose that the following command is issued:
1 run Model for 6
In this case Alloy Analyzer tries to find six instances of the signatures which satisfy all the defined facts. Such process finishes successfully and Alloy Analyzer returns several solutions which may be visualized as fig.2 shows.
In Alloy the flexible "separation of concerns" principle is used for implementation of simulation and checking. To verify the model against constraints Alloy analyzer
1 abstract sig Resource { }
2
3 sig PhysicalResource extends Resource
4 { logicalresource:LogicalResource }
5
6 sig LogicalResource extends Resource
7 { physicalresource: PhysicalResource }
8
9 sig LogicalDevice extends LogicalResource { }
10 sig PhysicalDevice extends PhysicalResource
11 { hardware:Hardware }
12
13 sig Hardware extends PhysicalResource
14 { physicaldevice:PhysicalDevice }
Figure 2. Found solutions of run command
translates the definition and constraints of the model into binary constraints and feds them to an external satisfiability solver in order to solve the classical finite domain constraint satisfaction problem via the search in the state space.
Although pure first-order predicate logic is undeci-dable, Alloy permits definition of the limits for space search giving the user the ability to specify the scope of the problem in terms of the maximum number of objects of each type. For such limited case, Alloy analyzer performs scope-complete analysis of the model. In result it returns either instances that hold all facts true or counterexamples for which at least one assertion was failed.
It is widely accepted that for many non-trivial models even a rather small scope (numbers of the objects) can produce huge amount of states. But, as the authors of Aloy note, the best solvers can examine spaces with more then 1050 states. Such advances in pure constraints satisfaction permit to develop complex original models in terms of Alloy language for many interesting real-life problems. Authors call the approach "lightweight formal methods", because it tries to obtain the benefits of traditional formal methods at lower cost, and without requiring a big initial investment.
4. Theory of Information Flow
Although the constraints satisfaction technique introduces general principles of resolving discrepancies between different agents in the context of the information system, methods of achieving semantic interoperability remain not obvious. In fact we need more precise understanding how constraints posed on local ontologies and information models stipulate for global regularities in the distributed system and shape information flow between different components. In such a situation following [BAR97] we can repeat the question:
"How is it that information about some components ofa system carries information about other components of the system ?"
To define formal foundations for answering that question the mathematical theory of Information Flow was proposed by J. Barwise and J. Seligman [BAR97]. From the beginning that theory studies distributed systems of various nature consisting of separate components (or agents) that interact with each other. In the framework of the Information Flow Theory (IFT) the components are usually loosely coupled, and they have own world models, which are represented in terms of vocabularies and classifications. At the same time interaction of components requires existence of shared physical objects, symbols, signs, messages. All these means of communication are that of tokens in the IFT. So, in fact each component of the distributed system expresses own world model in a form of a certain classification. In such classification a subset of shared tokens is classified in accordance with private set of vocabulary types. In addition to the private classification the component can define certain logical constraints representing specific structural or behavior issues in the local world. During interaction components exchange tokens and interpret their meaning in accordance with classifications. Close relationship between concepts (types) and tokens is highlighted in the Second Principle of Information Flow:
"Information flow crucially involves both types and their particulars."
Let's see how such informal speculations can reformulated precisely in terms of the IFT in accordance with the original definitions by authors [BAR97].
A classification is a triple A = (A, XA, l=A), where A — is the set of objects to be classified, called tokens of A. XA — is the set of types to classify the tokens. \=A — is a binary relation between A and XA , that determines a particular classification of the tokens. In the IFT classification is usually represented as follows:
Xa
^A A
In the IFT domain-specific constraints on the set of types are represented in a rather general form of entail-ments:
TbA, where
r and A are type subsets of XA. Such entailments are
usually expressed in the form of sequents (r, A). The
whole set of the defined sequents is called theory.
In any sequent of the theory the subset T is treated conjunctively and the subset A is treated disjunctively. It means that a token a of A satisfies (T, A) if and only if a is of type a for every a e T entails a is of type P for some P e A.
Also we can say that T entails A in A, written T hA A , if every token a of A satisfies (T, A).
One can combine a certain classification A, a theory |-L, and a subset of tokens satisfying the theory (they are called normal tokens) NL in one structure L called a local logic:
L = (A, b£ , Nl,).
The local logic L is sound if every token is normal, and it is complete of every sequent that holds of all normal tokens is in the consequence relation \~L .
In the IFT the notion of the infomorphism gives a mathematical model of the whole-part relationship between instances of a whole, as modeled by a classification C, and that of a part, as modeled by a classification A. If A = (A, XA, Na) and C = (C, XC, NC) are classifications then infomorphism f: A ^ C is a pair f = (fA, fv) of cont-variant functions defined on the set of tokens and the set of types as follows:
f A
X
V
A
These functions should satisfy the following biconditional:
V aeXA , V ce Cf v(c) \=A a if and only if c \=Cf A(a).
The notion of infomorphism for classifications can be easily extended to the complete structures of local logics. If L = < A, bL , NC) and C = <C, Kr, NC) are local logics then a logic f : L ^ C is a pair f = ( f A, fv) of contr-variant functions defined on the set of tokens and the set of types as follows:
f A
X
Xc
Na
A
Special restrictions are defined for these functions:
1. Va e xA, V c e Cfv(c) l=A a if and only if c NC f A(a).
2. V T, A f= XA , if T-f hA-f, thenf a [T] Kf a [A].1)
In the IFT classifications and infomorphisms play a crucial role for modeling information flow in a distributed system. Separate components of the system are represented by an indexed family cla(A) = {A}ieI of classifications. The fact of connecting these components in the framework of the whole system is modeled by an indexed family inf(C) = {A, ^ C }ieI of infomorphisms with a common codomain C . Mathematical structure inf(C) is called an information channel, and the classification C is called the core classification (or just the core) of the channel. The simplest case of the distributed system with two components can be represented as follows:
XC
f/
X
'A1
U
X
'A2
Ai
A,
1) f A [T] and f A [A] denote the set of images of sets T and Aalong the function respectively fA.
C
The core of the channel C and its associated theory ThI give a way to model regularities in information flow inside the distributed system.
5. Conclusion
During the years a highly developed and matured theory and applications for static knowledge representation was developed. Resent researches made a lot for close integration of ontology-based KR techniques with practical information modeling. UML and Ontology can be fused together in order to produce consistent foundations for hierarchical knowledge representation. A practical consequence of such fusion may provide methods and tools for semantically rich hierarchical
modeling of information flows and knowledge in terms of meta-models and meta-ontologies.
In principle, for modern distributed computer systems semantic interoperability can be supported by use of ontologies and the family of Description Logics, but we need to pay a special attention to determine proper tradeoffs between computational efficiency and expressive power. These circumstances cause continuing the research in developing practically tractable logical theories, ontology reasoning techniques, as well as heuristic algorithms or specialized mathematical theories like Barwise's theory of information flow and relational logic of Alloy. ■
Bibliography
[SOW00] Sowa J.F. Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks Cole, Pacific Grove, CA. 2000.
[KAL01] Kalfoglou Y. Exploring ontologies. in Handbook of Software Engineering and Knowledge Engineering, Vol. 1, Fundamentals, ed. S.K. Chang, World Scientific, Singapore. 2001. pp. 863-887.
[MGU02] McGuinness D.L. Ontologies come of age. in Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential, eds. D. Fensel, J. Hendler, H. Lieberman, & W. Wahlster, MIT Press, Boston, MA. 2002. pp. 1-18.
[GRU95] Gruber T. R. Toward Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human-Computer Studies, No. 43(5/6). 1995. pp. 907-928.
[DRA06] Dragan G., Dragan D., Vladan D. Model Driven Architecture and Ontology Development. Springer-Verlag. (ISBN: 978-3-54032180-4). 2006.
[SWA99] Swartout W.R., Tate, A. Guest editors' introduction: Ontologies. IEEE Intelligent Systems, vol. 14, no. 1. 1999. pp. 18-19.
[STR75] Strawson, Peter F. and Bubner, Riidiger. Semantik und Ontologie. Vandenhoeck & Ruprecht. 1975.
[STA04] Staab S., Studer, R., eds. Handbook on Ontologies. Springer, Berlin, Heidelberg. 2004.
[EHR07] Ehrig M. Ontology Alignment : Bridging the Semantic Gap. Series: Semantic Web and Beyond , Vol. 4. Springer-Verlag. (ISBN 978-0-387-32805-8). 2007.
[DAV06] Davies J., Studer R., eds. Semantic Web Technologies. John Wiley & Sons, Ltd. 2006.
[REC06] Rector A., Rogers J. Ontological and Practical Issues in Using a Description Logic to Represent Medical Concept Systems: Experience from GALEN. In Barahona et al. (eds.); reasoning Web, LNCS 4126. 2006. pp. 197-231.
[DIE06] Dietz J. L.G. Enterprise Ontology Theory and Methodology. Springer-Verlag. (ISBN 978-3-540-29169-5). 2006.
[MGR91] MacGregor R. Inside the LOOM classifier. SIGART Bulletin, vol. 2, No. 3. 1991. pp. 70-76.
[GEN07] Genesereth M.R., Fikes, R.E. Knowledge Interchange Format, Version 3.0, Reference Manual. Online: http://www-ksl.stan-ford.edu/knowledge-sharing/papers/kif.ps. 2007.
[GRU92] Gruber T.R. Ontolingua: A Mechanism to Support Portable Ontologies. Knowledge Systems Laboratory, Stanford University, CA. 1992.
[KAR07] Karp R., Chaudhri V., Thomere J. XOL: An XML-Based Ontology Exchange Language. Technical Report, SRI International. Online: http://www.ai.sri.com/~pkarp/xol/xol.html. 2007.
[FIP07] FIPA. Online: http://www.fipa.org. 2007.
[MAN07] Manola F., Miller, E. RDF Primer, W3C Recommendation. Online: http ://www.w3.org/TR/REC-rdf-syntax/ . 2007.
[BER05] Berners-Lee, Tim. The Semantic Web: An interview with Tim Berners-Lee. Consortium Standards Bulletin, 4(6).2005.
[SMI04] Smith M. K., Welty C., McGuinness D. L. OWL Web Ontology Language Guide. W3C Recommendation 10 February 2004.
[HOR04] Horrocks I., Patel-Schneider P. F. A proposal for an OWL rules language. In Feldman, Stuart I., Uretsky, Mike, Najork, Marc, and Wills, Craig E., editors. Proc. of the 13th Intrl. Conf WWW-2004. New York, NY, USA. ACM Press. 2004. pp. 723-731.
[BRA85] Brachman R.J., Schmoltze J.G. An overview of the KL-ONE knowledge representation system. Cognitive Science, vol. 9, No. 2. 1985. pp. 171-216.
[BRA91] Brachman R.J., McGuinness D.L., Patel-Schneider P.F., Resnick L.A., Borgida A. Living with CLASSIC: When and how to use a KL-ONE like language. In Principles of Semantic Networks, ed. J. Sowa, Morgan Kaufmann, San Francisco, CA.1991. pp. 401-456.
[KIF89] Kifer M., Lausen G. F-Logic: A Higher-Order Language for Reasoning about Objects, Inheritance and Scheme. In Proc. of ACM SIGMOD Intl. Conference on Management of Data, Portland. 1989. pp. 134-146.
[KIF93] Kifer M., Wu J. A logic for programming with complex objects. Journal of Computer and System Sciences, No. 47(1). 1993. pp.77-120.
[DLO07] Description Logics Web Site. Online: http://dl.kr.org. 2007.
[BAA03] Baader F. et al. The Description Logic Handbook: theory, implementation and applications. A book. Cambridge University Press. (ISBN: 0-521-78176-0). 2003.
[SCH91] Schmidt-Schau? M., Smolka G. Attributive concept descriptions with complements. Artificial Intelligence. No. 48(1). 1991. pp.1-26.
[HOR99] Horrocks I., Ulrike Sattler U., Tobies S.. Practical reasoning for expressive description logics. In Proc. of the 6th Intrl Conference on Logic for Programming and Automated Reasoning (LPAR'99), number 1705 in Lecture Notes in Artificial Intelligence, Springer-Verlag, 1999. pp. 161-180.
[BAA06] Baader F., Lutz C., Suntisrivaraporn B. CEL-A Polynomial-Time Reasoner for Life Science Ontologies. U. Furbach and N. Shankar (Eds.): IJCAR 2006, LNAI 4130. 2006. pp. 287-291.
[BER05a] Berardi D., Calvanese D., Giacomo G. Reasoning on UML class diagrams. Artificial Intelligence. No. 168 .2005. pp.70-118.
[JAC06] Jackson D. Software Abstractions: Logic, Language and Analysis. The MIT press, Cambridge, Massachusetts. 2006.
[JAC96] Jacky J. The Way of Z: Practical Programming with Formal Methods. Cambridge, UK, Cambridge University Press. 1996.
[BAR97] Barwise J., Seligman J. Information Flow. Cambridge University Press, 1997.
Integration of object-oriented software engineering models on the basis of the ontology
transformation approach
H. Abdulrab,
Professor of computer science, LITIS laboratory, INSA (Rouen, France). E. Babkin,
Associate Professor, Faculty of Business Informatics and Applied Mathematics, State University — Higher School of Economics (Nizhny Novgorod). LITIS laboratory, INSA (Rouen, France).
This article proposes combination of information flow theory and relational logic for achieving semantic interoperability during integration of heterogeneous UML models. The suggested approach is during integration of CIM and SID object-oriented models which are now widely used in telecommunications.
Интеграция объектно-ориентированных инженерных моделей на основе трансформации онтологий
Х. Абдулраб,
профессор информатики, сотрудник лаборатории LITIS, Институт прикладных наук (Руан, Франция). Э.А. Бабкин,
к.т.н., доцент кафедры информационных систем и технологий, факультет бизнес-информатики и прикладной математики НФ ГУ ВШЭ. Сотрудник лаборатории LITIS, Институт прикладных наук (Руан, Франция).