Technopark Resident VISmart: Helping Machines Think Like Humans
Semantic information processing technologies and semantic web software will soon allow search engines to think almost like humans. Among other things, they would be able to make logical connections between various bits of data and combine databases into unified information systems. Better yet, this related data will be visualized in a form comprehensible to a regular user. The company VISmart, a resident of ITMO University’s Technopark, is working on several projects in this field. Recently they were joined by a scientist from Austria whose research will help VISmart in improving their service that focuses on searching for and visualizing information using semantic technology.
VISmart, semantic technologies and Ontodia
The VISmart company is a resident of ITMO’s Technopark that works on developing semantic web solutions including the development of apps that make use of semantic technologies. Ontodia is one such application. Initially, the service’s purpose was the visualization of semantic data in diagram form. Nowadays Ontodia is also used to show various kinds of graphs designed to visualize a wide range of data.
Semantic data is interrelated information recorded in the form of a so-called triplet: “subject-predicate-object”. The subject is an entity; the object is another entity connected to the subject through a semantic relation (predicate). For example, the phrase “Olga Pavlova studies at ITMO University” can be represented by two entities related through a semantic connection.
“Semantic technologies and semantic web are innovations that are making their way from scientific research to real-world application. While working with ITMO University, we noticed that the industry lacks tools to help developers and regular people use semantic technologies in their work. For instance, in conventional database management systems (DBMS) there are already a plenty of services that allow non-programmers to easily access the contents of any particular database. Programmers can also easily transform the conceptual data schemas into data base tables structures. This led us to the decision to create a tool that would let users visualize semantic data as interactive graphs,” – comments Dmitry Pavlov, CFO of VISmart.
What this means is that users or programmers would be able to upload semantic data to Ontodia and convert it, say, to an XML-like file. The end product of such conversion is a graph with “vertexes” representing entities and “edges” reaching out from these vertexes towards other entities. Those entities can, in turn, become vertexes themselves and form more connections.
How is this useful? At the Almazov North-West Medical Research Center, a great deal of medical data is kept in a disjointed and unstructured fashion. If a doctor needs to retrace a patient’s treatment history, they have to dig through the digital archives. Such a system prevents being able to easily compare the treatment of the same illness in different patients since one would need to dip into the archives each time, which is not an easy task. Semantic technologies could be used to connect medical data in the form of graphs. For example, one could set up an illness – say, acute coronary syndrome – to be a “vertex”, with edges reaching out towards patients. The patients, in turn, are connected to medications they were given.
Gerhard Wohlgenannt, an assistant professor at the Vienna University of Economics and Business, has come to ITMO University for two years as part of the ITMO University’s Fellowship program and will work with VISmart and International Laboratory “Information Science and Semantic Technologies” on services that could enhance tools like Ontodia and conduct new, fundamental research in the field of semantic web. The Austrian scientist will take part in the creation of a service that would provide Ontodia with the ability to process text search queries with more flexibility. VISmart also plans to collaborate with Dr. Wohlgenannt on developing semantic technologies of voice control for electronic devices.
Processing text queries with semantic technology
“Let’s imagine a system that holds marital records in a small town. Its forms say something like “Ivan is a spouse of Masha”. There are many such records, so when you type in a search request like “Who is Ivan married to?” the system will recognize the word “married”, but will not find any results, since its database only contains records with the word “spouse”. Combining semantic technologies with Natural Language Processing tools helps a system recognize that, at their core, the words “married” and “spouse” have close meanings, so the search results will be more accurate and comprehensive,” – explains Gerhard Wohlgenannt.
Semantic connections are made through ontologies. Those are files or documents that define the relations between various concepts which serve as objects of semantic connections. In addition, the system can automatically infer additional information from the definitions. For example, if there is a statement that “Ivan is a spouse of Masha”, then from the definition of the relation “spouse” the system will deduct that Ivan and Masha must both be persons, and also that they cannot be married to another person at the same time. Dmitry Pavlov brings up another classic example: if the system has two triplets like “Socrates is a human” and “All humans are mortal”, it can reasonably conclude that “Socrates is a mortal”.
This might not seem that impressive to an observer. Indeed, what is so special about that? Such a task seems easy to humans who are able to form semantic connections. But the thing about computers is that they don’t work like the human brain does. The modern technologies of data search and analysis are based mostly on keywords. In case of semantic technologies, a program can search for information based not only on keywords, but on other, related terms, create logical connections between the terms and visualize them in a way that is accessible to humans. This would serve to make search results more precise. For example, if one were to enter the words “Arena” and “Zenit” in Google, they would probably be given a variety of articles and pictures about the football clubs and a range of historical documents about arenas, be they circus or gladiatorial. Meanwhile, a search employing semantic technology would realize that, in this context, “Arena” stands for “Stadium”.
The structure of a semantic net. Credit: airportal.ru
How does such a system get calibrated? ITMO.NEWS asked Dr. Wohlgenannt if the creation of semantic nets and ontologies could be fully automatized – if programs could be made to generate knowledge graphs on their own and create ontologies for a given domain, like the domain of furniture, using textual data found on the internet or provided by the developer.
“Yes, you could gather a range of documents on the kinds of tables, which chairs suit them, how tables can be used, etc. Then you can apply machine learning and linguistic methods to automatically extract an ontology, ie. a domain model, from this information. But the result would not be perfect. Because of that, results from such systems need to be refined “manually”. It’s the same with people – even if two persons read the same text, they will interpret and understand it a little bit differently,” – says the scientist.
Future: semantic web
Sir Tim Berners-Lee, the inventor of the “World Wide Web”, thinks that the use of semantic technology to analyze and process information will be a huge improvement to search engines, among other things. Right now, if they want to find certain information on the web, users often have to sort through a great number of websites. For instance, says Dr. Wohlgenannt, one might want to find all the right-handed tennis players born in Russia after 1980. They would probably need to check a lot of webpages to gather the data they need. Semantic web technologies will provide the data and query mechanisms that do the work for the user. In addition, semantic technologies will help to integrate, connect and analyze not only text, but pictures, videos, audio, graphs and diagrams – similar to what the human brain does when someone like, say, a journalist, needs to process various information and compile it into a coherent story.
Ontodia software. Credit:ontodia.org
Semantic technologies can be useful for regular consumers, too. Maybe you’d like to apply for tax deduction according to the form 3-TIPI. This means collecting no less than five types of documents and copying all the info into an application form. This might take quite a while! Semantic web-based tools, when provided the right data, can compile it all on its own and rid the user of routine work. The same principle would be used in combining databases into a unified information system.
This, in turn, can lead to the creation of semantic web agents, although it is not likely to happen in near future. Such agents would exist in the form of software that could run errands for its users, like setting up a medical appointment while making sure that the clinic is near your workplace, the doctor has high ratings, the clinic has all the necessary equipment and so on and so forth. Semantic technologies can make our lives significantly easier.