WHAT IS SEMANTIC SEARCH?

©Neo4j

©Neo4j

Search is changing. If you are a long time user of the Internet and, more specifically, search engines like Google or Bing, you can see how the way we look for information has changed over the past few years. There is no doubt that the search landscape is changing towards a more natural and spontaneous language. Currently, the most accurate way to deliver best-matching results can be achieved through semantic search, i.e. the ability to put typed searches into context. At Palenio we use semantic fingerprinting to organize the world’s information and make it universally accessible and useful.

Unlike traditional algorithms, which depend on people sorting through pages that are tagged with related keywords, semantic search goes further. It seeks to understand the query on a deeper level and deliver the exact information the user is looking for based on the intentional and contextual meaning of the words the user is using. With that being said, it is important to understand the difference between intent and context. Intent comes from the user, explicitly stating what he or she is looking for. Context on the other hand could be understood as everything that surrounds a search request and gives it meaning. Thus, by understanding and connecting intent and context, search engines are able to understand different queries, based on what motivates them and what is expected of them.

Palenio's algorithm focuses on the understanding phase of the search, which facilitates the subsequent phases by limiting the number of documents in the index and to show the best possible results. The reinforcement of the understanding phase means that we can pay closer attention to the context in which a search is performed, look at the way the concepts appear in documents and how they relate to each other. Semantics permit that words are no longer just letters put together, but turn those into concepts. This means that what was once a single unit now has a number of connotations that give a concrete meaning. This can be achieved through a process of disambiguation. In computer terms, word-sense disambiguation (WSD) is an open problem of natural language processing and ontology. It controls the process of identifying which sense of a word shall be used in a sentence, when the word has multiple meanings. Disambiguate a term means to clearly differentiate their different meanings and find the different contexts in which it can be used.

As identifying keywords alone was not enough, the need to understand how data is related became more apparent. This marked the shift from the ubiquitous keywords to the increasingly important entities and the so-called semantic search. It meant that words become concepts and search engines evolved into genuine learning machines. In order to deliver the most relevant results, it means looking at search entities to establish individual user patterns that help identify the importance of various documents and influence the information display. When establishing the context in which a query occurs, Palenio takes into account a number of factors such as:

  • User search history

  • User location: depending on the location of the user, the search engine is able to discern what type of results are more appropriate for him or her

  • User demographic: depending on specific characteristics of the user, the search engine is able to discern what type of results are more appropriate for him or her

  • Global search history: searches carried out consecutively or close in time associated with another search

  • Relationships between a high amount of previously stored data

  • Queries characteristics: spelling, variations, etc.

  • Domains linked from documents on the same topic

  • Co-occurrence of terms and distance between them

What started with typing a word on the keyboard of our computer now lead to making a voice request through Siri, Sherpa, Cortana or Google Now, as a wide range of devices allows us different search inputs. These advances have moved us from former queries like ‘restaurants in Berlin,’ to more specific queries such as ‘where to eat Thai food in Berlin’ or ‘what is the best place to eat Thai food in Berlin’. It shows how search queries have evolved to become longer and more specific, demand more personalised content and require a need for instant results.