Today’s Internet is a unique borderless depository of knowledge that allows a most affordable answer for virtually any question. Practice shows that at the moment no more than quarter of all Internet users are capable of effective and correctly use search engine machines. In many cases people get absolutely useless information as the result of their 1 – 2 keywords queries. Even though the search engine algorithms are more sophisticated today than ever, they are still based on elementary principles, catered for most popular keywords and keyword clusters. Our experiment in the “Search engine relevancy. 10 years of search engine researches: What do we get?” article proves that some multikeyword queries are tough cookies for existing search engines.
Correct and meaningful search would be impossible without some constituents of human imitation elements. This is not only an issue of search engine developers, but also a stumbling block for linguists working in the field of machine (automated) translation. Whereas great minds are dashing against the rock of artificial intelligence algorithm issues, major search engines have adopted an easier means of relevance degree estimation.
In 2003 Google bought CIRCA technology, the core of the AdSense. The heart of the CIRCA technology is a language scalable ontology comprised of large text corpuses database, were a special focus on the meaning of the words and the words’ relations to other words’ meanings is paid. These relations can be boiled down to the following, the ones that have a highest value for practical on-site seo:
- Similarity (”affordable” is similar to “inexpensive”)
- Hypernymy (is a kind of / has kind) (”eagle” has kind “bald”)
- Causation (work causes results)
- Metonymy (whole/part relations) (”laptop” has part “keyboard”)
- Substance (”lumber” has substance “wood”)
- Synonymy / Antonymy (”cheap” is an synonym to “inexpensive”)
- Product (”Microsoft Corporation” produces “Microsoft Word”)
- Attribute (”fast” is an attributes of “speed”)
- Membership (”king” is a governor of “kingdom”)
- Lateral bonds (concepts closely related to one another, e.g. “nut” and “lock nut”)
In every language, particularly English, there are dozen multi-meaning words. For example, bow – the first word that came into my mind. It has the following meanings: cold steel, musical instrument, the front part (of a ship). Semantic analysis allows search engines to automatically analyze and deduct word’s meaning, based on the word’s environment. In the process of on-site optimization it is vital to understand semantics-related algorithms in order to maximize its effectiveness and ultimately better visibility of the website resources.
P.S. Some webmasters say, that if you put a “~” sign before the query, Google will deliver semantically relevant results. I played with this sign and indeed it delivered different amount of relevant pages in the search results and my website was taking different positions. I wonder if that is true… but so long it works, and can be useful if you want to check what Google thinks about the relevancy of your site and its pages… to a certain query.