Latent Semantic Indexing: What Is It?
For many years now, search engine optimization has centered on keyword density, which is pretty straight forward
to novice article writers and web designers.
The two main factors that affected search engine ranking
were backlinks and keyword usage.
However, things changed with the introduction of Adsense by Google. Entrepreneurs soon realized that a lot of
money could be made by creating web pages specifically built to display Adsense ads.
You could make thousands of dollars a day by simply generating as many pages as was feasible using web page
generation tools specially built for this purpose. Duplicated content was prevalent and many websites offered
little or no value to the visitor who was always bombarded with Adsense ads.
Latent semantic indexing (LSI) was introduced primarily to combat this problem, and to ensure that users who
were using Google and other search engines were getting value from websites.
Consequently, many websites were de-listed by Google and other search engines too in the advent of latent
semantic indexing as being of little or no value to users. Typically, the only difference between web pages was a
change of keywords. The result was that a large number of internet marketers lost their income overnight.
Initially, latent semantic indexing was used with Adsense adverts to ensure that adverts displayed on a page
matched the theme of that particular web page. The algorithm establishes the theme of a page by analyzing the
wording on the page. It was only after a while that Google utilized the algorithm in search engine ranking, and
other search engines made use of it too.
Latent semantic indexing analyzes words used by natural language, related words, and synonyms to establish the
subject matter of a page. It’s not a substitute for keyword analysis, instead it complements it. On top of getting
the keywords contained in a document, it analyzes the document as a whole to check whether some key words are
contained in other documents.
The fact that LSI is premised on a set of mathematical rules means that it can give results that are
mathematically justifiable but with less meaning in the natural language.
So, what exactly is latent semantic indexing and how can the common layman understand it better?
Latent means that something is known to be present although not apparently visible. As used in latent semantic
indexing, it means that a word like ‘lock’ may be part of a text but with hidden meaning until it is made known by
Semantic refers to meaning or interpretation of language or words, rather than what is actually written or
Indexing, as used in latent semantic indexing, is the extraction of the meaning of a document from its subject
matter and listing it in a form that search engines can use.
That’s basically it. Latent semantic indexing looks at how words are used, how they are distributed across
documents and their interpretation.
Latent semantic indexing has been applied in three main areas: relevance feedback for search engines, archiving,
and automated writing assessment.