LSI is just information retrieval with a new development. The new development is not LSI, which was patented over 20 years ago, it is using LSI on the internet.
Information retrieval has been a problem for thousands of years because writing things down is not as difficult as finding them later when you need them. Shortly after writing was invented someone probably had a difficult time finding the clay tablet with the inventory of the king's treasure, and the index had to be invented. Indexes have their problems, as anyone who has used a library's card catalog or the index at the back of a book knows. Indexes can only capture broad topics, and they are only as good as the choices made by the creator of the index. The bit of information you need to solve your problem may be in a book, but you will never find it if it didn't get linked to the right heading on the card catalog or selected by the person creating a book's index.
Very soon after computers were invented, people started using them for document management. At first just the equivalent of card catalogs were computerized, because computers were so expensive. As the price of computers dropped and their processing power increased, using computers to do full-text indexing became possible. Early indexing software threw out all the extremely common words and made an index of what document or documents contained the remaining words. When a user entered a word or words in the search field, the information retrieval system responded with a list of all the documents that contained those words.
A breakthrough came from research done for machine translation (translating software). Those researchers stopped trying to understand the text and started looking at the relationships among the words, much like the way a human comprehends language. Translation software picks what word to suggest by looking at the surrounding words, using information like "if spring has words flowers or autumn nearby, it's a season; if spring has the words tension or adjust nearby, it's a coiled bit of metal." It's not a perfect method, but it's darned good.
Latent semantic indexing took the idea one step further, using statistics and matrix manipulation to find the word patterns in large collections of unstructured data, such as business letters, web pages, or newspaper articles. Search engines first create semantic indexes, then use the index of word patterns to decide which documents most closely match the words a user enters for search terms.
If semantic indexing is so good, why did it take so long to be used on the web? Capitalism, dear readers. The persons who discovered how to do it it patented the concept and the process. The licenses to use it were not cheap. Only large companies that could afford the software could index their document collections. As soon as the patent expired in 2006, the method became free to use by everyone, and the major search engines wasted no time putting the concept to work.
Published by Tsu Dho Nimh
I'm a long-time technical writer with time to spare. I'm an omnivorous reader, a superb researcher, and a very fast writer. I'm also a good photographer. I'm fascinated by medicine, and annoyed by quack... View profile
- Real Estate Search Engine Optimization and Your BusinessArticle dedicated to real estate search engine optimization
- Search Engine Optimization (SEO) Service FAQA great site is ineffective if not people know where to find you. This is where search engine optimization services come in useful. The method used to keep site in the top pages of search engines for a long time will...
- Search Engine Submission - Is it Worth It? Is search engine submission worth it? It depends really.
- Launch of Search Engine Optimisation Consultancy in Norwich, UKAfter five years working as an in house SEO, Kes Phelps launches Search Engine Optimisation Consultancy in Norwich, Norfolk, UK. Specialising in on page and off page SEO and working with small to medium sized businesses.
- Latent Semantic Indexing (LSI): A Huge Development in Internet Marketing and SEO
- Accidental Keyword Optimization and Latent Semantic Indexing
- A Logical Approach to LSI Using Google
- Factors to Consider when Building Your Website for Maximum Search Engine Indexing
- 655 Words that Search Engines and Semantic Indexing Usually Ignore
- How to Keep Top Search Engine Results
- Master Search Engine Optimization in 5 Minutes
- LSI is not a new development.
- Computers have been indexing content for over 40 years.





16 Comments
Post a CommentThanks for sharing. :)
Good information..
Fascinating information. I also got to this article through Textbroker, where one of the clients recommends it.
Thanks for this and I found this through Textbroker.
Sharon - NO! It has nothing to do qith anything you or AC do. LSI happens when a search engine retrieves your entire article. Engine compares the structure and wording to all the other documents in the index and decides where it fits best.
This article is very well written and has helped me to get a better understanding of how to write for the web. Thanks for it.
so, in "lay terms" this LSI happens when AC attaches a link to words in my article for advertising, links I have not attached???
If so, very interesting, depends on keywords I choose.
Thanks, Tsu. I came via textbroker,too. I have always wondered whether to use individual words or phrases in keywording.
It isn't capitalism that causes patents to interfere with the adoption of technology. It's government regulation. Some might called it "government interference".
I found the link to this article through an assignment on Textbroker. Great information!