In the start of web, pages were indexed in search engines by submitting them to web directories. You make a web site and register it against a category/subcategory. The most famous of them was Yahoo, the golden age of Yahoo.
However which page show in search in the order of rank required the search engines to crawl the page and based on keyword density put them in an order. The problem is that everyone want number one position in the search result, so people started stuffing pages with keywords. By keyword analysis search engines tried to understand the quality of a page against a keyword.
Came Google and came the idea of page rank. The premise was that people can manipulate their own
content with density and placement of keywords but they cannot manipulate the web pages of others and certainly not to the extent that they can manipulate their web pages. Maximum they can do is to write something in comment but definitely not on the main content. This means if someone puts a link to the other web page, the linked page has a merit in it. It sounds like the author is saying that, 'Leave my page and go to another page because that page deals with this particular thing in a much better way'. Google became the force in the search engine and rest is history. Fast forward and the whole search engine optimization industry is geared towards link building in both right and wrong ways. The so called experts found a number of ways to build links and to artificially boost the ranking and hence the traffic. You can see some of those ways here.
content with density and placement of keywords but they cannot manipulate the web pages of others and certainly not to the extent that they can manipulate their web pages. Maximum they can do is to write something in comment but definitely not on the main content. This means if someone puts a link to the other web page, the linked page has a merit in it. It sounds like the author is saying that, 'Leave my page and go to another page because that page deals with this particular thing in a much better way'. Google became the force in the search engine and rest is history. Fast forward and the whole search engine optimization industry is geared towards link building in both right and wrong ways. The so called experts found a number of ways to build links and to artificially boost the ranking and hence the traffic. You can see some of those ways here.
Google has been adjusting itself and has put a lot of algorithms in place to differentiate between relevant linking and linking on purpose. It has even gone to the extent of deindexing and banning websites.
In the above context, how the future of search looks like? Search engines cannot rely completely on keywords and links as both are prone to manipulation. Both can still act as an indicator to the relevance of the page but on their own they are not a good indicator of conclusions.
Let's take a step back and understand the whole purpose of search. A user comes and is looking for something and enters a search term. It's a genuine user and not a set-up user. The set-up users are paid to click on links and build artificial traffic. So let's say the genuine user is a software guy and wants to find how to debug hibernate. Don't worry if you don't know what is hibernate. Think it something like Java which we can use to write software programs. So the user puts a search term like 'How to debug hibernate' or 'Debugging Hibernate' or 'Hibernate debugging' or 'Debug Hibernate' . So there are possibly a bunch of combinations and permutations in which the user can enter the search term. Whatever user enters, the quality of a search engine is to bring the page which outlines the various ways a hibernate program can be debugged. The user does not cares if the page has million inbound links or is floating in isolation in the web with no anchors to it. The user does not cares if the author has used no pronouns as the author wants to increase the pages keyword density. The success of the search result is based on the fact that the user navigates to page and gets all the possible relevant details or is directed to few more links which bring out various other aspects of hibernate debugging.
There is a challenge here for a search engine. How to find if this is the best page? Many good actors die in this world without any exposure as they were not the son of some actors or some rich guy. They were very good but that was known only to them or at best to a small circle. This world works on reference and that's what Google did with page rank and there is nothing wrong with it. If my pages are not ranking well in search results, I might hate Google but in real life if I have to hire someone I also look for reference about the skill set of that guy. Nothing wrong in it.
So what search engines can do to build better relevance in their search results. The whole industry is grappling this question as this is where the future of search business is going to head to. And I think the key lies in the relationship of the author and the reader. The author writes for the reader and the reader is the best judge about the quality of content. If I have searched for 'Debugging in Hibernate', I can quickly figure out if the page shown in search is relevant or not. No amount of back linking or keyword density is going to do that conclusion for me which I can do by reading and at times scanning the page. We can call it 'Engagement score'. The engagement can be for humans or for bots scraping the pages to build certain analytic insight. I would like to consider those bots as useful who scrape the web to build more insight about certain topics as these researches impact decisions in
many organizations. So they are also a relevant user.
many organizations. So they are also a relevant user.
How we can define the engagement score of a user. It can be done in the following ways:
- Time spent on the page.
- Looking for similar page in the same domain.
- Sharing page through mail or social media.
- Bookmarking and revisiting the page.
For bots, the engagement score can be defined by the percentage of relevant information that the bot can extract. The percentage can be in context of that individual page and in the context of whole result set.
The difficulty is how to find this information. Once user has clicked on a page and if user has not enabled analytics, it is difficult to get hold of any information for the human user. For bot, unless bot share it, the search engines will never know about it.
This is also a catch 22 situation. Search engines want to know more and more about user behavior and users want to keep it more and more private. People are already nervous about the amount of information that search engines know about them in terms of their inclinations and habits. So it will not be easy to know the engagement score of website unless search engines do something differently. There are possibilities but that will require a paradigm shift.
No comments:
Post a Comment