|Number of votes:
As search evolves more and more users discover that the simplest way to find information on a site is by using its search function. In this article I will list important aspects when deciding which search engine to use and how to customize it to deliver relevant hits on your site. Beware that this is not a traditional SEO article but instead focuses on internal search were the search engine trusts all content. Ranking with a similar system as Google Page rank is often not possible on smaller sites since there’s not enough data to get decent link relevancy.
The most important thing for a search page is ease of use and relevance. Therefore, make the initial search page as simple and clear as possible. Avoid advanced search, it sounds scary and it usually is. The user don’t won’t complex alternatives to fill in if they don’t have to. If there are several hits, make it possible to add refinements to the search. This means that when using a broad search term like “apple” it should be easy to narrow down the search to for example “apple computers”. However, these choices should only be shown if you notice you’re unable to give a relevant answer to the given query. Tags/categories and previously used search words are great tools when creating the refinement choices or generating links to related searches. Always make sure the most relevant hit is displayed in the top before refinement. That is, don’t group the hits according to for example category if the user didn’t explicitly tell you to. This can make the best hits appear at the bottom of the results page if it belongs to the “wrong” category.
To be able to improve the search experience you need to know what your users are searching for. Are they using synonymous that aren’t used on the site? Make sure your search engine uses an extendable thesaurus to take care of this. By using an auto-complete feature, a la Google suggest, with popular searches the users will be able to refine the query and get spelling help directly in the search box. This is particularly useful on large sites with lots of different vertical areas, but is also great for discovering the content on the site. By analyzing the search terms you will probably also find out what kind of information that is hard to find on the site when navigating using the menu. You can also make sure you have relevant results for the most popular queries, see the section “Rate the tags” below.
Let’s say you’re about to make a trip to Norway and want to find the best rivers for fishing salmon. You probably enter something like “fishing salmon”. This is where the stemming algorithm kicks in. A stemming algorithm reduces the words "fishing", "fished", and "fisher" to the root word, or stem, "fish". This means that you won’t miss any fish pages just because they didn’t contain the correct inflection. Beware of that most stemming algorithms are language specific and therefore make sure your search engine supports the languages used on the site.
Did you mean..?
Spelling correction is an important feature for all users. Do you for example know how to spell to all countries in ex Soviet Union? I guess not. The good news is that you don’t have to create or buy a dictionary for this feature, just use all the words used on your site as the dictionary. Be sure not to auto correct spelling errors, instead show the results for the given search term but provide a direct link to make the search with the suggested term instead.
One thing to consider is that you can trust all metadata on your website. This is something Google can’t (remember, there’s lots of evil sites out there) and therefore doesn’t rely on this information as much as you should. Editors and/or the users should be able to add additional metadata to your pages. By using tags and rating the relevance can be increased significally. The rating system should take into consideration how many users who have rated the content and the actual rating should be considered first and foremost on the tags. This should be the main ranking system on your website and it will make your internal search beat external search engines hands down. Another way to find popular pages is to use the site statistics. Again, mainly rank these pages high on their given tags to avoid popular pages to appear on most searches.
Ever wondered why Google only display 10 hits at a time? It’s about speed. By showing 10 hits instead of 30 the page loads much quicker. Instant feedback enables a faster learning curve and expert users will use the service more and get out more of it. So, make sure the result page is small and that you’ve got a short response time.
Recommendations from your friends are usually better than automatically created recommendations. This is why you will have lots of possibilities to improve the search experience within communities. If a page is rated high among your friends with the same interests, it’s probably relevant to you also. There’re lots of possibilities to provide a better search experience if you can utilize a social graph. One simple example is to show all comments by a specific user, or only find articles written by users awarded as MVP.
If you’ve got a large search database with lots of information, for example a knowledge base, you’ll probably want to add extra features for the power users. This can be the ability to subscribe to a RSS feed for a particular search, only search for content of a specific type etc. A good way of implementing power commands is to have reserved words. For example in a forum you may want to give higher priority to new posts and only show posts with at least 5 comments. This can be done with “date comments:5”, where date and comments are reserved words and comments can take a parameter. Note that this isn’t the same as sorting by date, since that operation removes all other relevancy and is practically useless if you’ve got lots of hits.
As always, make sure you’re following standards when creating the actual (x)html code for your pages. By using semantic elements like headings, quotations and bulleted lists you’ll add structured information to your content that are not only good for your internal search engine, but also for external ones like Google.