Comments on Testing Search for Relevancy and Precision

10 Reader Comments

Back to the Article
  1. John,

    Shouldn’t a good search engine provide disambiguation help? In your example of “parking”, Google would suggest the top relevant related searches to automagically refine the user’s query.


    Copy & paste the code below to embed this comment.
  2. Good article, thanks. I wish you could expand on “[”¦] we used these metrics to identify weaknesses in the configuration of our search engine, and as a yardstick to track improvement as we implemented optimization, best bets, and a thesaurus.”

    Copy & paste the code below to embed this comment.
  3. Hi, that was a clear and helpful article! My site has very few visits, but still it’s good to know a good way to analyze site search experience.

    I would just like to point out that the first spreadsheet link appears to be broken, as it points to an invalid destination (http://d/).

    Copy & paste the code below to embed this comment.
  4. Thank you for this article! That were things I did not think about until now, but I got the impression that I have learned something ;). I am sure it will be helpful for me when I have to maintain bigger websites.

    Copy & paste the code below to embed this comment.
  5. I’ve run into the issue of scoring a result set for usability evaluation before (using different interfaces for complex queries but it is the same difference); One of the things I used is the typical hit and run behavior that we know google users are using; if the right result is not within the first 10 results, users rather re-query than goto the next page. So results after 10 are typically unimportant. The good old “precision” used to tweak engines is less important for these reasons and less useful in this kind of evaluation.

    In order to be able to have a user based scoring for the search results to a given query, one could count the relevant results within the top 10 and use the position of each of those results to create an aggregate score for the result set; say the 1st and 3rd result are relevant out of 100 results, you could give a score of (1/1+2/3)/10 = 0.17; if the second relevant result would be 2 in the result deck, the score would’ve been 0.2 etc.

    It would be even better if you have a couple of people evaluate the results for relevancy instead of the single ambiguous you.

    Copy & paste the code below to embed this comment.
  6. Thanks for the interesting article.
    Sometimes it is also a good idea to spread the search results into a section which is manually predetermined, if a certain keyword appears in the query, and a section, which is generated totally automatically by the search engine. So if some keywords were searched for quite often, you can at least present for these queries the most relevant results.

    Copy & paste the code below to embed this comment.
  7. I think this is an excellent step-by-step explanation of how to evaluate recall, precision, and relevance.  It’s not just metrics, it explains how to use the research, very helpful.

    I believe Ledderman above is referring to Search Suggestions / Best Bets.  These are particularly useful in cases like your Football example, where a static link to the sports department would serve the users.

    Copy & paste the code below to embed this comment.
  8. Sorry, commenting is closed on this article.