Sunday, June 26, 2011

Semantic Search at the Globe and Mail

Recently, I had the opportunity to meet with Kevin Schlueter, enterprise architect at the Globe and Mail. The Globe and Mail is one of Canada’s largest newspapers and they run one of the largest Web sites in Canada. Kevin told me an interesting story that I thought was worth sharing.

The Globe and Mail runs one of Canada's largest news sites
Newspapers live from advertising and so they are keenly interested in attracting the largest possible audience and keeping the readers on their site as long as possible. People usually come to a newspaper Web site to find some specific information. This becomes particularly relevant in times of significant events of interest - such as the recent federal elections in Canada. While the home page usually provides up-to-date information about the main election race, most users are also interested in their particular candidates and so they search for them.

Search optimization became very critical for the Globe and Mail. Since a newspaper is in the information selling business, the goal was stated as “show me what I want to know even if I don’t ask for it”. And this is the tricky part - exposing the readers to relevant articles that the user will likely be interested in. And that’s why the Globe and Mail employed semantic technology.

Semantic search is the next level of searching. The basic search is looking for the most statistically prominent keywords that are contained in the text body. It can find out about who, what, when, where, and perhaps even why. Full text search is often augmented by a metadata search which can reveal information such as the author, section, page, or publication data. But a semantic search can leverage automatically generated semantic metadata which is information about topics, people, places, products and concepts.

With semantic metadata, a reader can search for an article about a particular topic - say the Canada’s Governor General David Johnston. Unlike Wayne Gretzky, David Johnston is a fairly common name and a conventional search would find a whole bunch of them. Just try to google that name.  This is where a semantic search helps by identifying correctly all relevant concepts - such as the David Johnston who is Canada’s GG, the one who’s a Harvard professor, or the one who’s a known author and journalist. These concepts can be either presented to the reader as options or they can be used to deliver the content relevant in a given context.

You may think that this is what the online retailers have been doing for years - recommending similar products based on your current selection. But there is a big difference here. The retailers work with product catalogs which contain very structured data. When you are looking at a pair of shoes, the retailer can automatically recommend another five pairs in your size that are similar but perhaps a little more stylish (aka expensive). All of that is based on defined database fields. The semantic search can make such associations based on information contained in unstructured text in an article or a group of articles which is much more difficult.

Papers are changing
And that’s exactly what the Globe and Mail is doing - using the semantic search technology to generate semantic metadata that improves search results, increases search engine optimization (SEO), and makes the site more “sticky”. And stickiness means more advertising revenue which is what the paper lives from.

According to Kevin, it works really well as he could see during the recent election when the traffic on the site peaked to over 12,000 hits per second after the first results were published. Kevin plans on additional uses for the semantic search technology such as faceted navigation, similarity, or automatically generated topic pages. All that to keep the Globe and Mail site competitive in the Canadian news business.


  1. Great blog dear, especially the picture of our dog!

  2. Did Kevin happen to mention anything about OTSN helping with SEO?

  3. Hi Priscilla,

    Thanks for your comment and tweet. Yes, Kevin is of course an OpenText customer and indeed, they are using the OpenText Social Navigation product for the solution I described.

    I didn't want to turn my blog into a marketing site which is why I wasn't talking about the products. I though it was a cool story of content analytics in action.

    Thanks for reading my blog.


    PS: I have connected Kevin with the marketing team and they are working on a official success story.