Beyond keyword search: Adding Findability to your information

(PuneTech reader Titash Neogi has been in the information architecture domain for many years, and is passionate about making information more accessible to people. In recent times, he has been studying the problem of the difficulties of finding information within large enterprises, and his thoughts on the approaches for solving these problems. In this article, he gives us an overview of the concept of “findability” of information in an enterprise.)

One of the greatest human needs that have evolved in the 21st century is the need to know as much as possible about something before making a decision. While human beings have forever been driven towards learning and knowing more, in the last decade technology has added a new dimension to this.

There was a time when we could take the word of our neighbour, colleague, friend or the man at the grocery shop and reach a decision. Since information was only available in finite ways and in finite volumes, there was not much competitive edge or wide reaching impact.

The Internet has changed all of that. Today, the fear that there’s knowledge out there that could be relevant to our decisions, and that we are not using it and getting a lesser deal, haunts us all.

And this need in turn has fuelled the growth of information, made it more complex to deal with and more voluminous in size. Pick up any topic and you would find thousands of pages on the internet related to that topic. There are facts, figures, opinions, comments, user reviews. Information, unlike wealth, has grown directly in proportion to its usage.

This information fire hose impacts both individuals and enterprises. While individuals crave to know as much as possible before committing to something, enterprises find their customers more demanding or their competition more informed.

Staying on top of this complex, voluminous information tidal wave has become crucial for survival. As a response to this, search companies have sprung up, with Google in the lead. For about a decade now, search engines of all sorts are battling it out with terabytes of content on the Internet.

However, the Internet (and other networks within or without the enterprise) are moving from being information stores to knowledge networks. As the volume and complexity of knowledge grows, search as we know it today, is becoming inadequate. Search companies are losing ground fast.

Search as a tool is fine for information stores, but poor for knowledge bases. Knowledge bases need to have findability. Search is used when you know exactly what you are looking for, and are trying to figure out where it is. Findability is when you have only a vague idea of what you want to achieve, and you rely on the knowledge base to guide you into the direction of finding more and more information that is useful and relevant to you.

Understanding Findability

Findability could mean different things to different people or even to the same person at different times. Peter Morville, credited to be the guy who coined the term findability, defines it as:

The quality of being locatable or navigable. At the item level, we can evaluate to what degree a particular object is easy to discover or locate. At the system level, we can analyze how well a physical or digital environment supports navigation and retrieval.

While, a lot of people confuse findability with search, the two are really not the same thing. Search tries to solve the problem of locating information that you already know exists somewhere in a corpus.

Findability encompasses search, but also deals with the problem of how to make the searcher aware of other relevant information, that they didn’t know existed in the corpus. Findability exposes the knowledge within a corpus.

For example, when you are looking for a home loan rate from IDBI bank, it’s a search problem. You want to locate the document/URL that talks about the interest rates of IDBI. However, if you were someone very new to the entire banking/housing scenario and you didn’t know the names of any banks in India, or what loans they offered, you are dealing with a findability problem. While solving this problem will definitely involve search, but you can immediately see that it also involves a lot of information engineering, semantic modelling and usability engineering.

So in a sense, Findability is the big daddy of search. The example above might sound very impractical, but you can easily abstract it and see that it applies to a lot of scenarios in different ways. The future is about solving findability problems.

While Findability concerns all of us in our everyday life, it poses some interesting challenges for modern enterprises. This article will try to focus on findability issues in the enterprise and some pointers at how to solve them.

Why is findability important for organizations?

Internally, as organizations grow bigger in size and complexity and short on budgets and time, no one can afford to waste energy and resources in duplicating the knowledge that already exists in the organization. Smart companies will leverage the knowledge within their workforce and beat their competition. Enterprise 2.0 is about knowledge competition – do we know what we know, and can we use that in new ways.

Externally, as product complexity increases and newer offerings come up, it is going to be a crucial challenge for a company to communicate with its customers and let them know the range of product offerings that they have. An effective findability solution allows customers to automatically explore new products and solutions that might have come up, and might be more relevant for them.

The four-headed monster

To outline a few typical findability problems in an enterprise:

Brute Force findability: This is the most elementary form of findability problem, and can be simply classified as search. It’s like performing a grep over the content base with a particular pattern.

“I remember that the document contained the word log = 2 in its text”

Today’s search engines are very good at solving these problems. In fact this form of search has evolved simply because of the inability of search engines to understand natural language, leading users to rely on grep techniques to find documents/results in the quickest possible manner.

However, as organizations grow and we move from data to knowledge, this form of search will increasingly become impossible to scale or use.

Knowledge findability: This is the next level of findability and comprises the most common “how do I” or “what is” kind of questions.

“How do I know if I really need to move my web application from struts 1 to struts 2?”

“What kind of data backup product should I buy if I am a SOHO with a limited budget?”

Index based search engines are trying various complex algorithms to solve these findability problems.

The last few years have seen a lot of semantic search solutions trying to tackle this problem, by performing semantic analysis of content and indexing them based on meaning rather than sheer word statistics.

People/expertise findability: A lot of times, we find people asking for in house experts

“Anyone who has worked on technology X + platform Y in the organisation”

Typically today, this is handled by Word of Mouth or grapevine, which not only becomes impossible to scale in a cross geographic organisation, but is also inefficient and limited in scope.

A semantic analysis engine, plugged into a HRMS DB or an organisation’s intranet can very effectively solve this problem. An index-based search in a similar scenario is likely to pull up a lot of noise and irrelevant results, rather than solving this problem.

Social findability: Findability that relates to knowledge implicit in the community.

“What do all Linux newbie’s read when they join the organisation”

“What’s the best starting point for understanding deployment of Product X – the product guide or the support technote?”

No semantic or index based search can ever completely fill this gap. A good approach to solving this problem would be to marry a social-tagging system such as de.li.ci.ous or digg and a semantic analysis engine. The Findability solution would need to work as a facilitator that allows people to share their personal experiences and knowledge around a product and build a knowledge community.

What becomes obvious from this discussion is that findability is not a single technology or solution that can be purchased over the counter, deployed and then expected to perform wonders after couple hours of crawling or indexing. This is just plain vanilla search. Search can provide results to queries but not necessarily answers to questions. A good search engine can make your content searchable, but it does nothing to solve your findability problem.

The mistake that most organizations make today is to deploy a million dollar search engine and then expect it to solve a problem it was never designed to solve – a findability problem.

Solving the Findability problem

A good findability platform needs to bring together expertise and lessons learnt from the fields of Semantic search, Usability, Information Architecture, Graphic Design and Text Engineering. And above all, people need to understand that a findability solution can never be a “one size fits all” solution. It can never be an appliance that you can deploy over your networks and forget about.

Think of findability solution as an ERP solution. It needs to have various modules that can understand and talk to different information stores in the organisation. The first step in solving an organizations findability problem is to analyze its findability need and then deploying or developing all or some specific modules of that solution. Also important is the right combination of content strategy, user query analysis, and search and interface design.

Who’s working on Findability?

There are a lot of people trying to take their stab at findability. Solutions and products range from sophisticated semantic search applications to information architecture consulting firms. However, in the limited scope of this article, I would like to touch upon few companies/individuals who stand out in their attempts to solve the findability problem.

First, Peter Morville at Semantic Studios is doing ground breaking work in this area. The stuff he puts up on his site (www.findability.org) is pretty exciting and educating. There are also a few start-ups in this space, but I would like to mention connectbeam (www.connectbeam.com), a bay area based start-up that caught my notice. They are trying to solve the social findability problem within the enterprise and I found their approach very unique.

Finally

I am old school and I like to conclude my articles with something to think about. In a global, recession ridden economy, findability affects your bottom-line one way or another. To quote Peter Morville,

“You can’t use, what you can’t find” (and neither can your customers)

About the author – Titash Neogi

Titash Neogi is working with Symantec Corp (formerly VERITAS India) for past six years at various profiles in customer support, knowledge management and content management divisions. At present he is the architect and lead developer for Symantec’s new semantic search based help system initiative.

Google’s Impact on Brain Morphology & Cognition (weblogs.elearning.ubc.ca)
Understanding Natural Language Processing for SEO (thecustomercollective.com)
Digg: DiggBar is Good For You, Really Good For Us (mashable.com)
Search Will Get Smarter: The Prediction Made Every Year (searchenginewatch.com)

4 thoughts on “Beyond keyword search: Adding Findability to your information”

Dhananjay Nene says:

April 13, 2009 at 12:29 pm

Great article. The links at the end should help me learn more. But a couple of suggestions that do help improve findability in a brute force approach (these are actually good SEO tips as well). Note that these are only the low hanging fruit and while unlikely to be as good as a more comprehensive solution,they do allow one to start moving down the path of improved findability.

For static pages, make sure the page title, page url, and the page keywords appropriately match the content (eg. bank loan scenario you talked about)

For dynamic pages, try to embed the information in the url path rather than in the url parameters. This is often implicitly achieved by using REST based APIs however is unlikely to be ever as good as static pages eg. bank.com/services/loans/home_loan instead of bank.com/services.php?type=loans&subtype=home_loan. These can even deliver better results when the actual data that gets displayed as the title and meta keywords is carefully entered against each of the products.
Titash Neogi says:

April 13, 2009 at 8:27 pm

Hi Dhananjay,

Thanks for the comment.

You have pointed out some very good tips, and I wish website designers and web application architects would keep these simple points in mind during development. Another one on that list is to stick to XHTML and minimize the use of custom CSS (as that makes the search engines do extra work and often confuses them).

Since you mention SEO, I cant resist saying that I am increasingly beginning to feel that there is only a limit up to which your site or information store can be SE “optimized”. Beyond that, your results are going to be weighed as equally as your competitors’ results (and displayed right next to them). So in a sense, Search Engines are acting as the great information leveler – with smaller orgs (with less information complexity) having an edge over bigger orgs (with multiple products, etc). That’s another whole new reason why findability will become a critical success factor in future websites.
Pradeep Nair says:

April 14, 2009 at 12:37 am

Great Startup Article towards FINDABILITY. Will be following your upcoming work in this field.

Regards
PN
renne walker says:

June 30, 2009 at 1:33 am

VERY GOOD article

the title says it all – “… adding findability to your information”. absolutely correct. we do not make information findable by simply deploying a search engine – throwing a search engine at the business problems caused by content and users eluding each other in our information access systems. rather, we build findability into our content so we can then successfully configure content to be accurately and precisely found by, say, search. by using a variety of methods – metadata design, editorially selected presentation (based on understanding our core content and the common questions asked of our systems) etc.

search works through algorithms. parameters, sitting on top of the algorithms, can be configured to make the algorithms work “better” … BUT the key task of all of us trying to build content supply chains that work all the way through to the fact finding front end is – to make our content workable with search algorithms … and not the other way around. any organization that solves its top 1000 fact finding questions excellently (and ongoingly excellently) makes a very large number of its workers more effective (and efficient). and that happens by building requirements based on the concepts covered, and alluded to, in this article …

regards,
rw

Comments are closed.

punetech.com

Connecting together Pune's Technologists