Tag Archives: semantic web

First PIRST: Pune Information Retreival and Semantic Technology meeting – July 18

Click on the image to get all PuneTech articles related to the Pune Open Coffee Club
Click on the image to get all PuneTech articles related to the Pune Open Coffee Club

PIRST – the Pune Information Retreival and Semantic Technology – is a special interest group within POCC (the Pune Open Coffee Club), that is focused on search technologies, and the semantic web. PIRST has it’s first kickoff meeting this Saturday, July 18th from 9:30am-2pm, at SICSR, Model Colony. The event is free for all to attend, but you must register here.

This meetup is geared towards learning about IR & ST, networking of professionals interested / active in this area and brainstorming on various possibilities and ideas in this area. The following information is tentative:

Speakers

  • Shashikant Kore, Co-founder, Bandhan.com
  • Abhay Shete, Founder, FortyTwo
  • Rajan Chandi, Founder, OpenWeb Labs
  • Bhasker Kode, Founder, Hover.in
  • Atul Tulshibagwale, Founder, Web2rank

If you are interested in speaking at this event, please contact Atul Tulshibagwale (atultulshi gmail)

Agenda

Each individual talk is expected to be 45 minutes, with 15 minutes for Q&A.

  • 9:30am – 10:15am – Survey of startups in IR&ST – Atul Tulshibagwale
  • 10:15am – 11:00am – Survey of various semantic technologies – Rajan Chandi
  • 11:00am – 11:30am – Tea Break
  • 11:30am – 12:15pm – – Lucene primer – Shashikant Kore
  • 12:15pm – 1:30pm – Roadmap of required Math – Abhay Shete
  • 1:30pm – 2:00pm – Panel: Future of IR&ST – All Speakers
Reblog this post [with Zemanta]

Beyond keyword search: Adding Findability to your information

(PuneTech reader Titash Neogi has been in the information architecture domain for many years, and is passionate about making information more accessible to people. In recent times, he has been studying the problem of the difficulties of finding information within large enterprises, and his thoughts on the approaches for solving these problems. In this article, he gives us an overview of the concept of “findability” of information in an enterprise.)

One of the greatest human needs that have evolved in the 21st century is the need to know as much as possible about something before making a decision. While human beings have forever been driven towards learning and knowing more, in the last decade technology has added a new dimension to this.

There was a time when we could take the word of our neighbour, colleague, friend or the man at the grocery shop and reach a decision. Since information was only available in finite ways and in finite volumes, there was not much competitive edge or wide reaching impact.

The Internet has changed all of that. Today, the fear that there’s knowledge out there that could be relevant to our decisions, and that we are not using it and getting a lesser deal, haunts us all.

And this need in turn has fuelled the growth of information, made it more complex to deal with and more voluminous in size. Pick up any topic and you would find thousands of pages on the internet related to that topic. There are facts, figures, opinions, comments, user reviews. Information, unlike wealth, has grown directly in proportion to its usage.

This information fire hose impacts both individuals and enterprises. While individuals crave to know as much as possible before committing to something, enterprises find their customers more demanding or their competition more informed.

Staying on top of this complex, voluminous information tidal wave has become crucial for survival. As a response to this, search companies have sprung up, with Google in the lead. For about a decade now, search engines of all sorts are battling it out with terabytes of content on the Internet.

However, the Internet (and other networks within or without the enterprise) are moving from being information stores to knowledge networks. As the volume and complexity of knowledge grows, search as we know it today, is becoming inadequate. Search companies are losing ground fast.

Search as a tool is fine for information stores, but poor for knowledge bases. Knowledge bases need to have findability. Search is used when you know exactly what you are looking for, and are trying to figure out where it is. Findability is when you have only a vague idea of what you want to achieve, and you rely on the knowledge base to guide you into the direction of finding more and more information that is useful and relevant to you.

Understanding Findability

Findability could mean different things to different people or even to the same person at different times. Peter Morville, credited to be the guy who coined the term findability, defines it as:

The quality of being locatable or navigable. At the item level, we can evaluate to what degree a particular object is easy to discover or locate. At the system level, we can analyze how well a physical or digital environment supports navigation and retrieval.

While, a lot of people confuse findability with search, the two are really not the same thing. Search tries to solve the problem of locating information that you already know exists somewhere in a corpus.

Findability encompasses search, but also deals with the problem of how to make the searcher aware of other relevant information, that they didn’t know existed in the corpus. Findability exposes the knowledge within a corpus.

For example, when you are looking for a home loan rate from IDBI bank, it’s a search problem. You want to locate the document/URL that talks about the interest rates of IDBI. However, if you were someone very new to the entire banking/housing scenario and you didn’t know the names of any banks in India, or what loans they offered, you are dealing with a findability problem. While solving this problem will definitely involve search, but you can immediately see that it also involves a lot of information engineering, semantic modelling and usability engineering.

So in a sense, Findability is the big daddy of search. The example above might sound very impractical, but you can easily abstract it and see that it applies to a lot of scenarios in different ways. The future is about solving findability problems.

While Findability concerns all of us in our everyday life, it poses some interesting challenges for modern enterprises. This article will try to focus on findability issues in the enterprise and some pointers at how to solve them.

Why is findability important for organizations?

Internally, as organizations grow bigger in size and complexity and short on budgets and time, no one can afford to waste energy and resources in duplicating the knowledge that already exists in the organization. Smart companies will leverage the knowledge within their workforce and beat their competition. Enterprise 2.0 is about knowledge competition – do we know what we know, and can we use that in new ways.

Externally, as product complexity increases and newer offerings come up, it is going to be a crucial challenge for a company to communicate with its customers and let them know the range of product offerings that they have. An effective findability solution allows customers to automatically explore new products and solutions that might have come up, and might be more relevant for them.

The four-headed monster

To outline a few typical findability problems in an enterprise:

Brute Force findability: This is the most elementary form of findability problem, and can be simply classified as search. It’s like performing a grep over the content base with a particular pattern.

“I remember that the document contained the word log = 2 in its text”

Today’s search engines are very good at solving these problems. In fact this form of search has evolved simply because of the inability of search engines to understand natural language, leading users to rely on grep techniques to find documents/results in the quickest possible manner.

However, as organizations grow and we move from data to knowledge, this form of search will increasingly become impossible to scale or use.

Knowledge findability: This is the next level of findability and comprises the most common “how do I” or “what is” kind of questions.

“How do I know if I really need to move my web application from struts 1 to struts 2?”

“What kind of data backup product should I buy if I am a SOHO with a limited budget?”

Index based search engines are trying various complex algorithms to solve these findability problems.

The last few years have seen a lot of semantic search solutions trying to tackle this problem, by performing semantic analysis of content and indexing them based on meaning rather than sheer word statistics.

People/expertise findability: A lot of times, we find people asking for in house experts

“Anyone who has worked on technology X + platform Y in the organisation”

Typically today, this is handled by Word of Mouth or grapevine, which not only becomes impossible to scale in a cross geographic organisation, but is also inefficient and limited in scope.

A semantic analysis engine, plugged into a HRMS DB or an organisation’s intranet can very effectively solve this problem. An index-based search in a similar scenario is likely to pull up a lot of noise and irrelevant results, rather than solving this problem.

Social findability: Findability that relates to knowledge implicit in the community.

“What do all Linux newbie’s read when they join the organisation”

“What’s the best starting point for understanding deployment of Product X – the product guide or the support technote?”

No semantic or index based search can ever completely fill this gap. A good approach to solving this problem would be to marry a social-tagging system such as de.li.ci.ous or digg and a semantic analysis engine. The Findability solution would need to work as a facilitator that allows people to share their personal experiences and knowledge around a product and build a knowledge community.

What becomes obvious from this discussion is that findability is not a single technology or solution that can be purchased over the counter, deployed and then expected to perform wonders after couple hours of crawling or indexing. This is just plain vanilla search. Search can provide results to queries but not necessarily answers to questions. A good search engine can make your content searchable, but it does nothing to solve your findability problem.

The mistake that most organizations make today is to deploy a million dollar search engine and then expect it to solve a problem it was never designed to solve – a findability problem.

Solving the Findability problem

A good findability platform needs to bring together expertise and lessons learnt from the fields of Semantic search, Usability, Information Architecture, Graphic Design and Text Engineering. And above all, people need to understand that a findability solution can never be a “one size fits all” solution. It can never be an appliance that you can deploy over your networks and forget about.

Think of findability solution as an ERP solution. It needs to have various modules that can understand and talk to different information stores in the organisation. The first step in solving an organizations findability problem is to analyze its findability need and then deploying or developing all or some specific modules of that solution. Also important is the right combination of content strategy, user query analysis, and search and interface design.

Who’s working on Findability?

There are a lot of people trying to take their stab at findability. Solutions and products range from sophisticated semantic search applications to information architecture consulting firms. However, in the limited scope of this article, I would like to touch upon few companies/individuals who stand out in their attempts to solve the findability problem.

First, Peter Morville at Semantic Studios is doing ground breaking work in this area. The stuff he puts up on his site (www.findability.org) is pretty exciting and educating. There are also a few start-ups in this space, but I would like to mention connectbeam (www.connectbeam.com), a bay area based start-up that caught my notice. They are trying to solve the social findability problem within the enterprise and I found their approach very unique.

Finally

I am old school and I like to conclude my articles with something to think about. In a global, recession ridden economy, findability affects your bottom-line one way or another. To quote Peter Morville,

“You can’t use, what you can’t find” (and neither can your customers)

About the author – Titash Neogi

Titash Neogi is working with Symantec Corp (formerly VERITAS India) for past six years at various profiles in customer support, knowledge management and content management divisions. At present he is the architect and lead developer for Symantec’s new semantic search based help system initiative.

Reblog this post [with Zemanta]

Lecture series on Knowledge Representation

What: Overview of Knowledge Representation (this is first in a series), by Prof. V.N. Jha

When: Thursday, 4th Sept, 6:30pm

Where: India International Multiversity, Sakal Nagar, Baner Road,

Registration and Fees: This event is free for everyone. There is no need to register.

Details:

This is a lecture series organized by Dr. Jha in “Knowledge Representation” using the Nyaya and Navya Nyaya techniques. Navya-Ny?ya developed a sophisticated language and conceptual scheme that allowed it to raise, analyse, and solve problems in logic and epistemology. The lecture series will be an introductory course which will cover the basics and also look at the design principles of Sanskrit, inference schemas used in Nyaya etc.

The idea here is to get a fresh perspective on Knowledge Representation and looking at how these techniques could be used in today’s IT problems ranging from better modeling in databases to better common sense representation systems.

About the Speaker – Prof. V.N. Jha

Prof. V. N. Jha is a specialist of various branches of Sanskrit learning and Navya Nyaya. All along he has been trying to promote Sanskrit studies through multi-disciplinary approaches in order to make such studies relevant to contemporary knowledge domains. He has visited several countries as visiting Professor and has delivered lectures. He has contributed over40 books and over 100 articles. Over 25 students received PhD degree under his supervision.

Reblog this post [with Zemanta]