Category Archives: Money Matters

TiE Pune Seminar: To sell or not to sell – 24th Feb

TiE LogoWhat: TiE Pune Seminar on whether or not to sell your company, by Girish Godbole, CEO and Founder, CEO ally, Inc.
When: 24th Feb, 6pm
Where: Renu Electronics, Baner Road (Near Lexus Furniture)
Registration and Fees: Free for TiE Members, Rs. 250 for others, mail namita.shibad [at] gmail [dot] com to register.

To most of us, our companies are our own, forever. We never think of selling them. Why should we sell our own creations? What good can come out of it? But in the West companies are sold off on a regular basis. There must be some benefit from that? Given that we are in the midst of a challenging global recession that most of us will witness only once in our lifetime, does selling out make sense? The events that will unfold in the next few quarters will result in the widespread destruction of value – for companies, investors, management and employees, and will span industry verticals, geographies and sectors.

So would this be the right time to sell your company? How can you best position your company for a potential acquisition? Should you get a strategic investment into your company? Why is mid-market M&A still active? How can you best prepare your company for a strategic exit? The seminar will address these questions and walk you through all steps of a typical M&A deal from the point of view of a small to mid-sized technology company.

This workshop will be by Girish Godbole, Founder and CEO of CEO ally, Inc. (, a boutique M&A and business advisory firm based in the U.S. Girish brings hands-on experience as a serial entrepreneur and executive who has built, operated and sold businesses successfully.

The workshop will address the following questions in detail:

  1. Should you sell your company? Why? When?
  2. Why do buyers buy?
  3. How can you best position your company to sell?
  4. What do buyers read into financials?
  5. What is a fair valuation?
  6. How does this all work? Can you do it alone?
  7. How will an acquisition affect your work life?

This seminar is free for TiE members. Non members will be charged Rs250/- payable by cash or cheque at the venue. The event will be held on Tuesday, Feb 24, 2009, 6.00pm, at Renu Electronics, Survey no 2/6 Baner Road, (lane next to Lexus Furniture. It is 250 metres after Hotel Mahabaleshwar going towards Mumbai) Pune 411045. Kindly email Namita at namita.shibad[at]gmail[dot]com to confirm your attendance.

Reblog this post [with Zemanta]

FREEconomics: The economics of free stuff

GNUnify 09 LogoLast week, SICSR (Symbiosis Institute of Computer Studies and Research) hosted GNUnify, a conference on open source technologies, which attracted hundreds of students and other open source enthusiasts. Shirish has written a couple of posts on his blog about the talks he attended – read those to get a flavor of GNUnify (day 1, day 2). Usually, the presentations and discussions revolve mostly around the technology, but I decided to talk about not the technology, but about the economics of open source in particular, and free stuff in general.

If so much stuff is being given away for free, how is it sustainable? Programmers need to eat, even if they are immersed in the ideology of the free software movement. Businesses who give away free services exist for making money. So it is instructive to follow the money trail and look at who is paying for the free stuff, who is making money and how. As more of the business world is pushed towards free (whether they want to or not), it is important to understand the various fine points of the economics and sustainability of this situation.

I’ve embedded my presentation below. If you are not able to see it, you can download the PDF.

If you can’t see the slideshow above, click here to view it online, or download the PDF

Reblog this post [with Zemanta]

Cloud Computing and High Availability

This article discussing strategies for achieving high availability of applications based on cloud computing services is reprinted with permission from the blog of Mukul Kumar of Pune-based ad optimization startup PubMatic

Cloud Computing has become very widespread with startups as well as divisions of banks, pharmaceuticals companies and other large corporations using them for computing and storage. Amazon Web Services has led the pack with it’s innovation and execution, with services such S3 storage service, EC2 compute cloud, and SimpleDB online database.

Many options exist today for cloud services, for hosting, storage and application hosting. Some examples are below:

Hosting Storage Applications
Amazon EC2 Amazon S3 opSource
MOSSO Nirvanix Google Apps
GoGrid Microsoft Mesh
AppNexus EMC Mozy
Google AppEngine MOSSO CloudFS

[A good compilation of cloud computing is here, with a nice list of providers here. Also worth checking out is this post.]

The high availability of these cloud services becomes more important with some of these companies relying on these services for their critical infrastructure. Recent outages of Amazon S3 (here and here) have raised some important questions such as this – S3 Outage Highlights Fragility of Web Services and this.

[A simple search on can tell you things that you won’t find on web pages. Check it out with this search, this and this.]

There has been some discussion on the high availability of cloud services and some possible solutions. For example the following posts – “Strategy: Front S3 with a Caching Proxy” and “Responding to Amazon’s S3 outage“.

Here I am writing of some thoughts on how these cloud services can be made highly available, by following the traditional path of redundancy.

[Image: Basic cloud computing architectures config #1 to #3]

The traditional way of using AWS S3 is to use it with AWS EC2 (config #0). Configurations such as on the left can be made to make your computing and storage not dependent on the same service provider. Config #1, config #2 and config #3 mix and match some of the more flexible computing services with storage services. In theory the compute and the storage can be separately replaced by a colo service.

[Image: Cloud computing HA configuraion #4]

The configurations on the right are examples of providing high availability by making a “hot-standby”. Config #4 makes the storage service hot-standby and config #5 separates the web-service layer from the application layer, and makes the whole application+storage layer as hot-standby.

A hot-standby requires three things to be configured – rsync, monitoring and switchover. rsync needs to be configured between hot-standby servers, to make sure that most of the application and data components are up to date on the online-server. So for example in config #4 one has to rsync ‘Amazon S3’ to ‘Nirvanix’ – that’s pretty easy to setup. In fact, if we add more automation, we can “turn-off” a standby server after making sure that the data-source is synced up. Though that assumes that the server provisioning time is an acceptable downtime, i.e. the RTO (Recovery time objective) is within acceptable limits.

[Image: Cloud computing Hot Standby Config #5]
This also requires that you are monitoring each of the web services. One might have to do service-heartbeating – this has to be designed for the application, this has to be designed differently for monitoring Tomcat, MySQL, Apache or their sub-components. In theory it would be nice if a cloud computing service would export APIs, for example an API for , or However, most of the times the status page is updated much later after the service goes down. So, that wouldn’t help much.

Switchover from the online-server/service to the hot-standby would probably have to be done by hand. This requires a handshake with the upper layer so that requests stop and start going to the new service when you trigger the switchover. This might become interesting with stateful-services and also where you cannot drop any packets, so quiscing may have to be done for the requests before the switchover takes place.

[Image: Cloud computing multi-tier config #6]
Above are two configurations of multi-tiered web-services, where each service is built on a different cloud service. This is a theoretical configuration, since I don’t know of many good cloud services, there are only a few. But this may represent a possible future, where the space becomes fragmented, with many service providers.

[Image: Multi-tier cloud computing with HA]
Config #7 is config #6 with hot-standby for each of the service layers. Again this is a theoretical configuration.

Cost Impact
Any of the hot-standby configurations would have cost impact – adding any extra layer of high-availability immediately adds to the cost, at least doubling the cost of the infrastructure. This cost increase can be reduced by making only those parts of your infrastructure highly-available that affect your business the most. It depends on how much business impact does a downtime cause, and therefore how much money can be spent on the infrastructure.

One of the ways to make the configurations more cost effective is to make them active-active configuration also called a load balanced configuration – these configurations would make use of all the allocated resources and would send traffic to both the servers. This configuration is much more difficult to design – for example if you put the hot-standby-storage in active-active configuration then every “write” (DB insert) must go to both the storage-servers, writes (DB insert) must not complete on any replicas (also called mirrored write consistency).

Cloud Computing becoming mainstream
As cloud computing becomes more mainstream – larger web companies may start using these services, they may put a part of their infrastructure on a compute cloud. For example, I can imagine a cloud dedicated for “data mining” being used by several companies, these may have servers with large HDDs and memory and may specialize in cluster software such as Hadoop.

Lastly I would like to cover my favorite topic –why would I still use services that cost more for my core services instead of using cloud computing?

  1. The most important reason would be 24×7 support. Hosting providers such as servepath and rackspace provide support. When I give a call to the support at 2PM India time, they have a support guy picking up my calls – that’s a great thing. Believe me 24×7 support is a very difficult thing to do.
  2. These hosting providers give me more configurability for RAM/disk/CPU
  3. I can have more control over the network and storage topology of my infrastructure
  4. Point #2 above can give me consistent throughput and latency for I/O access, and network access
  5. These services give me better SLAs
  6. Security

About the Author

Mukul Kumar, is a founding engineer and VP of Engineering at Pubmatic. He is based in Pune and responsible for PubMatic’s engineering team. Mukul was previously the Director of Engineering at PANTA Systems, a high performance computing startup. Previous to that he joined Veritas India as the 13th employee and was Director of Engineering for the NetBackup group, one of Veritas’ main products. He has filed for 14 patents in systems software, storage software, and application software and proudly proclaims his love of π and can recite it to 60 digits. Mukul is a graduate of IIT Kharagpur with a degree in electrical engineering.

Mukul blogs at, and this article is cross posted from there.

Zemanta Pixie

Archival, e-Discovery and Compliance

Archival of e-mails and other electronic documents, and the use of such archives in legal discovery is an emerging and exciting new field in enterprise data management. There are a number of players in this area in Pune, and it is, in general, a very interesting and challenging area. This article gives a basic background. Hopefully, this is just the first in a series of articles and future posts will delve into more details.


In the US, and many other countries, when a company files a lawsuit against another, before the actual trial takes place, there is a pre-trial phase called “discovery”. In this phase, each side asks the other side to produce documents and other evidence relating to specific aspects of the case. The other side is, of course, not allowed to refuse and must produce the evidence as requested. This discovery also applies to electronic documents – most notably, e-mail. Now unlike other documents, electronic documents are very easy to delete. And when companies were asked to produce certain e-mails in court as part of discovery, more and more of them began claiming that relevant e-mails had already been deleted, or they were unable to find the e-mails in their backups. The courts are not really stupid, and quickly decided that the companies were lying in order to avoid producing incriminating evidence. This gave rise to a number of laws in recent times which specify, essentially, that relevant electronic documents cannot be deleted for a certain number of years, and that they should be stored in an easily searchable archive, and that failure to produce these documents in a reasonable amount of time is a punishable offense. This is terrible news for most companies, because now, all “important” emails (for a very loose definition of “important”) must be stored for many years. Existing backup systems are not good enough, because those are not really searchable by content. And the archives cannot be stored on cheap tapes either, because those are not searchable. Hence, they have to be on disk. This is a huge expense. And a management nightmare. But failure to comply is even worse. There have been actual instances of huge fines (millions of dollars, and in once case, a billion dollars) imposed by courts on companies that were unable to produce relevant emails in court. In some cases, the company executives were slapped with personal fines (in addition to fines on the company). On the other hand, this is excellent news for companies that sell archival software that helps you store electronic documents for the legally sufficient number of years in a searchable repository. The demand for this software, driven by draconian legal requirments, is HUGE, and an entire industry has burgeoned to service this market. Just e-mail archival soon be a billion dollar market. (Update: Actually it appears that in 2008, archival software alone is expected to touch 2 billion dollars with a growth rate of 47% per year, and e-discovery and litigation support software market will be 4 billion growing at 27%. And this doesn’t count the e-discovery services market which is much much larger.) There are three major chunks to this market:

  • Archival – Ability to store (older) documents for a long time on cheaper disks in a searchable repository
  • Compliance – Ensuring that the archival store complies with all the relevant laws. For example, the archive must be tamperproof.
  • e-Discovery – The archive should have the required search and analysis tools to ensure that it is easy to find all the relevant documents required in discovery


Archival software started its life before the advent of these compliance laws. Basic email archival is simply a way to move all your older emails out of your expensive MS exchange database, into cheaper, slower disks. And shortcuts are left in the main Exchange database so that if the user ever wants to access one of these older emails, it is fetched on demand from the slower archival disks. This is very much like a page fault in virtual memory. The net effect is that for most practical purposes, you’ve increased the sizes of peoples’ mailboxes without a major increase in price, without a decrease in performance for recent emails, and some decrease in performance for older emails. Unfortunately, these guys had only middling success. Think about it – if your IT department is given a choice between spending money on an archival software that will allow them to increase your mailbox size, or simply telling all you users to learn to live with small mailbox sizes, what would they choose? Right. So the archival software companies saw only moderate growth. All of this changed when the e-discovery laws came into effect. Suddenly, archival became a legal requirement instead of a good-to-have bonus feature. Sales exploded. Startups started. And it also added a bunch of new requirements, described in the next two sections.


Before I start, I should note that “IT Compliance” in general is really a huge area and includes all kinds of software and services required by IT to comply with any number of laws (like Sarbanes Oxley for accounting, HIPAA for medical records, etc.) That is not the compliance I am referring to in this article. Here we only deal with compliance as it pertains to archival software. The major compliance requirement is that for different types of e-mails and electronic documents, the laws prescribe the minimum number of years for which they must be retained by the company. And, no company wants to really keep any of these documents a day more than is minimally required by the law. Hence, for each document, the archival software must maintain the date until which the document must be retained, and on that day, it must automatically delete that document. Except, if the document is “relevant” to a case that is currently running. Then the document cannot be deleted until the case is over. This allows us to introduce the concept of a legal hold (or a deletion hold) that is placed on a document or a set of documents as soon as it is determined that it is relevant to a current case. The archival software ensures that documents with a deletion hold are not deleted even if their retention period expires. The deletion hold is only removed after the case is over. The archival software needs to ensure that the archive is tamperproof. Even if the CEO of the company walks up to the system one day in the middle of the night, he should not be able to delete or modify anything. Another major compliance requirement is that the archival software must make it possible to find “relevant” documents in a “reasonable” amount of time. The courts have some definition of what “relevant” and “reasonable” mean in this context, but we’ll not get into that. What it really means for the developers, is that there should be a fairly sophisticated search facility that allows searches by keywords, by regular expressions, and by various fields of the metadata (e.g., find me all documents authored by Basant Rajan from March to September 2008).


Sadly, just having a compliant archive is no longer good enough. Consider a typical e-discovery scenario. A company is required to produce all emails authored by Basant Rajan pertaining to the volume manager technology in the period March to September 2008. Now just producing all the documents by Basant for that period which contain the words “volume manager” is not good enough. Because he might have referred to it as “VM“. Or he might have just talked about space optimized snapshots without mentioning the words volume manager. So, what happens is that all emails written by Basant in that period are retreived, and a human has to go through each email, to determine if it is relevant to volume manager or not. And this human must be a lawyer. Who charges $4 per email because he has to pay off his law school debt. For a largish company, a typical lawsuit might involve millions of documents. Literally. Now you know why there is so much money in this market. Just producing all documents by Basant and dumping them on the opposing lawyers is not an option. Because the company does not want to disclose to the opposing side anything more than is absolutely necessary. Who knows what other smoking guns are present in Basant’s email? Thus, a way for different archival software vendors to differentiate themselves is the sophistication they can bring to this process. The ability to search for concepts like “volume management” as opposed to the actual keywords. The ability to group/cluster a set of emails by concepts. The ability to allow teams of people to collaboratively work on this job. The ability to search for “all emails which contain a paragraph similar to this paragraph“. If you know how to do this last part, I know a few companies that would be desperate to hire you.

What next?

In Pune, there are at least two companies Symantec, and Mimosa Systems, working in this area. (Mimosa’s President and CEO, T.M. Ravi, is currently in town and will give the keynote for CSI-Pune’s ILM Seminar this Thursday. Might be worth attending if you are interested in this area.) I also believe that CT Summation’s CaseVault system also has a development team here, but I am unable to find any information about that – if you have a contact there, please let me know. For some (possibly outdated) details of the other (worldwide) players in this market, see this report from last year. If you are from one of these companies, and can write an article on what exactly your software does in this field, and how it is better than the others, please let me know. I also had a very interesting discussion with Paul C. Easton of True Legal Partners, an e-Discovery outsourcing firm, where we talked about how they use archiving and e-discovery software, but more generally we also talked about how legal outsourcing to India, suitability of Pune for the same, competition from China, etc. I will write an article on that sometime soon – stay tuned (by subscribing to the PuneTech by email or via RSS)