LetsVenture CEO Manish Singhal and Indian Angel Network’s Sharad Sharma acted as lead investors for the deal.
About the company:
The company was founded in 2012 by former Patni Computers employee, Kunal Lagwankar. AdSparx uses a server side technology to serve device independent, high quality, targeted, pre & mid-roll video ads for linear TV, live events and videos on demand. It counts NexGTV, Vodafone, Airtel and Sony Liv among its notable clients. The startup has done mid-roll ads for online streaming of Indian Premier League, French Open and New Zealand Cricket Series in the past.
About the product:
AdSparx’s unique real-time targeted Ad Serving technology works for all devices, irrespective of OS or form-factor, while retaining smooth switching between video and Ad content for a seamless experience
Linear TV, Live Events & On-demand Video
AdSparx works with your existing content delivery set up and requires no modification. AdSparx easily integrates with Wowza, Adobe FMS and Apache servers and content delivery networks
‘Lowest’ time to Market: A single point of integration on your servers for all your Ad serving needs, AdSparx does not require any changes in your client apps and cuts down your time to market drastically
About the team:
The founders all seem to be people who worked in Patni, founded Novix Media in 2005, then joined Patni in 2008, and finally started AdSparx in 2012. Details:
A few days back, we reported that Startup Saturday this month features Ganesh Natarajan and the Indian Angels Network, and will be on Saturday, 11th, 3pm to 6pm. Note, however, that there has been a last minute shift in venue for this event from the usual Startup Saturday location to Yashada on Baner Road. The event will now be held in MDC Conference hall No V. This is on the 1st floor of the auditorium building (first building after you enter the gate, next to parking).
The event is free for all to attend. See the original announcement for all other details, including registration information.
What: “Financing Your Startup” Startup Saturday Pune event with Indian Angels Network and Ganesh Natarajan When: Saturday, Sept 11, 3pm-6pm Where: MDC Hall V. 1st floor, Auditorium Building, YASHADA, Baner Road. Registration & Fees: This event is free for all. Register here
Financing your venture – with Ganesh Natarajan / IAN
Financing your venture is the most challenging tasks for a start-up. Itâs easy to get customers, employees, technologies but finance is tricky. 10 years ago, you could have gone to a VC. Today that is not an option. So how do you finance your start-up?
Thankfully there are lots of other options. Funds are available from friends and family, angel investors, government bodies like MSME, SIBRI, NMITLI, incubators and angel Investorâs networks.
To throw light on this subject, we are getting veterans who have been there, done that for Startup Saturday Pune 10. Mr. Ganesh Natrajan, Chairman of NASSCOM and Global CEO, Zensar, will give the keynote address. He wears various hats. Here he will represent the Pune chapter of “Indian Angel Network”
3:00 – 3:15 Introduction
3:15 to 3:45 Key Note address by Mr Ganesh Natrajan, Chairman of Nasscom and Global CEO of Zensar Technologies.
3:45 to 4:00 Funding schemes from Government of India, by Kaushik Gala, NCL Venture Center 4:00 to 4:15 Crowd funding as an option by Satish Kataria, Grow VC
4:15 – 4:30 Lightening pitches from three promising startups
4:45 – 5:00 Closing Remarks
5:00 – 6:00 Networking and Snacks (On the House)
About IAN – Indian Angel Network
The Indian Angel Network(IAN) is India’s first angel investment network and looks to invest up to US$ 1 mn, though their sweet spot is between US$ 200K to 400K. Apart from funding, the Network also seeks to provide mentoring, strategic thought leadership and leverage the Network’s network for the investee companies. The Network has met with early successes and has already invested in 22 companies across multiple sectors.
Indian Angel Network(IAN) currently has over 125 members drawn from across the country and some from overseas, including leading lights from diverse sectors . Members include people such as Jerry Rao, Saurabh Srivastava, Pramod Bhasin, Raman Roy, Rajiv Luthra, Pradeep Gupta, Sunil Munjal, Arvind Singhal and institutions such as IBM, SIDBI, Spice Televentures, Intel, etc.
About Startup Saturday
Startup Saturday, Pune, is a forum aimed at deepening the skills of startup community in Pune to make more successful startups coming out of the city through creation of a vibrant innovation ecosystem. As with other cities, SS Pune will also be held on second Saturdays of the month.
A SS session is about rich-discussions on topics of interest to startups in the city. A typical session would have only about 25% of time devoted to talk/presentation and rest of the time time dedicated to freewheeling discussion as that is where, in our experience, the audience makes the best use of the available expert.
For example, an investment banker working on a deal will use several applications, such as MS Excel to do financial analysis and modelling of companies, and MS PowerPoint and various in-house databases to obtain information and do analysis.
Sapience will be customized to register these applications as work applications, and will calculate how much time the banker spent on them at the end of the day.
This would help his managers know how many hours the investment banker actually spent working, out of the time he was in office. They can also find out if the banker was spending too much time on some aspects of the work.
The article further points out that:
The software can be installed at company data centres. Smaller firms without a data centre can operate it from a so-called cloud server managed by InnovizeTech.
Its target consumers are software firms, banks, insurance firms and other firms whose employees use computers to deliver their output.
The key USP of Sapience is that it is a highly automated method of accounting for time spent by employees on different software packages (and hence different activities). While information can be manually fed, Sapience has an API that encouranges programmatic sourcing of this information. Further, nit uses learning and rules based intelligence, to increasingly automate this activity. Further, it can handle various difficult cases, like different employees sharing the same PC, or the same employee using different machines, or an employee logging in remotely to a server. They have applied for a global patent on their technology.
It then aggregates the per-employee information at team, project, and other company levels and locations. The product’s analytics and trend engine then provides insightful information that helps senior management to enhance overall business efficiency, and individual and teams to improve their own productivity.
Sapience is priced on per-user basis. The per-user permanent license fee is equivalent to a few hours of average per-employee cost to company. They point out, on their website that they demonstrate savings of several hours of productivity within the first 30 days of deployment. Therefore, Return on Investment (ROI) period is typically one month.
Innovize tech was started last year by Swati Deodhar, Shirish Deodhar, Hemant Joshi and Madhukar Bhatia. The Pune startup community will remember that Shirish, Hemant and Madhukar were also the people behind nFactorial software, the Startup Mentoring company. nFactorial has not been accepting any new mentorship engagements for a while now, and the founders are now primarily focusing on Innovize Tech. For more details on the executive team of Innovize Tech is on their About Us page.
And this is not confined to national boundaries. It is one of only two (as far as I know) Pune-based companies to be featured in TechCrunch (actually TechCrunchIT), one of the most influential tech blogs in the world (the other Pune company featured in TechCrunch is Pubmatic).
Why all this attention for Druvaa? Other than the fact that it has a very strong team that is executing quite well, I think two things stand out:
It is one of the few Indian product startups that are targeting the enterprise market. This is a very difficult market to break into, both, because of the risk averse nature of the customers, and the very long sales cycles.
Unlike many other startups (especially consumer oriented web-2.0 startups), Druvaa’s products require some seriously difficult technology.
The rest of this article talks about their technology.
Druvaa has two main products. Druvaa inSync allows enterprise desktop and laptop PCs to be backed up to a central server with over 90% savings in bandwidth and disk storage utilization. Druvaa Replicator allows replication of data from a production server to a secondary server near-synchronously and non-disruptively.
We now dig deeper into each of these products to give you a feel for the complex technology that goes into them. If you are not really interested in the technology, skip to the end of the article and come back tomorrow when we’ll be back to talking about google keyword searches and web-2.0 and other such things.
This is Druvaa’s first product, and is a good example of how something that seems simple to you and me can become insanely complicated when the customer is an enterprise. The problem seems rather simple: imagine an enterprise server that needs to be on, serving customer requests, all the time. If this server crashes for some reason, there needs to be a standby server that can immediately take over. This is the easy part. The problem is that the standby server needs to have a copy of the all the latest data, so that no data is lost (or at least very little data is lost). To do this, the replication software continuously copies all the latest updates of the data from the disks on the primary server side to the disks on the standby server side.
This is much harder than it seems. A simple implementation would simply ensure that every write of data that is done on the primary is also done on the standby storage at the same time (synchronously). This is unacceptable because each write would take unacceptably long and this would slow down the primary server too much.
If you are not doing synchronous updates, you need to start worrying about write order fidelity.
Write-order fidelity and file-system consistency
If a database writes a number of pages to the disk on your primary server, and if you have software that is replicating all these writes to a disk on a stand-by server, it is very important that the writes should be done on the stand-by in the same order in which they were done at the primary servers. This section explains why this is important, and also why doing this is difficult. If you know about this stuff already (database and file-system guys) or if you just don’t care about the technical details, skip to the next section.
Imagine a bank database. Account balances are stored as records in the database, which are ultimately stored on the disk. Imagine that I transfer Rs. 50,000 from Basant’s account to Navin’s account. Suppose Basant’s account had Rs. 3,00,000 before the transaction and Navin’s account had Rs. 1,00,000. So, during this transaction, the database software will end up doing two different writes to the disk:
write #1: Update Basant’s bank balance to 2,50,000
write #2: Update Navin’s bank balance to 1,50,000
Let us assume that Basant and Navin’s bank balances are stored on different locations on the disk (i.e. on different pages). This means that the above will be two different writes. If there is a power failure, after write #1, but before write #2, then the bank will have reduced Basant’s balance without increasing Navin’s balance. This is unacceptable. When the database server restarts when power is restored, it will have lost Rs. 50,000.
After write #1, the database (and the file-system) is said to be in an inconsistent state. After write #2, consistency is restored.
It is always possible that at the time of a power failure, a database might be inconsistent. This cannot be prevented, but it can be cured. For this, databases typically do something called write-ahead-logging. In this, the database first writes a “log entry” indicating what updates it is going to do as part of the current transaction. And only after the log entry is written does it do the actual updates. Now the sequence of updates is this:
write #0: Write this log entry “Update Basant’s balance to Rs. 2,50,000; update Navin’s balance to Rs. 1,50,000” to the logging section of the disk
write #1: Update Basant’s bank balance to 2,50,000
write #2: Update Navin’s bank balance to 1,50,000
Now if the power failure occurs between writes #0 and #1 or between #1 and #2, then the database has enough information to fix things later. When it restarts, before the database becomes active, it first reads the logging section of the disk and goes and checks whether all the updates that where claimed in the logs have actually happened. In this case, after reading the log entry, it needs to check whether Basant’s balance is actually 2,50,000 and Navin’s balance is actually 1,50,000. If they are not, the database is inconsisstent, but it has enough information to restore consistency. The recovery procedure consists of simply going ahead and making those updates. After these updates, the database can continue with regular operations.
(Note: This is a huge simplification of what really happens, and has some inaccuracies – the intention here is to give you a feel for what is going on, not a course lecture on database theory. Database people, please don’t write to me about the errors in the above – I already know; I have a Ph.D. in this area.)
Note that in the above scheme the order in which writes happen is very important. Specifically, write #0 must happen before #1 and #2. If for some reason write #1 happens before write #0 we can lose money again. Just imagine a power failure after write #1 but before write #0. On the other hand, it doesn’t really matter whether write #1 happens before write #2 or the other way around. The mathematically inclined will notice that this is a partial order.
Now if there is replication software that is replicating all the writes from the primary to the secondary, it needs to ensure that the writes happen in the same order. Otherwise the database on the stand-by server will be inconsistent, and can result in problems if suddenly the stand-by needs to take over as the main database. (Strictly speaking, we just need to ensure that the partial order is respected. So we can do the writes in this order: #0, #2, #1 and things will be fine. But #2, #0, #1 could lead to an inconsistent database.)
Replication software that ensures this is said to maintain write order fidelity. A large enterprise that runs mission critical databases (and other similar software) will not accept any replication software that does not maintain write order fidelity.
Why is write-order fidelity difficult?
I can here you muttering, “Ok, fine! Do the writes in the same order. Got it. What’s the big deal?” Turns out that maintaining write-order fidelity is easier said than done. Imagine the your database server has multiple CPUs. The different writes are being done by different CPUs. And the different CPUs have different clocks, so that the timestamps used by them are not nessarily in sync. Multiple CPUs is now the default in server class machines. Further imagine that the “logging section” of the database is actually stored on a different disk. For reasons beyond the scope of this article, this is the recommended practice. So, the situation is that different CPUs are writing to different disks, and the poor replication software has to figure out what order this was done in. It gets even worse when you realize that the disks are not simple disks, but complex disk arrays that have a whole lot of intelligence of their own (and hence might not write in the order you specified), and that there is a volume manager layer on the disk (which can be doing striping and RAID and other fancy tricks) and a file-system layer on top of the volume manager layer that is doing buffering of the writes, and you begin to get an idea of why this is not easy.
Naive solutions to this problem, like using locks to serialize the writes, result in unacceptable degradation of performance.
Druvaa Replicator has patent-pending technology in this area, where they are able to automatically figure out the partial order of the writes made at the primary, without significantly increasing the overheads. In this article, I’ve just focused on one aspect of Druvaa Replicator, just to give an idea of why this is so difficult to build. To get a more complete picture of the technology in it, see this white paper.
Druvaa inSync is a solution that allows desktops/laptops in an enterprise to be backed up to a central server. (The central server is also in the enterprise; imagine the central server being in the head office, and the desktops/laptops spread out over a number of satellite offices across the country.) The key features of inSync are:
The amount of data being sent from the laptop to the backup server is greatly reduced (often by over 90%) compared to standard backup solutions. This results in much faster backups and lower consumption of expensive WAN bandwidth.
It stores all copies of the data, and hence allows timeline based recovery. You can recover any version of any document as it existed at any point of time in the past. Imagine you plugged in your friend’s USB drive at 2:30pm, and that resulted in a virus that totally screwed up your system. Simply uses inSync to restore your system to the state that existed at 2:29pm and you are done. This is possible because Druvaa backs up your data continuously and automatically. This is far better than having to restore from last night’s backup and losing all data from this morning.
It intelligently senses the kind of network connection that exists between the laptop and the backup server, and will correspondingly throttle its own usage of the network (possibly based on customer policies) to ensure that it does not interfere with the customer’s YouTube video browsing habits.
Let’s dig a little deeper into the claim of 90% reduction of data transfer. The basic technology behind this is called data de-duplication. Imagine an enterprise with 10 employees. All their laptops have been backed up to a single central server. At this point, data de-duplication software can realize that there is a lot of data that has been duplicated across the different backups. i.e. the 10 different backups of contain a lot of files that are common. Most of the files in the C:\WINDOWS directory. All those large powerpoint documents that got mail-forwarded around the office. In such cases, the de-duplication software can save diskspace by keeping just one copy of the file and deleting all the other copies. In place of the deleted copies, it can store a shortcut indicating that if this user tries to restore this file, it should be fetched from the other backup and then restored.
Data de-duplication doesn’t have to be at the level of whole files. Imagine a long and complex document you created and sent to your boss. Your boss simply changed the first three lines and saved it into a document with a different name. These files have different names, and different contents, but most of the data (other than the first few lines) is the same. De-duplication software can detect such copies of the data too, and are smart enough to store only one copy of this document in the first backup, and just the differences in the second backup.
The way to detect duplicates is through a mechanism called document fingerprinting. Each document is broken up into smaller chunks. (How do determine what constitutes one chunk is an advanced topic beyond the scope of this article.) Now, a short “fingerprint” is created for each chunk. A fingerprint is a short string (e.g. 16 bytes) that is uniquely determined by the contents of the entire chunk. The computation of a fingerprint is done in such a way that if even a single byte of the chunk is changed, the fingerprint changes. (It’s something like a checksum, but a little more complicated to ensure that two different chunks cannot accidently have the same checksum.)
All the fingerprints of all the chunks are then stored in a database. Now everytime a new document is encountered, it is broken up into chunks, fingerprints computed and these fingerprints are looked up in the database of fingerprints. If a fingerprint is found in the database, then we know that this particular chunk already exists somewhere in one of the backups, and the database will tell us the location of the chunk. Now this chunk in the new file can be replaced by a shortcut to the old chunk. Rinse. Repeat. And we get 90% savings of disk space. The interested reader is encouraged to google Rabin fingerprinting, shingling, Rsync for hours of fascinating algorithms in this area. Before you know it, you’ll be trying to figure out how to use these techniques to find who is plagiarising your blog content on the internet.
Back to Druvaa inSync. inSync does fingerprinting at the laptop itself, before the data is sent to the central server. So, it is able to detect duplicate content before it gets sent over the slow and expensive net connection and consumes time and bandwidth. This is in contrast to most other systems that do de-duplication as a post-processing step at the server. At a Fortune 500 customer site, inSync was able reduce the backup time from 30 minutes to 4 minutes, and the disk space required on the server went down from 7TB to 680GB. (source.)
Again, this was just one example used to give an idea of the complexities involved in building inSync. For more information on other distinguishinging features, check out the inSync product overview page.
Have questions about the technology, or about Druvaa in general? Ask them in the comments section below (or email me). I’m sure Milind/Jaspreet will be happy to answer them.
Also, this long, tech-heavy article was an experiment. Did you like it? Was it too long? Too technical? Do you want more articles like this, or less? Please let me know.