All posts by Navin Kabra

Call for Speakers: IndicThreads Conference Pune

The call for speakers for the IndicThreads’ software technology conference is open. The conference is in December, but the CFP closes this week (22 September), and you should submit a proposal

IndicThreads have been holding tech conferences in Pune for the last 7 years, and their conferences are the top pure technology conferences in Pune. An IndicThreads conference is one of the best places to hear about the latest trends in the software industry, and to meet techies from large and small companies of not only Pune, but the rest of the country too.

The conference itself is paid, but becoming a speaker is a good way to get into the conference for free.

This time around, the conference will cover a wide range of technologies from Java, Cloud Computing, Mobile App Development to emerging technologies like Big Data, Gamification, HTML5. (Traditionally, IndicThreads used to have a Java conference – but this year, they are broadening the theme.)

The CFP calls for submissions in these areas:

  • Software Architecture
  • Cloud Computing: IaaS, SaaS, PaaS
  • Design methodology
  • Mobile Software Platforms
  • Mobile Software Development
  • Software Testing
  • Optimization, Scaling, Caching and Performance Tuning
  • Java Language Specs & Standards
  • Enterprise Java (JavaEE)
  • JVM Languages
  • Software Security
  • Development Frameworks
  • Big Data
  • NoSQL Software Development
  • Agile
  • HTML5
  • New and emerging technologies
  • Case Studies and Real World Experiences

But feel free to submit other topics in software technology too. The audience consists mostly of software architects and project leads from various product and services companies across India. If you have done any interesting work in one of the areas above, you should submit a proposal. For now, all you need to do is submit a one paragraph abstract of what you’d like to talk about.

Why?

  • Because you get a free pass to the conference
  • Get recognized amongst the community as an expert in an area
  • It strengthens the tech community in Pune, which benefits all of us.

The submission deadline is 22 Sept, so submit your proposal now. For more details about the conference itself, see the conference webpage.

The Net Neutrality Debate – A Supply/Demand Perspective – V. Sridhar, Sasken

(This is a liveblog of a lecture on Network Neutrality by V. Sridhar, a Fellow at Sasken. This talk was delivered as a part of the Turing100@Persistent Lecture Series in Pune. Since it is being typed as the event is happening, it is not really well structured, but should rather be viewed as a collection of bullet points of interesting things said during the talk. For more information about Dr. Sridhar, see his website)

The Problem of Net Neutrality

The principle of “Net Neutrality” states that all traffic on the internet should be treated equally. Thus, the principle states that network service providers (i.e. the telecom companies) should not be allowed to discriminate (i.e. limit or disallow) on network connections and speeds based on the type of traffic. Thus, for example, under net neutrality, a telecom should not be allowed to disallow BitTorrent Downloads, or limit bandwidth for Skype or Video streaming, or provide higher speeds and better quality of service guarantees for just traffic generated by iPhones or US-based companies.

Telecom companies are trying to introduce systems by which different levels of service are provided for different types of traffic, because, they argue that network neutrality is not economically viable.

The Demand for Network Services

  • Mobile broadband and 3G traffic is increasing exponentially
    • Even in India! In the last 7 months there has been 78% growth in 3G traffic, and 47% growth in 2G. India loves mobile broadband
    • Users are getting hooked to 3G. An average 3G user consumes 4 times more data than a 2G user. 3G is an acceptable alternative to wired broadband
    • Mobile data is growing fastest in smaller towns and villages (category B & C circles)
  • Video, voice, and streaming data are taking up huge chunks of bandwidth

NetHeads vs BellHeads

There are two major approaches to the network: the traditional telephone providers who come from a circuit switched Telephone background (the BellHeads), and the people who come from the packet-switched internet protocol background (the NetHeads). The BellHeads believe that the network is smart, endpoints are dumb; they believe in closed, proprietary networks; they expect payment for each service; often with per-minute charges; they want to control the evolution of the network and to control everything about the network. They want strong regulations. The NetHeads philosophy is that network is dumb, and endpoints are smart. So users should take all the decisions; they believe in an open community; and they expect cheap or free services, with no per-minute charges; they want the network to evolve organically without regulations.

To a large extent, the NetHeads are for net neutrality and the BellHeads are in favor of abolishing net neutrality in favor of carefully controlled tiered traffic.

The Supply Side

Land-line penetration is decreasing. On the other hand, mobile penetration continues to increase and is showing no signs of saturation. Fixed-line is losing its relevance, especially in case of emerging countries in India. Which means that increasing chunk of the internet bandwidth is going to be consumed by mobile devices.

LTE (the Long Term Evolution) mobile network is the fastest growing network ever. 300+ different operators all over the world are investing in LTE. This will come to India soon.

Mobile technologies are improving, and individual devices will soon be capable of handling 1Gbps data connections. This means that the capacity of the core network will have to go up to provide the speeds that the device is capable of consuming. And the NetHeads are making good progress and being able to provide high capacities for the core networks.

The problem is that the mobile spectrum is a scarce resource, and will soon become the bottleneck. The other problem is that chunks of the spectrum have to be exclusively allocated to individual operators. And then that operator has to operate just within that chunk.

The Problem of the Commons

When people have shared, unlimited access to a common resource, then each will consume the resource without recognizing that this results in costs for everyone else. When the total amount that everybody would like to consume goes above what is totally available, everybody suffers. This is a problem which will affect the mobile spectrum. The spectrum gets congested, and bandwidth suffers.

How to solve the congestion problem?

  • Congestion pricing. For example, cheaper access after 9pm is an instance of congestion pricing – an attempt to convince some of the users to consume resources when they’re less congested.
  • During periods of congestion, bandwidth is scarce and hence should have high prices. On the other hand, when the network is not congested, then the additional cost of supporting an additional user’s downloads is minimal, hence the user should be given free or very cheap access.

The Net Neutrality Debate

Net neutrality believes that the maximum good of maximum people will happen if networks service providers do not discriminate amongst their customers.

No discrimination means:

  • No blocking of content based on its source, ownership or destination
  • No slowing down or speeding up of content based on source, ownership or destination

Examples of discrimination:

  • In 2005, Madison River Communications (an ISP) blocked all Vonage VoIP phone traffic
  • In 2007, Comcast in the US, restricted some P2P applications (like BitTorrent)
  • In 2009, AT&T put restrictions on what iPhone apps can run on its network
    • Disallowed SlingPlayer (IP based video broadcast) over it’s 3G network
    • Skype was not allowed to run over AT&T’s 3G network

The case for net neutrality:

  • Innovation: Operators/ISPs can kill innovative and disruptive apps if they’re allowed to discriminate
  • Competition: Operators/ISPs can kill competition by selectively disallowing certain applications. For example, if AT&T slows down Google Search, but speeds up Bing Search, this can cause Google Search to die.
  • Consumers: Operators/ISPs will have a strong grip on the consumers and other players will not get easy access to them. This will hurt the consumers in the long run.

The case against net neutrality:

  • Capacity is finite. Especially in the case of mobile broadband (because the spectrum is limited)
  • If there is no prioritization, a few apps will consume too much bandwidth and hurt everybody; and also it reduces the service provider’s motivation to increase bandwidth
  • Prioritization, and higher pricing for specific apps can be used to pay for new innovations in future network capacity increases

Broadband is a two-sided market:

  • Apps and Broadband is a two-sided market.
    • Both, applications and bandwidth are needed by consumers
    • Without applications, users will not consume the bandwidth, because they have nothing interesting to do
    • Without bandwidth, users will not use applications, because they’ll be too slow
    • Hence both have to be promoted simultaneously
  • How should a two-sided market be handled?
    • Usually, one side should to be subsidized so it can grow and help the other grow
    • e.g. Somebody needs to break this cycle and grow one side of this market, so that the other can then grow
    • For example, Google (an app/content provider) is buying fiber and providing 1Gbps connection in Kansas for $70 per month. Thus Google is subsidizing the bandwidth increase, and hopes that the users and apps will increase in proportion.
  • Regulatory and Policy implications
    • Two ways to handle this:
      • Ex Ante: come up with regulations and policies before problems occur
        • Because lawsuits are expensive
        • US is trying to do this – they have exempted mobile providers from net neutrality principles
        • Netherlands has passed net neutrality regulations – first country in the world. Mobile operators are not allowed to disallow or discriminate against services like Skype
        • Rest of Europe: public consultations going on
      • Ex Post: Let the problems occur and then figure out how to deal with them
  • Net Neutrality and India
    • No mention of net neutrality in the NTP (National Telecom Policy 2012)
    • Fair Usage Policy (FUP)
      • Is against net neutrality (maybe)
      • It discriminates against users, but does not discriminate against applications
      • But it is indirect discrimination against applications – because users who use BitTorrent and other bandwidth heavy applications will be more affected by FUP
      • Affects innovation – because users are discouraged from using innovative, bandwidth heavy applications

Event Report: The Work and Impact of Bob Kahn and Vint Cerf

(This is a liveblog of the Turing100@Persistent Lecture on Bob Kahn and Vint Cerf by R. Venkateswaran, CTO of Persistent Systems. Since it is being typed as the event is happening, it is not really well structured, but should rather be viewed as a collection of bullet points of interesting things said during the talk.)

Vint Cerf and Bob Kahn

Vint Cerf: Widely known as the father of the internet. He is President of the ACM, Chief Internet Evangelist at Google, Chairman of the ICANN and many other influential positions. In addition to the Turing Award, he has also received the Presidential Medal of Freedom in 2005 and was elected to the Internet Hall of Fame in 2012.

Bob Kahn: Worked at AT&T Bell Labs, MIT, then while working with BBN, he got involved with the DARPA and Vint Cerf and they together worked on packet switching networks, and invented the IP and TCP.

The birth of the internet: TCP and IP. 70s and 80s.

  • The Internet:

    • The first 20 years:
      • Trusted network
      • Defense, Research and Academic network
      • Non-commercial
      • Popular apps: email, ftp, telnet
    • Next 20 years:
      • Commercial use
      • Multiple levels of ownership – increased distrust and security concerns
      • Wide range of apps: email, WWW, etc
  • What did Vint Cerf and Bob Kahn do?

    • The problem:
      • There were many packet switched networks at that time
      • But very small, limited and self contained
      • The different networks did not talk to each other
      • Vint Cerf and Bob Kahn worked on interconnecting these networks
    • The approach

      • Wanted a very simple, and reliable interface
      • Non-proprietary solution. Standardized, non-patented, “open”
      • Each network talked its own protocol, so they wanted a protocol neutral mechanism of connecting the networks.
      • Each network had its own addressing scheme, so they had to invent a universal addressing scheme.
      • Packets (information slices) forwarded from one host to another via the “internetwork”
      • Packets sent along different routes, no guarantees of in-order delivery. Actually no guarantee of delivery
      • Packets have sequence numbers, so end point needs to reassemble them in order
    • The protocol

      • A “process header” identifies which process on the end host should be delivered the packets. This is today called the “port”
      • Retransmissions to ensure reliable delivery. And duplicate detection.
      • Flow control – to limit number of un-acknowledged packets, prevent bandwidth hogging
      • A conceptual “connection” created between the end processes (TCP), but the actual network (IP) does not know or understand this
      • Mechanism to set up and tear down the “connection” – the three-way handshake
      • This are the main contributions of their seminal paper
    • The Layered Network Architecture
      • Paper in 1974 defining a 4 layered network model based on TCP/IP.
      • This later became the basis of the 7 layer network architecture
    • The Internet Protocol
    • Packet-switched datagram network
    • Is the glue between the physical network and the logical higher layers
    • Key ideas:
      • Network is very simple
      • Just route the packets
      • Robust and scalable
      • Network does not guarantee any thing other than best effort
        • No SLA, no guarantee of delivery, not guarantee of packet ordering
      • Dumb network, smart end-host
      • Very different from the existing, major networks of that time (the “circuit-switched” telephone networks of that time)
      • No state maintained at any node of the network
    • Advantages
      • Can accommodate many different types of protocols and technologies
      • Very scalable
    • The Transport Layer
    • UDP
      • Most simplistic higher level protocol
      • Unreliable, datagram-based protocol
      • Detect errors, but no error corrections
      • No reliability guarantees
      • Great for applications like audio/video (which are not too affected by packet losses) or DNS (short transactions)
    • TCP
      • Reliable service on top of the unreliable underlying network
      • Connection oriented, ordered-stream based, with congestion and flow control, bi-directional
      • State only maintained at the end hosts, not at the intermediate hosts

Internet 2.0 – Commercialization

  • The birth of the world wide web: late 80s early 90s
    • Tim Berners-Lee came up with the idea of the the world-wide-web
    • 1993: Mosaic, the first graphical web browser
    • First Commercial ISP (Internet Service Provider) – Dial up internet
    • Bandwidth doubling every 6 months
    • Push for multi-media apps
  • Push for higher bandwidth and rich apps
    • Net apps (like VoIP, streaming video) demand higher bandwidth
    • Higher bandwidth enables other new applications
    • Apps: email, email with attachments, streaming video, intranets, e-commerce, ERP, Voice over Internet, Interactive Video Conferencing
  • Dumb Network no longer works
    • Single, dumb network cannot handle all these different applications
    • Next Generation Networks evolved
    • Single, packet-switched network for data, voice and video
    • But with different levels of QoS guarantees for different services
  • Clash of Network Philosophies: BellHeads vs NetHeads (mid-90s)
    • Two major approaches: the BellHeads (circuit switched Telephone background), and the NetHeads (from the IP background)
    • BellHeads philosophy: network is smart, endpoints are dumb; closed, proprietary communities; expect payment for service; per-minute charges; Control the evolution of the network; want strong regulations
    • NetHeads philosophy: network is dumb, endpoints are smart; open community; expect cheap or free services; no per-minute charges; want network to evolve organically without regulations.
    • These two worlds were merging, and there was lots of clashes
    • BellHead network example: Asynchronous Transfer Mode (ATM) network
      • Fixed sized packets over a connection oriented network
      • Circuit setup from source to destination; all packets use same route
      • Low per-packet processing at each intermediate node
      • Much higher speeds than TCP/IP (10Gbps)
      • A major challenge for the NetHeads
    • Problems for NetHeads
      • To support 10Gbps and above, each packet needs to be processed in less than 30ns, which is very difficult to do because of all the processing needed (reduce TTL, lookup destination address, manipulate headers, etc)
      • As sizes of networks increased, sizes of lookup tables increased
      • Almost read to concede defeat
    • IP Switching: Breakthrough for NetHeads
      • Use IP routing on top of ATM hardware
      • Switch to ATM circuit switching (and bypass the routing layer) if a long-running connection detected.
      • Late 90s, all IP networking companies started implementing variations on this concept
    • MPLS: Multi-Protocol Lable Switching
      • Standard developed by IP networking companies
      • Insert a layer between TCP and IP (considered layer 2.5)
      • Separates packet forwarding from packet routing
      • Edges of the network do the full IP routing
      • Internal nodes only forward packets, and don’t do full routes
      • Separate forwarding information from routing information, and put forwarding info in an extra header (MPLS label – layer 2.5)
      • MPLS Protocol (mid-97)
        • First node (edge; ingress LSR) determines path, inserts MPLS label header
        • Internal nodes only look at MPLS label, and forwards appropriately, without doing any routing and without looking at IP packet
        • Last node (edge; egress LSR) removes the MPLS label
        • Label switching at intermediate nodes can be implemented in hardware; significant reduction in total latency
      • MPLS is now basis of most internet networking

Internet 3.0: The Future

End of the network centric viewpoint. (Note: These are futuristic predictions, not facts. But, for students, there should be lots of good project topics here.)

  • Problems with today’s internet
    • Support for mobility is pretty bad with TCP/IP.
    • Security: viruses, spams, bots, DDOS attacks, hacks
      • Internet was designed for co-operative use; not ideal for today’s climate
    • Multi-homing not well supported by TCP/IP
      • Change in IP address results in service disruption
      • What if you change your ISP, your machine, etc?
      • Cannot be done seamlessly
    • Network is very machine/ip centric (“Where”)
      • What is needed are People-centric networks (“Who”) and content centric (“What”)
      • IP address ties together identity and location; this is neither necessary, nor desirable
  • Three areas of future research:
    • Delay Tolerant Network (DTN) Architecture
      • Whenever end-to-end delay is more than a few 100 milliseconds, various things start breaking in today’s networks
      • DTN’s characterized by:
        • Things that are not always connected to the network. For example, sensor networks, gadgets, remote locations. Another Example: remote villages in Africa have a bus visiting them periodically, and that gives them internet access for a limited time every day.
        • Extremely Long Delays
        • Asymmetric Data Rates
        • High Error Rates
      • Needs a store-and-forward network
    • Content-centric Networks
      • Instead of everything being based on IP-address, how about giving unique identifiers to chunks of content, and define a networking protocol based on this
      • Strategy: let the network figure out where the content is and how to deliver it
      • Security: the content carries the authorization info, and unauthorized access is prevented
    • Software Defined Networks
      • Virtualizing the Network
      • Search the net for: “OpenFlow”
      • Hardware Router only does packet forwarding, but end applications can update the routing tables of the router using the OpenFlow protocol. App has a OpenFlow controller that sends updates to the OpenFlow agent on the Hardware Router.
      • In the hardware/OS world, virtualization (VMWare, Xen, VirtualBox) are slowly taking over; OpenFlow is a similar idea for network hardware
      • Oracle, VMWare have had major acquisitions in this space recently

Turing100 Lecture: Vint Cerf + Bob Kahn – “Fathers of the Internet” – 8 Sept

Vinton Cerf and Robert Kahn invented TCP and IP, the two protocols at the heart of the internet, and are hence considered the “Fathers of the Internet”. For this and other fundamental contributions, they were awarded the Turing award in 2004.

On 8th September, R. Venkateswaran, CTO of Persistent Systems, will give a talk on the life and work of Vint Cerf and Bob Kahn as a part of the Turing 100 Lecture Series organized by Persistent in Pune on the first Saturday of every month (although this month it was shifted to the second Saturday).

In addition, this Saturday’s event will also feature a talk on “Net Neutrality: The Supply and Demand Side Perspective” by Dr. V Sridhar, a research fellow with Sasken.

About the Turing Awards

The Turing awards, named after Alan Turing, given every year, are the highest achievement that a computer scientist can earn. And the contributions of each Turing award winner are then, arguably, the most important topics in computer science.

About Turing 100 @ Persistent Lecture Series

This year, the Turing 100 @ Persistent Lecture Series will celebrate the 100th anniversary of Alan Turing’s birth by having a monthly lecture series. Each lecture will be presented by an eminent personality from the computer science / technology community in India, and will cover the work done by one Turing award winner.

The lecture series will feature talks on Ted Codd (Relational Databases), Vint Cerf and Robert Kahn (Internet), Ken Thompson and Dennis Ritchie (Unix), Jim Gray, Barbara Liskov, and others. Full schedule is here

This is a lecture series that any one in the field of computer science must attend. These lectures will cover the fundamentals of computer science, and all of them are very relevant today.

Fees and Registration

This is a free event. Anyone can attend.

The event will be at Dewang Mehta Auditorium, Persistent Systems, SB Road, from 2pm to 5pm on Saturday 8th September. This event is free and open for anybody to attend. Register here

VCCircle Event: Meet investors from all over the country: Aug 23

(Normally, PuneTech does not promote paid events, unless under special circumstances. In this case we are making an exception because we believe that the event is likely to be interesting for tech entrepreneurs in Pune, and also because the organizers have promised a 30% discount for PuneTech readers.)

VCCircle is one of the top forums/platforms in the country for investing, funding, VC activity in startups in India. And, given the amount of activity in the technology startup space in Pune, and the number of techies quitting their jobs starting companies, the need for networking with investors is the need of the hour. VCCircle is organizing a “Investment Forum” at Le Meridien, Pune, on 23rd August.

What to expect at the event?

Approximately 200 participants consisting of established and emerging entrepreneurs, CEOs, PE/VC investors, bankers.

Look here for the detailed agenda.

Who should attend this event?

This forum will give an excellent platform for a POCC/Punetech members to network with the leading investors of the country. These investors will have varied portfolios and will be from all three categories, Angel, VCs and PEs. The badges will be color coded so entrepreneurs can easily decide whom to talk to.

The agenda will cover what investments have happened in Pune already, which sectors are seeing lots of investments, and which sectors are expected to see more activity in the future. There will be special enclosed speed pitching session with investors for selected companies.

How much does it cost?

It costs Rs. 6000 for Entrepreneurs/Companies/Startups. However there is a 30% discount for PuneTech readers. Use the discount code VCCPE30. (If you are a banker/Investor/Consultant, it will cost you Rs. 9000. You can still get the PuneTech discount – use the discount code VCCNPE30.)

Registration details are here. For questions, other details, or group discounts contact register@vccircle.com or call Kanika / Sandeep at 0120-4171111.

Click here for more details of the event

Event Report: Sham Navathe on E.F. Codd

(This is a liveblog of Sham Navathe’s lecture on E.F. Codd as part of the Turing 100 @ Persistent lecture series.)

Sham Navathe does not really need an introduction – since he is famous for his book “Fundamentals of Database Systems” written by with Ramez Elmasri, which is prescribed for undergraduate database courses all over the world. His full background can be looked up in his Wikipedia page, but it is worth mentioning that Navathe is a Punekar, being a board topper from Pune in his HSC. He regularly visits Pune since he has family here.

Today’s lecture is about E.F. Codd, a British mathematicians, logician, and analyst, who was given the Turing award in 1981 for his invention of the relational databases model. He is one of the 3 people to have received a Turing award for work in databases (the other two being Charlie Bachman in 1973 for introducing the concept of data structures and the network database model, and Jim Gray in 1998 for his work on database transaction processing research and technical leadership in system implementation.)

Ted Codd studied in Oxford, initially studying Chemistry, before doing a stint with the Royal Air Force and then getting degree in Maths. He later emigrated to US, worked in IBM, did a PhD from University of Michigan, and finally went back to IBM. At that time, he led the development of the world’s first “multiprogramming system” – sort of an operating system.

Codd quit IBM in 1984 because he was not happy with the “incomplete implementation of the relational model.” He believed that SQL is a “convenient” and “informal” representation of the relational model. He published rules that any system must follow before it could be called a relational database management system, and complained that most commercial systems were not really relational in that sense – and some were simply thin pseudo-relational layer on top of older technology.

Invention of the Relational Model

In 1963-64, IBM developed the IMS database management system based on the hierarchical model. In 1964-65 Honeywell developed IDS, based on the network model. In 1968, Dave Childs of Michigan first proposed a set-oriented database management system. In 1969 Codd published “The derivability, redundancy, and consistency of relations stored in large databases” (IBM research report, RJ-599, 1969). This was the work that led to the seminal paper, “A Relational Model for Large Shared Data Banks” (CACM, 13:6, June 1970). Other classic papers are: “Extending the Relational Model to capture more meaning” (ACM TODS, 4:4, Dec 1979), which is called the RM/T model. He is also the inventor of the term OLAP (Online Analytical Processing).

After Codd’s proposal of the relational model, IBM was initially reluctant to commercialize the idea. Instead, Michael Stonebraker of UC-Berkeley along with PhD students created INGRES, the first fully relational system. INGRES ultimately became Postres database which is one of the leading open source databases in the world today. In the meantime, Relational Software Inc. brought another relational database product to the market. This ultimately became Oracle. After this, IBM heavily invested in System R that developed the relational DBMS ideas fully. Codd was involved in the development of System R – and most of the fundamental ideas and algorithms underlying most modern RDBMS today are heavily influenced by System R.

Interesting RDBMS developments after Codd’s paper:

  • 1975: PhD students in Berkeley develop an RDBMS
  • 1976: System R first relational prototype system from IBM
  • 1979: First proposal for a cost based optimizer for database queries
  • 1981: Transactions (by Jim Gray)
  • 1981: SQL/DS First commercial RDBMS

Two main motivations for the relational model:

    • Ordering dependence
    • Indexing dependence
    • Access path dependence

    In DBMS before RDBMS, there was a heavy dependence of the program (and programmer) on the way the data is modeled, stored and navigated:

    All of this was hardcoded in the program. And Codd wanted to simplify database programming by removing these dependencies.
    – Loss of programmer productivity due to manual optimization.

Codd’s fundamental insight was that freeing up the application programmer from knowing about the layout of the data on disk would lead to huge improvements in productivity. For example, in the network or hierarchical models, a data model in which a Student has a link to the Department that he is enrolled in, is very different from a model in which each Department links to all the students that are enrolled there. Depending upon which model is used, all application programs would be different, and switching from one model to another would be difficult later on. Instead, Codd proposed the relational model which would store this as the Student relation, the Department relation, and finally the Enrolment relation that connects Student and Department.

The Relational Data Model

A relation is simply an unordered set of tuples. On these relations, the following operations are permitted:

  • Permutation
  • Projection
  • Join
  • Composition

Of course, other operators from set theory can be applied to relations, but then the result will not be a relation. However, the operations given above take relations and the results are also relations. Thus, all the relational operators can again be applied to the results of this operation.

He defined 3 types of relations:

  • Expressible: is any relation that can be expressed in the data model, using the data access language
  • Named: is any relation that has been given a specific name (i.e. is listed in the schema)
  • Stored: is a relation that is physically stored on disk

He also talked about 3 important properties of relations:

  • Derivability: A relation is derivable if it can be expressed in terms of the data access language (i.e. can be expressed as a sequence of relational operations)
  • Redundancy: A set of relations is called strongly redundant if one of the relations can be derived from the other relations. i.e. it is possible to write a relational operation on some of the relations of the set whose result is the same as one of the other relations. A set of relations is weakly redundant if there is a relation in that set which has a projection that is derivable from the other relations. Good database design entails that strongly redundant sets of relations should not be used because of problems with inconsistency. However, weakly redundant relations are OK, and used for performance purposes. (Materialized views.)
  • Consistency / Inconsistency: Codd allowed the definition of constraints governing the data in a set of relations, and a database is said to be consistent if all the data in the database satisfies those constraints, and is said to be inconsistent if not.

In the years that followed, a bunch of IBM research reports on normalization of databases followed.

Turing Award Lecture

His talk is titled: “Relational Databases: A Practical Foundation for Productivity”. His thoughts at that time:

  • Put users in direct touch with databases
  • Increase productivity of DP professionals in developing applications
  • Concerned that the term “relational” was being misued

He points out that in relational data model, data can be addressed symbolically, as “relation name, primary key value, attribute name”. This is much better than embedding links, or positional addressing (X(i, j)).

The relational data model encompasses structure, manipulation and integrity of data. Hence, it is a complete model, because all 3 aspects are important for data management.

Characteristics of relational systems:

  • MUST have a data sub-language that allows users to query the data using SELECT, PROJECT and JOIN operators
  • MUST NOT have user visible navigation links between relations
  • MUST NOT convey any information in the way tuples are ordered

He was worried that relational system might not be able to give performance as good as the performance of non-relational systems. He talked about:

  • performance oriented data structures
  • efficient algorithms for converting user requests into optimal code

In future work, he mentioned the following

  1. Domains and primary keys
  2. Updating join-type views
  3. Outer-joins
  4. Catalogs
  5. Design aids at logical and physical level
  6. Location and replication transparency in distributed databases
  7. Capture meaning of data
  8. Improved treatment of missing, null and inapplicable values
  9. Heterogeneous data

This was a remarkably prescient list. In the 30 years since this talk, most of this has actually happened either in commercial databases or in research labs. We have pretty much achieved #1 to #6, while #7 to #9 have seen a lot of research work but not wide commercial acceptance yet.

Concluding Thoughts

  • Relational model is a firm foundation for data management. Nothing else compares.
  • On this foundation we were able to tackle difficult problems in the areas of design, derivability, redundancy, consistency, replication as well as language issues. All of these would have been very difficult otherwise
  • Proponents of NoSQL databases as well as map-reduce/hadoop type of systems need to keep in mind that large data management cannot really be done in an ad hoc manner.
  • Codd’s RM/T model was an attempt to capture metadata management, but fell short of what was needed.

Audience Questions

Q: Why did object-oriented databases not catch on?

A: There was a lack of understanding amongst the wider community as to the best way of using object-oriented ideas for data management. OODBMS companies were not really able to really educate the wider community, and hence failed. Another problem is that object-oriented DBMS systems made the data model complex but there were not corresponding developments in the query language and optimization tools.

Q: When the relational model was developed, did they constrain themselves due to the hardware limitations of those days?

A: Codd did mention that when deciding on a set of operations for the relational model, one consideration was ‘Which of these operations can be implemented on today’s hardware’. On the other hand, there were lots of companies in the 80s which tried to implement specialized hardware for relational/DBMS. However, none of those companies really succeeded.

In the future, it is very unlikely that anyone will develop a new data model with improvements in hardware and processing power. However, new operators and new ways of parallelizing them will certainly be developed.

Q: There are areas of data management like incomplete, inexact data; common sense understanding of data; deduction and inferencing capabilities. These are not really handled by today’s DBMS systems. How will this be handled in the future.

A: There have been many interesting and elegant systems proposed for these areas, but non have seen commercial success. So we’ll have to wait a while for any of these to happen.

Will be updated every 15 minutes. Please refresh regularly.

SLP-Pune: Startup Leadership Program for Entrepreneurs

The Startup Leadership Program, a global entrepreneur discussion/training group now has Pune Chapter. Apply before 7th Aug here – http://www.startupleadership.com

The Startup Leadership Program (SLP) is a selective, training program for entrepreneurs who are or want to be startup CEOs, and be connected to a global network. SLP Fellows have founded over 300 breakthrough startups including Duron Energy, Gharpay, ixigo, Innoz, Momelan, Runkeeper, SideTour, Shareaholic, Solar Junction, Ubersense, Savored, Voicetap, and have won many awards.

The program brings together about 25 high impact entrepreneurs in every city to coach and create the next generation CEOs. The unique curriculum is co-designed by Serial Entrepreneurs, Academicians, Researchers, VCs, and Analysts. The complete list of Mentors can be seen here http://www.startupleadership.com/main_nav/mentors-2/.

The Startup Leadership Program is a NOT FOR PROFIT entity, registered in US and INDIA.

*What will entrepreneurs get? *

  1. Avoid entrepreneurial mistakes – as you learn from your peer group (that comprises of entrepreneurs from different background) and mentors (serial entrepreneurs, VCs, bankers, etc) and experts.
  2. Get solutions to your growth challenges – as you get feedback from super peer group and mentors.
  3. Connect to VCs/Investors and raise funding. Make real-life pitches.Understand what VC looks for.
  4. Get an understanding of term-sheet, legal aspects and exiting ventures.
  5. Be part of high-profile and high impact SLP Global alumni that will help you to scale up.
  6. Last but not least, make friends with entrepreneurs, as all know – its lonely at the top.

As a testimonial, from the class of Mumbai Chapter 2012-2013, 3 people
raised capital (with total to tune of USD 5 Mn!). One of them is part of
SLP Pune organizing team.

Commitments and pedagogy

  1. The program runs from mid-Sept to March and requires about 60 hours of time commitment on the part of entrepreneur. Usually there is a session once in every three weeks and on Saturday.
  2. The session comprises of brain storming, role plays, VC pitches, HBR case studies, etc.
  3. There are usually 2-3 mentors per session. You can expect people from Angel Community and VC firms (last year our chapter had mentors from Indian Angels, Mumbai Angels, Cannan Partners, IvyCap Ventures, Blume Ventures, Sequioa, Kanakia Ventures, Anand Rathi Group, Kae Capital, Nexus Capital, etc) and serial entrepreneurs!
  4. SLP is volunteer run and not for profit. The entrepreneur will need to pay fees of 6000 Rs. which cover costs of food, certification and logistics.

It is observed that the SLP Class usually hangs-out for a beer or coffee and develops strong bond.

Venue

For Pune, the sessions will be held at Venture Center, NCL Innovation Park.

Contact

For any details/doubts on SLP Pune, please contact Dhiraj Khot (dhirajk@gmail.com) +91-9850 682 789

POCC Event: Pune’s June Software’s Y-Combinator Experience – 4th Aug

Tomorrow, you will get a chance to hear a Pune startup tell its story of how they got into the famous Y-Combinator program, what it was like to spend a few months in Silicon Valley, and tips for other entrepreneurs. This is the usual Pune Open Coffee Club event, at the usual place (SICSR, Model Colony), and the usual time (4pm, 1st Saturday of the month).

Paul Graham’s Y-Combinator is arguably one of the most famous startup incubators in the world, and selection into one of Y-Combinator’s batches is a guarantee of visibility and exposure all over the world (or at least in Silicon Valley and the US) for any startup.

Pune’s June Software is the only Pune company to have made it into Y-Combinator – they were part of the Summer 2012 batch of Y-Combinator.

In addition, they were also selected by ImagineK12 – a similar incubator that invests in education startups, with their TapToLearn sub-brand.

Here is more information about June Software in their own words:

We started June Software to create new and innovative products in the areas of E-Learning, Customer Relationship Management and Data Management platforms. In May of 2010 we established our first sub brand – TapToLearn.com to focus exclusively on building learning textbooks, workbooks for English and Maths for Age Groups 9 – 13. We were the pioneers of introducing touch for learning via the unique model of combining three senses for learning including sight, sound and for the first time – touch.

Our unique approach to e-learning paid off with Apple marking us as a New and Noteworthy Application and our Grammar App becoming the Number 1 Ranked App in the World in multiple countries. We were also among the top earning apps in the US App Store in June 2010.

So make it a point to come for the event:

Date: Saturday, 4th August, 4pm
Venue: 7th floor, SICSR, (Symbiosis Institute of Computer Studies and Research, near Om Market, Model Colony)
Fees: This event is free, and open for anyone to attend. Register here

Lecture on Turing Award Winner Ted Codd (Databases) by Sham Navathe – 4 Aug

Ted Codd was awarded the Turing Award in 1981 for “his fundamental and continuing contributions to the theory and practice of database management systems.” A simpler way to put it would be that Codd was given the award for inventing relational databases (RDBMS).

On 4th August, Prof. Sham Navathe, of Georgia Tech University, who is visiting Pune, will talk about Ted Codd’s work. This talk is a part of the Turing Awards lecture series that happens at Persistent’s Dewang Mehta Auditorium at 2pm on the first Saturday of every month this year.

About the Turing Awards

The Turing awards, named after Alan Turing, given every year, are the highest achievement that a computer scientist can earn. And the contributions of each Turing award winner are then, arguably, the most important topics in computer science.

About Turing 100 @ Persistent Lecture Series

This year, the Turing 100 @ Persistent lecture series will celebrate the 100th anniversary of Alan Turing’s birth by having a monthly lecture series. Each lecture will be presented by an eminent personality from the computer science / technology community in India, and will cover the work done by one Turing award winner.

The lecture series will feature talks on Ted Codd (Relational Databases), Vint Cerf and Robert Kahn (Internet), Ken Thompson and Dennis Ritchie (Unix), Jim Gray, Barbara Liskov, and others. Full schedule is here

This is a lecture series that any one in the field of computer science must attend. These lectures will cover the fundamentals of computer science, and all of them are very relevant today.

Fees and Registration

This is a free event. Anyone can attend.

The event will be at Dewang Mehta Auditorium, Persistent Systems, SB Road, at from 2pm to 5pm on Saturday 4th August. This event is free and open for anybody to attend. Register here

Activities of SEAP – the Software Exporters Association of Pune

SEAP (the Software Exporters Association of Pune), the organization consisting of top software companies of Pune has been very active last year. On 27th July SEAP had its AGM, and at this, Gaurav Mehra, president of SEAP gave a report of his activities. This is a quick capture of his report – and should give an idea of the various SEAP activities in Pune.

These are the major activities of SEAP last year:

  • Advocacy. Represent Pune’s software companies at:
    • RAC Customs,
    • STPI, Hinjewadi
    • ESI Inspector
    • PF Office
  • Ideas Exchange and Education
    • SEAP Book Club meets on the first Saturday of every month – 10am at Sungard Aundh. 13 books have been presented so far, and this will continue
    • Breakfast series – 3rd Wednesday of every month -at Sumant Moolgaokar Auditorium, ICC, SB Road. Cover topics of interest to middle management and higher. Topics covered in the 4 sessions so far – Innovation, Security, etc.
    • Leadership Forum. 15 member companies trained on Crucial Conversations. Atyaasaa did a session on managing human resources in turbulent times.
    • SEAP Education.
  • Collaboration and Connection
    • Working with and expanding the eco-system
    • Working closely with NASSCOM to bring their events to Pune in a much more aggressive manner
    • PuneConnect (done with PuneTech, Pune Open Coffee Club, TiE Pune, and ET Now) put small startups Pune in touch with the established companies.
    • SEAP-Zinnov event.
    • SEAP’s Other collaborations with the ecosystem
      • TiE – exchange invitations and merge calendars
      • IPMA Pune hosted along with the SEAP Book Club
      • PuneTech – exchange of calendars and invitations
      • CSI Pune – exchange of invitations
  • Networking
    • SEAP Golf: Golf tournament and clinic. 50 players. (Thanks Dell computers)
  • Value Added Services
    • Research, communication and partner networks
    • Create SEAP Associate Members – a bunch of companies, who are “recommended” providers of products and services that are of interest to SEAP member companies.
    • Research and Publications: Compensation and Benefits study for Pune by Hexagon
    • Pune Advantage Study by Zinnov
  • Communication
    • Brand new SEAP website – with the help of Aadi Ventures
      • Member areas
      • Features for Colleges
      • Creation of Companies Directories by area
    • Facebook page
    • LinkedIn Group
    • YouTube Page
  • Corporate Social Responsibility
    • Hosted Bhimthadi Jatra in SEAP member companies
    • Supported Students FUEL
  • SEAP Advisory Council:
    • Creation of a SEAP Advisory Council consisting of past SEAP Presidents – Anand Desphande, Nitin Deshpande, Abhijit Atre, Chetan Shah who will advice SEAP on a formal biannual basis.
  • SEAP Ambassador in Silicon Valley Area:
    • Parag Mehta, of QLogic, past president of SEAP, is formally named as the “ambassador” of SEAP in the Silicon Valley. He will be the evangelist for Pune and SEAP there.