Tag Archives: Technology

TechWeekend 8: Web Development Frameworks – Rails, Grails, Django

TechWeekend #8 (#tw8) this Saturday will focus on Web Development Frameworks. We have the following talks lined up, and one more is likely to get added in the next day or two

  • Grails, and other web development techniques in Groovy, by Saager Mhatre
  • Interesting new things in Rails3, by Gautam Rege
  • “A Django Case-Study: Use of advanced features of Django in http://wogma.com” by Navin Kabra. This talk will be structured in such a way that people who are not familiar with Python/Django might find the features interesting; while Django developers will be interested in how they were implemented in Django.

TW8 will be this Saturday, 19th March, from 10am to 2pm, at the Sumant Moolgaonkar Auditorium, Ground Floor, A Wing, ICC Trade Center, SB Road.

About Techweekend

TechWeekend Pune is a volunteer run activity. TechWeekend talks are held on the 3rd Saturday of every month from 10am to 2pm at Sumant Moolgaonkar Auditorium, Ground Floor, ICC Trade Center, SB Road. Each TechWeekend event features 3 or 4 talks on advanced technical topics. These events are free for all to attend. See PuneTech articles about past techweekends to get an idea of the events.

Join the techweekend mailing list to keep in touch with the latest TechWeekend activities.

About the Sponsor – Microsoft

Many thanks to Microsoft for sponsoring the venue for Techweekend. Microsoft wants to get more closely involved with the tech community in Pune, and particularly the open source enthusiasts – with the intention of making everybody aware that their cloud technologies (like Azure) actually play well with open source, and that you can deploy your php applications, your drupal/joomla installs on Azure.

When, Where, How much

TechWeekend #8 will be on Saturday, 19th March, from 10am to 2pm, at Sumant Moolgaonkar Auditorium, Ground Floor, Wing A, ICC Trade Center, SB Road.

This event is free and open for anybody to attend. Register here.

Event Report: “Amplify Mobility” event on Mobile Tech at Bharati Vidyapeeth

This Saturday, there was a day-long event on mobile technology organized by Amplify Mindware, a group of educational institutions housed at Bharati Vidyapeeth. This is a live-blog of three of the talks at the event; (which is unfortunately being posted with a 2-day delay because my internet connection did not work at the venue.) There were other talks which haven’t been captured either because I missed the talk (as I had to leave early), or it wasn’t interesting enough, or they were student presentations that were appropriate for the audience but not for this blog (and my own presentation on “Mobile Technology Trends” that I couldn’t blog).

Anyway, here are my notes on 3 of the talks:

Enhance Education’s talk about MyOpenCampus portal and the e-Pad tablet

Lots of people have lots of ideas on how to improve education. And most of them will not work because that’s not what the students want. Amit Sharma of Enhance Education claims that the right approach is to ask the students what they want. And his research indicates that students just want answers to their questions without having to ask the question in highly public forums.

Enhance Education’s MyOpenCampus product tries to fill this requirement. It is an educational social network and content portal that provides content specifically for your degree, your discipline, your year. Basically it is a general educational social network, but it has groups for the specific classes your taking, information about your curriculum, notes, and other study material, and groups of your classmates for interaction. The difference between a general repository of such information from the internet and this repository is that this is found and uploaded by your teacher or fellow-students and validated by the teacher. This is further supported by question & answer forums and discussion forums.

All of this is delivered to the students via the e-pad, a 7-inch resistive touchscreen Android tablet, which is smaller than a book, is always on, is always connected to the internet, and is cheap. It’s portable, can be used for accessing all the data from MyOpenCampus, all the documents, the study material, audio/video lectures, and it can also be used as an entertainment device. It will cost about Rs. 6000, and the first batch will go to Amplify Mindware students in June 2011.

Binoy Samuel from Digital Spice:

Media companies, design companies, publishing companies, gaming companies are all moving to mobile platforms from their usual medium. This is a huge market opportunity.

He gave a few examples of apps that have come through this route. One of the interesting example was of a book called “Bio-replenishment” on Bone Health, which has lots of information about the health of your bones, and what causes problems, and how it effects you. The book was expensive, at $50, and still not compelling enough for readers. They converted it to an iPad app with a lot of 3-D animations to explain the issues, and that is a much better format for this material.

In this space, there are opportunities in healthcare, animation, wildlife, e-Publishing, e-Learning, and retail design.

Anthony Hsiao from Sapna Solutions:

On why becoming a mobile developer is cool:

  • Because it is new and exciting and unknown
  • Because it is a completely new way to try and benefit the bottom of the pyramid
  • Because you can develop things and immediately try them on yourself
  • Because you can use maths and science along with computer science when developing for mobiles (e.g. accelerometer, gps, etc.)
  • Because you want to build things for users. Real users. Non-techies. Kids. And cats.

Mobile development is fast, always moving, high pressure. It’s a lot of hard work. It is not for everyone. Choose wisely.

What will be big in mobile:

  • Money transfer
  • Location based services
  • Mobile search
  • Mobile browsing
  • Mobile health monitoring
  • Mobile payment
  • Near field communication services
  • Mobile advertising
  • Mobile instant messaging
  • Mobile music

Anuj Tandon from Rolocule Games:

Quote: “I was a techie first. Then Infosys made me a donkey. Then I quit to join Rolocule and became a techie again”

Mobile Gaming is a hot area.

In Asia, mobile gaming industry will grown at 73% CAGR.

The biggest entertainment launch this year was not a movie, it was a mobile game. Consumers are willing to pay for quality mobile games ($9.99 per game). There is already good M&A activity in amongst mobile games development companies since 3 years of the launch of the Apple AppStore. e.g. Ngmoco acquired by DeNA, Tapolous acquired by Disney. Freeverse acquired by NGmoco. VCs have already made investments of over $100 in iPhone gaming related companies.

India, gaming industry is worth $7.9B in 2009 and will grow to $32B by 2014. Globally, gaming industry will grow 18%, but in India it will grow 32%.

TechWeekend LiveBlog: NoSQL + Database in the Cloud #tw6

This is a quick-and-dirty live-blog of TechWeekend 6 on NoSQL and Databases in the Cloud.

First, I (Navin Kabra) gave an overview of NoSQL systems. Since I was talking, I wasn’t able to live-blog it.

When not to use NoSQL

Next, Dhananjay Nene talked about when to not use NoSQL. Main points:

  • People know SQL. They can leverage it much faster, than if they were to use one of these non-standard interfaces of one of these new-fangled systems.
  • When reporting is very important, having SQL is much better. Reporting systems support SQL. Re-doing that with NoSQL will be more difficult.
  • Consistency, and Transactions are often important. Going to NoSQL usually involves giving them up. And unless you are really, really sure you don’t need them, this issue might come and bite you.
  • If you’re considering using NoSQL, you better know what the CAP theorem is; you better really understand what C, A, and P in that mean; don’t even consider NoSQL until you’re very well versed with these concepts
  • RDBMS can really scale quite a lot – especially if you optimize them well. So 90% of the time, it is very likely that the RDBMS is good enough for your situation and you don’t need NoSQL. So don’t go for NoSQL unless you are really sure that your RDBMS wont scale.

MongoDB the Infinitely Scalable

Next up is BG, talking about MongoDB, the Infinitely Scalable. They are using MongoDB in production for http://paisa.com (Infinitely Beta). The main points he made:

  • Based on the idea that JSON is a well understood format for data, and it is possible to build a database based on JSON as the primary data structuring format.
  • The data is stored on disk using BSON, a binary format for storing JSON
  • Obviously, JavaScript is the natural language for working with MongoDB. So you can use JavaScript to query the database, and also for “stored procedures”
  • MongoDB it does not really allow joins; but with proper structuring of your data, you will not need joins
  • You can do very rich querying, deeply nested, in MongoDB
  • MongoDB has native support for ‘sharding’ (i.e. breaking up your data into chunks to be spread across multiple servers). This is really difficult to do.
  • MongoDB is screaming fast.
  • It is free and open source, but it is also backed by a commercial company, so you can get paid support if you want. There are hosting solutions (including free plans) where you can host your MongoDB instances (e.g. http://mongohq.com)
  • You store “documents” in MongoDB. Since you can’t really do joins, the solution is to de-normalize your data. Everything you need should be in the one document, so you don’t need joins to fetch related data. e.g. if you were storing a blog post in MongoDB, you’ll store the post, all its meta-data, and all the comments in a single document.

MongoDB Use Cases:

  • Great for “web stuff”
  • High Speed Logging (because MongoDB has extremely fast writes)
  • Data Warehousing – great because of the schema flexibility
  • Storing binary files via GridFS – which are queryable!

MongoDB is used in production by these popular services:

FourSquare recently had a major unplanned downtime – because they did not understand how to really MongoDB. That underscores the importance of understanding the guarantees given by your NoSQL system – otherwise you could run into major problems including downtime, or even data loss. See this blog post for more on the FourSquare outage

Some stats about use of MongoDB at paisa.com. 54 million documents. 80GB of data. 6GB of indexes. All of this on 2 nodes (master-slave setup).

Redis

Gautam Rege now talking about his experiences with Redis. Main points made:

  • Redis is a key-value database with an attitude. Nothing more.
  • Important feature: in (key, value), the value can be a list, hash, set.
  • 1 million key lookups in 40ms. Because it keeps data in memory.
  • Persistence is lazy – save to disk every x seconds. So you can lose data in case of a crash. So you need to be sure that your app can handle this.
  • Redis is a “main memory database” (which can handle virtual memory – so your database does not really have to fit in memory)
  • All get and set operations on Redis are atomic. A lot of concurrency problems and race conditions disapper because of atomicity.
  • Sets in redis allow union, intersection, difference. Accessed like a hash.
  • Sorted sets combine hashes and arrays. Can lookup by key, but can also scan sequentially.
  • Redis allows real-time publish-subscribe.
  • Redis is simple. Redis is for specific small applications. Not intended for being the general purpose database for your app. Use where it makes sense. For example:
    • Lots of small updates
    • Vote up, vote down
    • Counters
    • Tagging. Implementing a tagging solution is a pain – becomes easy with Redis
    • Cross-referencing small data
  • Don’t use Redis for ORM (object-relational mapping)
  • Don’t use Redis if memory is limited
  • Sites like digg use Redis for tagging

SQL Azure

Saranya Sriram talking about SQL Azure and data in the cloud. SQL Azure is pretty much SQL Server in the cloud, retrofitted for for the cloud:

  • Exposes a RESTful interface
  • Has language bindings for python, rails, java, etc.
  • Gives full SQL / Relational database in the cloud
  • The standard tools used to access SQLServer locally can also be used to access SQL Azure from the cloud
  • For Azure you get a cloud simulation on your local machine to develop and test your application. For SQL Azure, you simply test with your local SQL Server edition. If you don’t have a SQL Server license, you can download SQL Server Express, which is free.
  • You can develop applications in Microsoft Visual Studio. You can incorporate PHP also in this.
  • You can also use Eclipse for developing applications.
  • SQL Azure has a maximum size limit of 50GB. (Started with 1 GB last year)
  • There is no free plan for Azure. You have to play. “Enthusiasts” can use it free for 180 days. If you sign up for the Bizspark program (for small startups, for the first 3 years) it is free. Similarly students can use it for free by signing up for the DreamSpark program. (Actually, the Bizspark and DreamSpark programs give you free access to lots of Microsoft software.)

LiveBlog #tw5: Intro to Functional Programming & Why it’s important

This is a live-blog of TechWeekend 5 on Functional Programming. Please keep checking regularly, this will be updated once every 15 minutes until 1pm.

Why Functional Programming Matters by Dhananjay Nene

Dhananjay Nene started off with an introductory talk on FP – what it is, and why it is important.

FP is a language in which functions have no side-effects. i.e., the result of a function is purely dependent on its inputs. There is no state maintained.

Effects/Implications of “no side effects”

  • Side-effects are necessary: FP doesn’t mean completely side-effect free. If you have no side-effects, you can’t do IO. So, FP really means “largely side-effect free”. Specifically, there are very few parts of the code that have side-effects, and you know exactly which those are.
  • Testability: Unit Testing becomes much easier. There are no “bizarre interactions” between different parts of the code. “Integration” testing becomes much easier, because there are no hidden effects.
  • Immutability: There are no “variables”. Once a value has been assigned to a ‘name’, that value is ‘final’. You can’t change the value of that ‘name’ since that would be ‘state’ and need ‘side-effects’ to change it.
  • Lazy Evaluation: Since a function always produces the same result, the compiler is free to decide when to execute the function. Thus, it might decide to not execute a function until that value is really needed. This gives rise to lazy evaluation.
  • Concurrency control is not so much of a problem. Concurrency control and locks are really needed because you’re afraid that your data might be modified by someone else while you’re accessing it. This issue disappears if your data is immutable.
  • Easier parallelization: The biggest problem with parallelizing programs is handling all the concurrency control issues correctly. This becomes a much smaller problem with FP.
  • Good for multi-core: As the world moves to multi-core architectures, more and more parallelism will be needed. And humans are terrible at writing parallel programs. FP can help, because FP programs are intrinsically, automatically parallelizable.

Another important feature of functional programming languages is the existence of higher order functions. Basically in FP, functions can be treated just like data structures. They can be passed in as parameters to other functions, and they can be returned as the results of functions. This makes much more powerful abstractions possible. (If you know dependency injection, then higher-order functions are dependency injection on steroids.)

FP gives brevity. Programs written in FP will typically be much shorter than comparable imperative programs. This is probably because of higher-order functions and clojures. Compare the size of the quicksort code in Haskell vs. Java at this page

You need to think differently when you start doing functional programming.

Think different:

  • Use recursion or comprehensions instead of loops
  • Use pattern matching instead of if conditions
  • Use pattern matching instead of state machines
  • Information transformation instead of sequence of tasks
  • Software Transactional Memory FTW!

Advantages of FP:

  • After initial ramp-up issues, development will be faster in FP
  • Code is shorter (easier to read, understand)
  • Clearer expression of intention of developer
  • Big ball of mud is harder to achieve with pure functions. You will not really see comments like “I don’t know why this piece of code works, but it works. Please don’t change it.”
  • Once you get used to FP, it is much more enjoyable.
  • Faster, better, cheaper and more enjoyable. What’s not to like?

The cost of doing FP:

  • Re-training the developers’ brains (this is a fixed cost). Because of having to think differently. Can’t just get this from books. Must do some FP programming.
  • You can suffer from a lack of third-party libraries(?), but if you pick a language like Clojure which sits on the JVM, then you can easily access java libraries for the things that don’t exist natively in your language.

Should a company do it’s next project in a functional programming language? Dhananjay’s recommendation: start with small projects, and check whether you have the organizational capacity for FP. Then move on to larger and larger projects. If you’re sure that you have good programmers, and there happens to be a 6-month project for which you’re OK if it actually becomes a 12-month project, then definitely do it in FP. BG’s correction (based on his own experience): the 6-month project will only become a 8-month project.

Some things to know about Erlang by Bhasker Kode

Bhasker is the CEO of http://hover.in. They use Erlang in production for their web service.

Erlang was created in 1986 by developers at Ericsson for their telecom stack. This was later open-sourced and is now a widely used language.

Erlang is made up of many “processes”. These are programming language constructs – not real operating system processes. But otherwise, they are similar to OS processes. Each process executes independently of other processes. Processes do not share any data. Only message passing is allowed between processes. There are a number of schedulers which schedule processes to run. Normally, you will have as many schedulers as you have cores on your machine. Erlang processes are very lightweight.

Garbage collection is very easy, because as soon as a process dies, all its private date can be garbage collected because this is not shared with anyone else.

Another interesting thing about Erlang is that the pattern matching (which is used in all functional programming languages) can actually match binary strings also. This makes it much easier to deal with binary data packets.

Erlang has inbuilt support and language features for handling failures of processors, and which process takes over the job and so on, supervisor processes, etc.

Erlang allows you to think beyond for loops. Create processes which sit around waiting for instructions from you. And then the primary paradigm of programming is to send a bunch of tasks to a bunch of processes in parallel, and wait for results to come back.

Some erlang applications for developers:

  • Webservers built in erlang: Yaws, mochiweb, nitrogen, misultin
  • Databases built in erlang: amazon simpledb, riak, couch, dynomite, hibari, scalaris
  • Testing frameworks: distil, eunit, quickcheck, tsung

Who is using erlang? Amazon (simpledb), Facebook (facebook chat), microsoft, github, nokia (disco crawler), ea (the games company), rabbitmq (a messaging application), ejabberd (the chat server, which has not crahsed in 10 years). Indian companies using erlang: geodesic, http://hover.in.

How Clojure handles the Expression Problem by Baishampayan Ghose

If you’ve gone deep into any programming language, you will find a reference to lisp somewhere. So, every programmer must be interested in lisp. To quote Eric Raymond:

LISP is worth learning for the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot.

BG had conducted a 2 day Clojure tutorial in Pune a few months back, and he will happily do that again if there is enough interest. This talk is not about the basics of Clojure. It is talking about a specific problem, and how it is solved in Clojure, in the hope that it gives some interesting insights into Clojure.

Clojure is a dialect of lisp. And the first thing that anybody notices about lisp is all the parantheses. Don’t be afraid of the parantheses. After a few days of coding in lisp, you will stop noticing them.

Clojure has:

  • first-class regular expressions. A # followed by a string is a regular expression.
  • arbitrary precision integers and doubles. So don’t worry about the data-type of your numbers. (It internally uses the appropriately sized data types.)
  • code as data and data as code. Clojure (and lisp) is homoiconic. So lisp code is just lists, and hence can be manipulated in the program by your program to create new program constructs. This is the most ‘difficult’ and most powerful part of all lisp based languages. Google for “macros in lisp” to learn more. Most people don’t “get” this for a long time, and when they “get” lisp macros, the suddenly become very productive in lisp.
  • has a nice way to attach metadata to functions. For example, type hints attached to functions can help improve performance
  • possibility of speed. With proper type-hints, Clojure can be as fast as Java

_(Sorry: had to leave the talk early because of some other commitments. Will try to update this article later (in a day or two) based on inputs from other people.)

Clojure, Erlang, & Functional Programming – Intro to FP & Why It’s Important – TechWeekend5 18 Dec

Have you heard of Clojure, Erlang, Scala, F# and wondered why people are getting all excited about these new fangled languages? Then this is your chance to find out. And if you are a programmer or are otherwise working in the software technology space and have not heard any of those names, then you need to start reading more, and you certainly need to attend this TechWeekend5 in Pune this Saturday. Register for the event here.

Vayana Services and TechWeekend Pune presents a detailed session on Functional Programming this Saturday, 18th December from 10am to 1pm, at Sumant Moolgaonkar Auditorium, MCCIA in ICC Trade Tower (A Wing, Ground floor), S.B. Road. You must attend.

Object-Oriented Programming is now passe, and all the cool kids (i.e. the star programmers) have started looking very seriously at functional programming languages like Clojure and Erlang. The more visionary ones (like our speakers this week: Dhananjay Nene, Bhasker Kode, and Baishampayan Ghose) are building the next generation of products in these languages.

Find out the What, the Why and the How on Saturday.

There will be three talks, listed below, and some time for general discussions around this topic.

Why you should care about functional programming – by Dhananjay Nene

This talk will focus on important characteristics of functional programming and the current landscape in terms of variety of languages and its adoption. The talk will also refer to how leveraging it can help you in terms of brevity, concurrency, better abstractions, testability, economics and particularly enjoyability. A small part of the talk will also focus very superficially on the Scala programming language.

About the Speaker – Dhananjay Nene

Dhananjay is a passionate programmer and a consulting software architect. He loves to learn, research, prototype and deploy new technologies and languages even as he is strongly focused on ensuring that the choices are made consistent with the business objectives and landscape. He currently writes code for and advises Vayana Enterprises in his role as its Chief Architect.

An Introduction to Erlang – by Bhasker Kode

While ideating hover.in towards the end of 2007 Bhasker soon become an ardent evangelist of Erlang and it’s fault tolerant nature traditionally intended for use in telecom & messaging circles. Following it’s rising use in building real-time and low-latency applications at web scale Bhasker has presented Hover’s erlang growth stories at Commercial Users of Functional Programming Conference in Edinburgh along with Facebook, Erlang Factory in London, and Foss.in in Bangalore talking about the role of functional programming. Hover’s engineering efforts can be tracked at http://developers.hover.in

About the Speaker – Bhasker Kode

Bhasker is the CEO and Co-Founder of Pune-based Hover Technologies, a user-engagement platform that allows web publishers to add a new channel of earning ad revenue through the use of in-text “tooltip” based ads. He has always been captured by the potential of the internet as part of the core team behind several destination portals and startups from his college days in Chennai. His introduction to functional programming came from his stint as the first few developers at Bangalore based Tutorvista where he built the calendar, syndication, whiteboard among other products used by thousands across the world everyday.

Clojure & its solution to the Expression Problem – Baishampayan Ghose

The “Expression Problem” arises when we want to add new functionality to a library that we don’t control. Most popular programming languages accomplish this task by Monkey Patching, Wrapper Classes, etc. In this talk, BG will discuss the demerits of traditional approaches to the problem and how Clojure solves this problem using Protocols. This talk is intended to show-off the real power of Clojure in solving complex problems.

BG has chosen to talk about a particular feature of Clojure in depth instead of skimming over many things in a hurry because he believes that Clojure’s approach to solving the Expression Problem clearly demonstrates the thought process that has gone into designing the language and shows how it’s different from most other programming languages. I will also cover the very basics of reading Clojure code in just a few minutes which will also demonstrate the simplicity of the language itself.

About the Speaker – Baishampayan Ghose

Baishampayan Ghose (mostly known as BG) is the co-founder & CTO of http://Paisa.com. He has been a career Functional Programmer and has programmed professionally in Common Lisp, Clojure & now Erlang.

About the Sponsor – Vayana Services

Vayana Services offers an easier option for small and medium enterprises to obtain working capital financing from banks by electronically sourcing, transferring and tracking digitally signed trade documents across trading parties and banks. It is a financial service backed by a cloud based offering with its development and operations management team based in Pune. With a strong belief that healthy businesses are greatly assisted by using healthy technology, Vayana Services looks forward to an increasingly frequent and high quality interaction within the software technology community in Pune and welcomes you all to Techweekend 5.

Logistics

This event is free for all to attend, but please register here. The event is in MCCIA’s Sumant Moolgaokar Auditorium, ICC Towers, Wing A, Ground Floor. From 10am-1pm. The hashtag for the event is #tw5

Top 5 things to worry about when designing a Cloud Based SaaS

(This article on things you need to be careful when designing the architecture of a cloud based Software-as-a-Service offering is a guest post by Mukul Kumar, who, as SVP of Engineering at Pubmatic has a lot of hands-on experience with having designing, building and maintaining a very high performance, high scalability cloud-based service.)

Designing a SaaS software stack poses challenges that are very different from the considerations for host-based software design. The design aspects for performance, scalability, reliability of SaaS with lots of servers and lots of data is very different and interesting from designing a software that is installed on a host and is used by that host.

Here I list the top 5 design elements for Cloud Based SaaS.

High availability

SaaS software stack is built on top of several disparate elements. Most of the times these elements are hosted by different software vendors, such as Rackspace, Amazon, Akamai, etc. The software stack consists of several layers, such as – application server, database server, data-mining server, DNS, CDN, ISP, load-balancer, firewall, router, etc. Highly availability of SaaS actually means thinking about the high availability of all or most of these components. Designing high availability of each of these components is a non-trivial exercise and the cost shoots up as you keep on adding layers of HA. Such design requires thinking deeply about the software architecture and each component of the architecture. Two years back I wrote an article on Cloud High Availability, where I described some of these issues, you can read it here.

Centralized Manageability

As you keep on adding more and more servers to your application cluster the manageability gets hugely complex. This means:

  • you have to employ more people to do the management,
  • human errors would increase, and
  • the rate at which you can deploy more servers goes down.

And, don’t just think of managing the OS on these servers, or these virtual machines. You have to manage the entire application and all the services that the application depends on. The only way to get around this problem is to have centralized management of your cluster. Centralized management is not an easy thing to do, since every application is different, making a generalized management software is oversimplifying the problem and is not a full solution.

Online Upgradability

This is probably the most complex problem after high availability. When you have a cluster of thousands of hosts, live upgradability is a key requirements. When you release a new software revision, you need to be able to upgrade is across the servers in a controlled way, with the ability of rolling it back whenever you want – at the instant that you want, across the exact number of servers that you want. You would also need to control database and cache coherency and invalidation across the cluster is a controlled way. Again, this cannot be solved in a very generic way; every software stack has its own specificity, which needs to be solved in its own specific ways.

Live testability

Testing your application in a controlled way with real traffic and data is another key aspect of SaaS design. You should be able to sample real traffic and use it for testing your application without compromising on user experience or data integrity. Lab testing has severe limitations, especially when you are testing performance and scalability of your application. Real traffic patterns and seasonality of data can only be tested with real traffic. Don’t start your beta until you have tested on real traffic.

Monitor-ability

The more servers and applications that you add to your cluster the more things can fail and in very different ways. For example – network (NIC), memory, disk and many other things. It is extremely important to monitor each of these, and many more, constantly, with alarms using different communication formats (email, SMS, etc.). There are many online services that can be used for monitoring services, and they provide a host of difference services and have widely varying pricing. Amazon too recently introduced CloudWatch, which can monitor various aspects of a host such as CPU Utilization, Disk I/O, Network I/O etc.

As you grown your cluster of server you will need to think of these design aspects and keep on tuning your system. And, like the guys at YouTube said:

Recipe for handling rapid growth

    while (true)
    {
        identify_and_fix_bottlenecks();
        drink();
        sleep();
        notice_new_bottleneck();
     }

About the Author – Mukul Kumar

Mukul Kumar is the Co-Founder & Senior Vice President Engineering at PubMatic. PubMatic, an online advertising company that helps premium publishers maximize their revenue and protect their brands online, has its Research & Development center in Pune.

Mukul is responsible for PubMatic’s Engineering team and resides in Pune, India. Mukul was previously the Director of Engineering at PANTA Systems, a high-performance computing startup. Before that he was at VERITAS India, where he joined as the 13th employee and helped it grow to over 2,000 individuals. Mukul has filed for 14 patents in systems software, storage software, and application software. Mukul is a graduate of IIT Kharagpur with a degree in Electrical Engineering.

Mukul is very passionate about technology, and building world-class teams. His interests include architecting scalable and high-performance web-applications, handling and mining massive amounts of data and system & storage architecture.

Mukul’s email address is mukul at pubmatic.com.

DocType HTML5 – Free 1-day conference on html5/css in Pune – Dec 4

DocType HTML5 is a one day conference on HTML5, CSS3 and related technologies. This is a free, technology-focused event aimed at helping folks get started with HTML5 as a rich application platform.

DocType HTML5 will be held in Pune this Saturday, December 4, from 9am to 6pm, at COEP (College of Engineering, Pune).

This is a free event, and anybody can attend. You need to register here

The first edition of DocType HTML5 was in Bangalore on October 9. A full report of that edition is available here. That should give you an idea of what the conference is about. The schedule and list of speakers for the Pune event haven’t yet been finalized, but the talks are likely to be similar. Each edition has a different set of speakers and is customized around the interests of participants. After you register, you will be asked to pick the topics you’re interested in. They will customize the sessions and find subject matter experts based on your choices.

About the Organizers – HasGeek.in

DocType HTML5 is organized by HasGeek, a new initiative focused on creating high quality community-driven technology events.

HasGeek was created by Kiran Jonnalagadda after he realized that technology events “by the community” could be improved significantly if someone else were to take over the job of logistics and of finding sponsorships. That way, the community could focus on the content. This is what HasGeek does. It is a private company that organizes the DocType HTML5 conference in various cities (and will presumably start organizing other tech conferences in the future). The conferences are free for anybody to attend, and HasGeek takes care of the logistics (venue, lunch, tea/coffee, registration) and getting enough sponsorships to pay for all of that.

What I really like about HasGeek is the professional way in which the it appears to be run. Poke around on the HasGeek wiki to understand what I mean. See the detailed updates on the feedback received from previous conferences. Look up the conference router project.

What should you do

If you would like to attend, register here.

If you would like to be a speaker get in touch with Kiran at HasGeek.in.

Android/iPhone/BlackBerry/Nokia – Which platform(s) should developers target

(I attended the IndicThreads Conference on Mobile Application Development today. This article is based on presentations made there and conversations I had with some of the presenters.)

The smartphones market is very fragmented.

In 3Q2010, Symbian had 37% of the smartphone market, Android was second with 25% (it was at 2% 18 months ago), and iOS in third place with 16%. RIM (Blackberry) was next. Windows was losing.

So, what should a developer do? Which to target?

I talked to Romin Irani of Xoriant about this problem, and whether HTML5 is the answer to these issues. My key takeaway’s from this conversation were:

  • HTML5 is here already. I was under the impression that HTML5 is something that will arrive sometime in the near future. Romin pointed out that HTML5 support is pretty good even today, especially if you’re thinking of mobile phone browsers.
  • But HTML5 not the answer to all your problems. If you need access to device sensors, you’re probably better off with a native app. If you want access to the appstore/marketplace, then you need a native app. HTML5 doesn’t qualify!
  • If you’re a new startup, and you want to build a mobile app, what should you do? These are the guidelines:
    • If you don’t need device sensors, and don’t need to be in the appstore/marketplace, strongly consider a HTML5+CSS+JavaScript app
    • If you want to go after the US market, you must have an iPhone native app. (Maybe followed by Android)
    • If you want to go after Europe market, then you will need to have a Nokia based native app, just for the sheer numbers they have

Rohit Nayak of Talentica had talked about the use of cross-platform app development frameworks like Titanium and PhoneGap. Both allow you to write apps in JavaScript. Titanium cross-compiles them to native apps on each platform. PhoneGap uses a modified version of the browser so that your app is HTML+CSS+JavaScript, but there are modifications that allow you to access native phone features (like sensors).

There are some limitations, and such apps aren’t as good as native apps.

So, would he really recommend the use of PhoneGap/Titanium for developing apps? Rohit had this to say:

  • Titanium and PhoneGap are rapidly getting better and better. More and more apps built using them are showing up on the android marketplace.
  • If you already know JavaScript, and need to get to the market quickly, you should definitely consider using one of these tools
  • If you don’t really need advanced native features of any specific platform, then it makes a lot of sense to go this route
  • If you are a software outsourcing company that’s building apps for third parties, you should seriously considering building a team that uses Titanium. For most of your customers, you’ll be able to quickly complete an app that satisfies them. Otherwise, you’re faced with a nightmare – you’ll need to build teams with expertise in each of the major platforms, and this is almost impossible to do with today’s attrition.

The last few points seem very similar to the advantages of HTML5, so I asked Rohit whether PhoneGap/Titanium had any advantages over HTML5. Answer:

  • PhoneGap/Titanium generally support more native features than HTML is planning on supporting
  • An app built Titanium/PhoneGap can go on the appstore/marketplace.
  • An HTML5 app necessarily requires you to have a “cloud” presence – a web server and an API, and supporting all the online connections. PhoneGap/Titanium application does not require any of that.

Live-Blog: Overview of High Performance Computing by Dr. Vipin Chaudhary

(This is a live-blog of Dr. Vipin Chaudhary talk on Trends in High Performance Computing, organized by the IEEE Pune sub-section. Since this is being typed while the talk is going on, it might not be as well organized, or as coherent as other PuneTech articles. Also, links will usually be missing.)

Dr. Vipin Chaudhary, CEO of CRL
Live-blog of a talk by Dr. Vipin Chaudhary, CEO of CRL, on High Performance Computing at Institute of Engineers, Pune. CRL are the makers of Eka, one of the world's fastest privately funded supercomputers. For more information about HPC and CRL, click on the photo above.
Myths about High Performance Computing:

  • Commonly associated with scientific computing
  • Only used for large problems
  • Expensive
  • Applicable to niche areas
  • Understood by only a few people
  • Lots of servers and storage
  • Difficult to use
  • Not scalable and reliable

This is not the reality. HPC is:

  • Backbone for national development
  • Will enable economic growth. Everything from toilets to potato chips are designed using HPC
  • Lots of supercomputing is throughput computing – i.e. used to solve lots of small problems
  • “Mainstream” businesses like Walmart, and entertainment companies like Dreamworks Studioes use HPC.
  • _(and a bunch of other reasons that I did not catch)

China is really catching up in the area of HPC. And Vipin correlates China’s GDP with the development of supercomputers in China. Point: technology is a driver for economic growth.  We need to also invest in this.

Problems solved using HPC:

  • Movie making (like avatar)
  • Real time data analysis
    • weather forecasting
    • oil spill impact analysis
    • forest fire tracking and monitoring
    • biological contamination prediction
  • Drug discover
    • reduce experimental costs through simulations
  • Terrain modeling for wind-farms
    • e.g. optimized site selection, maintenance scheduling
    • and other alternate energy sources
  • Geophysical imaging
    • oil industry
    • earthquake analysis
  • Designing airplanes (Virtual wind tunnel)

Trends in HPC.

The Manycore trend.

Putting many CPUs inside a single chip. Multi-core is when you have a few cores, manycore is when you have many, many cores. This has challenges. Programming manycore processors is very cumbersome. Debugging is much harder. e.g. if you need to get good performance out of these chips then you need to do parallel, assembly programming. Parallel programming is hard. Assembly programming is hard. Both together will kill you.

This will be one of the biggest challenges in computer science in the near future. A typical laptop might have 8 to 10 processses running concurrently. So there is automatic parallelism, as long as number of cores is less than 10. But as chips get 30, 40 cores or more, individual processes will need to be parallel. This will be very challenging.

Oceans of Data but the Pipes are Skinny

Data is growing fast. In sciences, humanities, commerce, medicine, entertainment. The amount of information being created in the world is huge. Emails, photos, audio, documents etc. Genomic data (bio-informatics) data is also huge.

Note: data is growing way, way faster than Moore’s law!

Storing things is not a problem – we have lots of disk space. Fetching and finding stuff is a pain.

Challenges in data-intensive systems:

  • Amount of data to be accessed by the application is huge
  • This requires huge amounts of disk, and very fat interconnects
  • And fast processors to process that data

Conventional supercomputing was CPU bound. Now, we are in the age of data-intensive supercomputing. Difference: old supercomputing had storage elsewhere (away from the processor farm). Now the disks have to be much closer.

Conventional supercomputing was batch processed. Now, we want everything in real-time. Need interactive access. To be able to run analytic and ad hoc queries. This is a new, and difficult challenge.

While Vipin was faculty in SUNY Buffalo, they started an initiative for data-intensive discovery initiative (Di2). Now, CRL is participating. Large, ever-changing data sets. Collecting and maintaining data is of course major problem, but primary focus of Di2 is to search in this data. e.g. security (find patterns in huge logs user actions). This requires a new, different architecture from traditional supercomputing, and the resulting Di2 system significantly outperforms the traditional system.

This also has applications in marketing analysis, financial services, web analytics, genetics, aerospace, and healthcare.

High Performance Cloud Services at CRL

Cloud computing makes sense. It is here to stay. But energy consumption of clouds is a problem.

Hence, CRL is focusing on a green cloud. What does that mean?

Data center optimization:

  • Power consumption optimization on hardware
  • Optimization of the power system itself
  • Optimized cooling subsystem
  • CFD modeling of the power consumption
  • Power dashboards

Workflow optimization (reduce computing resource consumption via efficiencies):

  • Cloud offerings
  • Virtualizations
  • Workload based power management
  • Temperature aware distribution
  • Compute cycle optimization

Green applications being run in CRL

  • Terrain modeling
  • Wind farm design and simulation
  • Geophysical imaging
  • Virtual wind tunnel

Summary of talk

  • Manycore processors are here to stay
    • Programmability have to improve
    • Must match application requirements to processor architecture (one size does not fit all)
  • Computation has to move to where the data is, and not vice versa
  • Data scale is the biggest issue
    • must co-locate data with computing
  • Cloud computing will continue to grow rapidly
    • Bandwidth is an issue
    • Security is an issue
    • These issues need to be solved

6 events in next 4 days: science, maths, cleantech, IP and open source

The events in Pune in the next four days are a great example of the diversity of Pune in the “science and technology” sector. Far too often, we assume that technology means software technology, but Pune does have much more. NCL is one of the top institutes in the country for chemical technology, and has a history of coming up with chemical science breakthroughs that make it into commercial products. Today, a scientist from NCL will give a talk on the patent and other intellectual property issues that scientists and small businesses should know about. The Bhaskaracharya Pratishthana is a great institute of Mathematics, and it regularly schedules very interesting talks for people interested in Mathematics. (And if you’re a software engineer who is not interested in Mathematics, you should be ashamed of yourself.) Monday will have a talk on probiotics – the use of bacteria and other micro-organisms in industrial waste treatements and other cleantech. And by the way, if you’re interested in finding out what other world-class institutions Pune has, (and it’s a huge number!), check out PuneTech’s top ranked websites of Pune page.

Click on the logo to get all PuneTech articles about events in Pune
Click on the logo to get all PuneTech articles about events in Pune

And all of this is in addition to our usual talks on open source (the Pune Linux Users Group), issues for small startups (the Pune Open Coffee Club), and Microsoft Technologies (the Pune User Group).

This weekend – try to get exposure to a different science & technology community than the one you normally hang out with.

Here are the details:

Jul 3, 2010: Ancient Indian Combinatorial Methods – by Prof Sridharan CMI at Bhaskaracharya Pratisthan

Posted: 29 Jun 2010 11:08 PM PDT

Professor Sridharan, Chennai Mathematical Institute, Chennai, will
give a lecture at Bhaskaracharya Pratishthana.

Topic: Differences in Style but not in Substance: Ancient Indian
Combinatorial Methods

This lecture is free for all to attend. No registration required.

Jul 5, 2010: PuneCleanTech event: Probiotic applications in CleanTech at Venture Center, NCL Innovation Park

Posted: 29 Jun 2010 09:21 PM PDT

PuneCleanTech is proud to present an enlightening talk on ‘Probiotics in CleanTech’ on July 5th, 2010 at 4:00pm at the NCL Innovation Center. The talk will be presented by Dr. Pillai, a renowned authority on the subject. This event is supported by Fusiontech Ventures and NCL Venture Center.

As you know, Probiotics is the use of beneficial micro-organisms to increase the health, vitality and efficiency of various animal processes. The same techniques can be applied to Industrial activity in areas such as soil remediation, effluent treatment, waste management etc. The talk will focus on such applications of Probiotics.

The talk will be suitable for all entities that are actively dealing with such technologies (such as Praj) or might benefit from their applications to industrial and municipal waste management. As a result, institutions such as MCCIA and Pune Municipal Corporation might benefit from this talk. If you agree, please canvass it within your or affiliated organizations.

This broad-ranging talk should be interesting also for concerned citizenry (such as ecological society) and the scientific/technological elites (such as NCL), as well as, educational and research institutes.

As always, the talk is free but the seating is limited to first 60 people. There is no RSVP and the seating will be on a ‘first at the door gets the first chair’ basis 🙂

Jul 2, 2010: Venture Center’s IP Center Event: IP overview by Dr. Tiwari of NCL IP Gropu at Venture Center, NCL Innovation Park

Posted: 29 Jun 2010 09:18 PM PDT

Dr. Nitin Tiwari, a scientist with NCL and part of the NCL IP Group will talk about Intellectual Property. The focus will be general awareness of IP for small and medium enterprises.

This is a free event . It is open to all

Jul 3, 2010: POCC Meet: “Contracts and Intellectual Property” at GrubShup

Posted: 29 Jun 2010 09:15 PM PDT

Are the following significant problem areas for your startup?
* Non-payment from clients who have already taken delivery (ITES, other domains)
* Intellectual Property (trademark violations, copyright enforcement)
* Industry Ethics, price cutting by competitors (who then don’t deliver quality)

Our next meetup is focused on how entrepreneurs deal with these issues.

Attending Counsels:
Kaushik Kute http://www.linkedin.com/pub/kaushik-kute/8/b26/1bb

This is a free event. Anybody can attend. Register here: http://punestartups.ning.com/events/event/show?id=1988582%3AEvent%3A35767&xg_source=msg_invite_event

Jul 3, 2010: Pune Linux Users Group – Monthly Meeting at Symbiosis Institute of Computer Studies and Research

Posted: 29 Jun 2010 09:11 PM PDT

PLUG meeting for July is scheduled on Saturday 3rd July, 4 pm @ SICSR

These are the details:
Location: SICSR, Atur Centre, Model Colony.
Room No 704. 7th floor ( room no. may change )
Time: 4 pm

Agenda:

1. We will have a talk on distributed version control and TeamGit by
Abhijit Bhopatkar. Abhijit Bhopatkar is the author of TeamGit
(http://www.devslashzero.com/teamgit).
Audience: Anyone interested in version control
(http://en.wikipedia.org/wiki/Revision_control), TeamGit, and/or
contributing to an interesting Qt project.
2. Open discussion and QA session

This event is free for all to attend. No registration required.

Jul 3, 2010: Microsoft Community Tech Day at Shekhar Natu Hall

Posted: 29 Jun 2010 07:24 PM PDT

Agenda:
9:00am – 9:30am Registration
9.30am – 9.45am Tea Break
09.45am – 10.00am Keynote
10:00am – 11:00am What’s new in Windows Server 2008 R2 SP 1 – Aviraj Ajgekar
11.00am – 12.00pm Setting Up Remote Access Service on Windows Server 2008 R2 for VPN – Dev Chaudhari
12.00pm – 01.00pm Lunch
01.00pm – 02.00pm Introduction to Forefront Identity Manager 2010 – Mayur Deshpande
02.00pm – 03.00pm Deploying application using Application Virtualization (App-V) – Ninad Doshi
03:00pm – 04:00pm Tea Break & Networking

This event is free for all to attend. Register here: http://www.communitytechdays.com/Registration1.aspx?Status=NotFound&login=offline