Web Scalability and Performance – Real Life Lessons (Pune TechWeekend #3)

Last Saturday, we had TechWeekend #3 in Pune, on the theme of Website Scalability and Performance.  Mukul Kumar, co-founder, and VP of Engineering at Pubmatic, talked about the hard lessons in scalability they learnt on their way to building a web service that serves billions of ad impressions per month.

Here are the slides used by Mukul. If you cannot see the slides, click here.
Web Scalability & Performance

The talk was live-tweeted by @punetechlive and @d7ylive. Here are a few highlights from the talk:

  • Keep it simple: If you cannot explain your application to your sales staff, you probably won’t be able to scale it!
  • Use JMeter to monitor performance, to a good job of scaling your site
  • Performance testing idea: Take 15/20 Amazon EC2 servers, run JMeter with 200threads on each for 10 hours. Bang on your website! (a few days later, @d7y pointed out that using openSTA instead of JMeter can give you upto 500 threads per server even on old machines.)
  • Scaling your application: have a loosely coupled, shared nothing, stateless, distributed architecture
  • Mysql scalability tip: Be careful before using new features, or new versions. Or don’t use them at all!
  • Website scalability: think global. Some servers in California, some servers in London, etc. Similarly, think global when designing your app. Having servers across the world will drive architecture decisions. When half your data-center is 3000 miles from the other half, interesting, non-trivial problems start cropping up. Also, think carefully about horizontal scaling (lots of cheap servers) vs vertical scaling (few big fat servers)
  • memcache tip: pre-populate memcache with most common objects
  • Scalability tip: Get a hardware load balancer (if you can afford one). Amazon AWS has some load-balancers, but they don’t perform so well
  • Remember the YouTube algo for scaling:
    while(1){
    identify_and_fix_bottlenecks();
    eat_drink();
    sleep();
    notice_new_bottleneck();
    }

    there’s no alternative to this.
  • Scalability tip: You can’t be sure of your performance unless you test with real load, real env, real hardware, real software!
  • Scalability tip – keep the various replicated copies of data loosely consistent. Speeds up your updates. But, figure out which parts of your database must be consistent at all times, and which ones can have “eventual consisteny”
  • Hard lessons: keep spare servers at all times. Keep servers independent – on failure shouldn’t affect other servers
  • Hard lessons: Keep all commands in a script. You will have to run them at 2am. Then 3am. Then 7am.
  • Hard lessons: Have a well defined process for fault identification, communication and resolution (because figuring these things out at 2am, with a site that is down, is terrible.)
  • Hard lessons: Monitor your web service from 12 cities around the world!
  • Hard lesson, Be Paranoid – At any time: servers can go down, DDOS can happen, NICs can become slow or fail!

Note: a few readers of of the live-tweets asked questions from Nashik and Bombay, and got them answered by Mukul. +1 for twitter. You should also start following.

Reblog this post [with Zemanta]

1 thought on “Web Scalability and Performance – Real Life Lessons (Pune TechWeekend #3)

Comments are closed.