All posts by Navin Kabra

Hi-Tech Pune Maharashtra 2008 Conference – Day 1

June 18, 2008Events, Live Blogginganimation, biotech, conference, innovation, puneNavin Kabra

Hi-Tech Pune Maharastra Conference 2008

The Hi-Tech Pune Maharashtra 2008 conference got underway today. Organized by Suresh Kalmadi backed Pune Vyaspeeth, this is the 5th installment of this conference, and in addition to IT, the focus this time is on Bio-Technology and Animation. The conference is spread out over three days (18th June to 20th June) and there is a fairly interesting schedule of presentations by a diverse set of speakers.

I am live-blogging this conference so, 1) refresh on a regular basis if you’re reading this on Wednesday evening (Pune time), and 2) please excuse the terse and ungrammatical language.

The event is being live-webcast by the organizers. Go to the Pune Vyaspeeth homepage and click on the broadcast link at the bottom of the page.

The first day is mostly talks by dignitaries – Suresh Kalmadi, Jyotiraditya Scindia (Minister of State for Communications & IT Government of India), Dr. Ashok Kolaskar (VC UoP), Narayan Murthy, Dr. K. I. Varaprasad Reddy (MD, Shantha Biotechnics).

The talks:

Missed talks by Deepak Shikarpur, Suresh Kalmadi and Dr. Kolaskar
Anand Khandekar Director Pune Development Center & Chief Mentor NVIDIA: “Animation is going to be the next big thing. Especially in Maharashtra and Pune. And it is not restricted to the elite – it will create jobs for the rural sector too. The government must extend the same incentives for the budding animation industry as it did for IT industry earlier”
Mr. Ashish Kulkarni, CEO, BIG Animation: “Animation for a bunch of recent movies was done in Pune. Dashavatar, Golden Compass. All of the animation for the upcoming Krishna movie will be in Pune.
Lifetime achievement award for Narayan Murthy
Lifetime achievement award for Dr. Reddy.
Dr. Reddy heard comments that India was a beggar country begging for vaccines from the west. At that time one of the vaccines (I forgot which one) cost $28 – completely out of the reach of most of the poor Indians. Stung by the criticism, he gave up his career in Electronics and started Shanta Biotech. He tried to acquire the technology and was told by the company that recombinant DNA technology was so far ahead of the capabilities of Indian scientists that it would take them 20 years to absorb the technology – and hence there was no point in transfering the technology to India. Miffed, Dr. Reddy hired local scientists and developed the technology indigeneously in about 5 years and introduced it at a price of Rs. 50. Today it sells for Rs. 20.
But Dr. Reddy worries that the situation today is less than ideal. Due to the booming IT sector and the huge salaries offered there, people are no longer opting for careers in sciences. (At least not people that you would actually want to hire.)
Jyotiraditya Scindia: is a great speaker. Spoke very well about innovation. Spoke about India’s tradition of innovation and education. Said that in modern times, our temples should be the IITs and other great educational institutions. Spoke about the need for greater collaboration between industry and educational institutions. I am not doing justice to his speech – maybe someone else who attended will do that tomorrow.
William A. Haseltine, President, Haseltine Foundation: India is not a subset of the world. India is a representative of the world. You have everything, from large business, and high tech to tribals and poverty. You solve the problems in India and you solve the problems of the world.

The scheduled presentations are over and I am heading off to the “networking dinner”. I hadn’t expected on getting an invitation for this conference, so I have not really made place in my calendar for the attending the next two days of the conference. I might drop in for a couple of hours each day, but can’t stay the whole day. If anybody reading this is attending the conference and would like to write a report on the sessions, please let me know.

Seminar on OpenSocial presented by Google – 20 July

June 16, 2008EventsOpenSocial, webNavin Kabra

(This information was sent in by PuneTech reader Shardul Mohite)

CSI Pune presents
What: Seminar on Google’s OpenSocial platform
When: 6:30pm to 8:30pm, Friday 20th June, 2008
Where: Persistent Systems, “Bhageerath”, SB Road
Who can attend: Free for CSI members, Rs. 100 for others. Registration will be at the venue

Details:

Writing Apps for Orkut and Other OpenSocial Containers

The web is better when it’s social

The web is more interesting when you can build apps that easily interact with your friends and colleagues. But with the trend towards more social applications also comes a growing list of site-specific APIs that developers must learn.

OpenSocial defines a common API for social applications across multiple websites. With standard JavaScript and HTML, developers can create apps that access a social network’s friends and update feeds.

Many sites, one API

A common API means you have less to learn to build for multiple websites. OpenSocial is currently being developed by a broad set of members of the web community. The ultimate goal is for any social website to be able to implement the API and host 3rd party social applications. There are many websites implementing OpenSocial, including Engage.com, Friendster, hi5, Hyves, imeem, LinkedIn, MySpace, Ning, Oracle, orkut, Plaxo, Salesforce.com, Six Apart, Tianji, Viadeo, and XING.

Overview of the talk

What is OpenSocial?
OpenSocial current status
Building an OpenSocial Application
OpenSocial container
What is next for OpenSocial?
Where to find the information about OpenSocial?

About the speakers

Rajdeep Dua

Rajdeep is with Google Developer API Evangelism team working on OpenSocial Advocacy. He has around 10 years of experience in Middleware, Web Services and Integration space. Before joining Google he was leading development effort for CSF: Connected Services Framework Initiative in Microsoft India. He has also contributed to JBoss Open source development in the Past.

Rajdeep holds an MBA from IIM Lucknow, India.

Rohit Ghatol

Rohit is part of Google Developer API Evangelism team working on OpenSocial Advocacy and support. He carries around 6 years of experience in Java Enterprise and Web 2.0 technologies space. Before joining Google he was a Project Manager with a firm in Pune working exclusively on Ajax Technogoly.

Pune engineer honored as RHCE of the year

June 13, 2008Newsawards, linux, open sourceNavin Kabra

From Red Hat News:

The achievements of Red Hat Certified Engineers (RHCEs) from around the world will be honored for the third-consecutive year at the upcoming Red Hat Summit, June 18 – 20 in Boston, Mass. RHCE of the Year awards will be granted to five individuals – one each from the United States, Canada, Asia-Pacific, India and Europe.

The RHCE of the Year award gives Red Hat an opportunity to acknowledge the contributions of five extremely resourceful and capable individuals whose winning submissions highlight the value of the Red Hat certifications for the enterprise, the career and the community. The awards also recognize the contributions of our certified community as a whole. More than 500 RHCEs entered this year’s contest by answering the question: “why should you be considered RHCE of the Year?” in 1,000 words or less.

[…]

Anil Waychal, India’s winner, led his company’s efforts in migrating more than 350 systems to Red Hat Enterprise Linux. The end result was a large cost savings and a significant boost for Suma Soft’s overall security.

Suma Soft is a Pune-based company that provides development and support services in web technologies and security.

Pune OpenCoffee Club Meeting – 14 July

June 12, 2008Eventscommunity, entrepreneurs, meetings, POCC, startups, user groupsNavin Kabra

What: Meeting of the Pune OpenCoffee Club – theme: startup mentoring, incubating and collaborations

When: Saturday, 14th June, 5pm

Where: Barista, Law College Road, opposite IndSearch

This is the (loose) agenda:

Synergising startups through Service Provider – Client relationships or just good old Fashioned Business Collaboration.
Freeman Murray will lead a discussion about the possibility of Y!Combinator style startup incubation in Pune
Hemant Joshi from nFactorial software will talk about his experiences and thoughts on mentoring startups and entrepreneurs in Pune

Data Leakage Prevention – Overview

June 12, 2008In Depthdlp, networks, security, TechnologyNavin Kabra

A few days ago, we posted a news article on how Reconnex has been named a top leader in Data Leakage Prevention (DLP) technology by Forrester Research. We asked Ankur Panchbudhe of Reconnex, Pune to write an article giving us a background on what DLP is, and why it is important.

Data leakage protection (DLP) is a solution for identifying, monitoring and protecting sensitive data or information in an organization according to policies. Organizations can have varied policies, but typically they tend to focus on preventing sensitive data from leaking out of the organization and identifying people or places that should not have access to certain data or information.

DLP is also known by many other names: information security, content monitoring and filtering (CMF), extrusion prevention, outbound content management, insider thread protection, information leak prevention (ILP), etc.

[edit] Need for DLP

Until a few years ago, organizations thought of data/information security only in terms of protecting their network from intruders (e.g. hackers). But with growing amount of data, rapid growth in the sizes of organizations (e.g. due to globalization), rise in number of data points (machines and servers) and easier modes of communication (e.g. IM, USB, cellphones), accidental or even deliberate leakage of data from within the organization has become a painful reality. This has lead to growing awareness about information security in general and about outbound content management in particular.

Following are the major reasons (and examples) that make an organization think about deploying DLP solutions:

growing cases of data and IP leakages
- for example, credit card numbers and social security numbers being leaked inadvertently or being hacked
regulatory mandates to protect private and personal information
- for example, the case of Monster.com losing over a million private customer records due to phishing
protection of brand value and reputation
- see above example
compliance (e.g. HIPAA, GLBA, SOX, PCI, FERBA)
- for example, Ferrari and McLaren engaging in anti-competitive practices by allegedly stealing internal technical documents
internal policies
- for example, Facebook leaking some pieces of their code
profiling for weaknesses
- Who has access to what data? Is sensitive data lying on public servers? Are employees doing what they are not supposed to do with data?

[edit] Components of DLP

Broadly, the core DLP process has three components: identification, monitoring and prevention.

The first, identification, is a process of discovering what constitutes sensitive content within an organization. For this, an organization first has to define “sensitive”. This is done using policies, which are composed of rules, which in turn could be composed of words, patterns or something more complicated. These rules are then fed to a content discovery engine that “crawls” data sources in the organization for sensitive content. Data sources could include application data like HTTP/FTP servers, Exchange, Notes, SharePoint and database servers, repositories like filers and SANs, and end-user data sources like laptops, desktops and removable media. There could be different policies for different classes of data sources; for example, the policies for SharePoint could try to identify design documents whereas those for Oracle could be tuned to discover credit card numbers. All DLP products ship with pre-defined policy “packages” for well-known scenarios, like PCI compliance, credit card and social security leakage.

The second component, monitoring, typically deployed at the network egress point or on end-user endpoints, is used to flag data or information that should not be going out of the organization. This flagging is done using a bunch of rules and policies, which could be written independently for monitoring purposes, or could be derived from information gleaned during the identification process (previous para). The monitoring component taps into raw data going over the wire, does some (optional) semantic reconstruction and applies policies on it. Raw data can be captured at many levels – network level (e.g. TCP/IP), session level (e.g. HTTP, FTP) or application level (e.g. Yahoo! Mail, GMail). At what level raw data is captured decides whether and how much semantic reconstruction is required. The reconstruction process tries to assemble together fragments of raw data into processable information, on which policies could be applied.

The third component, prevention, is the process of taking some action on the data flagged by the identification or monitoring component. Many types of actions are possible – blocking the data, quarantining it, deleting, encrypting, compressing, notifying and more. Prevention actions are also typically configured using policies and hook into identification and/or monitoring policies. This component is typically deployed along with the monitoring or identification component.

In addition to the above three core components, there is a fourth piece which can be called control. This is basically the component using which the user can [centrally] manage and monitor the whole DLP process. This typically includes the GUI, policy/rule definition and deployment module, process control, reporting and various dashboards.

[edit] Flavors of DLP

DLP products are generally sold in three “flavors”:

Data in motion. This is the flavor that corresponds to a combination of monitoring and prevention component described in previous section. It is used to monitor and control the outgoing traffic. This is the hottest selling DLP solution today.
Data at rest. This is the content discovery flavor that scours an organization’s machines for sensitive data. This solution usually also includes a prevention component.
Data in use. This solution constitutes of agents that run on end-servers and end-user’s laptops or desktops, keeping a watch on all activities related to data. They typically monitor and prevent activity on file systems and removable media options like USB, CDs and Bluetooth.

These individual solutions can be (and are) combined to create a much more effective DLP setup. For example, data at rest could be used to identify sensitive information, fingerprint it and deploy those fingerprints with data in motion and data in use products for an all-scenario DLP solution.

[edit] Technology

DLP solutions classify data in motion, at rest, and in use, and then dynamically apply the desired type and level of control, including the ability to perform mandatory access control that can’t be circumvented by the user. DLP solutions typically:

Perform content-aware deep packet inspection on outbound network communication including email, IM, FTP, HTTP and other TCP/IP protocols
Track complete sessions for analysis, not individual packets, with full understanding of application semantics
Detect (or filter) content that is based on policy-based rules
Use linguistic analysis techniques beyond simple keyword matching for monitoring (e.g. advanced regular expressions, partial document matching, Bayesian analysis and machine learning)

Content discovery makes use of crawlers to find sensitive content in an organization’s network of machines. Each crawler is composed of a connector, browser, filtering module and reader. A connector is a data-source specific module that helps in connecting, browsing and reading from a data source. So, there are connectors for various types of data sources like CIFS, NFS, HTTP, FTP, Exchange, Notes, databases and so on. The browser module lists what all data is accessible within a data source. This listing is then filtered depending on the requirements of discovery. For example, if the requirement is to discover and analyze only source code files, then all other types of files will be filtered out of the listing. There are many dimensions (depending on meta-data specific to a piece of data) on which filtering can be done: name, size, content type, folder, sender, subject, author, dates etc. Once the filtered list is ready, the reader module does the job of actually downloading the data and any related meta-data.

The monitoring component is typically composed of following modules: data tap, reassembly, protocol analysis, content analysis, indexing engine, rule engine and incident management. The data tap captures data from the wire for further analysis (e.g. WireShark aka Ethereal). As mentioned earlier, this capture can happen at any protocol level – this differs from vendor to vendor (depending on design philosophy). After data is captured from the wire, it is beaten into a form that is suitable for further analysis. For example, captured TCP packets could be reassembled into a higher level protocol like HTTP and further into application level data like Yahoo! Mail. After data is into a analyzable form, first level of policy/rule evaluation is done using protocol analysis. Here, the data is parsed for protocol specific fields like IP addresses, ports, possible geographic locations of IPs, To, From, Cc, FTP commands, Yahoo! Mail XML tags, GTalk commands and so on. Policy rules that depend on any such protocol-level information are evaluated at this stage. An example is – outbound FTP to any IP address in Russia. If a match occurs, it is recorded with all relevant information into a database. The next step, content analysis, is more involved: first, actual data and meta-data is extracted out of assembled packet, and then content type of the data (e.g. PPT, PDF, ZIP, C source, Python source) is determined using signatures and rule-base classification techniques (a similar but less powerful thing is “file” command in Unix). Depending on the content type of data, text is extracted along with as much meta-data as possible. Now, content based rules are applied – for example, disallow all Java source code. Again, matches are stored. Depending on the rules, more involved analysis like classification (e.g. Bayesian), entity recognition, tagging and clustering can also be done. The extracted text and meta-data is passed onto the indexing engine where it is indexed and made searchable. Another set of rules, which depend on contents of data, are evaluated at this point; an example: stop all MS Office or PDF files containing the words “proprietary and confidential” with a frequency of at least once per page. The indexing engine typically makes use of an inverted index, but there are other ways also. This index can also be used later to do ad-hoc searches (e.g. for deeper analysis of a policy match). All along this whole process, the rule engine keeps evaluating many rules against many pieces of data and keeping a track of all the matches. The matches are collated into what are called incidents (i.e. actionable events – from an organization perspective) with as much detail as possible. These incidents are then notified or shown to the user and/or also sent to the prevention module for further action.

The prevention module contains a rule engine, an action module and (possibly) connectors. The rule engine evaluates incoming incidents to determine action(s) that needs to be taken. Then the action module kicks in and does the appropriate thing, like blocking the data, encrypting it and sending it on, quarantining it and so on. In some scenarios, the action module may require help from connectors for taking the action. For example, for quarantining, a NAS connector may be used or for putting legal hold, a CAS system like Centera may be deployed. Prevention during content discovery also needs connectors to take actions on data sources like Exchange, databases and file systems.

[edit] Going Further

There are many “value-added” things that are done on top of the functionality described above. These are sometimes sold as separate features or products altogether.

Reporting and OLAP. Information from matches and incidents is fed into cubes and data warehouses so that OLAP and advanced reporting can be done with it.
Data mining. Incident/match information or even stored captured data is mined to discover patterns and trends, plot graphs and generate fancier reports. The possibilities here are endless and this seems to be the hottest field of research in DLP right now.
E-discovery. Here, factors important from an e-discovery perspective are extracted from the incident database or captured data and then pushed into e-discovery products or services for processing, review or production purposes. This process may also involve some data mining.
Learning. Incidents and mined information is used to provide a feedback into the DLP setup. Eventually, this can improve existing policies and even provide new policy options.
Integration with third-parties. For example, integration with BlueCoat provides setups that can capture and analyze HTTPS/SSL traffic.

DLP in Reconnex

Reconnex is a leader in the DLP technology and market. Its products and solutions deliver accurate protection against known data loss and provide the only solution in the market that automatically learns what your sensitive data is, as it evolves in your organization. As of today, Reconnex protects information for more than one million users. Reconnex starts with the protection of obvious sensitive information like credit card numbers, social security numbers and known sensitive files but goes further by storing and indexing upto all communications and upto all content. It is the only company in this field to do so. Capturing all content and indexing it enables organizations to learn what information is sensitive and who is allowed to see it, or conversely who should not see it. Reconnex is also well-known for its unique case management capabilities, where incidents and their disposition can be grouped, tracked and managed as cases.

Reconnex is also the only solution in the market that is protocol-agnostic. It captures data at the network level and reconstructs it to higher levels – from TCP/IP to HTTP, SMTP and FTP to GMail, Yahoo! Chat and Live.

Reconnex offers all three flavors of DLP through its three flagship products: iGuard (data-in-motion), iDiscover (data-at-rest) and Data-in-Use. All its products have consistently been rated high in almost surveys and opinion polls. Industry analysts, Forrester and Gartner, also consider Reconnex a leader in their domain.

About the author: Ankur Panchbudhe is a principal software engineer in Reconnex, Pune. He has more than 6 years of R&D experience in domains of data security, archiving, content management, data mining and storage software. He has 3 patents granted and more than 25 pending in fields of electronic discovery, data mining, archiving, email systems, content management, compliance, data protection/security, replication and storage. You can find Ankur on Twitter.

Related articles:

Reconnex named a leader in DLP by Forrester

June 10, 2008Newsanalysts, dlp, networks, security, startupsNavin Kabra

(Newsitem forwarded to punetech by Anand Kekre of Reconnex)

Reconnex, has been named a “top leader” in the data leak prevention space by Forrester in its DLP Q2 2008 report.

DLP software allows a company to monitor all data movements in the company and ensure that “sensitive” data (i.e. intellectual property, financial information, etc.) does not go out of the company. Reconnex and Websense have been named as the two top leaders in this space by Forrester.

Forrester employed approximately 74 criteria in the categories of current offering, strategy, and market presence to evaluate participating vendors on a scale from 0 (weak) to 5 (strong).

Reconnex received a perfect score of 5.0 in the sub-categories of data-in-motion (i.e., the network piece of DLP), unified management, and administration
Reconnex tied for the top scores in the sub-categories of data-at-rest (i.e., discovery) and forensics

“Reconnex offers best-in-class product functionality through its automated classification and analysis engine, which allows customers to sift through the actual data that the engine monitors to learn what is important to protect,” according to the Forrester Wave: Data Leak Prevention, Q2 2008 Report. “This solution stands out because it is the only one that automatically discovers and classifies sensitive data without prior knowledge of what needs to be protected.”

For more information about this award, see Reconnex’ press release.

For more information about Reconnex technology, see the punetech wiki profile of Reconnex.

User groups in Pune & meetings this weekend: Java, Linux, Flex, Ruby

June 5, 2008Eventscommunity, Events, flex, java, linux, meetings, ruby, user groupsNavin Kabra

I just added information about four user groups in Pune to the Groups and Organizations page in the PuneTech wiki: PuneRuby, PuneJava, Pune Linux Users Group (PLUG), and Pune Flex Users Group (PuneFUG). Please take a look at their pages to get an idea of their activities. PLUG and PuneJava have regular meetings. PuneRuby and PuneJava have very active mailing lists. PuneFUG is relatively new, but it looks like they will have regular meetings.

PuneJava has a talk on agile development this Saturday at 6pm. (You should already know this; otherwise subscribe to PuneTech updates!) Just before that, at the same location, PLUG is holding its monthly meeting (see their website for details). Pune Flex Users Group is holding a meeting on Sunday at 5pm (see their wiki for details).

Also, the Pune OpenCoffee Club (for entrepreneurs, and others interested in the startup ecosystem in Pune) is planning on a meeting next weekend. Chip in on that discussion if you want to influence the time/location or agenda.

Update: Rohit points out in the comments that there’s a Pune bloggers lunch on Saturday at 12:30pm.

Agile Development for Java Enterprise Applications – 7 June

June 3, 2008Eventsagile, enterprise, Events, javaNavin Kabra

What: Talk on “Agile Development for Java Enterprise Applications” by Prerna Patil from Oracle Financial Services (formerly i-Flex solutions)

When: Saturday, 7th June 2008, 6.00 pm – 7.30 pm

Where: Symbiosis Institute of Computer Studies and Research (SICSR), 7th floor, Atur Center, Model Colony, Pune

* Entry is free of cost. No registration required. Entry on first come first served basis.

This event is an activity of the PuneJava group.

Session Overview : Agile is a way to quickly develop working applications by focusing on progressive requirement rather than processes. Agile development is done in iterative manner with short requirements, quick builds and frequent releases. Agile methodology when compared to traditional practices like waterfall model, makes development easier, faster and adaptive.

The session would provide a roadmap for building enterprise-class Java applications using agile methods. It would include introduction to agile methodology and when and why it should be used. Various practices used for agile development (Agile Modeling, Agile Draw, Agile estimation) would be discussed. Agile development based Case Study would be drawn using: Light weight technologies like Spring; ORM for database handling; Test Driven Development approach; Build management & Configuration control in concurrent development environment. Session would also include coding practices to make code adaptable to new requirements and tips for using IDE (Eclipse & Netbeans) for agile development.

Speaker Bio : Prerana Patil has over 5 years of experience of working with Java and Java Enterprise Applications. She is currently working in Technology Practice group of Oracle Financial Services (formerly i-flex solutions limited). She is a Masters in Computer Science from UOP and loves exploring the new things in software world. She has been involved in various trainings on Java, Java EE.

* For those not in Pune or unable to attend, add your queries in the comments section at http://www.indicthreads.com/news/1228/agile_development_enterprise_java_meet.html and the organizers promise to get them answered at the meet.

Linux Kernel Internals – 7 June

June 1, 2008Eventskernel, linux, paid, trainingNavin Kabra

Posted on behalf of Anurag Agarwal of KQ Infotech. This is a paid training program. See the end of this article for fees and other logistics. Disclaimer: PuneTech does not accept any remuneration, monetary or otherwise, for publishing content. Postings of a commercial nature (e.g. paid training program) are posted solely on the basis of whether or not they fit in with the charter of PuneTech and whether the readers would find those interesting. Please let me know your views on this issue. I’m posting this one because it involves “deep” technology, I think it would be of interest to a number of PuneTech readers, and because I can recommend the program based on the reputation of the trainers.

Are you planning to make a career in development in Linux kernel but don’t have right skills?
Have you burned your fingers writing your own code in the Linux kernel?

KQ Infotech is launching a unique training and mentoring program for you. There are two parts to this program. First one is the forty hours intensive training and other one a long term mentoring.

Training will provide you a good understanding of all the sub systems of Linux. It will enable you to debug, modify Linux kernel and write various device drivers. There will be a number of theory sessions, practical sessions, peer code reviews and code walk through. All the assignments will be targeted towards specific Linux kernel functionalities.

But it is very unlikely that a forty hours training would be able to make you a good Linux kernel programmer. In the absence of a real Linux kernel project, it is very unlikely that you will become a good Linux kernel programmer. After forty hours training a special mentoring program will be launched for this purpose.

All the students will be provided with a medium term Linux kernel project. KQ Infotech will facilitate a two hour weekly meeting to discuss and solve this project related problems. This program will enable one to become a good Linux kernel programmer in three to six months time.

About the Trainers
We have 7 to 11 years experience in developing systems storage software and virtualization solutions at Symantec (formerly Veritas) and are well versed with the challenges and techniques for engineering in the kernel space. We believe that our strong development background on heterogeneous platforms (Linux, solaris and AIX) will enable us to give our audience a better perspective on developing in the Linux kernel.

Course Contents
This course assumes good knowledge of C, familiarity with editors (vi/emacs) and basic concepts of user land and kernel space.

This course is designed for the people having their day job. This course will be delivered on Saturday and Sunday, for five hours each day. There will be assignments, code review, code walk through.

A brief overview of the course contents is below

Day 1: Introduction to Linux

Linux Kernel Features
Code layout
Build Linux Kernel
Introduction to the Kernel Module API
Legal Issues with Linux Kernel
Introduction to System Calls

Day 2: Process management and scheduling

Introduction to Process Management
Linux Kernel Order one scheduler
Fork system call
Signals and signal handlers
debugging techniques with Linux Kernel
Proc file system

Day 3: Synchronization

Important Data structures in the kernel
Synchronization primitives in the linux kernel

Day 4: Interrupts

Introduction Interrupts and ISRs
Introduction to bottom halves, tasklet

Day 5: File System

Introduction to VFS
ext2/ext3 filesystem

Day 6: Device Drivers

Introduction to character device drivers
Introduction to block device drivers

Day 7 and 8: Memory Management

Introduction to Linux Memory Management
Allocator, allocation schemes
Process view of memory, kernel view of memory
Drill down on x86 specific: L1, L2, TLB
Paging and swapping

Logistics:
First batch for this course will start from 7^th June.
This course will be delivered at Pune IT park.
There will be five hours session on both Saturday and Sunday.

Fee:
There is one combined fee for training and mentoring program. It is 25,000/-. There is special discount of 20% for first batch. Fee for first batch will be 20,000/-.

Contact us:
Anurag Agarwal: talk2anurag@gmail.com or 9881254401
Anand Mitra: anand.mitra@gmail.com or 9881296791

CSI-Pune’s ILM Seminar – A Report

May 30, 2008Eventsarchival, csi, Events, ilm, storageNavin Kabra

CSI-Pune conducted a half-day workshop on Information Life-cycle management. T.M. Ravi, founder and CEO of Mimosa Systems gave the keynote presentation. There were product/project pitches from IBM, Zmanda, Coriolis. A talk on storage trends by Abhinav Jawadekar. Finally a panel discussion with representation from Symantec (V. Ganesh), BMC (Bladelogic; Monish Darda), Zmanda (K K George), IBM, Symphony (Surya Narayanan), and nFactorial (Hemant Joshi).

Here are my cryptic notes from the conference:

T.M. Ravi, CEO of Mimosa, gave talk on what he sees as challenges in storage/ILM. New requirements coming from the customers – Huge amounts of user-generated unstructured data in enterprises. Must manage it properly for legal, security and business reasons. Interesting new trends coming from the technology side – new/cheap disks. De-duplication. Storage intensive apps (eg. video). Flash storage. Green storage (i.e. energy conscious storage). SaaS and storage in the cloud (e.g. Amazon S3). Based on this, storage software should focus on these things: 1. Increase Information content of data 2. Improve security. 3. Reduce legal risk. Now he segues into a pitch for Mimosa’s products. i.e. You must have an enterprise-wide archive: 1. continuous capture (i.e. store all versions of the data). 2. Full text indexing of all the content and allow users to search by keyword. 3. Single instance storage (SIS) aka De-duplication, to reduce the storage requirements. 4. Retention policies. Mimosa is an archiving appliance that can be used for 1. ediscovery, 2. recovery, 3. end-user searches, 4. storage cost reduction.
Then there was a presentation from IBM on General Parallel File System (GPFS). Parallel, highly available distributed file system. I did not really understand how this is significantly different from all the other such products already out there. Also, I am not sure what part of this work is being done in Pune. Caching of files over WAN in GPFS (to improve performance when it is being accessed from a remote location) is being developed here (Ujjwal Lanjewar).
There was also a presentation on the SAN simulator tool. This is something that allows you to simulate a storage area network, including switches and disk arrays. It has been open-sourced and can be downloaded here. A lot of the work for this tool happens in Pune (Pallavi Galgali).
KKG from Zmanda demonstrating recovery manager for MySQL. This whole product has been architected and developed in Pune
Bernali from Coriolis demonstrated CoLaMa – a virtual machine lifecycle manager a virtual machine lifecycle manager. This is essentially CVS for virtual machine images. A version management software to keep track of all your VM images. Check out image. Work on it. Check it in. A new version gets stored in the repository. And it only stores the differences between the image – so space savings. It auto-extracts info like OS info, patchlevel etc.
Coriolis’ was the only live demo. The others were flash demos which looked lame (and had audio problems). Suggestion to all – if you are going to give a flash demo, at least turn off the audio and do the talking yourself. This would involve the audience much better.
Abhinav Jawadekar gave nice introductory talk on the various interesting technologies and trends in storage. It would have been very useful and helpful for someone new to the field. However, in this case, I think it was wasted on audience most of who’ve been doing this for 5+ years. The only new stuff was in the last few slides that were about energy aware storage (aka green storage). (For example, he pointed out that data-center class storage in Pune is very expensive due to the high storage costs – due to power, cooling, UPS, genset, the operating costs of a 42U rack are $800 to $900 per month.)
The panel discussion touched upon a number of topics, not all of them interesting. I did not really capture notes of that.

Overall, it was an interesting evening. With about 50 people attending, the turnout was a little lower than I expected. I’m not sure what needs to be done in Pune to get people to attend. If you have suggestions, let me know. If you are interested in getting in touch with any of the people mentioned above, let me know, and I can connect you.

punetech.com

Connecting together Pune's Technologists