Patrick McGovern - Splunk
Speaking Topic: Leveraging the IT community

Prior to joining Splunk, Patrick oversaw SourceForge.net, the world's largest Open Source development website, for five years. Patrick grew the site from a few thousand users to over a million registered Open Source developers and 97,000 Open Source software projects. SourceForge.net is now the de facto repository for Open Source software.

Troubleshooting servers can be a daunting task, particularly at 3am when everyone else is asleep. Leveraging the expertise and knowledge of IT professionals globally to find the solution can greatly reduces the mean-time-to-resolution. There are many resources on the web that have parts of the solution but none of them that are reasonably complete. This talk is focused on a building a new broad resource for sysadmins by sysadmins. Starting with the concepts that made the site Wikipedia a great success and tools like grep so universal, this talk will discuss the challenges of building a large IT community / knowledge base and how to bring IT troubleshooting to the next level.

Applications continue to become more on-demand and application environments grow in complexity and scale. In the early days of IT maintaining availability was straightforward because components from a single vendor are built to work well together. But the days of homogenous data centers are past. As our IT infrastructures become complex combinations of components from different vendors, architects and engineers - managed by different system administrators - we are less able to anticipate the interactions among components. Only at runtime do we discover the issues and problems. Operator errors, configuration changes, dependency and integration issues, ineffective architectures and discrete component defects can now snowball quickly degrading performance, functionality and causing outages. The sheer amount of data generated by hundreds of components in a single environment can be overwhelming. Trying to sort through them at runtime is like drinking from the proverbial fire hose. Yet real-time troubleshooting is now a requirement for every application administrator, developer and support person. In this presentation, Patrick McGovern will discuss how system administrators can understand what's happening with their distributed applications and troubleshoot problems more effectively.

We'll examine the tools and techniques necessary for troubleshooting and provide practical advice through real world examples. Attendees will learn more effective approaches to common and complex problem identification, analysis and resolution.

Listen Now! [.mp3]