Diving into Production Issues at Scale
Problems with single hosts are challenging enough. Scaling up to hundreds or thousands of running hosts only multiplies the problems. However, troubleshooting and remediating production issues at scale can also be much easier to deal with than issues on smaller installations. In this talk, we will explore a real production issue or two around a python application to highlight some sound techniques and approaches to handling services at scale. We will explore some specific how-tos from a simple level, such as reading logs, to more complicated things pertaining the overall state of the runtime environment.