Software mitigations for hardware vulnerabilities
This presentation focuses on the common characteristics of recently disclosed methods that target the internal structures of modern CPUs, how they compare to traditional methods, how malicious actors might try to infer secret data, and what the software mitigations for these methods are. This presentation will describe some of the software changes introduced in both operating systems and microcode to mitigate some recently disclosed hardware vulnerabilities. It will also detail the threat model of some of these methods. This threat model is key to perform the necessary risk analysis before mitigations are deployed and enabled in production systems. We will also walk through the different phases that this mitigation process follows. The first step is to mitigate the issue. Sometimes, new hardware functionality is required to mitigate the issue, and this functionality can only be provided by new microcode. A new microcode update might provide a tool to deal with potential implementations of these methods, but this tool still needs to be used at runtime. We have seen that oftentimes these changes are implemented on ring transitions, so that new functionality is invoked during system calls. Changes in the Linux kernel also include methods to enable, disable, and configure these mitigations at boot time. The kernel is modified to accept new parameters to change its behavior. However, this means that the system would need to be rebooted if the sysadmin wants to change the configuration of the mitigation. Thus, it is sometimes the case that additional options to configure the changes at runtime are also provided. sysfs is also used to report the status of the system: whether it is vulnerable, if it has been mitigated, what type of mitigation is currently in place, and whether more actions are required. This presentation will show some of these mitigations for certain hardware issues and point the audience to different websites where they can find more information for other vulnerabilities. So far, we have seen reactive changes to apply mitigations in the Linux kernel and microcode against specific methods. But there are also ongoing efforts that proactively focus on changes to increase process isolation on shared multicore systems. Improved process isolation would significantly limit potential implementations of these methods to successfully carry out user-to-user attacks. Improving process isolation requires changes to the OS scheduler to ensure that processes from different trust domains do not share certain hardware resources at the same time. This approach, however, has many implications: How do these changes affect the overall throughput of the system? How can we ensure that there is no resource starvation? Do all processes get the same fairness? While these questions are answered before releasing the updates for deployment, answering them requires a large amount of effort and many considerations, since a change in a key component of the OS might have unexpected side effects. However, some of these side effects might be positive: For example, we are studying how this type of scheduling technique can be used to improve the quality of service of the system and ensure that the performance of individual processes is not affected by other processes running on the same system. So, we have a scheduling technique that improves process isolation to make user-to-user attacks more difficult. But how do we make sure that this technique is usable and implemented in production systems? We propose tagging processes with preexisting tools so that different tags identify different trust domains. The presentation will discuss different ideas for implementing this technique in current production systems without much intervention from the sysadmins. But we would like to hear from the audience, since this is an ongoing change and any feedback should be considered when implementing such a large change in how the process scheduler in the Linux kernel works.