Fast HTTP string processing algorithms


There are binary optimizations in HTTP/2, so the protocol becomes less about string processing. However, strings, sometimes quite large like URI or Cookie, stil exists in HTTP. A typical program working with HTTP, must perform various string operations, e.g. tokenization, string matching, searching for a pattern etc. Classic computer science describe many string processing algorithms, but HTTP strings are special and specialized algorithms can improve performance of the strings processing in several times.

This talk describes:

  • ¬†How HTTP flood may make you HTTP parser the bottle neck
  • x86-64 issues with branch mispredictions, caching and unaligned memory access
  • C compiler optimizations for multi-branch statements and autovectorization
  • switch-driven finite state machines (FSM) versus direct jumps (e.g. Ragel)
  • what makes HTTP strings special and why LIBC functions aren't good
  • strspn()- and strcasecmp()-like algorithms for HTTP strings using SSE and AVX
  • efficient custom filtering to prevent injection attacks using AVX
  • the cost of FPU context switch and how the Linux kernel works with SIMD
  • all the topics are illustrated with microbenchmarks
Ballroom C
Saturday, March 9, 2019 - 18:00 to 19:00