• DocumentCode
    154264
  • Title

    Finite State Machine Parsing for Internet Protocols: Faster Than You Think

  • Author

    Graham, Robert David ; Johnson, Peter C.

  • fYear
    2014
  • fDate
    17-18 May 2014
  • Firstpage
    185
  • Lastpage
    190
  • Abstract
    A parser´s job is to take unstructured, opaque data and convert it to a structured, semantically meaningful format. As such, parsers often operate at the border between untrusted data sources (e.g., the Internet) and the soft, chewy center of computer systems, where performance and security are paramount. A firewall, for instance, is precisely a trust-creating parser for Internet protocols, permitting valid packets to pass through and dropping or actively rejecting malformed packets. Despite the prevalence of finite state machines (FSMs) in both protocol specifications and protocol implementations, they have gained little traction in parser code for such protocols. Typical reasons for avoiding the FSM computation model claim poor performance, poor scalability, poor expressibility, and difficult or time-consuming programming. In this research report, we present our motivations for and designs of finite state machines to parse a variety of existing Internet protocols, both binary and ASCII. Our hand-written parsers explicitly optimize around L1 cache hit latency, branch misprediction penalty, and program-wide memory overhead to achieve aggressive performance and scalability targets. Our work demonstrates that such parsers are, contrary to popular belief, sufficiently expressive for meaningful protocols, sufficiently performant for high-throughput applications, and sufficiently simple to construct and maintain. We hope that, in light of other research demonstrating the security benefits of such parsers over more complex, Turing-complete codes, our work serves as evidence that certain ``practical´´ reasons for avoiding FSM-based parsers are invalid.
  • Keywords
    Internet; finite state machines; protocols; FSM; Internet protocols; L1 cache hit latency; branch misprediction penalty; finite state machine parsing; program-wide memory overhead; protocol implementations; protocol specifications; Automata; Internet; Pipelines; Program processors; Protocols; Servers; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Security and Privacy Workshops (SPW), 2014 IEEE
  • Conference_Location
    San Jose, CA
  • Type

    conf

  • DOI
    10.1109/SPW.2014.34
  • Filename
    6957302