Extended Journal Paper: Recursive Multi-Tree Construction with Efficient Rule Sifting for Packet Classification on FPGA (IEEE/ACM Transactions on Networking, 2024)

  • Authors: Yao Xin, Wenjun Li*, Chengjun Jia, Xianfeng Li, Yang Xu, Bin Liu, Zhihong Tian, and Weizhe Zhang
  • This paper (i.e., KickTree_Systolic) was conducted under the guidance of the corresponding author Wenjun Li (e-mail: wenjunli@pku.org.cn).

Abstract

As a programmable accelerator, SmartNIC provides more opportunities for algorithmic packet classification. Our aim in this work is to achieve both line-speed rule search and efficient rule update, two highly desired metrics for SDN data plane. We leverage the parallelism offered by the FPGA in SmartNIC following an algorithm/hardware co-design paradigm. Particularly, we first design an algorithm that constructs multiple trees for the rule set with a recursive rule sifting process. Unlike traditional space-cutting-based multi-tree construction, our rule sifting mechanism breaks the space constraints of rule-to-tree mapping and enables bounded height on each tree, thus providing the potential of bounded worst-case and line-speed performance. We then design a flexible hardware architecture with multiple systolic arrays that can be implemented in parallel on FPGA. Each systolic array works as a coarse-grained pipeline, and the multiple trees constructed earlier will be mapped onto these pipeline stages. This hardware-software mapping enables bounded worst-case rule searching. Additionally, incremental rule update is achieved simply by traversing the pipeline in one pass, with little and bounded impact on rule searching. Experimental results show that our design achieves an average classification throughput of 600.8/147.5 MPPS and an update throughput of 8.2/5.9 MUPS for 10k/100k-scale 5-tuple and OpenFlow rule sets.



 

  A Parallel and Updatable Architecture for FPGA-based Packet Classification with Large-Scale Rule Sets (IEEE Micro, 2023)

  • Authors: Yao Xin, Wenjun Li*, Gaogang Xie, Yang Xu, and Yi Wang
  • This work was conducted under the guidance of the corresponding author Wenjun Li (e-mail: wenjunli@pku.org.cn).
  • This paper (i.e., KickTree_Parallel) is a special issue paper selected and recommended by IEEE Hot Interconnects 2022.

Abstract

As a programmable hardware, FPGA provides more opportunities for algorithmic network packet classification. Despite more than ten years of research, the most actively investigated pipeline architectures still struggle to support fast rule search and efficient rule update for large-scale rule sets. In this paper, we design and implement a novel architecture for multi-tree based packet classification on FPGA, where the search and update processes are decoupled. A strategy of multi-PE, parallel search, and serial update is adopted. The parsing of multiple tree search results adopts a modular and hierarchical design, supporting architecture with various tree numbers. Additionally, incremental rule updates can be achieved simply by traversing all PEs in one pass, with little and bounded impact on rule searching. Compared with TcbTree, the state-of-the-art updatable classifier, the experimental results on FPGA show that the classification performance of our design improves 3.4X on average for various 100k-scale rule sets.



 

  KickTree: A Recursive Algorithmic Scheme for Packet Classification with Bounded Worst-Case Performance (ACM/IEEE ANCS, 2021)

  • Authors: Yao Xin, Yuxi Liu, Wenjun Li*, Ruyi Yao, Yang Xu, and Yi Wang
  • Yao Xin and Yuxi Liu contributed equally to this paper, and they conducted this work under the guidance of the corresponding author Wenjun Li (e-mail: wenjunli@pku.org.cn).

Abstract

As a promising alternative to TCAM-based solutions for packet classification, FPGA has received increasing attention. Although extensive research has been conducted in this area, existing FPGA-based packet classifiers cannot satisfy the burgeoning needs from OpenFlow, which demands large-scale rule sets and frequent rule updates. As a recently proposed hardware-specific approach, TabTree avoids rule replication and supports dynamic rule update. However, it still faces problems of unbalanced partition of rule subsets, unevenly distributed subtrees and excessive TSS leaf nodes when implemented on FPGA. In this paper, we propose a hardware-friendly packet classification approach called KickTree, which is elaborated by considering the hardware properties. To take advantage of intrinsic parallelism of FPGA, KickTree adopts multiple balanced decision trees which can run simultaneously. The bit selection is more flexible which breaks the restriction of rule subset. Moreover, the size of each subset is strictly limited, leading to bounded and evenly-distributed trees. Experimental results show that KickTree outperforms TabTree significantly in terms of the number of memory accesses for each classification operation while providing a rule update performance comparable to TabTree. In summation, KickTree is more practical for implementations on FPGA.

Downloads

News

2023.11.11: The website of KickTree series has been updated. If you are interested in pure algorithm design of KickTree, please refer to ANCS paper; if you are interested in straightforward architecture design of KickTree (i.e., KickTree_Parallel), please refer to Micro paper; if you are interested in algorithm/architecture co-design of KickTree (i.e., KickTree_Systolic), please refer to TON paper.

2023.10.18: KickTree_Systolic has been accepted by IEEE/ACM Transactions on Networking.

2023.1.6: KickTree_Parallel has been accepted by IEEE Micro, which is a special issue paper selected and recommended by IEEE Hot Interconnects 2022.

2023.1.1: To verify the effectiveness of KickTree for OpenFlow rules, we also test with ClassBench-ng rules [ANCS 2017]. Since the original ClassBench-ng did not provide the trace function, we modified ClassBench-ng based on the trace function of ClassBench [INFOCOM 2005]. The modified ClassBench-ng has been uploaded.

2021.11.17: Source code v1.0 of KickTree has been uploaded. Three types of rule sets are generated by ClassBench using default parameters (in README file), which are ACL, FW and IPC. The rule set size varies from 1k, 10k to 100k. For each size, 12 rule sets based on 12 seed parameter files (i.e, 5 ACL, 5 FW, and 2 IPC) are generated in ClassBench.

2021.11.16: The website of KickTree is under construction and will be finished soon.

Contact

E-mail: wenjunli@pku.org.cn, wenjun0816@qq.com.