Tuple Space Assisted Packet Classification with High Performance on Both Search and Update (IEEE JSAC, 2020.07)

 

Abstract

Software switches are being deployed in SDN to enable a wide spectrum of non-traditional applications. The popular Open vSwitch uses a variant of Tuple Space Search (TSS) for packet classifications. Although it has good performance on rule updates, it is less efficient than decision trees on lookups. In this paper, we propose a two-stage framework consisting of heterogeneous algorithms to adaptively exploit different characteristics of the rule sets at different scales. In the first stage, partial decision trees are constructed from several rule subsets grouped with respect to their small fields. This grouping eliminates rule replications at large scales, thereby enabling very efficient pre-cuttings. The second stage handles packet classification at small scales for non-leaf terminal nodes, where rule replications within each subspace may lead to inefficient cuttings. A salient fact is that small space means long address prefixes or less nesting levels of ranges, both indicating a very limited tuple space. To exploit this favorable property, we employ a TSS-based algorithm for these subsets following tree constructions. Experimental results show that our work has comparable update performance to TSS in Open vSwitch, while achieving almost an order-of-magnitude improvement on classification performance over TSS.


Ideas&Framework

According to our in-depth analysis, cutting techniques can separate searching space into smaller sub-spaces quickly for faster classification, but it suffers from serious rule replications. In contrast, TSS can completely avoid rule replications and support fast rule updates, but it has longer classification time due to tuple expansions, especially for large rule sets. Intuitively, Figure(a) fosters the strengths and circumvents the weakness of decision tree and TSS based schemes. However, in order to design scalable algorithms to meet performance metrics in Subsection III_A, the initial framework still faces several difficulties and challenges. Figure(b) shows the refined framework of CutTSS.


Observations

Due to space limitation of the manuscript, more observations can refer to this page. Additionally, these observations are still effective for OpenFlow-like rule tables, which are generated based on 216 real-life rules from enterprise customers. We are very grateful to the help from Prof. Jun Li (Tsinghua Unsiversity) during these evaluations.


Q&A for Readers

Q1: Why the pre-cuttings can avoid rule replications at large scales?

A1: Essentially, each field sub-range generated by equal-sized cuttings can be viewed as a binary prefix on the cutting field, where its super-prefix can be indicated by concatenating cutting bits from each cutting stage. For example, if the source IPv4 address field is cut into 256 (i.e., 28) equal-sized sub-ranges, all generated sub-ranges can be equivalently represented by binary prefixes with a prefix length of 8, indicated from 0000000*/8 to 1111111*/8 respectively. There are many interesting properties among prefixes, which had been wildly studied in IP routing table lookups. Among them, one is a key for this question: longer (i.e., narrow) prefix can never be overlapped with two shorter (i.e., wide) prefix with the same prefix length. In other word, if two prefixes are overlapped, the narrow prefix must be completely covered by the wide prefix. Thus, rule replication can never happen if the cutting sub-space is larger than the length of small field. Note that this conclusion is drawn based on small prefix fields. For small range fields, we can further draw the conclusion that pre-cuttings may lead to at most one replication for each rule at large scales. This conclusion can be easily proved based on the concept of LCP (Longest Common Prefix), which had been well studied in these papers ( Sorting and Searching using Ternary CAMs, IEEE Hot Interconnects, 2002; A 2-Level TCAM Architecture for Ranges, IEEE Transactions on Computers, 2006). Since only prefix and exact value fields are adopted for pre-classifications in the PSTSS algorithm, CutTSS also only conduct partitioning and pre-cuttings on these fields. For small range fields, please refer to the Subsection III.C of TabTree[ANCS 2019], which is our latest work designed for FPGA platform.


Q2: What is the difference of partitioning between CutTSS and EffiCuts?

A2: Firstly, CutTSS partitions rules based on a few distinct fields especially on prefix and exact value fields, while all rule fields are considered in EffiCuts. Secondly, CutTSS has a more flexible definition about small/big field, while it is too strict in EffiCuts. These two differences enable much more flexible partitioning to generate fewer subsets in CutTSS. Finally, subsets merging will not lead to rule replication in CutTSS, while rule replication may become worse after merging in EffiCuts. Thus, the partitioning in CutTSS is designed for efficient pre-cuttings called FiCuts, while it is designed for HyperCuts in EffiCuts. Essentially, the partitioning algorithm in CutTSS is a tradeoff between the partitioning algorithm in HybridCuts and EffiCuts, where at most F+1 and 2F subsets may be generated for F-tuple rule set respectively.


Q3: How to conduct a packet classification in CutTSS?

A3: For each incoming packet, it is searched from the greatest to the least maximum priority on subsets. If a matching is found in the searching subset, it will terminate searching process immediately, since at that point no better match (matched with higher priority) can be found. For each TSS assisted decision tree of partitioned subsets, the incoming packet searches along the decision tree using its field values as traditional cutting based decision trees until a leaf node or a non-leaf terminal node is hit. For the first case, a linear search is taken to get a best matched rule, if any, among the rules in the leaf node. For the second case, a PSTSS search is taken among rules in the non-leaf terminal node for matching.


Q4: How to conduct a rule update in CutTSS?

A4: Since no rules are replicated in CutTSS, the rule update process is quite similar to the packet lookup process described in above Q3. Thus, for each rule update, the first thing we need to do is to find a subset for each updated rule at which the rule is supposed to reside. Recall that all rules are partitioned based on small fields, so it is easy to find the resided subset based on the updated rule field length. If the rule is a big rule, the update operation is then conducted in big subset with the PSTSS algorithm. Otherwise, the update operation will be conducted in the corresponding TSS assisted decision tree, where a terminal node is first found at which the rule is supposed to reside. After that, the rule insertion or deletion can be conduct easily either on a leaf node (i.e., liner search) or a non-leaf terminal node (i.e., PSTSS search).


Q5: What if the observations are not valid? In other words, what if all rules are big rules?

A5: Extremely, all rules are big rules is itself a big useful observation, because all rules are only distinguished by a few super bits in all fields, indicating much fewer tuples in the big subset. On the other hand, the definition of small and big rule is a relative concept in CutTSS, so it can still work well by changing the threshold values of small fields. Actually, small rules are always the dominant portion in a non-trivial rule database, as big rules with all fields being large means that they have short address prefixes and large ranges, which are scarce compared to long prefixes or ranges.


Q?: If you have any questions? Please feel free to contact with me (e-mail: wenjunli@pku.edu.cn).

A?: I will do my best to answer for you.


Downloads

 

News

2022.03.17: The code of CutTSS_V3.1 has been uploaded, the revised code fixed the printing errors of HyperSplit in CutSplit.

2021.08.08: The code of CutTSS_V3 has been uploaded, finished by Yuxi Liu (under the guidance of Wenjun Li), a graduate student from the Southern University of Science and Technology. The modified CutSplit can avoid previous faults and acheive faster classification performance.

2021.08.01: The code of CutTSS has been uploaded (to fixed the problems of segment error and misclassification in CutSplit).

2021.07.21: There is an error in the published paper(i.e., Fig.15a is updated wrong), we have updated this figure in the website attached paper.

2020.06.21: The website of CutTSS has been updated, and the final paper of CutTSS has been attached.

2020.04.02: Affected by the COVID-19, the website will be updated in a few days (after being allowed to go back to the school).

2020.03.02: CutTSS has been accepted by IEEE Journal on Selected Areas in Communications, Special Issue on Network Softwarization & Enablers (JSAC).

2019.04.20: The first version source code of CutTSS as well as three other compared algorithms (i.e., PartitionSort, PSTSS and CutSplit) have been uploaded.

2019.04.18: Some Questions and Answers (Q&A) will be attached in this page. Feel free to contact with me if you have any questions.

2019.04.17: The conference version called CutSplit [INFOCOM 2018] has been attached.

2019.04.17: The experimental results of observations have been updated.

2019.04.15: The manuscript of CutTSS has been submitted to the IEEE JSAC (Issue: Series on Network Softwarization & Enablers).

2019.04.05: The website of CutTSS is under construction.

 

Contact

My Contact Info.: wenjunli@pku.edu.cn, wenjun0816@qq.com.