PSSketch: Finding Persistent and Sparse Flow with High Accuracy and Efficiency (ACM SIGKDD, 2025)
|
Finding persistent sparse (PS) flow is critical to early warning of various threats. Previous works have predominantly focused on either heavy or persistent flows, with limited attention given to PS flows. Although some recent studies pay attention to PS flows, they struggle to establish an objective criterion due to insufficient data-driven observations, resulting in reduced accuracy. In this paper, we define a new criterion ‘‘anomaly boundary’’ to distinguish PS flows from regular flows. Specifically, a flow whose persistence exceeds a threshold will be protected, while a protected flow with a density lower than a threshold is reported as a PS flow. We then introduce PSSketch, a high-precision layered sketch, to find PS flows. PSSketch employs variable-length bitwise counters, where the first layer tracks the frequency and persistence of all flows, and the second layer protects potential PS flows and records overflow counts from the first layer. Some optimizations have also been implemented to reduce memory consumption further and improve accuracy. The experiments show that PSSketch reduces memory consumption by 1-2 orders of magnitude compared to the strawman solution combined with existing work. Compared with SOTA solutions for finding PS flows, it outperforms up to 2.94x higher in F1 score and reduces ARE by 1-2 orders of magnitude. Meanwhile, PSSketch achieves a higher throughput than these solutions.
2025.7.31: The slides of PSSketch have been uploaded.
2025.6.28: The brief introduction video of PSSketch have been uploaded.
2025.5.30: The source code and the final paper of PSSketch have been uploaded.
2025.5.16: The website of PSSketch is under construction and will be finished soon.
E-mail: wenjunli@pku.org.cn.