Q-Factor is a framework to enable ultra-high-speed data transfer optimization based on real-time network state information provided by programmable data planes.
Communication networks are critical components of today’s scientific workflows. Researchers require long-distance, ultra high-speed networks to transfer huge data from acquisition sites (such as Vera C. Rubin Observatory, also knowns as Large Synoptic Survey Telescope in Chile) to processing sites, and to share measurements with scientists worldwide. However, while network bandwidth is continuously increasing, the majority of data transfers are unable to efficiently utilize the added capacity due to inherent limitations of parameter settings of the network transport protocols and the lack of network state information at the end hosts. To address these challenges, Q-Factor plans to use temporal network state data to dynamically configure current transport protocol parameters to reach higher network utilization and, as a result, to improve scientific workflows.
Q-Factor leverages programmable network devices with the In-band Network Telemetry (INT) application and delivers a software solution to process in-band measurements at the end hosts. Using Q-Factor on Data Transfer Nodes (DTN)s, TCP/IP parameters will be configured according to temporal network characteristics, such as round-trip time, network utilization, and network congestion. This tuning is expected to result in increased network utilization, shorter flow completion times, and significantly fewer packet drops caused by network buffers overflow. Additionally, Q-Factor is geared to save host memory by tailoring kernel parameters and buffers to optimal sizes.
Q-Factor targets a timely issue in communication networks: underutilization of ultra high-speed networks for science workflows. In order to keep scientific progress unconstrained, future science workflows need to support emerging data-intensive science experiments (e.g., the Vera Rubin Observatory, High Luminosity Large Hadron Collider) where data generation grows significantly, reaching exabytes of traffic each year. Results of this project will also allow better understanding of optimal buffer sizes of network devices for huge flows and the interaction of various congestion control algorithms.
Experimental measurement data, network state information, network topology, software code, TCP tuning guidelines, and results will be available on the Q-Factor website https://q-factor.io, which will be maintained and indexed for at least three years after the completion of the project.