Triumphs in big data, from creating the ability to search for anything on the web within milliseconds to decoding the human genome, have been made possible by cloud computing – using remote servers (the "cloud") to store data and process complex computations. To keep cloud servers running as smoothly and predictably as possible, developers running applications on the cloud have applied various techniques to minimize disruption of servers' central processing units (CPUs).
Two of these techniques have been historically incompatible with each other, holding back the performance of cloud applications.
The first, "containerization," creates an isolated computing environment, inside of which applications can be run without disrupting a machine's CPU. The second, known as remote direct memory access (RDMA), allows developers to access memory on a remote server without interrupting the server's CPU.
"Before FreeFlow, no system could use RDMA for containerized applications," says Daehyeok Kim, a Ph.D. student in the Computer Science Department (CSD) at Carnegie Mellon. "FreeFlow makes this possible."
Kim presented FreeFlow earlier this year at the 16th USENIX Symposium on Networked Systems Design and Implementation in Boston. Watch Kim's presentation.
This boils down to companies like Intel running deep learning applications at much faster speeds with FreeFlow than they were able to before.Daehyeok Kim, Ph.D. student, Computer Science Department
As an illustrative example, consider TensorFlow, a popular open-sourced machine learning framework used for various tasks such as image or speech recognition and used by companies like Google. If a developer deploys TensorFlow by installing it using containers on the cloud, a TensorFlow instance running inside a container wouldn't be able to remotely access data inside another container without invoking hosting server's CPU. That is, unless they were using FreeFlow.
The use of containers, Kim says, has made it possible for people to deploy applications like TensorFlow without using a lot of time and computing resources. But until now, they could not use RDMA as a communication method in their application, so application performance has been limited.
"This boils down to companies like Intel running deep learning applications at much faster speeds with FreeFlow than they were able to before," says Kim. "That's because RDMA is 15 times faster than traditional networking."
Other researchers on the study included CSD Ph.D. student Tianlong Tu, Hongqiang Harry Liu from Alibaba, Jitu Padhye and Shachar Raindel from Microsoft Research, Chuanxiong Guo and Yibo Zhu from Bytedance, CyLab and Electrical and Computer Engineering professor Vyas Sekar, and CSD Department Head Srinivasan Seshan.