Serving DNNs like Clockwork: Performance Predictability from the Bottom Up

Published in 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2020

Recommended citation: Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, Jonathan Mace, "Serving DNNs like Clockwork: Performance Predictability from the Bottom Up", 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2020. https://arxiv.org/abs/2006.02464

In this paper, starting with the predictable execution times of individual DNN inferences, we adopt a principled design methodology to successively build a fully distributed model serving system that achieves predictable end-to-end performance.