Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs.
Automatic μBenchmark Generation to Compute “Lower-bound” Latency and Inform Optimizations of Deep Learning Models on GPUs.
An open-source, framework and hardware agnostic, extensible and customizable, distributed platform design for evaluating and profiling ML models across datasets/frameworks/systems.
Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in Function as a Service Environments.