Chris Fregly

Founder and Research Engineer at PipelineAI


Training, Optimizing, and Deploying High Performance, Distributed, and Streaming TensorFlow AI Models in Production with GPUs
Thursday 10:50 - 12:35
machine learning
artificial intelligence
intelligent infrastructure
science ops
ai ops
model training
model serving
predictive analytics

Your rating:

Following the famous Netflix Culture that encourages "Freedom and Responsibility", I use this talk to demonstrate how Data Scientists can use PipelineAI to safely deploy their ML / AI pipelines into production using live data.

Using live demos, I will show how to train, optimize, profile, deploy, and monitor high performance, distributed TensorFlow AI Models in production with Docker, Kubernetes, and GPUs.

I then optimize our TensorFlow models using various training-optimization techniques such as TensorFlow's Accelerated Linear Algebra (XLA) framework and JIT compiler for operator fusing, drop out, and batch normalization.

Next, I discuss some post-training model optimization techniques including TensorFlow's Graph Transform Tool for weight quantization, batch normalization folding, and layer fusing.

Last, I will demonstrate and compare various GPU-based, TensorFlow model-serving runtimes including TensorFlow Serving, TensorFlow Lite, and Nvidia's GPU-optimized TensorRT runtime.

Watch the talk    Check the slides


Chris Fregly is Founder and Research Engineer at PipelineAI, a Real-Time Machine Learning and Artificial Intelligence Startup based in San Francisco. 

He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, "High Performance TensorFlow in Production."

Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco.