Note: this is a past event Check out the current event!

Chris Fregly

Founder and Research Engineer at PipelineAI


Training, Optimizing, and Deploying High Performance, Distributed, and Streaming TensorFlow AI Models in Production with GPUs
Thursday 10:50 - 12:35
machine learning
artificial intelligence
intelligent infrastructure
science ops
ai ops
model training
model serving
predictive analytics

Your rating:

Following the famous Netflix Culture that encourages "Freedom and Responsibility", I use this talk to demonstrate how Data Scientists can use PipelineAI to safely deploy their ML / AI pipelines into production using live data.

Using live demos, I will show how to train, optimize, profile, deploy, and monitor high performance, distributed TensorFlow AI Models in production with Docker, Kubernetes, and GPUs.

I then optimize our TensorFlow models using various training-optimization techniques such as TensorFlow's Accelerated Linear Algebra (XLA) framework and JIT compiler for operator fusing, drop out, and batch normalization.

Next, I discuss some post-training model optimization techniques including TensorFlow's Graph Transform Tool for weight quantization, batch normalization folding, and layer fusing.

Last, I will demonstrate and compare various GPU-based, TensorFlow model-serving runtimes including TensorFlow Serving, TensorFlow Lite, and Nvidia's GPU-optimized TensorRT runtime.

Watch the talk    Check the slides


Chris Fregly is Founder and Research Engineer at PipelineAI, a Real-Time Machine Learning and Artificial Intelligence Startup based in San Francisco. 

He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, "High Performance TensorFlow in Production."

Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco.