Model and Framework selection in Production: A Case of Object Detection with TensorFlow

This is a series of write-up on a modern data pipeline implementaiton with case study of Tyco Cam security solutions using various data tooling features like Pacyderm, Kubernetes, Docker, Tensorflow and Storage infrastructre like AWS

Table of Contents

●●●

This is third article in the series discussing the elements required to make a locally developed data science model to production-ready machine

Choosing models that meet a business objective is the most fundamental aspect of analytics after understanding the business itself. In this regards, distributed deep-learning frameworks have fundamentally transformed the way analytics is done at a scale. But it should also be noted that it is not a panacea, sometimes traditional ML techniques like logistics regression, k-means clustering can be more effective in a manageable dataset and provides easy interpretability to business. But nevertheless, where the dataset is ever growing, ever varying, it is paramount to consider deep-learning platforms as a serious option.

In regards to our continuing example of Home Security solutions, a Neural Network implemented on distributed deep learning framework TensorFlow is an ideal choice suiting the needs. Tyco’s systems stream data from a large number of clients’ cameras. Its utility lies in the ability to recognize and identify moving objects and recognize the intruders correctly (animal, person,etc.,). With this business understanding, rest of the post goes on to explain the reasons why these two choices actually makes sense for Production environment.

Neural Nets are an extremely robust class of algorithms that generally provides a higher accuracy in prediction. It consists of a labyrinthine set of nodes grouped in an input layer, multiple(including one) hidden and an output layers. Their internal workings mimic brain neurons to interpret data. At a node, it can be thought as weighted sum of incoming data with a statistical bias. Once a neural network is trained with a sufficiently large number of data, it can be relied upon to give an acceptable prediction outputs. One possible explanation for its excellent performance that hidden units are able to create new types of features in hidden layers that make it classify efficiently. There are various variations of Neural Net available for different needs but based on business understanding it is crucial that it supports embedded architecture and compatible with mobile technology. The only high cost of using ANN is its computational requirement which requires distributed computing requirement( involving GPU) and this is where the utility of a good framework prevails.

TensorFlow is a hugely popular(see Kaggle survey results below) open source library release by Google Brain in 2015. This deep-learning framework for machine learning allows us to implement flavors of various ML models built-to scale. But there are also custom-made models also available in the framework that gives a step-ahead in the implementation or playing with a Proof-of-Concept. A good analogy to understand the relationship between TensorFlow and ANN is of Scikit-learn package. This popular python library allows us to implement models with numpy as building block even when any model can be implemented in numpy package alone.

Survey Response to question What is the Next Big Thing in Data Science? Kaggle ML and Data Science Survey, 2017

TensorFlow models are especially useful in the application of Computer Vision including Object Detection. Other applications include completing noisy images and automatically generating map from image. Out-of-Box TensorFlow models trained on CoCo dataset provides a low barrier-of-entry to the Tyco team and is an addition to its appeal. These models are pre-customized to have shown(on Nvidia GeForce GTX TITAN X card) a run speed of 540ms per image to 30 ms per image for a 600x600 image. In Tyco case, API also allows to prepare our specific inputs that can be tweaked for special cases encountered in the real world for instance, model can then be trained to distinguish animals vs intruders dressed in animal costume

But before diving into the TensorFlow application, we should be asking a basic question to ourselves - Why TensorFlow as compared to other popular libraries - Torch or Theano?

A part of the answer has to do with the Open-Source technologies supported by tech-giants and presence of huge community support. TensorFlow framework is actively managed and is very prompt in fixing bugs. There is also a bit of risk associated with Opens Sources as well regarding the robustness of external support For example, TensorFlow community will not be available for On-Demand support requests which Tyco may be need immediately. But positives of open-source far outweighs it negatives.

The other part of argument in favor of TensorFlow is to do with its foundation of data flow programming architecture which is very much suited to do image and graphical processing. Theano is a just as good a candidate for graphical processing but it lacks processor portability(e.g. on Mobile) and a custom-made API for object detection which may lead Tyco’s data science team to have a higher learning curve and larger team efforts when building a Theano-based solution. TensorFlow is growing fast and is rebuilding itself from the learning of other deep-learning framework. For example, sensing a lack of visualization in deep learning library, community introduced Tensorboard to better assist in understanding and debugging. Any shortcomings it may have when compared with other libraries is likely going to be overcome soon with its big pool of community. If you have to bet on a horse in technology, wisdom of crowd is generally the best option because that is what attracts the more talent to solve problems.

More on TensorFlow and its scaling capability

Dataflow architecture of TensorFlow: Nodes in the graph represents the mathematical operations, while the edges represents the n-dimensional arrays(Tensors) that serves as a communication medium between the operations. For example, MatMul node in (Logit and Rely layer) below graph represents matrix multiplication with 2 incoming tensors and 1 outgoing tensor. This is a low-level programming model(like C) but exclusively designed for parallel computing.

Tensors flowing across layers of nodes in a Dataflow architecture(Source: Tensorflow.org)

These Dataflow capabilities allow TensorFlow to scale on following grounds:

Portability: Dataflow graph supports multiple client languages which allows it to truly abstracts the devices across the different architecture. What it means is that I can code a layer of TensorFlow logic in Python on Windows and use it directly into IoT sensor architecture driven by C++ on Raspberry pi or GPU

In home security case, customers will appreciate multi-channel support (e.g., seeing the stream with live detection on iOS/Andriod), this can be made possible with TensorFlow’s native support for mobile. Infact, models implementations that performs at low-power, low-latency will meet the objectives more graciously than a high-powered , time -consuming desktop applications.

Distribution: Graph allows the graphs to Send/Reciever operations to transport across devices and in parallel. And each unit of added GPU will help you to speedup your algorithm based on your timely needs.

On load increase(or decline) from rapid expansion(or contraction) in customer base, Tyco production team will be able to scale up( or down) fast without engaging costly processing resources. TensorFlow internally is going to maintain the performance level of Neural Network-based algorithm if processors is kept in proportions

Flexibility: With a TensorFlow, all 60 cores of GPU can be optimized for that architecture in addition to optimization of 3 x number of virtual CPU core in a 3 core CPU . While developing production-scale analytics, it allows the team to be flexible to heterogeneously utilize either or both of the hardware resources as the whole layer is abstracted.

operations from GPU 0 to CPU

Home Security cams can be locally embedded with processing capabilities that will allow the compilation of images at sources and move the data as a tensors to cloud, with all the training and validation done in parallel. TensorFlow will be able to handle splitting up the graph between cam chip architecture to GPU/CPU.

Compilation: TensorFlow’s XLA Compiler is proven to be very optimized to generate faster code in linear algebra calculations. XLA invokes JIT compilation techniques to execute TensorFlow at runtime - gauging the environment and produce native machine code for devices like CPUs, GPUs and custom accelerators(e.g. Google’s TPU). It leads to faster compilation than other frameworks( like Theano) of subgraphs for any machine which allows for more constant propagation.