|
1 | | -# tensorflow-serving-java-grpc-kafka-streams |
2 | | -Kafka Streams + Java + gRPC + TensorFlow Serving |
| 1 | +# TensorFlow Serving + Java + Kafka Streams + gRCP |
| 2 | +This project contains a demo to do model inference with Apache Kafka, Kafka Streams and a TensorFlow model deployed using [TensorFlow Serving](https://www.tensorflow.org/serving/) (leveraging [Google Cloud ML Engine](https://cloud.google.com/ml-engine/docs/tensorflow/deploying-models) in this example). The concepts are very similar for other ML frameworks and Cloud Providers, e.g. you could also use Apache MXNet and [AWS model server](https://github.com/awslabs/mxnet-model-server). |
| 3 | + |
| 4 | +## Model Serving: Stream Processing vs. Request Response |
| 5 | +Machine Learning / Deep Learning models can be used in different way to do predictions. The preferred way is to deploy an analytic model directly into a Kafka Streams application. You could e.g. use the [TensorFlow for Java API](https://www.tensorflow.org/install/install_java). Examples here: [Model Inference within Kafka Streams Microservices](https://github.com/kaiwaehner/kafka-streams-machine-learning-examples). |
| 6 | + |
| 7 | +However, it is not always a feasible approach. Sometimes it makes sense or is needed to deploy a model in another serving infrastructure like TF-Serving for TensorFlow models. This project shows how access such an infrastructure via Apache Kafka and Kafka Streams. |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | +*Pros of an external model serving infrastructure like TensorFlow Serving:* |
| 12 | +- Simple integration with existing systems and technologies |
| 13 | +- Easier to understand if you come from non-streaming world |
| 14 | +- Later migration to real streaming is also possible |
| 15 | + |
| 16 | +*Cons:* |
| 17 | +- Framework-specific Deployment (e.g. only TensorFlow models) |
| 18 | +- Coupling the availability, scalability, and latency/throughput of your Kafka Streams application with the SLAs of the RPC interface |
| 19 | +- Side-effects (e.g. in case of failure) not covered by Kafka processing (e.g. Exactly Once) |
| 20 | +- Worse latency as communication over internet required |
| 21 | +- No local inference (offline, devices, edge processing, etc.) |
| 22 | + |
| 23 | +## TensorFlow Serving (using Google Cloud ML Engine) |
| 24 | +The blog post "[How to deploy TensorFlow models to production using TF Serving](https://medium.freecodecamp.org/how-to-deploy-tensorflow-models-to-production-using-tf-serving-4b4b78d41700)" is a great explanation of how to export and deploy trained TensorFlow models to a TensorFlow Serving infrastructure. You can either deploy your own infrastructure anywhere or leverage a cloud service like Google Cloud ML Engine. A [SavedModel](https://www.tensorflow.org/programmers_guide/saved_model#build_and_load_a_savedmodel) is TensorFlow's recommended format for saving models, and it is the required format for deploying trained TensorFlow models using TensorFlow Serving or deploying on Goodle Cloud ML Engine |
| 25 | + |
| 26 | +Things to do: |
| 27 | +1. Create Cloud ML Engine |
| 28 | +2. Deploy prebuild TensorFlow Model |
| 29 | +3. Create Kafka Cluster |
| 30 | +4. Implement Kafka Streams application |
| 31 | +5. Deploy Kafka Streams application (e.g. to a Kubernetes cluster) |
| 32 | +6. Generate streaming data to test the combination of Kafka Streams and TensorFlow Serving |
| 33 | + |
| 34 | + |
| 35 | +### Step 1: Create a TensorFlow model and export it to 'SavedModel' format. |
| 36 | +I simply added an existing pretrained Image Recognition model built with TensorFlow (Inception V1). |
| 37 | + |
| 38 | +I also created a new model for predictions of census using the "[ML Engine getting started guide](https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-training-prediction)". The data for training is in 'data' folder. |
| 39 | + |
| 40 | +### Step 2: Deploy model to Google ML Engine |
| 41 | +[Getting Started with Google ML Engine](https://cloud.google.com/ml-engine/docs/tensorflow/deploying-models) |
| 42 | + |
| 43 | +### Step 3: Create Kafka Cluster using GCP Confluent Cloud |
| 44 | +[Confluent Cloud - Apache Kafka as a Service](https://www.confluent.io/confluent-cloud/) |
| 45 | + |
| 46 | +### TODO Implement and deploy Streams app |
| 47 | + |
| 48 | +### Example 4 - Census Prediction with TensorFlow Serving |
| 49 | +This example shows how do use TensorFlow Serving to deploy a model. The Kafka Streams app can access it via HTTP or gRPC to do the inference. You could also use e.g. Google Cloud ML Engine to deploy the TensorFlow model in a public cloud the same way. |
| 50 | + |
| 51 | +TODO more details discussed in another github project. |
| 52 | + |
| 53 | +Steps: |
| 54 | +- Install and run TensorFlow Serving locally (e.g. in [Docker container](https://www.tensorflow.org/serving/docker)) |
| 55 | + docker build --pull -t tensorflow-serving-devel -f Dockerfile.devel . |
| 56 | + docker run -it tensorflow-serving-devel |
| 57 | + |
| 58 | + git clone --recurse-submodules https://github.com/tensorflow/serving |
| 59 | + cd serving/tensorflow |
| 60 | + ./configure |
| 61 | + cd .. |
| 62 | + bazel test tensorflow_serving/... |
| 63 | + |
| 64 | +=> Takes long time... Better use a prebuilt container like below |
| 65 | +- [Deploy TensorFlow model to TensorFlow serving](https://www.tensorflow.org/programmers_guide/saved_model#load_and_serve_a_savedmodel_in_tensorflow_serving) |
| 66 | + |
| 67 | +- mvn clean package istall |
| 68 | + |
| 69 | +- Start Kafka and create topics |
| 70 | + confluent start kafka |
| 71 | + |
| 72 | + kafka-topics --zookeeper localhost:2181 --create --topic ImageInputTopic --partitions 3 --replication-factor 1 |
| 73 | + |
| 74 | + kafka-topics --zookeeper localhost:2181 --create --topic ImageOutputTopic --partitions 3 --replication-factor 1 |
| 75 | + |
| 76 | + java -cp target/kafka-streams-machine-learning-examples-1.0-SNAPSHOT-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Kafka_Streams_TensorFlow_Serving_gRPC_Image_Recognition_Example |
| 77 | + |
| 78 | + |
| 79 | + java -cp target/kafka-streams-machine-learning-examples-1.0-SNAPSHOT-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Main |
| 80 | + |
| 81 | + |
| 82 | +- TODO Start Streams App |
| 83 | +- TODO Start Kafka and create topic |
| 84 | +- TODO Send test message |
| 85 | + |
| 86 | +- Send messages, e.g. with kafkacat: |
| 87 | + echo -e "src/main/resources/TensorFlow_Images/dog.jpg" | kafkacat -b localhost:9092 -P -t ImageInputTopic |
| 88 | + |
| 89 | +- Consume predictions: |
| 90 | + kafka-console-consumer --bootstrap-server localhost:9092 --topic ImageOutputTopic --from-beginning |
| 91 | +- Find more details in the unit test... |
| 92 | + |
| 93 | + |
| 94 | +https://github.com/gameofdimension/inception-java-client |
| 95 | +pull and start the prebuilt container, forward port 9000 |
| 96 | + |
| 97 | +# pull and start the prebuilt container, forward port 9000 |
| 98 | +docker run -it -p 9000:9000 tgowda/inception_serving_tika |
| 99 | + |
| 100 | +# Inside the container, start tensorflow service |
| 101 | +root@8311ea4e8074:/# /serving/server.sh |
| 102 | +This is hosting the model. The client just uses gRPC and Protobuf. It does not include any TensorFlow APIs. |
| 103 | + |
| 104 | +mvn clean compile exec:java -Dexec.args="localhost:9000 example.jpg" |
| 105 | + |
| 106 | +https://github.com/thammegowda/tensorflow-grpc-java/blob/master/src/main/java/edu/usc/irds/tensorflow/grpc/TensorflowObjectRecogniser.java |
| 107 | + |
| 108 | +java -cp target/kafka-streams-machine-learning-examples-1.0-SNAPSHOT-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Main localhost:9000 src/main/resources/TensorFlow_Images/dog.jpg |
| 109 | + |
| 110 | +java -cp target/kafka-streams-machine-learning-examples-1.0-SNAPSHOT-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Kafka_Streams_TensorFlow_Serving_gRPC_Image_Recognition_Example localhost:9000 src/main/resources/TensorFlow_Images/dog.jpg |
0 commit comments