Misc.Dec 29, 2019
AppianEjvt23

Python and Tensorflow in Parallel

Hey all, unfortunately, I've been stuck on this task for a couple weeks now and would really like to figure it out soon. Since most people here are in tech, I was curious if anyone would be able to provide some assistance. Background: I'm on Windows using Python and Tensorflow to perform inference on 3 deep learning models I have. I have a function, let's call it "func_load", that loads the 3 models into memory (takes a few minutes). Then, I have another function, let's call it "func_infer", that performs inference on the 3 models sequentially. My goal is to perform the 3 models' inference in parallel using either multi-processing or multi-threading. I've already tried both. 1. For multi-threading, I can't seem to get it running in parallel. All I need to do is call session.run() in Tensorflow to get the inference. If I'm understanding correctly, session.run() releases the Python Global Interpreter Lock (GIL), so if I call it on the 3 different session in 3 different threads, theoretically, it should all be run in parallel - but it's still running in the same amount of time as if being run sequentially! For reference on what I'm doing with multi-threading (I'm using the lines below for my 3 threads): workers = [lambda input_list: sess.run(y_op, {x_inp: input_list, tflag_op: False}) for sess in sessions] t = threading.Thread(target=workers[idx], args=(input_list,)) t.start() ... t.join() 2. For multi-processing, I'd like to be able to create 3 different processes, using multiprocessing.Process() and then call func_load() on each of them with one model each. Then, I'd like to use those same processes to call func_infer(). I haven't been able to figure out a way to do that with processes though - calling 2 different functions at different points in the script. Does anyone have any suggestions on how to approach these tasks? Any help would be greatly appreciated! I’ve actually already posted about this in SO, but haven’t gotten much further as well btw.

Add a comment
Snapchat ADDX44 Dec 29, 2019

Consider using Apache beam for inference. It is easy to parallelize. And you can find examples in tfx project under tensorflow

Appian Ejvt23 OP Dec 29, 2019

Thanks for the suggestion! This is my first time hearing of Apache Beam. I took a look at it, but I haven’t been able to find any examples of parallel Tensorflow inferences. Do you know if it’s able to do that?

Appian Ejvt23 OP Dec 29, 2019

Also, I’d like for this to all happen locally. Apache Beam seems to be targeted for cloud

NIO nio666 Dec 29, 2019

Checkout Ray. Under the hood it uses cloud pickly to serialize/deserialize (unlike multiprocessing module which has some deficiencies when it comes to tf models).

Appian Ejvt23 OP Dec 29, 2019

Yeah I looked into Ray earlier and really wanted to use it, but then I realized it’s currently not supported for Windows :(

Appian Ejvt23 OP Dec 29, 2019

Thanks though!

Amazon nfdo67fr7 Dec 29, 2019

What kind of models are you trying to run parallel on cpu? I think you simply have a HW bottleneck here hence you dont see any speedup.

Appian Ejvt23 OP Dec 29, 2019

They’re pretty large deep learning models. I have 2 NVIDIA Titan RTX GPUs though, so I don’t think my hardware could be a bottleneck. Please correct me if I’m wrong though

Amazon nfdo67fr7 Dec 29, 2019

How do you load 3 models to 2 gpus?