Asynchronous inference with a keras model and prefetch_to_device
I have build and trained a model with the Keras API. Now I need to create an efficient framework to run inference for a large number of input samples. The tricky part is that not all of those samples are available from the beginning on, but are selected during inference based on the results of previous samples.
I can create a basic pipeline for that, with one process selecting samples to push into a queue and a second process to retrieve and preprocess them. Those samples then get feed to a model on GPU (which only got initialized once in the beginning) with Keras' model.predict_on_batch(batch)
.
However, this would be quite slow. I'd rather have a small queue on GPU so that there are no loading times when the next batch is transferred from RAM to GPU.
This seems to be possible with the Tensorflow Dataset API and prefetch_to_device
[1]. But it also seems to be not straight forward to use the Dataset API with a Keras model for inference:
- Inference with
tf.data.Dataset
has been asked multiple times: [2], [3] but the answers aren't really explanatory beside the given code snippets. Also I'm not sure how to utilize the suggestions for my Keras model checkpoint .hdf5-file. - How to asynchronously feed the Dataset without rebuilding or reloading the graph each time the first process selects new samples? [4], [5]
I'm not to familiar with plain tensorflow code without the keras abstractions so I might have overseen something obvious in the references. I'd be very grateful for detailed explanation or pointers to more sources.
asynchronous keras gpu tensorflow-datasets inference
add a comment |
I have build and trained a model with the Keras API. Now I need to create an efficient framework to run inference for a large number of input samples. The tricky part is that not all of those samples are available from the beginning on, but are selected during inference based on the results of previous samples.
I can create a basic pipeline for that, with one process selecting samples to push into a queue and a second process to retrieve and preprocess them. Those samples then get feed to a model on GPU (which only got initialized once in the beginning) with Keras' model.predict_on_batch(batch)
.
However, this would be quite slow. I'd rather have a small queue on GPU so that there are no loading times when the next batch is transferred from RAM to GPU.
This seems to be possible with the Tensorflow Dataset API and prefetch_to_device
[1]. But it also seems to be not straight forward to use the Dataset API with a Keras model for inference:
- Inference with
tf.data.Dataset
has been asked multiple times: [2], [3] but the answers aren't really explanatory beside the given code snippets. Also I'm not sure how to utilize the suggestions for my Keras model checkpoint .hdf5-file. - How to asynchronously feed the Dataset without rebuilding or reloading the graph each time the first process selects new samples? [4], [5]
I'm not to familiar with plain tensorflow code without the keras abstractions so I might have overseen something obvious in the references. I'd be very grateful for detailed explanation or pointers to more sources.
asynchronous keras gpu tensorflow-datasets inference
add a comment |
I have build and trained a model with the Keras API. Now I need to create an efficient framework to run inference for a large number of input samples. The tricky part is that not all of those samples are available from the beginning on, but are selected during inference based on the results of previous samples.
I can create a basic pipeline for that, with one process selecting samples to push into a queue and a second process to retrieve and preprocess them. Those samples then get feed to a model on GPU (which only got initialized once in the beginning) with Keras' model.predict_on_batch(batch)
.
However, this would be quite slow. I'd rather have a small queue on GPU so that there are no loading times when the next batch is transferred from RAM to GPU.
This seems to be possible with the Tensorflow Dataset API and prefetch_to_device
[1]. But it also seems to be not straight forward to use the Dataset API with a Keras model for inference:
- Inference with
tf.data.Dataset
has been asked multiple times: [2], [3] but the answers aren't really explanatory beside the given code snippets. Also I'm not sure how to utilize the suggestions for my Keras model checkpoint .hdf5-file. - How to asynchronously feed the Dataset without rebuilding or reloading the graph each time the first process selects new samples? [4], [5]
I'm not to familiar with plain tensorflow code without the keras abstractions so I might have overseen something obvious in the references. I'd be very grateful for detailed explanation or pointers to more sources.
asynchronous keras gpu tensorflow-datasets inference
I have build and trained a model with the Keras API. Now I need to create an efficient framework to run inference for a large number of input samples. The tricky part is that not all of those samples are available from the beginning on, but are selected during inference based on the results of previous samples.
I can create a basic pipeline for that, with one process selecting samples to push into a queue and a second process to retrieve and preprocess them. Those samples then get feed to a model on GPU (which only got initialized once in the beginning) with Keras' model.predict_on_batch(batch)
.
However, this would be quite slow. I'd rather have a small queue on GPU so that there are no loading times when the next batch is transferred from RAM to GPU.
This seems to be possible with the Tensorflow Dataset API and prefetch_to_device
[1]. But it also seems to be not straight forward to use the Dataset API with a Keras model for inference:
- Inference with
tf.data.Dataset
has been asked multiple times: [2], [3] but the answers aren't really explanatory beside the given code snippets. Also I'm not sure how to utilize the suggestions for my Keras model checkpoint .hdf5-file. - How to asynchronously feed the Dataset without rebuilding or reloading the graph each time the first process selects new samples? [4], [5]
I'm not to familiar with plain tensorflow code without the keras abstractions so I might have overseen something obvious in the references. I'd be very grateful for detailed explanation or pointers to more sources.
asynchronous keras gpu tensorflow-datasets inference
asynchronous keras gpu tensorflow-datasets inference
edited Nov 16 '18 at 9:12
Johnny TGun
asked Nov 16 '18 at 8:18
Johnny TGunJohnny TGun
114
114
add a comment |
add a comment |
0
active
oldest
votes
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53333897%2fasynchronous-inference-with-a-keras-model-and-prefetch-to-device%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53333897%2fasynchronous-inference-with-a-keras-model-and-prefetch-to-device%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown