Disable cache while running SQL - Google Big Query
Need some advice on cache issue we are facing in Google Big Query.
I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.
Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?
sql caching google-bigquery talend
add a comment |
Need some advice on cache issue we are facing in Google Big Query.
I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.
Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?
sql caching google-bigquery talend
You have a low rate. Important on SO - you canmark accepted answer
by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important tovote on answer
. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!
– Mikhail Berlyant
Dec 6 '18 at 20:59
add a comment |
Need some advice on cache issue we are facing in Google Big Query.
I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.
Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?
sql caching google-bigquery talend
Need some advice on cache issue we are facing in Google Big Query.
I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.
Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?
sql caching google-bigquery talend
sql caching google-bigquery talend
asked Nov 15 '18 at 6:14
Pal1989Pal1989
62
62
You have a low rate. Important on SO - you canmark accepted answer
by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important tovote on answer
. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!
– Mikhail Berlyant
Dec 6 '18 at 20:59
add a comment |
You have a low rate. Important on SO - you canmark accepted answer
by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important tovote on answer
. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!
– Mikhail Berlyant
Dec 6 '18 at 20:59
You have a low rate. Important on SO - you can
mark accepted answer
by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer
. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!– Mikhail Berlyant
Dec 6 '18 at 20:59
You have a low rate. Important on SO - you can
mark accepted answer
by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer
. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!– Mikhail Berlyant
Dec 6 '18 at 20:59
add a comment |
2 Answers
2
active
oldest
votes
Two quick thoughts come to mind:
You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.
You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like
CURRENT_TIMESTAMP()
,CURRENT_DATE()
etc should be sufficient for this purpose.
From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache
to false for the query job, which is the behavior you're requesting.
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
add a comment |
Is there a disable cache option which can be embeded with in the SQL
No! There is no such option to be used within the query
Meantime, to enforce not using cache you can add something like below to your query
WHERE RAND() < 2
This is just silly example - but you should get an idea :o)
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313468%2fdisable-cache-while-running-sql-google-big-query%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Two quick thoughts come to mind:
You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.
You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like
CURRENT_TIMESTAMP()
,CURRENT_DATE()
etc should be sufficient for this purpose.
From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache
to false for the query job, which is the behavior you're requesting.
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
add a comment |
Two quick thoughts come to mind:
You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.
You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like
CURRENT_TIMESTAMP()
,CURRENT_DATE()
etc should be sufficient for this purpose.
From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache
to false for the query job, which is the behavior you're requesting.
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
add a comment |
Two quick thoughts come to mind:
You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.
You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like
CURRENT_TIMESTAMP()
,CURRENT_DATE()
etc should be sufficient for this purpose.
From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache
to false for the query job, which is the behavior you're requesting.
Two quick thoughts come to mind:
You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.
You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like
CURRENT_TIMESTAMP()
,CURRENT_DATE()
etc should be sufficient for this purpose.
From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache
to false for the query job, which is the behavior you're requesting.
answered Nov 15 '18 at 7:55
shollymanshollyman
1,63699
1,63699
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
add a comment |
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.
– Pal1989
Nov 15 '18 at 22:55
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.
– shollyman
Nov 16 '18 at 1:39
add a comment |
Is there a disable cache option which can be embeded with in the SQL
No! There is no such option to be used within the query
Meantime, to enforce not using cache you can add something like below to your query
WHERE RAND() < 2
This is just silly example - but you should get an idea :o)
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
|
show 1 more comment
Is there a disable cache option which can be embeded with in the SQL
No! There is no such option to be used within the query
Meantime, to enforce not using cache you can add something like below to your query
WHERE RAND() < 2
This is just silly example - but you should get an idea :o)
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
|
show 1 more comment
Is there a disable cache option which can be embeded with in the SQL
No! There is no such option to be used within the query
Meantime, to enforce not using cache you can add something like below to your query
WHERE RAND() < 2
This is just silly example - but you should get an idea :o)
Is there a disable cache option which can be embeded with in the SQL
No! There is no such option to be used within the query
Meantime, to enforce not using cache you can add something like below to your query
WHERE RAND() < 2
This is just silly example - but you should get an idea :o)
answered Nov 15 '18 at 16:14
Mikhail BerlyantMikhail Berlyant
59.7k43671
59.7k43671
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
|
show 1 more comment
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?
– Pal1989
Nov 15 '18 at 23:01
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching
– Mikhail Berlyant
Nov 15 '18 at 23:10
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
... and obviously because RAND() is always less than 1 it will not affect result of your query :o)
– Mikhail Berlyant
Nov 15 '18 at 23:14
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
did this explained? make sense?
– Mikhail Berlyant
Nov 15 '18 at 23:37
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.
– Pal1989
Nov 15 '18 at 23:42
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313468%2fdisable-cache-while-running-sql-google-big-query%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You have a low rate. Important on SO - you can
mark accepted answer
by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important tovote on answer
. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!– Mikhail Berlyant
Dec 6 '18 at 20:59