Is it faster to call procedure twice and get COUNT(*) on the second call or faster to do COUNT(*) OVER() when fetching a list of records? (PostgreSQL)
My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.
Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).
Which one of these examples is more efficient? Why?
Example 1
SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();
This returns a total_rows
column on every row along with the rest of the columns.
-or-
Example 2
SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();
This way requires 2 calls, but doesn't calculate a total_rows
column.
Additionally, does the COUNT in Example 1 get calculated for every record or only once?
sql database postgresql
add a comment |
My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.
Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).
Which one of these examples is more efficient? Why?
Example 1
SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();
This returns a total_rows
column on every row along with the rest of the columns.
-or-
Example 2
SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();
This way requires 2 calls, but doesn't calculate a total_rows
column.
Additionally, does the COUNT in Example 1 get calculated for every record or only once?
sql database postgresql
2
what do you see when you time them both?
– Mitch Wheat
Nov 14 '18 at 23:57
1
Eric Lippert sayRace Your Horse
– dwir182
Nov 15 '18 at 0:04
1
I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.
– 404
Nov 15 '18 at 0:04
On a table with 471 records, Example 1 takes about8ms
whereas Example 2 takes about11ms
. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.
– Graham S.
Nov 15 '18 at 0:08
add a comment |
My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.
Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).
Which one of these examples is more efficient? Why?
Example 1
SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();
This returns a total_rows
column on every row along with the rest of the columns.
-or-
Example 2
SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();
This way requires 2 calls, but doesn't calculate a total_rows
column.
Additionally, does the COUNT in Example 1 get calculated for every record or only once?
sql database postgresql
My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.
Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).
Which one of these examples is more efficient? Why?
Example 1
SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();
This returns a total_rows
column on every row along with the rest of the columns.
-or-
Example 2
SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();
This way requires 2 calls, but doesn't calculate a total_rows
column.
Additionally, does the COUNT in Example 1 get calculated for every record or only once?
sql database postgresql
sql database postgresql
asked Nov 14 '18 at 23:55
Graham S.Graham S.
86811121
86811121
2
what do you see when you time them both?
– Mitch Wheat
Nov 14 '18 at 23:57
1
Eric Lippert sayRace Your Horse
– dwir182
Nov 15 '18 at 0:04
1
I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.
– 404
Nov 15 '18 at 0:04
On a table with 471 records, Example 1 takes about8ms
whereas Example 2 takes about11ms
. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.
– Graham S.
Nov 15 '18 at 0:08
add a comment |
2
what do you see when you time them both?
– Mitch Wheat
Nov 14 '18 at 23:57
1
Eric Lippert sayRace Your Horse
– dwir182
Nov 15 '18 at 0:04
1
I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.
– 404
Nov 15 '18 at 0:04
On a table with 471 records, Example 1 takes about8ms
whereas Example 2 takes about11ms
. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.
– Graham S.
Nov 15 '18 at 0:08
2
2
what do you see when you time them both?
– Mitch Wheat
Nov 14 '18 at 23:57
what do you see when you time them both?
– Mitch Wheat
Nov 14 '18 at 23:57
1
1
Eric Lippert say
Race Your Horse
– dwir182
Nov 15 '18 at 0:04
Eric Lippert say
Race Your Horse
– dwir182
Nov 15 '18 at 0:04
1
1
I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.
– 404
Nov 15 '18 at 0:04
I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.
– 404
Nov 15 '18 at 0:04
On a table with 471 records, Example 1 takes about
8ms
whereas Example 2 takes about 11ms
. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.– Graham S.
Nov 15 '18 at 0:08
On a table with 471 records, Example 1 takes about
8ms
whereas Example 2 takes about 11ms
. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.– Graham S.
Nov 15 '18 at 0:08
add a comment |
3 Answers
3
active
oldest
votes
If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:
with result as (
select *
from some_function()
)
select *, (select count(*) from result) as total_rows
from result;
Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.
add a comment |
There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.
Single calls allow the DB to optimize, if possible.
With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.
But, testing is the only way to know for sure.
add a comment |
The only way to know is to test.
Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.
Calculating count(*)
in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.
If you are reading all the rows, why not just count the ones that are returned?
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310535%2fis-it-faster-to-call-procedure-twice-and-get-count-on-the-second-call-or-fast%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:
with result as (
select *
from some_function()
)
select *, (select count(*) from result) as total_rows
from result;
Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.
add a comment |
If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:
with result as (
select *
from some_function()
)
select *, (select count(*) from result) as total_rows
from result;
Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.
add a comment |
If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:
with result as (
select *
from some_function()
)
select *, (select count(*) from result) as total_rows
from result;
Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.
If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:
with result as (
select *
from some_function()
)
select *, (select count(*) from result) as total_rows
from result;
Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.
answered Nov 15 '18 at 6:41
a_horse_with_no_namea_horse_with_no_name
298k46454548
298k46454548
add a comment |
add a comment |
There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.
Single calls allow the DB to optimize, if possible.
With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.
But, testing is the only way to know for sure.
add a comment |
There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.
Single calls allow the DB to optimize, if possible.
With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.
But, testing is the only way to know for sure.
add a comment |
There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.
Single calls allow the DB to optimize, if possible.
With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.
But, testing is the only way to know for sure.
There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.
Single calls allow the DB to optimize, if possible.
With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.
But, testing is the only way to know for sure.
answered Nov 15 '18 at 0:05
pbuckpbuck
2,46321221
2,46321221
add a comment |
add a comment |
The only way to know is to test.
Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.
Calculating count(*)
in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.
If you are reading all the rows, why not just count the ones that are returned?
add a comment |
The only way to know is to test.
Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.
Calculating count(*)
in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.
If you are reading all the rows, why not just count the ones that are returned?
add a comment |
The only way to know is to test.
Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.
Calculating count(*)
in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.
If you are reading all the rows, why not just count the ones that are returned?
The only way to know is to test.
Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.
Calculating count(*)
in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.
If you are reading all the rows, why not just count the ones that are returned?
answered Nov 15 '18 at 0:19
Gordon LinoffGordon Linoff
775k35306409
775k35306409
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310535%2fis-it-faster-to-call-procedure-twice-and-get-count-on-the-second-call-or-fast%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
what do you see when you time them both?
– Mitch Wheat
Nov 14 '18 at 23:57
1
Eric Lippert say
Race Your Horse
– dwir182
Nov 15 '18 at 0:04
1
I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.
– 404
Nov 15 '18 at 0:04
On a table with 471 records, Example 1 takes about
8ms
whereas Example 2 takes about11ms
. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.– Graham S.
Nov 15 '18 at 0:08