Is it faster to call procedure twice and get COUNT(*) on the second call or faster to do COUNT(*) OVER() when fetching a list of records? (PostgreSQL)










-1















My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.



Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).



Which one of these examples is more efficient? Why?



Example 1



SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();


This returns a total_rows column on every row along with the rest of the columns.



-or-



Example 2



SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();


This way requires 2 calls, but doesn't calculate a total_rows column.



Additionally, does the COUNT in Example 1 get calculated for every record or only once?










share|improve this question

















  • 2





    what do you see when you time them both?

    – Mitch Wheat
    Nov 14 '18 at 23:57






  • 1





    Eric Lippert say Race Your Horse

    – dwir182
    Nov 15 '18 at 0:04






  • 1





    I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.

    – 404
    Nov 15 '18 at 0:04











  • On a table with 471 records, Example 1 takes about 8ms whereas Example 2 takes about 11ms. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.

    – Graham S.
    Nov 15 '18 at 0:08















-1















My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.



Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).



Which one of these examples is more efficient? Why?



Example 1



SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();


This returns a total_rows column on every row along with the rest of the columns.



-or-



Example 2



SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();


This way requires 2 calls, but doesn't calculate a total_rows column.



Additionally, does the COUNT in Example 1 get calculated for every record or only once?










share|improve this question

















  • 2





    what do you see when you time them both?

    – Mitch Wheat
    Nov 14 '18 at 23:57






  • 1





    Eric Lippert say Race Your Horse

    – dwir182
    Nov 15 '18 at 0:04






  • 1





    I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.

    – 404
    Nov 15 '18 at 0:04











  • On a table with 471 records, Example 1 takes about 8ms whereas Example 2 takes about 11ms. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.

    – Graham S.
    Nov 15 '18 at 0:08













-1












-1








-1








My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.



Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).



Which one of these examples is more efficient? Why?



Example 1



SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();


This returns a total_rows column on every row along with the rest of the columns.



-or-



Example 2



SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();


This way requires 2 calls, but doesn't calculate a total_rows column.



Additionally, does the COUNT in Example 1 get calculated for every record or only once?










share|improve this question














My coworker and I were discussing today which way is the faster and more efficient way to perform a count over a list of records.



Basically, the scenario is, when we want to fetch a list of records from a procedure, is it quicker/more efficient to calculate the total number of records from the procedure as we fetch the list of records in one query or is it better to fetch the list of records first and then call COUNT again on the same procedure (in 2 calls).



Which one of these examples is more efficient? Why?



Example 1



SELECT *, COUNT(1) OVER() AS total_rows FROM some_procedure();


This returns a total_rows column on every row along with the rest of the columns.



-or-



Example 2



SELECT * FROM some_procedure();
SELECT COUNT(*) FROM some_procedure();


This way requires 2 calls, but doesn't calculate a total_rows column.



Additionally, does the COUNT in Example 1 get calculated for every record or only once?







sql database postgresql






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 14 '18 at 23:55









Graham S.Graham S.

86811121




86811121







  • 2





    what do you see when you time them both?

    – Mitch Wheat
    Nov 14 '18 at 23:57






  • 1





    Eric Lippert say Race Your Horse

    – dwir182
    Nov 15 '18 at 0:04






  • 1





    I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.

    – 404
    Nov 15 '18 at 0:04











  • On a table with 471 records, Example 1 takes about 8ms whereas Example 2 takes about 11ms. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.

    – Graham S.
    Nov 15 '18 at 0:08












  • 2





    what do you see when you time them both?

    – Mitch Wheat
    Nov 14 '18 at 23:57






  • 1





    Eric Lippert say Race Your Horse

    – dwir182
    Nov 15 '18 at 0:04






  • 1





    I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.

    – 404
    Nov 15 '18 at 0:04











  • On a table with 471 records, Example 1 takes about 8ms whereas Example 2 takes about 11ms. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.

    – Graham S.
    Nov 15 '18 at 0:08







2




2





what do you see when you time them both?

– Mitch Wheat
Nov 14 '18 at 23:57





what do you see when you time them both?

– Mitch Wheat
Nov 14 '18 at 23:57




1




1





Eric Lippert say Race Your Horse

– dwir182
Nov 15 '18 at 0:04





Eric Lippert say Race Your Horse

– dwir182
Nov 15 '18 at 0:04




1




1





I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.

– 404
Nov 15 '18 at 0:04





I'd be surprised if the second was quicker. Assuming the function reads from a table, it will result in two table reads. And I would imagine the COUNT in the first would be optimised to only be executed/calculated once, after all it's guaranteed to be the same value for every row. Just speculation though.

– 404
Nov 15 '18 at 0:04













On a table with 471 records, Example 1 takes about 8ms whereas Example 2 takes about 11ms. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.

– Graham S.
Nov 15 '18 at 0:08





On a table with 471 records, Example 1 takes about 8ms whereas Example 2 takes about 11ms. I'll try to see if I can find a larger data set though, not sure if that's enough to accurately depict performance.

– Graham S.
Nov 15 '18 at 0:08












3 Answers
3






active

oldest

votes


















2














If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:



with result as (
select *
from some_function()
)
select *, (select count(*) from result) as total_rows
from result;


Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.






share|improve this answer






























    1














    There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.



    Single calls allow the DB to optimize, if possible.



    With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.



    But, testing is the only way to know for sure.






    share|improve this answer






























      1














      The only way to know is to test.



      Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.



      Calculating count(*) in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.



      If you are reading all the rows, why not just count the ones that are returned?






      share|improve this answer






















        Your Answer






        StackExchange.ifUsing("editor", function ()
        StackExchange.using("externalEditor", function ()
        StackExchange.using("snippets", function ()
        StackExchange.snippets.init();
        );
        );
        , "code-snippets");

        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "1"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310535%2fis-it-faster-to-call-procedure-twice-and-get-count-on-the-second-call-or-fast%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        2














        If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:



        with result as (
        select *
        from some_function()
        )
        select *, (select count(*) from result) as total_rows
        from result;


        Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.






        share|improve this answer



























          2














          If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:



          with result as (
          select *
          from some_function()
          )
          select *, (select count(*) from result) as total_rows
          from result;


          Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.






          share|improve this answer

























            2












            2








            2







            If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:



            with result as (
            select *
            from some_function()
            )
            select *, (select count(*) from result) as total_rows
            from result;


            Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.






            share|improve this answer













            If the function returns only a few rows (compared to the number of rows in the table it reads from), I would count the returned rows:



            with result as (
            select *
            from some_function()
            )
            select *, (select count(*) from result) as total_rows
            from result;


            Again, this only makes sense if the function does some heavy lifting that you only want to do once and the number of rows is reasonably small (how much that is depends on your server, anything in the "many thousands" is probably OK, anything like "millions of rows" is probably not going to be efficient.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 15 '18 at 6:41









            a_horse_with_no_namea_horse_with_no_name

            298k46454548




            298k46454548























                1














                There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.



                Single calls allow the DB to optimize, if possible.



                With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.



                But, testing is the only way to know for sure.






                share|improve this answer



























                  1














                  There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.



                  Single calls allow the DB to optimize, if possible.



                  With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.



                  But, testing is the only way to know for sure.






                  share|improve this answer

























                    1












                    1








                    1







                    There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.



                    Single calls allow the DB to optimize, if possible.



                    With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.



                    But, testing is the only way to know for sure.






                    share|improve this answer













                    There are a lot of "it depends", but as a general rule, I make the database do the work by doing a single call rather than multiple.



                    Single calls allow the DB to optimize, if possible.



                    With multiple calls, good deal of time is spent marshaling and unmarshaling parameters, network traffic, having the DB interpret the call, causing the interrupt back into the calling process to make the second call.



                    But, testing is the only way to know for sure.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Nov 15 '18 at 0:05









                    pbuckpbuck

                    2,46321221




                    2,46321221





















                        1














                        The only way to know is to test.



                        Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.



                        Calculating count(*) in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.



                        If you are reading all the rows, why not just count the ones that are returned?






                        share|improve this answer



























                          1














                          The only way to know is to test.



                          Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.



                          Calculating count(*) in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.



                          If you are reading all the rows, why not just count the ones that are returned?






                          share|improve this answer

























                            1












                            1








                            1







                            The only way to know is to test.



                            Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.



                            Calculating count(*) in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.



                            If you are reading all the rows, why not just count the ones that are returned?






                            share|improve this answer













                            The only way to know is to test.



                            Although there is overhead for multiple calls to the database, reading all the rows of a table can be quite expensive. Of course, if the table has 10 rows, then it is not expensive. A billion rows is a different matter.



                            Calculating count(*) in Postgres should require scanning all the rows -- and testing for locking (to handle against concurrent transactions). This is, unfortunately, rather expensive.



                            If you are reading all the rows, why not just count the ones that are returned?







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Nov 15 '18 at 0:19









                            Gordon LinoffGordon Linoff

                            775k35306409




                            775k35306409



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310535%2fis-it-faster-to-call-procedure-twice-and-get-count-on-the-second-call-or-fast%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Top Tejano songwriter Luis Silva dead of heart attack at 64

                                政党

                                天津地下鉄3号線