How to have app engine avoid cold starts?










0















Even when there are instances already running, I am still experiencing cold starts on some of the requests.



I thought that GAE would start some instances in the background and add them to the pool of active instances that serve requests only once the instances are started. Is that not the case? Is there a way to configure GAE to make it so?



Instead it seems like some of the requests are waiting the full duration of the new instance to be started, which can take up to 10 seconds, when using the existing instances only would have served all the benchmark traffic under a couple of seconds.



UPDATE:
This is my app.yaml config:



runtime: nodejs10
env: standard
instance_class: F1
handlers:
- url: '.*'
script: auto
automatic_scaling:
min_instances: 1
max_instances: 3









share|improve this question




























    0















    Even when there are instances already running, I am still experiencing cold starts on some of the requests.



    I thought that GAE would start some instances in the background and add them to the pool of active instances that serve requests only once the instances are started. Is that not the case? Is there a way to configure GAE to make it so?



    Instead it seems like some of the requests are waiting the full duration of the new instance to be started, which can take up to 10 seconds, when using the existing instances only would have served all the benchmark traffic under a couple of seconds.



    UPDATE:
    This is my app.yaml config:



    runtime: nodejs10
    env: standard
    instance_class: F1
    handlers:
    - url: '.*'
    script: auto
    automatic_scaling:
    min_instances: 1
    max_instances: 3









    share|improve this question


























      0












      0








      0








      Even when there are instances already running, I am still experiencing cold starts on some of the requests.



      I thought that GAE would start some instances in the background and add them to the pool of active instances that serve requests only once the instances are started. Is that not the case? Is there a way to configure GAE to make it so?



      Instead it seems like some of the requests are waiting the full duration of the new instance to be started, which can take up to 10 seconds, when using the existing instances only would have served all the benchmark traffic under a couple of seconds.



      UPDATE:
      This is my app.yaml config:



      runtime: nodejs10
      env: standard
      instance_class: F1
      handlers:
      - url: '.*'
      script: auto
      automatic_scaling:
      min_instances: 1
      max_instances: 3









      share|improve this question
















      Even when there are instances already running, I am still experiencing cold starts on some of the requests.



      I thought that GAE would start some instances in the background and add them to the pool of active instances that serve requests only once the instances are started. Is that not the case? Is there a way to configure GAE to make it so?



      Instead it seems like some of the requests are waiting the full duration of the new instance to be started, which can take up to 10 seconds, when using the existing instances only would have served all the benchmark traffic under a couple of seconds.



      UPDATE:
      This is my app.yaml config:



      runtime: nodejs10
      env: standard
      instance_class: F1
      handlers:
      - url: '.*'
      script: auto
      automatic_scaling:
      min_instances: 1
      max_instances: 3






      google-app-engine






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 14 '18 at 1:33







      ben

















      asked Nov 13 '18 at 23:33









      benben

      3022418




      3022418






















          1 Answer
          1






          active

          oldest

          votes


















          0














          What you're looking for are the Warmup requests:




          Warmup requests are a specific type of loading request that load
          application code into an instance ahead of time, before any live
          requests are made. Manual or basic scaling instances do not receive an
          /_ah/warmup request.




          And from Configuring warmup requests:




          Loading your app's code to a new instance can result in loading
          requests. Loading requests can result in increased request latency
          for your users, but you can avoid this latency using warmup
          requests
          . Warmup requests load your app's code into a new instance
          before any live requests reach that instance.




          Not 100% perfect - there are some limitations, but they're the next best thing.



          Configuring warmup requests means:




          • Enabling warmup requests in your app.yaml file:



            inbound_services:
            - warmup


          • Creating your handler for the '/_ah/warmup' warmup requests URL






          share|improve this answer























          • So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

            – ben
            Nov 14 '18 at 5:03






          • 1





            Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

            – Dan Cornilescu
            Nov 14 '18 at 11:13










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291045%2fhow-to-have-app-engine-avoid-cold-starts%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          What you're looking for are the Warmup requests:




          Warmup requests are a specific type of loading request that load
          application code into an instance ahead of time, before any live
          requests are made. Manual or basic scaling instances do not receive an
          /_ah/warmup request.




          And from Configuring warmup requests:




          Loading your app's code to a new instance can result in loading
          requests. Loading requests can result in increased request latency
          for your users, but you can avoid this latency using warmup
          requests
          . Warmup requests load your app's code into a new instance
          before any live requests reach that instance.




          Not 100% perfect - there are some limitations, but they're the next best thing.



          Configuring warmup requests means:




          • Enabling warmup requests in your app.yaml file:



            inbound_services:
            - warmup


          • Creating your handler for the '/_ah/warmup' warmup requests URL






          share|improve this answer























          • So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

            – ben
            Nov 14 '18 at 5:03






          • 1





            Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

            – Dan Cornilescu
            Nov 14 '18 at 11:13















          0














          What you're looking for are the Warmup requests:




          Warmup requests are a specific type of loading request that load
          application code into an instance ahead of time, before any live
          requests are made. Manual or basic scaling instances do not receive an
          /_ah/warmup request.




          And from Configuring warmup requests:




          Loading your app's code to a new instance can result in loading
          requests. Loading requests can result in increased request latency
          for your users, but you can avoid this latency using warmup
          requests
          . Warmup requests load your app's code into a new instance
          before any live requests reach that instance.




          Not 100% perfect - there are some limitations, but they're the next best thing.



          Configuring warmup requests means:




          • Enabling warmup requests in your app.yaml file:



            inbound_services:
            - warmup


          • Creating your handler for the '/_ah/warmup' warmup requests URL






          share|improve this answer























          • So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

            – ben
            Nov 14 '18 at 5:03






          • 1





            Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

            – Dan Cornilescu
            Nov 14 '18 at 11:13













          0












          0








          0







          What you're looking for are the Warmup requests:




          Warmup requests are a specific type of loading request that load
          application code into an instance ahead of time, before any live
          requests are made. Manual or basic scaling instances do not receive an
          /_ah/warmup request.




          And from Configuring warmup requests:




          Loading your app's code to a new instance can result in loading
          requests. Loading requests can result in increased request latency
          for your users, but you can avoid this latency using warmup
          requests
          . Warmup requests load your app's code into a new instance
          before any live requests reach that instance.




          Not 100% perfect - there are some limitations, but they're the next best thing.



          Configuring warmup requests means:




          • Enabling warmup requests in your app.yaml file:



            inbound_services:
            - warmup


          • Creating your handler for the '/_ah/warmup' warmup requests URL






          share|improve this answer













          What you're looking for are the Warmup requests:




          Warmup requests are a specific type of loading request that load
          application code into an instance ahead of time, before any live
          requests are made. Manual or basic scaling instances do not receive an
          /_ah/warmup request.




          And from Configuring warmup requests:




          Loading your app's code to a new instance can result in loading
          requests. Loading requests can result in increased request latency
          for your users, but you can avoid this latency using warmup
          requests
          . Warmup requests load your app's code into a new instance
          before any live requests reach that instance.




          Not 100% perfect - there are some limitations, but they're the next best thing.



          Configuring warmup requests means:




          • Enabling warmup requests in your app.yaml file:



            inbound_services:
            - warmup


          • Creating your handler for the '/_ah/warmup' warmup requests URL







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 14 '18 at 2:31









          Dan CornilescuDan Cornilescu

          28.3k113164




          28.3k113164












          • So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

            – ben
            Nov 14 '18 at 5:03






          • 1





            Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

            – Dan Cornilescu
            Nov 14 '18 at 11:13

















          • So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

            – ben
            Nov 14 '18 at 5:03






          • 1





            Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

            – Dan Cornilescu
            Nov 14 '18 at 11:13
















          So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

          – ben
          Nov 14 '18 at 5:03





          So that will ensure there is at least 1 instance ready, isn't that equivalent to min_instances: 1?

          – ben
          Nov 14 '18 at 5:03




          1




          1





          Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

          – Dan Cornilescu
          Nov 14 '18 at 11:13





          Hm, nope. They're orthogonal. min_instances: 1 means at least 1 instance should be running at all times. But your problem is about how instances are being started (say when a 2nd instance is needed). Without warmups an actual/live request is used to bring the instance up. So that request will be hit with long response times (as it includes the instance startup time). When warmup requests are configured a warmup request (not a live one) is used to bring the instance up and only after that request is answered OK (i.e. the instance is up and running) live requests are sent to it.

          – Dan Cornilescu
          Nov 14 '18 at 11:13

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291045%2fhow-to-have-app-engine-avoid-cold-starts%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Top Tejano songwriter Luis Silva dead of heart attack at 64

          政党

          天津地下鉄3号線