Airflow deprecation warning Invalid arguments were passed









up vote
0
down vote

favorite












I have the following code on Airflow 1.9:



import_op = MySqlToGoogleCloudStorageOperator(
task_id='import',
mysql_conn_id='oproduction',
google_cloud_storage_conn_id='gcpm',
provide_context=True,
approx_max_file_size_bytes = 100000000, #100MB per file
sql = 'import.sql',
params='next_to_import': NEXT_TO_IMPORT, 'table_name' : TABLE_NAME,
bucket=GCS_BUCKET_ID,
filename=file_name_orders,
dag=dag)


Why does it genereates:




/usr/local/lib/python2.7/dist-packages/airflow/models.py:2160:
PendingDeprecationWarning: Invalid arguments were passed to
MySqlToGoogleCloudStorageOperator. Support for passing such arguments
will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: 'provide_context': True category=PendingDeprecationWarning




What is the problem with the provide_context? To the best of my knowledge it is needed for the usage of params.










share|improve this question

























    up vote
    0
    down vote

    favorite












    I have the following code on Airflow 1.9:



    import_op = MySqlToGoogleCloudStorageOperator(
    task_id='import',
    mysql_conn_id='oproduction',
    google_cloud_storage_conn_id='gcpm',
    provide_context=True,
    approx_max_file_size_bytes = 100000000, #100MB per file
    sql = 'import.sql',
    params='next_to_import': NEXT_TO_IMPORT, 'table_name' : TABLE_NAME,
    bucket=GCS_BUCKET_ID,
    filename=file_name_orders,
    dag=dag)


    Why does it genereates:




    /usr/local/lib/python2.7/dist-packages/airflow/models.py:2160:
    PendingDeprecationWarning: Invalid arguments were passed to
    MySqlToGoogleCloudStorageOperator. Support for passing such arguments
    will be dropped in Airflow 2.0. Invalid arguments were:
    *args: ()
    **kwargs: 'provide_context': True category=PendingDeprecationWarning




    What is the problem with the provide_context? To the best of my knowledge it is needed for the usage of params.










    share|improve this question























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I have the following code on Airflow 1.9:



      import_op = MySqlToGoogleCloudStorageOperator(
      task_id='import',
      mysql_conn_id='oproduction',
      google_cloud_storage_conn_id='gcpm',
      provide_context=True,
      approx_max_file_size_bytes = 100000000, #100MB per file
      sql = 'import.sql',
      params='next_to_import': NEXT_TO_IMPORT, 'table_name' : TABLE_NAME,
      bucket=GCS_BUCKET_ID,
      filename=file_name_orders,
      dag=dag)


      Why does it genereates:




      /usr/local/lib/python2.7/dist-packages/airflow/models.py:2160:
      PendingDeprecationWarning: Invalid arguments were passed to
      MySqlToGoogleCloudStorageOperator. Support for passing such arguments
      will be dropped in Airflow 2.0. Invalid arguments were:
      *args: ()
      **kwargs: 'provide_context': True category=PendingDeprecationWarning




      What is the problem with the provide_context? To the best of my knowledge it is needed for the usage of params.










      share|improve this question













      I have the following code on Airflow 1.9:



      import_op = MySqlToGoogleCloudStorageOperator(
      task_id='import',
      mysql_conn_id='oproduction',
      google_cloud_storage_conn_id='gcpm',
      provide_context=True,
      approx_max_file_size_bytes = 100000000, #100MB per file
      sql = 'import.sql',
      params='next_to_import': NEXT_TO_IMPORT, 'table_name' : TABLE_NAME,
      bucket=GCS_BUCKET_ID,
      filename=file_name_orders,
      dag=dag)


      Why does it genereates:




      /usr/local/lib/python2.7/dist-packages/airflow/models.py:2160:
      PendingDeprecationWarning: Invalid arguments were passed to
      MySqlToGoogleCloudStorageOperator. Support for passing such arguments
      will be dropped in Airflow 2.0. Invalid arguments were:
      *args: ()
      **kwargs: 'provide_context': True category=PendingDeprecationWarning




      What is the problem with the provide_context? To the best of my knowledge it is needed for the usage of params.







      airflow






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 11 at 12:27









      Luis

      557




      557






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote



          accepted










          provide_context is not needed for params.



          params parameter (dict type) can be passed to any Operator.



          You would mostly use provide_context with PythonOperator, BranchPythonOperator. A good example is https://airflow.readthedocs.io/en/latest/howto/operator.html#pythonoperator.



          MySqlToGoogleCloudStorageOperator has no parameter provide_context, hence it is passed in **kwargs and you get Deprecation warning.



          If you check docstring of PythonOperator for provide_context :




          if set to true, Airflow will pass a set of keyword arguments that can
          be used in your function. This set of kwargs correspond exactly to
          what you can use in your jinja templates. For this to work, you need
          to define **kwargs in your function header.




          It has the following code if you check the source code:



          if self.provide_context:
          context.update(self.op_kwargs)
          context['templates_dict'] = self.templates_dict
          self.op_kwargs = context


          So in simple terms, it passes the following dictionary with templates_dict to your function pass in python_callable:




          'END_DATE': ds,
          'conf': configuration,
          'dag': task.dag,
          'dag_run': dag_run,
          'ds': ds,
          'ds_nodash': ds_nodash,
          'end_date': ds,
          'execution_date': self.execution_date,
          'latest_date': ds,
          'macros': macros,
          'params': params,
          'run_id': run_id,
          'tables': tables,
          'task': task,
          'task_instance': self,
          'task_instance_key_str': ti_key_str,
          'test_mode': self.test_mode,
          'ti': self,
          'tomorrow_ds': tomorrow_ds,
          'tomorrow_ds_nodash': tomorrow_ds_nodash,
          'ts': ts,
          'ts_nodash': ts_nodash,
          'yesterday_ds': yesterday_ds,
          'yesterday_ds_nodash': yesterday_ds_nodash,



          So this can be used in the function as follows:



          def print_context(ds, **kwargs):
          pprint(kwargs)
          ti = context['task_instance']
          exec_date = context['execution_date']
          print(ds)
          return 'Whatever you return gets printed in the logs'


          run_this = PythonOperator(
          task_id='print_the_context',
          provide_context=True,
          python_callable=print_context,
          dag=dag,
          )





          share|improve this answer






















          • OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
            – Luis
            Nov 12 at 9:51











          • You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
            – kaxil
            Nov 12 at 11:06











          • I have updated the answer with this info
            – kaxil
            Nov 12 at 11:18










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248759%2fairflow-deprecation-warning-invalid-arguments-were-passed%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote



          accepted










          provide_context is not needed for params.



          params parameter (dict type) can be passed to any Operator.



          You would mostly use provide_context with PythonOperator, BranchPythonOperator. A good example is https://airflow.readthedocs.io/en/latest/howto/operator.html#pythonoperator.



          MySqlToGoogleCloudStorageOperator has no parameter provide_context, hence it is passed in **kwargs and you get Deprecation warning.



          If you check docstring of PythonOperator for provide_context :




          if set to true, Airflow will pass a set of keyword arguments that can
          be used in your function. This set of kwargs correspond exactly to
          what you can use in your jinja templates. For this to work, you need
          to define **kwargs in your function header.




          It has the following code if you check the source code:



          if self.provide_context:
          context.update(self.op_kwargs)
          context['templates_dict'] = self.templates_dict
          self.op_kwargs = context


          So in simple terms, it passes the following dictionary with templates_dict to your function pass in python_callable:




          'END_DATE': ds,
          'conf': configuration,
          'dag': task.dag,
          'dag_run': dag_run,
          'ds': ds,
          'ds_nodash': ds_nodash,
          'end_date': ds,
          'execution_date': self.execution_date,
          'latest_date': ds,
          'macros': macros,
          'params': params,
          'run_id': run_id,
          'tables': tables,
          'task': task,
          'task_instance': self,
          'task_instance_key_str': ti_key_str,
          'test_mode': self.test_mode,
          'ti': self,
          'tomorrow_ds': tomorrow_ds,
          'tomorrow_ds_nodash': tomorrow_ds_nodash,
          'ts': ts,
          'ts_nodash': ts_nodash,
          'yesterday_ds': yesterday_ds,
          'yesterday_ds_nodash': yesterday_ds_nodash,



          So this can be used in the function as follows:



          def print_context(ds, **kwargs):
          pprint(kwargs)
          ti = context['task_instance']
          exec_date = context['execution_date']
          print(ds)
          return 'Whatever you return gets printed in the logs'


          run_this = PythonOperator(
          task_id='print_the_context',
          provide_context=True,
          python_callable=print_context,
          dag=dag,
          )





          share|improve this answer






















          • OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
            – Luis
            Nov 12 at 9:51











          • You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
            – kaxil
            Nov 12 at 11:06











          • I have updated the answer with this info
            – kaxil
            Nov 12 at 11:18














          up vote
          0
          down vote



          accepted










          provide_context is not needed for params.



          params parameter (dict type) can be passed to any Operator.



          You would mostly use provide_context with PythonOperator, BranchPythonOperator. A good example is https://airflow.readthedocs.io/en/latest/howto/operator.html#pythonoperator.



          MySqlToGoogleCloudStorageOperator has no parameter provide_context, hence it is passed in **kwargs and you get Deprecation warning.



          If you check docstring of PythonOperator for provide_context :




          if set to true, Airflow will pass a set of keyword arguments that can
          be used in your function. This set of kwargs correspond exactly to
          what you can use in your jinja templates. For this to work, you need
          to define **kwargs in your function header.




          It has the following code if you check the source code:



          if self.provide_context:
          context.update(self.op_kwargs)
          context['templates_dict'] = self.templates_dict
          self.op_kwargs = context


          So in simple terms, it passes the following dictionary with templates_dict to your function pass in python_callable:




          'END_DATE': ds,
          'conf': configuration,
          'dag': task.dag,
          'dag_run': dag_run,
          'ds': ds,
          'ds_nodash': ds_nodash,
          'end_date': ds,
          'execution_date': self.execution_date,
          'latest_date': ds,
          'macros': macros,
          'params': params,
          'run_id': run_id,
          'tables': tables,
          'task': task,
          'task_instance': self,
          'task_instance_key_str': ti_key_str,
          'test_mode': self.test_mode,
          'ti': self,
          'tomorrow_ds': tomorrow_ds,
          'tomorrow_ds_nodash': tomorrow_ds_nodash,
          'ts': ts,
          'ts_nodash': ts_nodash,
          'yesterday_ds': yesterday_ds,
          'yesterday_ds_nodash': yesterday_ds_nodash,



          So this can be used in the function as follows:



          def print_context(ds, **kwargs):
          pprint(kwargs)
          ti = context['task_instance']
          exec_date = context['execution_date']
          print(ds)
          return 'Whatever you return gets printed in the logs'


          run_this = PythonOperator(
          task_id='print_the_context',
          provide_context=True,
          python_callable=print_context,
          dag=dag,
          )





          share|improve this answer






















          • OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
            – Luis
            Nov 12 at 9:51











          • You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
            – kaxil
            Nov 12 at 11:06











          • I have updated the answer with this info
            – kaxil
            Nov 12 at 11:18












          up vote
          0
          down vote



          accepted







          up vote
          0
          down vote



          accepted






          provide_context is not needed for params.



          params parameter (dict type) can be passed to any Operator.



          You would mostly use provide_context with PythonOperator, BranchPythonOperator. A good example is https://airflow.readthedocs.io/en/latest/howto/operator.html#pythonoperator.



          MySqlToGoogleCloudStorageOperator has no parameter provide_context, hence it is passed in **kwargs and you get Deprecation warning.



          If you check docstring of PythonOperator for provide_context :




          if set to true, Airflow will pass a set of keyword arguments that can
          be used in your function. This set of kwargs correspond exactly to
          what you can use in your jinja templates. For this to work, you need
          to define **kwargs in your function header.




          It has the following code if you check the source code:



          if self.provide_context:
          context.update(self.op_kwargs)
          context['templates_dict'] = self.templates_dict
          self.op_kwargs = context


          So in simple terms, it passes the following dictionary with templates_dict to your function pass in python_callable:




          'END_DATE': ds,
          'conf': configuration,
          'dag': task.dag,
          'dag_run': dag_run,
          'ds': ds,
          'ds_nodash': ds_nodash,
          'end_date': ds,
          'execution_date': self.execution_date,
          'latest_date': ds,
          'macros': macros,
          'params': params,
          'run_id': run_id,
          'tables': tables,
          'task': task,
          'task_instance': self,
          'task_instance_key_str': ti_key_str,
          'test_mode': self.test_mode,
          'ti': self,
          'tomorrow_ds': tomorrow_ds,
          'tomorrow_ds_nodash': tomorrow_ds_nodash,
          'ts': ts,
          'ts_nodash': ts_nodash,
          'yesterday_ds': yesterday_ds,
          'yesterday_ds_nodash': yesterday_ds_nodash,



          So this can be used in the function as follows:



          def print_context(ds, **kwargs):
          pprint(kwargs)
          ti = context['task_instance']
          exec_date = context['execution_date']
          print(ds)
          return 'Whatever you return gets printed in the logs'


          run_this = PythonOperator(
          task_id='print_the_context',
          provide_context=True,
          python_callable=print_context,
          dag=dag,
          )





          share|improve this answer














          provide_context is not needed for params.



          params parameter (dict type) can be passed to any Operator.



          You would mostly use provide_context with PythonOperator, BranchPythonOperator. A good example is https://airflow.readthedocs.io/en/latest/howto/operator.html#pythonoperator.



          MySqlToGoogleCloudStorageOperator has no parameter provide_context, hence it is passed in **kwargs and you get Deprecation warning.



          If you check docstring of PythonOperator for provide_context :




          if set to true, Airflow will pass a set of keyword arguments that can
          be used in your function. This set of kwargs correspond exactly to
          what you can use in your jinja templates. For this to work, you need
          to define **kwargs in your function header.




          It has the following code if you check the source code:



          if self.provide_context:
          context.update(self.op_kwargs)
          context['templates_dict'] = self.templates_dict
          self.op_kwargs = context


          So in simple terms, it passes the following dictionary with templates_dict to your function pass in python_callable:




          'END_DATE': ds,
          'conf': configuration,
          'dag': task.dag,
          'dag_run': dag_run,
          'ds': ds,
          'ds_nodash': ds_nodash,
          'end_date': ds,
          'execution_date': self.execution_date,
          'latest_date': ds,
          'macros': macros,
          'params': params,
          'run_id': run_id,
          'tables': tables,
          'task': task,
          'task_instance': self,
          'task_instance_key_str': ti_key_str,
          'test_mode': self.test_mode,
          'ti': self,
          'tomorrow_ds': tomorrow_ds,
          'tomorrow_ds_nodash': tomorrow_ds_nodash,
          'ts': ts,
          'ts_nodash': ts_nodash,
          'yesterday_ds': yesterday_ds,
          'yesterday_ds_nodash': yesterday_ds_nodash,



          So this can be used in the function as follows:



          def print_context(ds, **kwargs):
          pprint(kwargs)
          ti = context['task_instance']
          exec_date = context['execution_date']
          print(ds)
          return 'Whatever you return gets printed in the logs'


          run_this = PythonOperator(
          task_id='print_the_context',
          provide_context=True,
          python_callable=print_context,
          dag=dag,
          )






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 12 at 11:18

























          answered Nov 11 at 19:25









          kaxil

          2,133627




          2,133627











          • OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
            – Luis
            Nov 12 at 9:51











          • You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
            – kaxil
            Nov 12 at 11:06











          • I have updated the answer with this info
            – kaxil
            Nov 12 at 11:18
















          • OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
            – Luis
            Nov 12 at 9:51











          • You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
            – kaxil
            Nov 12 at 11:06











          • I have updated the answer with this info
            – kaxil
            Nov 12 at 11:18















          OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
          – Luis
          Nov 12 at 9:51





          OK. So can you please explain why provide_context is even needed? provide_context will always be true when params is added for PythonOperator. It seems like a parameter that Airflow can figure out it's value by it's own... It gives nothing to ask the user to specify it
          – Luis
          Nov 12 at 9:51













          You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
          – kaxil
          Nov 12 at 11:06





          You would use provide_context so that it passes the variables to the function passed in python_callable in PythonOperator
          – kaxil
          Nov 12 at 11:06













          I have updated the answer with this info
          – kaxil
          Nov 12 at 11:18




          I have updated the answer with this info
          – kaxil
          Nov 12 at 11:18

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248759%2fairflow-deprecation-warning-invalid-arguments-were-passed%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Top Tejano songwriter Luis Silva dead of heart attack at 64

          ReactJS Fetched API data displays live - need Data displayed static

          政党