Pandas :TypeError: '>=' not supported between instances of 'str' and 'float' [duplicate]









up vote
0
down vote

favorite













This question already has an answer here:



  • Pandas: Converting to numeric, creating NaNs when necessary

    4 answers



I'm trying to find the the maximum wind gust for each month for the following sample of data:



 maxtemp mintemp meantemp heatdays cooldays rain snow precip groundsnow maxgustdir maxgustspd
time
2018-01-01 -1.3 -8.1 -4.7 22.7 0.0 0.0 1.0 0.2 17.0 26.0 54
2018-01-02 -0.9 -7.4 -4.2 22.2 0.0 0.0 0.0 0.0 17.0 26.0 41
2018-01-03 -3.0 -7.9 -5.5 23.5 0.0 0.0 0.4 0.2 17.0 27.0 70
2018-01-04 0.0 -11.0 -5.5 23.5 0.0 2.4 7.2 8.4 11.0 12.0 96
2018-01-05 10.0 -0.3 4.9 13.1 0.0 11.0 0.0 11.0 10.0 14.0 70


Here's my code:



w['maxgustspd'].resample('M').max()


As you can see i've resampled the data to monthly and am trying to get the max value for each month. Problem is there's a mix of float and string (i.e. <31) values so I get the error:



TypeError: '>=' not supported between instances of 'str' and 'float'


Any ideas how to ignore the string dtypes?










share|improve this question













marked as duplicate by jpp pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Nov 10 at 22:09


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • Ignore them how? Just give a nan?
    – roganjosh
    Nov 10 at 21:28










  • Are all your numbers actually stored as floats or are some of them saved as strings? You may want to actually convert the numbers to floats if its the latter.
    – Paritosh Singh
    Nov 10 at 21:43










  • The numbers are stored as floats in general except for the (<31) values. I'm guessing that was a threshold for recording the data? I can convert them to NaN in this case since i'm looking for the max and they obviously wouldn't qualify.
    – Steve Power
    Nov 11 at 1:42














up vote
0
down vote

favorite













This question already has an answer here:



  • Pandas: Converting to numeric, creating NaNs when necessary

    4 answers



I'm trying to find the the maximum wind gust for each month for the following sample of data:



 maxtemp mintemp meantemp heatdays cooldays rain snow precip groundsnow maxgustdir maxgustspd
time
2018-01-01 -1.3 -8.1 -4.7 22.7 0.0 0.0 1.0 0.2 17.0 26.0 54
2018-01-02 -0.9 -7.4 -4.2 22.2 0.0 0.0 0.0 0.0 17.0 26.0 41
2018-01-03 -3.0 -7.9 -5.5 23.5 0.0 0.0 0.4 0.2 17.0 27.0 70
2018-01-04 0.0 -11.0 -5.5 23.5 0.0 2.4 7.2 8.4 11.0 12.0 96
2018-01-05 10.0 -0.3 4.9 13.1 0.0 11.0 0.0 11.0 10.0 14.0 70


Here's my code:



w['maxgustspd'].resample('M').max()


As you can see i've resampled the data to monthly and am trying to get the max value for each month. Problem is there's a mix of float and string (i.e. <31) values so I get the error:



TypeError: '>=' not supported between instances of 'str' and 'float'


Any ideas how to ignore the string dtypes?










share|improve this question













marked as duplicate by jpp pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Nov 10 at 22:09


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • Ignore them how? Just give a nan?
    – roganjosh
    Nov 10 at 21:28










  • Are all your numbers actually stored as floats or are some of them saved as strings? You may want to actually convert the numbers to floats if its the latter.
    – Paritosh Singh
    Nov 10 at 21:43










  • The numbers are stored as floats in general except for the (<31) values. I'm guessing that was a threshold for recording the data? I can convert them to NaN in this case since i'm looking for the max and they obviously wouldn't qualify.
    – Steve Power
    Nov 11 at 1:42












up vote
0
down vote

favorite









up vote
0
down vote

favorite












This question already has an answer here:



  • Pandas: Converting to numeric, creating NaNs when necessary

    4 answers



I'm trying to find the the maximum wind gust for each month for the following sample of data:



 maxtemp mintemp meantemp heatdays cooldays rain snow precip groundsnow maxgustdir maxgustspd
time
2018-01-01 -1.3 -8.1 -4.7 22.7 0.0 0.0 1.0 0.2 17.0 26.0 54
2018-01-02 -0.9 -7.4 -4.2 22.2 0.0 0.0 0.0 0.0 17.0 26.0 41
2018-01-03 -3.0 -7.9 -5.5 23.5 0.0 0.0 0.4 0.2 17.0 27.0 70
2018-01-04 0.0 -11.0 -5.5 23.5 0.0 2.4 7.2 8.4 11.0 12.0 96
2018-01-05 10.0 -0.3 4.9 13.1 0.0 11.0 0.0 11.0 10.0 14.0 70


Here's my code:



w['maxgustspd'].resample('M').max()


As you can see i've resampled the data to monthly and am trying to get the max value for each month. Problem is there's a mix of float and string (i.e. <31) values so I get the error:



TypeError: '>=' not supported between instances of 'str' and 'float'


Any ideas how to ignore the string dtypes?










share|improve this question














This question already has an answer here:



  • Pandas: Converting to numeric, creating NaNs when necessary

    4 answers



I'm trying to find the the maximum wind gust for each month for the following sample of data:



 maxtemp mintemp meantemp heatdays cooldays rain snow precip groundsnow maxgustdir maxgustspd
time
2018-01-01 -1.3 -8.1 -4.7 22.7 0.0 0.0 1.0 0.2 17.0 26.0 54
2018-01-02 -0.9 -7.4 -4.2 22.2 0.0 0.0 0.0 0.0 17.0 26.0 41
2018-01-03 -3.0 -7.9 -5.5 23.5 0.0 0.0 0.4 0.2 17.0 27.0 70
2018-01-04 0.0 -11.0 -5.5 23.5 0.0 2.4 7.2 8.4 11.0 12.0 96
2018-01-05 10.0 -0.3 4.9 13.1 0.0 11.0 0.0 11.0 10.0 14.0 70


Here's my code:



w['maxgustspd'].resample('M').max()


As you can see i've resampled the data to monthly and am trying to get the max value for each month. Problem is there's a mix of float and string (i.e. <31) values so I get the error:



TypeError: '>=' not supported between instances of 'str' and 'float'


Any ideas how to ignore the string dtypes?





This question already has an answer here:



  • Pandas: Converting to numeric, creating NaNs when necessary

    4 answers







python pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 10 at 21:26









Steve Power

114




114




marked as duplicate by jpp pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Nov 10 at 22:09


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by jpp pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Nov 10 at 22:09


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.













  • Ignore them how? Just give a nan?
    – roganjosh
    Nov 10 at 21:28










  • Are all your numbers actually stored as floats or are some of them saved as strings? You may want to actually convert the numbers to floats if its the latter.
    – Paritosh Singh
    Nov 10 at 21:43










  • The numbers are stored as floats in general except for the (<31) values. I'm guessing that was a threshold for recording the data? I can convert them to NaN in this case since i'm looking for the max and they obviously wouldn't qualify.
    – Steve Power
    Nov 11 at 1:42
















  • Ignore them how? Just give a nan?
    – roganjosh
    Nov 10 at 21:28










  • Are all your numbers actually stored as floats or are some of them saved as strings? You may want to actually convert the numbers to floats if its the latter.
    – Paritosh Singh
    Nov 10 at 21:43










  • The numbers are stored as floats in general except for the (<31) values. I'm guessing that was a threshold for recording the data? I can convert them to NaN in this case since i'm looking for the max and they obviously wouldn't qualify.
    – Steve Power
    Nov 11 at 1:42















Ignore them how? Just give a nan?
– roganjosh
Nov 10 at 21:28




Ignore them how? Just give a nan?
– roganjosh
Nov 10 at 21:28












Are all your numbers actually stored as floats or are some of them saved as strings? You may want to actually convert the numbers to floats if its the latter.
– Paritosh Singh
Nov 10 at 21:43




Are all your numbers actually stored as floats or are some of them saved as strings? You may want to actually convert the numbers to floats if its the latter.
– Paritosh Singh
Nov 10 at 21:43












The numbers are stored as floats in general except for the (<31) values. I'm guessing that was a threshold for recording the data? I can convert them to NaN in this case since i'm looking for the max and they obviously wouldn't qualify.
– Steve Power
Nov 11 at 1:42




The numbers are stored as floats in general except for the (<31) values. I'm guessing that was a threshold for recording the data? I can convert them to NaN in this case since i'm looking for the max and they obviously wouldn't qualify.
– Steve Power
Nov 11 at 1:42












2 Answers
2






active

oldest

votes

















up vote
0
down vote













If the <31 values are of interest, you'll want to do some clean up to remove the < and convert them to floats. If the str values are not of interest then you can convert them to NaN and .max will ignore them.



w.maxgustspd = w.maxgustspd.map(lambda x: x if type(x) != str else np.NaN)
w['maxgustspd'].resample('M').max()






share|improve this answer



























    up vote
    0
    down vote













    The spacing of your data is a little weird and you didn't post how it is being imported, so i can only hazard a guess.



    Are you sure that all the columns in maxgustspd have data? I have seen the issue you have described when I have a dataframe series of strings with a few gaps in them. The gaps get filled in as NaNs, while the rest of the series stays as strings.



    So, check the types of the numeric data you have imported (and convert them to float if necessary)... if the data import has weird gaps in it you might consider fixing the data issues or importing the data with delim_whitespace=True if the columns/rows are constantly shifting like in the posted data






    share|improve this answer



























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      0
      down vote













      If the <31 values are of interest, you'll want to do some clean up to remove the < and convert them to floats. If the str values are not of interest then you can convert them to NaN and .max will ignore them.



      w.maxgustspd = w.maxgustspd.map(lambda x: x if type(x) != str else np.NaN)
      w['maxgustspd'].resample('M').max()






      share|improve this answer
























        up vote
        0
        down vote













        If the <31 values are of interest, you'll want to do some clean up to remove the < and convert them to floats. If the str values are not of interest then you can convert them to NaN and .max will ignore them.



        w.maxgustspd = w.maxgustspd.map(lambda x: x if type(x) != str else np.NaN)
        w['maxgustspd'].resample('M').max()






        share|improve this answer






















          up vote
          0
          down vote










          up vote
          0
          down vote









          If the <31 values are of interest, you'll want to do some clean up to remove the < and convert them to floats. If the str values are not of interest then you can convert them to NaN and .max will ignore them.



          w.maxgustspd = w.maxgustspd.map(lambda x: x if type(x) != str else np.NaN)
          w['maxgustspd'].resample('M').max()






          share|improve this answer












          If the <31 values are of interest, you'll want to do some clean up to remove the < and convert them to floats. If the str values are not of interest then you can convert them to NaN and .max will ignore them.



          w.maxgustspd = w.maxgustspd.map(lambda x: x if type(x) != str else np.NaN)
          w['maxgustspd'].resample('M').max()







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 10 at 21:56









          python_data_egghead

          11




          11






















              up vote
              0
              down vote













              The spacing of your data is a little weird and you didn't post how it is being imported, so i can only hazard a guess.



              Are you sure that all the columns in maxgustspd have data? I have seen the issue you have described when I have a dataframe series of strings with a few gaps in them. The gaps get filled in as NaNs, while the rest of the series stays as strings.



              So, check the types of the numeric data you have imported (and convert them to float if necessary)... if the data import has weird gaps in it you might consider fixing the data issues or importing the data with delim_whitespace=True if the columns/rows are constantly shifting like in the posted data






              share|improve this answer
























                up vote
                0
                down vote













                The spacing of your data is a little weird and you didn't post how it is being imported, so i can only hazard a guess.



                Are you sure that all the columns in maxgustspd have data? I have seen the issue you have described when I have a dataframe series of strings with a few gaps in them. The gaps get filled in as NaNs, while the rest of the series stays as strings.



                So, check the types of the numeric data you have imported (and convert them to float if necessary)... if the data import has weird gaps in it you might consider fixing the data issues or importing the data with delim_whitespace=True if the columns/rows are constantly shifting like in the posted data






                share|improve this answer






















                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  The spacing of your data is a little weird and you didn't post how it is being imported, so i can only hazard a guess.



                  Are you sure that all the columns in maxgustspd have data? I have seen the issue you have described when I have a dataframe series of strings with a few gaps in them. The gaps get filled in as NaNs, while the rest of the series stays as strings.



                  So, check the types of the numeric data you have imported (and convert them to float if necessary)... if the data import has weird gaps in it you might consider fixing the data issues or importing the data with delim_whitespace=True if the columns/rows are constantly shifting like in the posted data






                  share|improve this answer












                  The spacing of your data is a little weird and you didn't post how it is being imported, so i can only hazard a guess.



                  Are you sure that all the columns in maxgustspd have data? I have seen the issue you have described when I have a dataframe series of strings with a few gaps in them. The gaps get filled in as NaNs, while the rest of the series stays as strings.



                  So, check the types of the numeric data you have imported (and convert them to float if necessary)... if the data import has weird gaps in it you might consider fixing the data issues or importing the data with delim_whitespace=True if the columns/rows are constantly shifting like in the posted data







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 10 at 21:56









                  jtwilson

                  125




                  125













                      Popular posts from this blog

                      Top Tejano songwriter Luis Silva dead of heart attack at 64

                      政党

                      天津地下鉄3号線