How to calculate columns that have circular dependency in pandas dataframe?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have a pandas dataframe like this-

 Tstamp Token LTP Cum_bsdiffs Cum_ltpdiffs counts Entry Correl Exit ltpchange ltpcumchange ltppercumchange
0 2018-10-29 11:40:33 415745 138.40 NaN NaN 0 0 NaN 0 0 0 0
1 2018-10-29 11:40:34 415745 138.40 -200.0 0.00 1 0 NaN 0 0 0 0
2 2018-10-29 11:40:34 415745 138.35 -1437.0 -0.05 2 0 NaN 0 0 0 0
3 2018-10-29 11:40:36 415745 138.35 -1337.0 -0.05 3 0 NaN 0 0 0 0

Now the columns Entry,Exit,ltpchange and ltpcumchange are interdependent as follows-

Entry becomes "Buy" or "Sell" based on a condition depending on
other columns. Otherwise it will remain 0.

Just when Entry becomes not equal to 0, ltpchange starts taking changes in subsequent values of LTP. Otherwise it will
remain 0.

ltpcumchange will take cumulative sum of ltpchange.

Just when ltpcumchange reaches a target value (any direction), Exit will become 1.

Entry will remain "Buy" or "Sell", depending on its previous row, untill Exitbecomes 1 after which it will revert to 0.

I have used iterrows() to go for this logic, however, it is superslow. My dataframe contains more than 2 million rows and it is going by the speed of almost 5 rows per second.

I tried using dataframe column logic but failed to get the desired result. Can anyone help me out here?

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

Seems that your dataframe changes based only on those 4 columns so you can exclude the rest. Also, given the above example, what would the desire output look like? Should you need help in editing your question do take a look at how to create Minimal, Complete, and Verifiable example

– zipa
Nov 16 '18 at 12:04

1

I gess you want to create a robot for trading and you don't want to share your strategie. But if you do not share your code, we cannot optimize it. iterrows() is slow, there might be a way to avoid using it.

– Charles R
Nov 16 '18 at 12:43

@CharlesR yes you are right. but i have mentioned all the logic that i want to be 'vectorized'

– Sagar Upadhyay
Nov 16 '18 at 13:20

Maybe try to use a maximum of .loc in order to filter the rows you want to modify on each step of your code. Also numpy is good for vectorizing your code, ty to use np.where, np.select, np.choice

– Charles R
Nov 16 '18 at 13:38

iterrows() will be slow indeed. Recommend using numpy as well and modify the elements of the array as the length of the array goes on.

– kon_u
Nov 16 '18 at 14:34

add a comment |

I have a pandas dataframe like this-

 Tstamp Token LTP Cum_bsdiffs Cum_ltpdiffs counts Entry Correl Exit ltpchange ltpcumchange ltppercumchange
0 2018-10-29 11:40:33 415745 138.40 NaN NaN 0 0 NaN 0 0 0 0
1 2018-10-29 11:40:34 415745 138.40 -200.0 0.00 1 0 NaN 0 0 0 0
2 2018-10-29 11:40:34 415745 138.35 -1437.0 -0.05 2 0 NaN 0 0 0 0
3 2018-10-29 11:40:36 415745 138.35 -1337.0 -0.05 3 0 NaN 0 0 0 0

Now the columns Entry,Exit,ltpchange and ltpcumchange are interdependent as follows-

Entry becomes "Buy" or "Sell" based on a condition depending on
other columns. Otherwise it will remain 0.

Just when Entry becomes not equal to 0, ltpchange starts taking changes in subsequent values of LTP. Otherwise it will
remain 0.

ltpcumchange will take cumulative sum of ltpchange.

Just when ltpcumchange reaches a target value (any direction), Exit will become 1.

Entry will remain "Buy" or "Sell", depending on its previous row, untill Exitbecomes 1 after which it will revert to 0.

I have used iterrows() to go for this logic, however, it is superslow. My dataframe contains more than 2 million rows and it is going by the speed of almost 5 rows per second.

I tried using dataframe column logic but failed to get the desired result. Can anyone help me out here?

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

Seems that your dataframe changes based only on those 4 columns so you can exclude the rest. Also, given the above example, what would the desire output look like? Should you need help in editing your question do take a look at how to create Minimal, Complete, and Verifiable example

– zipa
Nov 16 '18 at 12:04

1

I gess you want to create a robot for trading and you don't want to share your strategie. But if you do not share your code, we cannot optimize it. iterrows() is slow, there might be a way to avoid using it.

– Charles R
Nov 16 '18 at 12:43

@CharlesR yes you are right. but i have mentioned all the logic that i want to be 'vectorized'

– Sagar Upadhyay
Nov 16 '18 at 13:20

Maybe try to use a maximum of .loc in order to filter the rows you want to modify on each step of your code. Also numpy is good for vectorizing your code, ty to use np.where, np.select, np.choice

– Charles R
Nov 16 '18 at 13:38

iterrows() will be slow indeed. Recommend using numpy as well and modify the elements of the array as the length of the array goes on.

– kon_u
Nov 16 '18 at 14:34

add a comment |

I have a pandas dataframe like this-

 Tstamp Token LTP Cum_bsdiffs Cum_ltpdiffs counts Entry Correl Exit ltpchange ltpcumchange ltppercumchange
0 2018-10-29 11:40:33 415745 138.40 NaN NaN 0 0 NaN 0 0 0 0
1 2018-10-29 11:40:34 415745 138.40 -200.0 0.00 1 0 NaN 0 0 0 0
2 2018-10-29 11:40:34 415745 138.35 -1437.0 -0.05 2 0 NaN 0 0 0 0
3 2018-10-29 11:40:36 415745 138.35 -1337.0 -0.05 3 0 NaN 0 0 0 0

Now the columns Entry,Exit,ltpchange and ltpcumchange are interdependent as follows-

Entry becomes "Buy" or "Sell" based on a condition depending on
other columns. Otherwise it will remain 0.

Just when Entry becomes not equal to 0, ltpchange starts taking changes in subsequent values of LTP. Otherwise it will
remain 0.

ltpcumchange will take cumulative sum of ltpchange.

Just when ltpcumchange reaches a target value (any direction), Exit will become 1.

Entry will remain "Buy" or "Sell", depending on its previous row, untill Exitbecomes 1 after which it will revert to 0.

I have used iterrows() to go for this logic, however, it is superslow. My dataframe contains more than 2 million rows and it is going by the speed of almost 5 rows per second.

I tried using dataframe column logic but failed to get the desired result. Can anyone help me out here?

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

I have a pandas dataframe like this-

 Tstamp Token LTP Cum_bsdiffs Cum_ltpdiffs counts Entry Correl Exit ltpchange ltpcumchange ltppercumchange
0 2018-10-29 11:40:33 415745 138.40 NaN NaN 0 0 NaN 0 0 0 0
1 2018-10-29 11:40:34 415745 138.40 -200.0 0.00 1 0 NaN 0 0 0 0
2 2018-10-29 11:40:34 415745 138.35 -1437.0 -0.05 2 0 NaN 0 0 0 0
3 2018-10-29 11:40:36 415745 138.35 -1337.0 -0.05 3 0 NaN 0 0 0 0

Now the columns Entry,Exit,ltpchange and ltpcumchange are interdependent as follows-

Entry becomes "Buy" or "Sell" based on a condition depending on
other columns. Otherwise it will remain 0.

Just when Entry becomes not equal to 0, ltpchange starts taking changes in subsequent values of LTP. Otherwise it will
remain 0.

ltpcumchange will take cumulative sum of ltpchange.

Just when ltpcumchange reaches a target value (any direction), Exit will become 1.

Entry will remain "Buy" or "Sell", depending on its previous row, untill Exitbecomes 1 after which it will revert to 0.

I have used iterrows() to go for this logic, however, it is superslow. My dataframe contains more than 2 million rows and it is going by the speed of almost 5 rows per second.

I tried using dataframe column logic but failed to get the desired result. Can anyone help me out here?

python pandas dataframe

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

asked Nov 16 '18 at 11:49

Sagar Upadhyay

1163

Seems that your dataframe changes based only on those 4 columns so you can exclude the rest. Also, given the above example, what would the desire output look like? Should you need help in editing your question do take a look at how to create Minimal, Complete, and Verifiable example

– zipa
Nov 16 '18 at 12:04

1

I gess you want to create a robot for trading and you don't want to share your strategie. But if you do not share your code, we cannot optimize it. iterrows() is slow, there might be a way to avoid using it.

– Charles R
Nov 16 '18 at 12:43

@CharlesR yes you are right. but i have mentioned all the logic that i want to be 'vectorized'

– Sagar Upadhyay
Nov 16 '18 at 13:20

Maybe try to use a maximum of .loc in order to filter the rows you want to modify on each step of your code. Also numpy is good for vectorizing your code, ty to use np.where, np.select, np.choice

– Charles R
Nov 16 '18 at 13:38

iterrows() will be slow indeed. Recommend using numpy as well and modify the elements of the array as the length of the array goes on.

– kon_u
Nov 16 '18 at 14:34

add a comment |

Seems that your dataframe changes based only on those 4 columns so you can exclude the rest. Also, given the above example, what would the desire output look like? Should you need help in editing your question do take a look at how to create Minimal, Complete, and Verifiable example

– zipa
Nov 16 '18 at 12:04

1

I gess you want to create a robot for trading and you don't want to share your strategie. But if you do not share your code, we cannot optimize it. iterrows() is slow, there might be a way to avoid using it.

– Charles R
Nov 16 '18 at 12:43

@CharlesR yes you are right. but i have mentioned all the logic that i want to be 'vectorized'

– Sagar Upadhyay
Nov 16 '18 at 13:20

Maybe try to use a maximum of .loc in order to filter the rows you want to modify on each step of your code. Also numpy is good for vectorizing your code, ty to use np.where, np.select, np.choice

– Charles R
Nov 16 '18 at 13:38

iterrows() will be slow indeed. Recommend using numpy as well and modify the elements of the array as the length of the array goes on.

– kon_u
Nov 16 '18 at 14:34

Seems that your dataframe changes based only on those 4 columns so you can exclude the rest. Also, given the above example, what would the desire output look like? Should you need help in editing your question do take a look at how to create Minimal, Complete, and Verifiable example

– zipa
Nov 16 '18 at 12:04

I gess you want to create a robot for trading and you don't want to share your strategie. But if you do not share your code, we cannot optimize it. iterrows() is slow, there might be a way to avoid using it.

– Charles R
Nov 16 '18 at 12:43

@CharlesR yes you are right. but i have mentioned all the logic that i want to be 'vectorized'

– Sagar Upadhyay
Nov 16 '18 at 13:20

Maybe try to use a maximum of .loc in order to filter the rows you want to modify on each step of your code. Also numpy is good for vectorizing your code, ty to use np.where, np.select, np.choice

– Charles R
Nov 16 '18 at 13:38

iterrows() will be slow indeed. Recommend using numpy as well and modify the elements of the array as the length of the array goes on.

– kon_u
Nov 16 '18 at 14:34

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337297%2fhow-to-calculate-columns-that-have-circular-dependency-in-pandas-dataframe%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Myujth