Manipulating Pandas Dataframe

I have a DataFrame A with one column location_ms. I want to split by ; and : to get DataFrame B.

DataFrame A(Beginning):

Beginning

DataFrame B(Final):

Final

My code below seems to be very roundabout and would love to see a better implementation towards the problem. By doing the splits I create a DataFrame with each element being a list of lists. Then I flatten that list of lists to create the final DataFrame.

def locpapersrc_table(df):
 toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
 singlelistoflist = [item for sublist in toflatten for item in sublist]
 tmp = pd.DataFrame(singlelistoflist)
 return tmp

This version2 is slower than the first but is another method that is also very roundabout.

def version2(df):
 xx = df["location_ms"].str.split(';',expand = True).T
 tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
 return tmp

Thank You!

edited Nov 16 '18 at 21:33

asked Nov 16 '18 at 1:34

kradja

486

3

Please do not post code or dataframes as images, make them text please

– U9-Forward
Nov 16 '18 at 1:40

add a comment |

I have a DataFrame A with one column location_ms. I want to split by ; and : to get DataFrame B.

DataFrame A(Beginning):

Beginning

DataFrame B(Final):

Final

def locpapersrc_table(df):
 toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
 singlelistoflist = [item for sublist in toflatten for item in sublist]
 tmp = pd.DataFrame(singlelistoflist)
 return tmp

This version2 is slower than the first but is another method that is also very roundabout.

def version2(df):
 xx = df["location_ms"].str.split(';',expand = True).T
 tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
 return tmp

Thank You!

edited Nov 16 '18 at 21:33

asked Nov 16 '18 at 1:34

kradja

486

3

Please do not post code or dataframes as images, make them text please

– U9-Forward
Nov 16 '18 at 1:40

add a comment |

I have a DataFrame A with one column location_ms. I want to split by ; and : to get DataFrame B.

DataFrame A(Beginning):

Beginning

DataFrame B(Final):

Final

def locpapersrc_table(df):
 toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
 singlelistoflist = [item for sublist in toflatten for item in sublist]
 tmp = pd.DataFrame(singlelistoflist)
 return tmp

This version2 is slower than the first but is another method that is also very roundabout.

def version2(df):
 xx = df["location_ms"].str.split(';',expand = True).T
 tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
 return tmp

Thank You!

edited Nov 16 '18 at 21:33

asked Nov 16 '18 at 1:34

kradja

486

I have a DataFrame A with one column location_ms. I want to split by ; and : to get DataFrame B.

DataFrame A(Beginning):

Beginning

DataFrame B(Final):

Final

def locpapersrc_table(df):
 toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
 singlelistoflist = [item for sublist in toflatten for item in sublist]
 tmp = pd.DataFrame(singlelistoflist)
 return tmp

This version2 is slower than the first but is another method that is also very roundabout.

def version2(df):
 xx = df["location_ms"].str.split(';',expand = True).T
 tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
 return tmp

Thank You!

python pandas

edited Nov 16 '18 at 21:33

asked Nov 16 '18 at 1:34

kradja

486

edited Nov 16 '18 at 21:33

asked Nov 16 '18 at 1:34

kradja

486

edited Nov 16 '18 at 21:33

asked Nov 16 '18 at 1:34

kradja

486

asked Nov 16 '18 at 1:34

kradja

486

asked Nov 16 '18 at 1:34

kradja

486

3

Please do not post code or dataframes as images, make them text please

– U9-Forward
Nov 16 '18 at 1:40

add a comment |

3

Please do not post code or dataframes as images, make them text please

– U9-Forward
Nov 16 '18 at 1:40

Please do not post code or dataframes as images, make them text please

– U9-Forward
Nov 16 '18 at 1:40

add a comment |

1 Answer
1

active

oldest

votes

Try something like this.

split_df = df['location_ms'].str.split(pat=";", expand=True)

Throw in something like this if you want to merge it back into the original dataframe.

df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')

For your new problem (splitting by ; and :):

split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
 subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])

You can merge it back in as above if you want.

edited Nov 16 '18 at 19:46

answered Nov 16 '18 at 1:54

CJR

1,2322316

You need to split by both delimiters, the ";" and the ":"

– kradja
Nov 16 '18 at 19:04

This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.

– kradja
Nov 16 '18 at 19:11

"I want to split by ; and ; to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.

– CJR
Nov 16 '18 at 19:41

Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.

– kradja
Nov 16 '18 at 21:30

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330228%2fmanipulating-pandas-dataframe%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Try something like this.

split_df = df['location_ms'].str.split(pat=";", expand=True)

Throw in something like this if you want to merge it back into the original dataframe.

df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')

For your new problem (splitting by ; and :):

split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
 subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])

You can merge it back in as above if you want.

edited Nov 16 '18 at 19:46

answered Nov 16 '18 at 1:54

CJR

1,2322316

You need to split by both delimiters, the ";" and the ":"

– kradja
Nov 16 '18 at 19:04

This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.

– kradja
Nov 16 '18 at 19:11

"I want to split by ; and ; to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.

– CJR
Nov 16 '18 at 19:41

Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.

– kradja
Nov 16 '18 at 21:30

add a comment |

Try something like this.

split_df = df['location_ms'].str.split(pat=";", expand=True)

Throw in something like this if you want to merge it back into the original dataframe.

df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')

For your new problem (splitting by ; and :):

split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
 subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])

You can merge it back in as above if you want.

edited Nov 16 '18 at 19:46

answered Nov 16 '18 at 1:54

CJR

1,2322316

You need to split by both delimiters, the ";" and the ":"

– kradja
Nov 16 '18 at 19:04

This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.

– kradja
Nov 16 '18 at 19:11

"I want to split by ; and ; to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.

– CJR
Nov 16 '18 at 19:41

Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.

– kradja
Nov 16 '18 at 21:30

add a comment |

Try something like this.

split_df = df['location_ms'].str.split(pat=";", expand=True)

Throw in something like this if you want to merge it back into the original dataframe.

df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')

For your new problem (splitting by ; and :):

split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
 subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])

You can merge it back in as above if you want.

edited Nov 16 '18 at 19:46

answered Nov 16 '18 at 1:54

CJR

1,2322316

Try something like this.

split_df = df['location_ms'].str.split(pat=";", expand=True)

Throw in something like this if you want to merge it back into the original dataframe.

df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')

For your new problem (splitting by ; and :):

split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
 subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])

You can merge it back in as above if you want.

edited Nov 16 '18 at 19:46

answered Nov 16 '18 at 1:54

CJR

1,2322316

edited Nov 16 '18 at 19:46

answered Nov 16 '18 at 1:54

CJR

1,2322316

answered Nov 16 '18 at 1:54

CJR

1,2322316

answered Nov 16 '18 at 1:54

CJR

1,2322316

You need to split by both delimiters, the ";" and the ":"

– kradja
Nov 16 '18 at 19:04

This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.

– kradja
Nov 16 '18 at 19:11

"I want to split by ; and ; to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.

– CJR
Nov 16 '18 at 19:41

Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.

– kradja
Nov 16 '18 at 21:30

add a comment |

You need to split by both delimiters, the ";" and the ":"

– kradja
Nov 16 '18 at 19:04

This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.

– kradja
Nov 16 '18 at 19:11

"I want to split by ; and ; to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.

– CJR
Nov 16 '18 at 19:41

Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.

– kradja
Nov 16 '18 at 21:30

You need to split by both delimiters, the ";" and the ":"

– kradja
Nov 16 '18 at 19:04

This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.

– kradja
Nov 16 '18 at 19:11

"I want to split by ; and ; to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.

– CJR
Nov 16 '18 at 19:41

Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.

– kradja
Nov 16 '18 at 21:30

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Myujth