Manipulating Pandas Dataframe
I have a DataFrame A
with one column location_ms
. I want to split by ;
and :
to get DataFrame B
.
DataFrame A(Beginning):
DataFrame B(Final):
My code below seems to be very roundabout and would love to see a better implementation towards the problem. By doing the splits I create a DataFrame with each element being a list of lists. Then I flatten that list of lists to create the final DataFrame.
def locpapersrc_table(df):
toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
singlelistoflist = [item for sublist in toflatten for item in sublist]
tmp = pd.DataFrame(singlelistoflist)
return tmp
This version2 is slower than the first but is another method that is also very roundabout.
def version2(df):
xx = df["location_ms"].str.split(';',expand = True).T
tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
return tmp
Thank You!
python pandas
add a comment |
I have a DataFrame A
with one column location_ms
. I want to split by ;
and :
to get DataFrame B
.
DataFrame A(Beginning):
DataFrame B(Final):
My code below seems to be very roundabout and would love to see a better implementation towards the problem. By doing the splits I create a DataFrame with each element being a list of lists. Then I flatten that list of lists to create the final DataFrame.
def locpapersrc_table(df):
toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
singlelistoflist = [item for sublist in toflatten for item in sublist]
tmp = pd.DataFrame(singlelistoflist)
return tmp
This version2 is slower than the first but is another method that is also very roundabout.
def version2(df):
xx = df["location_ms"].str.split(';',expand = True).T
tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
return tmp
Thank You!
python pandas
3
Please do not post code or dataframes as images, make them text please
– U9-Forward
Nov 16 '18 at 1:40
add a comment |
I have a DataFrame A
with one column location_ms
. I want to split by ;
and :
to get DataFrame B
.
DataFrame A(Beginning):
DataFrame B(Final):
My code below seems to be very roundabout and would love to see a better implementation towards the problem. By doing the splits I create a DataFrame with each element being a list of lists. Then I flatten that list of lists to create the final DataFrame.
def locpapersrc_table(df):
toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
singlelistoflist = [item for sublist in toflatten for item in sublist]
tmp = pd.DataFrame(singlelistoflist)
return tmp
This version2 is slower than the first but is another method that is also very roundabout.
def version2(df):
xx = df["location_ms"].str.split(';',expand = True).T
tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
return tmp
Thank You!
python pandas
I have a DataFrame A
with one column location_ms
. I want to split by ;
and :
to get DataFrame B
.
DataFrame A(Beginning):
DataFrame B(Final):
My code below seems to be very roundabout and would love to see a better implementation towards the problem. By doing the splits I create a DataFrame with each element being a list of lists. Then I flatten that list of lists to create the final DataFrame.
def locpapersrc_table(df):
toflattenrows = df['location_ms'].str.split(';').apply(lambda x:[c.split(':') for c in x]).values.tolist()
singlelistoflist = [item for sublist in toflatten for item in sublist]
tmp = pd.DataFrame(singlelistoflist)
return tmp
This version2 is slower than the first but is another method that is also very roundabout.
def version2(df):
xx = df["location_ms"].str.split(';',expand = True).T
tmp = pd.melt(xx).dropna().drop(['variable'],axis=1)['value'].str.split(':',expand=True)
return tmp
Thank You!
python pandas
python pandas
edited Nov 16 '18 at 21:33
kradja
asked Nov 16 '18 at 1:34
kradjakradja
486
486
3
Please do not post code or dataframes as images, make them text please
– U9-Forward
Nov 16 '18 at 1:40
add a comment |
3
Please do not post code or dataframes as images, make them text please
– U9-Forward
Nov 16 '18 at 1:40
3
3
Please do not post code or dataframes as images, make them text please
– U9-Forward
Nov 16 '18 at 1:40
Please do not post code or dataframes as images, make them text please
– U9-Forward
Nov 16 '18 at 1:40
add a comment |
1 Answer
1
active
oldest
votes
Try something like this.
split_df = df['location_ms'].str.split(pat=";", expand=True)
Throw in something like this if you want to merge it back into the original dataframe.
df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')
For your new problem (splitting by ;
and :
):
split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])
You can merge it back in as above if you want.
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
"I want to split by;
and;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.
– CJR
Nov 16 '18 at 19:41
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330228%2fmanipulating-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try something like this.
split_df = df['location_ms'].str.split(pat=";", expand=True)
Throw in something like this if you want to merge it back into the original dataframe.
df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')
For your new problem (splitting by ;
and :
):
split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])
You can merge it back in as above if you want.
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
"I want to split by;
and;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.
– CJR
Nov 16 '18 at 19:41
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
add a comment |
Try something like this.
split_df = df['location_ms'].str.split(pat=";", expand=True)
Throw in something like this if you want to merge it back into the original dataframe.
df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')
For your new problem (splitting by ;
and :
):
split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])
You can merge it back in as above if you want.
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
"I want to split by;
and;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.
– CJR
Nov 16 '18 at 19:41
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
add a comment |
Try something like this.
split_df = df['location_ms'].str.split(pat=";", expand=True)
Throw in something like this if you want to merge it back into the original dataframe.
df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')
For your new problem (splitting by ;
and :
):
split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])
You can merge it back in as above if you want.
Try something like this.
split_df = df['location_ms'].str.split(pat=";", expand=True)
Throw in something like this if you want to merge it back into the original dataframe.
df = df.merge(split_df, left_index=True, right_index=True)
df = df.drop('location_ms')
For your new problem (splitting by ;
and :
):
split_df = df['location_ms'].str.split(pat=";", expand=True)
subsplit_df = pd.DataFrame(index = split_df.index)
for i in range(split_df.shape[1]):
subsplit_df = subsplit_df.merge(split_df.iloc[:, i].str.split(pat=":", expand=True), left_index=True, right_index=True)
subsplit_df.columns = range(subsplit_df.shape[1])
You can merge it back in as above if you want.
edited Nov 16 '18 at 19:46
answered Nov 16 '18 at 1:54
CJRCJR
1,2322316
1,2322316
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
"I want to split by;
and;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.
– CJR
Nov 16 '18 at 19:41
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
add a comment |
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
"I want to split by;
and;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.
– CJR
Nov 16 '18 at 19:41
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
You need to split by both delimiters, the ";" and the ":"
– kradja
Nov 16 '18 at 19:04
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
This does not work since you have a list of lists when delimiting by both characters which then has to be manipulating into the format of the final dataframe.
– kradja
Nov 16 '18 at 19:11
"I want to split by
;
and ;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.– CJR
Nov 16 '18 at 19:41
"I want to split by
;
and ;
to get DataFrame B" is a direct quote from your problem. I've edited the answer to match your new criteria.– CJR
Nov 16 '18 at 19:41
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
Oops sorry about that typo! If you look at the initial Dataframe and code that sentence would not make sense. Sorry about the mistake! This is more roundabout than the code that I have. You should be using apply instead of iterating through the dataframe.
– kradja
Nov 16 '18 at 21:30
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330228%2fmanipulating-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
Please do not post code or dataframes as images, make them text please
– U9-Forward
Nov 16 '18 at 1:40