Reshaping data.frame from wide to long format
I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:
Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246
Now I like to transform this data.frame into a long data.frame.
Something like this:
Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246
I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.
If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.
r dataframe reshape r-faq
add a comment |
I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:
Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246
Now I like to transform this data.frame into a long data.frame.
Something like this:
Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246
I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.
If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.
r dataframe reshape r-faq
2
Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51
1
And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10
3
And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37
add a comment |
I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:
Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246
Now I like to transform this data.frame into a long data.frame.
Something like this:
Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246
I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.
If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.
r dataframe reshape r-faq
I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:
Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246
Now I like to transform this data.frame into a long data.frame.
Something like this:
Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246
I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.
If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.
r dataframe reshape r-faq
r dataframe reshape r-faq
edited Jul 14 at 19:09
Saranjith
4,43612361
4,43612361
asked Feb 2 '10 at 15:36
mropa
5,97392729
5,97392729
2
Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51
1
And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10
3
And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37
add a comment |
2
Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51
1
And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10
3
And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37
2
2
Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51
Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51
1
1
And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10
And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10
3
3
And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37
And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37
add a comment |
5 Answers
5
active
oldest
votes
reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:
reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value",
idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)
add a comment |
Three alternative solutions:
1: With reshape2
library(reshape2)
long <- melt(wide, id.vars = c("Code", "Country"))
giving:
Code Country variable value
1 AFG Afghanistan 1950 20,249
2 ALB Albania 1950 8,097
3 AFG Afghanistan 1951 21,352
4 ALB Albania 1951 8,986
5 AFG Afghanistan 1952 22,532
6 ALB Albania 1952 10,058
7 AFG Afghanistan 1953 23,557
8 ALB Albania 1953 11,123
9 AFG Afghanistan 1954 24,555
10 ALB Albania 1954 12,246
Some alternative notations that give the same result:
# you can also define the id-variables by column number
melt(wide, id.vars = 1:2)
# as an alternative you can also specify the measure-variables
# all other variables will then be used as id-variables
melt(wide, measure.vars = 3:7)
melt(wide, measure.vars = as.character(1950:1954))
2: With data.table
You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:
library(data.table)
long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")
Some alternative notations:
melt(setDT(wide), id.vars = 1:2, variable.name = "year")
melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")
3: With tidyr
library(tidyr)
long <- wide %>% gather(year, value, -c(Code, Country))
Some alternative notations:
wide %>% gather(year, value, -Code, -Country)
wide %>% gather(year, value, -1:-2)
wide %>% gather(year, value, -(1:2))
wide %>% gather(year, value, -1, -2)
wide %>% gather(year, value, 3:7)
wide %>% gather(year, value, `1950`:`1954`)
If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.
Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:
long$value <- as.numeric(gsub(",", "", long$value))
Or directly with data.table or dplyr:
# data.table
long <- melt(setDT(wide),
id.vars = c("Code","Country"),
variable.name = "year")[, value := as.numeric(gsub(",", "", value))]
# tidyr and dplyr
long <- wide %>% gather(year, value, -c(Code,Country)) %>%
mutate(value = as.numeric(gsub(",", "", value)))
Data:
wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)
great answer, just one more tiny reminder : do not put any variables other thanidandtimein your data frame,meltcould not tell what you want to do in this case.
– Jason Goal
Oct 19 '17 at 11:11
1
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both theid.varsand themeasure.vars.
– Jaap
Oct 19 '17 at 11:55
,then that's good for me, don't knowid.varsand themeasure.varscan be specified in the first alternative,sorry for the mess, its my fault.
– Jason Goal
Oct 20 '17 at 5:34
add a comment |
Using reshape package:
#data
x <- read.table(textConnection(
"Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)
library(reshape)
x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))
add a comment |
Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.
Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:
is.vectorreturnsTRUEif x is a vector of the specified mode having no attributes other than names. It returnsFALSEotherwise.
I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.
Here's the stack approach:
cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
## Code Country values ind
## 1 AFG Afghanistan 20,249 1950
## 2 ALB Albania 8,097 1950
## 3 AFG Afghanistan 21,352 1951
## 4 ALB Albania 8,986 1951
## 5 AFG Afghanistan 22,532 1952
## 6 ALB Albania 10,058 1952
## 7 AFG Afghanistan 23,557 1953
## 8 ALB Albania 11,123 1953
## 9 AFG Afghanistan 24,555 1954
## 10 ALB Albania 12,246 1954
never heard ofstack, but it does exactly what I need and does so as simply as possible. Thanks! +1
– theforestecologist
Jun 17 at 14:35
add a comment |
Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.
Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)
wide %>%
gather(Year, Value, -Code, -Country) %>%
mutate(Year = parse_number(Year)
, Value = parse_number(Value))
Returns:
Code Country Year Value
1 AFG Afghanistan 1950 20249
2 ALB Albania 1950 8097
3 AFG Afghanistan 1951 21352
4 ALB Albania 1951 8986
5 AFG Afghanistan 1952 22532
6 ALB Albania 1952 10058
7 AFG Afghanistan 1953 23557
8 ALB Albania 1953 11123
9 AFG Afghanistan 1954 24555
10 ALB Albania 1954 12246
add a comment |
protected by Jaap May 2 '16 at 5:11
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:
reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value",
idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)
add a comment |
reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:
reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value",
idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)
add a comment |
reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:
reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value",
idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)
reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:
reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value",
idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)
edited Feb 12 at 10:41
Jaap
54.9k20117130
54.9k20117130
answered Feb 2 '10 at 16:07
Aniko
13.9k33743
13.9k33743
add a comment |
add a comment |
Three alternative solutions:
1: With reshape2
library(reshape2)
long <- melt(wide, id.vars = c("Code", "Country"))
giving:
Code Country variable value
1 AFG Afghanistan 1950 20,249
2 ALB Albania 1950 8,097
3 AFG Afghanistan 1951 21,352
4 ALB Albania 1951 8,986
5 AFG Afghanistan 1952 22,532
6 ALB Albania 1952 10,058
7 AFG Afghanistan 1953 23,557
8 ALB Albania 1953 11,123
9 AFG Afghanistan 1954 24,555
10 ALB Albania 1954 12,246
Some alternative notations that give the same result:
# you can also define the id-variables by column number
melt(wide, id.vars = 1:2)
# as an alternative you can also specify the measure-variables
# all other variables will then be used as id-variables
melt(wide, measure.vars = 3:7)
melt(wide, measure.vars = as.character(1950:1954))
2: With data.table
You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:
library(data.table)
long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")
Some alternative notations:
melt(setDT(wide), id.vars = 1:2, variable.name = "year")
melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")
3: With tidyr
library(tidyr)
long <- wide %>% gather(year, value, -c(Code, Country))
Some alternative notations:
wide %>% gather(year, value, -Code, -Country)
wide %>% gather(year, value, -1:-2)
wide %>% gather(year, value, -(1:2))
wide %>% gather(year, value, -1, -2)
wide %>% gather(year, value, 3:7)
wide %>% gather(year, value, `1950`:`1954`)
If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.
Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:
long$value <- as.numeric(gsub(",", "", long$value))
Or directly with data.table or dplyr:
# data.table
long <- melt(setDT(wide),
id.vars = c("Code","Country"),
variable.name = "year")[, value := as.numeric(gsub(",", "", value))]
# tidyr and dplyr
long <- wide %>% gather(year, value, -c(Code,Country)) %>%
mutate(value = as.numeric(gsub(",", "", value)))
Data:
wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)
great answer, just one more tiny reminder : do not put any variables other thanidandtimein your data frame,meltcould not tell what you want to do in this case.
– Jason Goal
Oct 19 '17 at 11:11
1
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both theid.varsand themeasure.vars.
– Jaap
Oct 19 '17 at 11:55
,then that's good for me, don't knowid.varsand themeasure.varscan be specified in the first alternative,sorry for the mess, its my fault.
– Jason Goal
Oct 20 '17 at 5:34
add a comment |
Three alternative solutions:
1: With reshape2
library(reshape2)
long <- melt(wide, id.vars = c("Code", "Country"))
giving:
Code Country variable value
1 AFG Afghanistan 1950 20,249
2 ALB Albania 1950 8,097
3 AFG Afghanistan 1951 21,352
4 ALB Albania 1951 8,986
5 AFG Afghanistan 1952 22,532
6 ALB Albania 1952 10,058
7 AFG Afghanistan 1953 23,557
8 ALB Albania 1953 11,123
9 AFG Afghanistan 1954 24,555
10 ALB Albania 1954 12,246
Some alternative notations that give the same result:
# you can also define the id-variables by column number
melt(wide, id.vars = 1:2)
# as an alternative you can also specify the measure-variables
# all other variables will then be used as id-variables
melt(wide, measure.vars = 3:7)
melt(wide, measure.vars = as.character(1950:1954))
2: With data.table
You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:
library(data.table)
long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")
Some alternative notations:
melt(setDT(wide), id.vars = 1:2, variable.name = "year")
melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")
3: With tidyr
library(tidyr)
long <- wide %>% gather(year, value, -c(Code, Country))
Some alternative notations:
wide %>% gather(year, value, -Code, -Country)
wide %>% gather(year, value, -1:-2)
wide %>% gather(year, value, -(1:2))
wide %>% gather(year, value, -1, -2)
wide %>% gather(year, value, 3:7)
wide %>% gather(year, value, `1950`:`1954`)
If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.
Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:
long$value <- as.numeric(gsub(",", "", long$value))
Or directly with data.table or dplyr:
# data.table
long <- melt(setDT(wide),
id.vars = c("Code","Country"),
variable.name = "year")[, value := as.numeric(gsub(",", "", value))]
# tidyr and dplyr
long <- wide %>% gather(year, value, -c(Code,Country)) %>%
mutate(value = as.numeric(gsub(",", "", value)))
Data:
wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)
great answer, just one more tiny reminder : do not put any variables other thanidandtimein your data frame,meltcould not tell what you want to do in this case.
– Jason Goal
Oct 19 '17 at 11:11
1
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both theid.varsand themeasure.vars.
– Jaap
Oct 19 '17 at 11:55
,then that's good for me, don't knowid.varsand themeasure.varscan be specified in the first alternative,sorry for the mess, its my fault.
– Jason Goal
Oct 20 '17 at 5:34
add a comment |
Three alternative solutions:
1: With reshape2
library(reshape2)
long <- melt(wide, id.vars = c("Code", "Country"))
giving:
Code Country variable value
1 AFG Afghanistan 1950 20,249
2 ALB Albania 1950 8,097
3 AFG Afghanistan 1951 21,352
4 ALB Albania 1951 8,986
5 AFG Afghanistan 1952 22,532
6 ALB Albania 1952 10,058
7 AFG Afghanistan 1953 23,557
8 ALB Albania 1953 11,123
9 AFG Afghanistan 1954 24,555
10 ALB Albania 1954 12,246
Some alternative notations that give the same result:
# you can also define the id-variables by column number
melt(wide, id.vars = 1:2)
# as an alternative you can also specify the measure-variables
# all other variables will then be used as id-variables
melt(wide, measure.vars = 3:7)
melt(wide, measure.vars = as.character(1950:1954))
2: With data.table
You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:
library(data.table)
long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")
Some alternative notations:
melt(setDT(wide), id.vars = 1:2, variable.name = "year")
melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")
3: With tidyr
library(tidyr)
long <- wide %>% gather(year, value, -c(Code, Country))
Some alternative notations:
wide %>% gather(year, value, -Code, -Country)
wide %>% gather(year, value, -1:-2)
wide %>% gather(year, value, -(1:2))
wide %>% gather(year, value, -1, -2)
wide %>% gather(year, value, 3:7)
wide %>% gather(year, value, `1950`:`1954`)
If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.
Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:
long$value <- as.numeric(gsub(",", "", long$value))
Or directly with data.table or dplyr:
# data.table
long <- melt(setDT(wide),
id.vars = c("Code","Country"),
variable.name = "year")[, value := as.numeric(gsub(",", "", value))]
# tidyr and dplyr
long <- wide %>% gather(year, value, -c(Code,Country)) %>%
mutate(value = as.numeric(gsub(",", "", value)))
Data:
wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)
Three alternative solutions:
1: With reshape2
library(reshape2)
long <- melt(wide, id.vars = c("Code", "Country"))
giving:
Code Country variable value
1 AFG Afghanistan 1950 20,249
2 ALB Albania 1950 8,097
3 AFG Afghanistan 1951 21,352
4 ALB Albania 1951 8,986
5 AFG Afghanistan 1952 22,532
6 ALB Albania 1952 10,058
7 AFG Afghanistan 1953 23,557
8 ALB Albania 1953 11,123
9 AFG Afghanistan 1954 24,555
10 ALB Albania 1954 12,246
Some alternative notations that give the same result:
# you can also define the id-variables by column number
melt(wide, id.vars = 1:2)
# as an alternative you can also specify the measure-variables
# all other variables will then be used as id-variables
melt(wide, measure.vars = 3:7)
melt(wide, measure.vars = as.character(1950:1954))
2: With data.table
You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:
library(data.table)
long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")
Some alternative notations:
melt(setDT(wide), id.vars = 1:2, variable.name = "year")
melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")
3: With tidyr
library(tidyr)
long <- wide %>% gather(year, value, -c(Code, Country))
Some alternative notations:
wide %>% gather(year, value, -Code, -Country)
wide %>% gather(year, value, -1:-2)
wide %>% gather(year, value, -(1:2))
wide %>% gather(year, value, -1, -2)
wide %>% gather(year, value, 3:7)
wide %>% gather(year, value, `1950`:`1954`)
If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.
Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:
long$value <- as.numeric(gsub(",", "", long$value))
Or directly with data.table or dplyr:
# data.table
long <- melt(setDT(wide),
id.vars = c("Code","Country"),
variable.name = "year")[, value := as.numeric(gsub(",", "", value))]
# tidyr and dplyr
long <- wide %>% gather(year, value, -c(Code,Country)) %>%
mutate(value = as.numeric(gsub(",", "", value)))
Data:
wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)
edited Aug 22 at 14:46
answered Sep 15 '14 at 20:09
Jaap
54.9k20117130
54.9k20117130
great answer, just one more tiny reminder : do not put any variables other thanidandtimein your data frame,meltcould not tell what you want to do in this case.
– Jason Goal
Oct 19 '17 at 11:11
1
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both theid.varsand themeasure.vars.
– Jaap
Oct 19 '17 at 11:55
,then that's good for me, don't knowid.varsand themeasure.varscan be specified in the first alternative,sorry for the mess, its my fault.
– Jason Goal
Oct 20 '17 at 5:34
add a comment |
great answer, just one more tiny reminder : do not put any variables other thanidandtimein your data frame,meltcould not tell what you want to do in this case.
– Jason Goal
Oct 19 '17 at 11:11
1
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both theid.varsand themeasure.vars.
– Jaap
Oct 19 '17 at 11:55
,then that's good for me, don't knowid.varsand themeasure.varscan be specified in the first alternative,sorry for the mess, its my fault.
– Jason Goal
Oct 20 '17 at 5:34
great answer, just one more tiny reminder : do not put any variables other than
id andtime in your data frame, melt could not tell what you want to do in this case.– Jason Goal
Oct 19 '17 at 11:11
great answer, just one more tiny reminder : do not put any variables other than
id andtime in your data frame, melt could not tell what you want to do in this case.– Jason Goal
Oct 19 '17 at 11:11
1
1
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the
id.vars and the measure.vars.– Jaap
Oct 19 '17 at 11:55
@JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the
id.vars and the measure.vars.– Jaap
Oct 19 '17 at 11:55
,then that's good for me, don't know
id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.– Jason Goal
Oct 20 '17 at 5:34
,then that's good for me, don't know
id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.– Jason Goal
Oct 20 '17 at 5:34
add a comment |
Using reshape package:
#data
x <- read.table(textConnection(
"Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)
library(reshape)
x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))
add a comment |
Using reshape package:
#data
x <- read.table(textConnection(
"Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)
library(reshape)
x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))
add a comment |
Using reshape package:
#data
x <- read.table(textConnection(
"Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)
library(reshape)
x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))
Using reshape package:
#data
x <- read.table(textConnection(
"Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)
library(reshape)
x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))
edited Jan 9 at 10:20
zx8754
29.1k76398
29.1k76398
answered Feb 2 '10 at 16:08
Shane
77.3k27197211
77.3k27197211
add a comment |
add a comment |
Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.
Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:
is.vectorreturnsTRUEif x is a vector of the specified mode having no attributes other than names. It returnsFALSEotherwise.
I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.
Here's the stack approach:
cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
## Code Country values ind
## 1 AFG Afghanistan 20,249 1950
## 2 ALB Albania 8,097 1950
## 3 AFG Afghanistan 21,352 1951
## 4 ALB Albania 8,986 1951
## 5 AFG Afghanistan 22,532 1952
## 6 ALB Albania 10,058 1952
## 7 AFG Afghanistan 23,557 1953
## 8 ALB Albania 11,123 1953
## 9 AFG Afghanistan 24,555 1954
## 10 ALB Albania 12,246 1954
never heard ofstack, but it does exactly what I need and does so as simply as possible. Thanks! +1
– theforestecologist
Jun 17 at 14:35
add a comment |
Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.
Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:
is.vectorreturnsTRUEif x is a vector of the specified mode having no attributes other than names. It returnsFALSEotherwise.
I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.
Here's the stack approach:
cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
## Code Country values ind
## 1 AFG Afghanistan 20,249 1950
## 2 ALB Albania 8,097 1950
## 3 AFG Afghanistan 21,352 1951
## 4 ALB Albania 8,986 1951
## 5 AFG Afghanistan 22,532 1952
## 6 ALB Albania 10,058 1952
## 7 AFG Afghanistan 23,557 1953
## 8 ALB Albania 11,123 1953
## 9 AFG Afghanistan 24,555 1954
## 10 ALB Albania 12,246 1954
never heard ofstack, but it does exactly what I need and does so as simply as possible. Thanks! +1
– theforestecologist
Jun 17 at 14:35
add a comment |
Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.
Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:
is.vectorreturnsTRUEif x is a vector of the specified mode having no attributes other than names. It returnsFALSEotherwise.
I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.
Here's the stack approach:
cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
## Code Country values ind
## 1 AFG Afghanistan 20,249 1950
## 2 ALB Albania 8,097 1950
## 3 AFG Afghanistan 21,352 1951
## 4 ALB Albania 8,986 1951
## 5 AFG Afghanistan 22,532 1952
## 6 ALB Albania 10,058 1952
## 7 AFG Afghanistan 23,557 1953
## 8 ALB Albania 11,123 1953
## 9 AFG Afghanistan 24,555 1954
## 10 ALB Albania 12,246 1954
Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.
Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:
is.vectorreturnsTRUEif x is a vector of the specified mode having no attributes other than names. It returnsFALSEotherwise.
I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.
Here's the stack approach:
cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
## Code Country values ind
## 1 AFG Afghanistan 20,249 1950
## 2 ALB Albania 8,097 1950
## 3 AFG Afghanistan 21,352 1951
## 4 ALB Albania 8,986 1951
## 5 AFG Afghanistan 22,532 1952
## 6 ALB Albania 10,058 1952
## 7 AFG Afghanistan 23,557 1953
## 8 ALB Albania 11,123 1953
## 9 AFG Afghanistan 24,555 1954
## 10 ALB Albania 12,246 1954
edited Jan 9 at 5:52
answered Jan 9 at 5:31
A5C1D2H2I1M1N2O1R2T1
152k18282375
152k18282375
never heard ofstack, but it does exactly what I need and does so as simply as possible. Thanks! +1
– theforestecologist
Jun 17 at 14:35
add a comment |
never heard ofstack, but it does exactly what I need and does so as simply as possible. Thanks! +1
– theforestecologist
Jun 17 at 14:35
never heard of
stack, but it does exactly what I need and does so as simply as possible. Thanks! +1– theforestecologist
Jun 17 at 14:35
never heard of
stack, but it does exactly what I need and does so as simply as possible. Thanks! +1– theforestecologist
Jun 17 at 14:35
add a comment |
Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.
Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)
wide %>%
gather(Year, Value, -Code, -Country) %>%
mutate(Year = parse_number(Year)
, Value = parse_number(Value))
Returns:
Code Country Year Value
1 AFG Afghanistan 1950 20249
2 ALB Albania 1950 8097
3 AFG Afghanistan 1951 21352
4 ALB Albania 1951 8986
5 AFG Afghanistan 1952 22532
6 ALB Albania 1952 10058
7 AFG Afghanistan 1953 23557
8 ALB Albania 1953 11123
9 AFG Afghanistan 1954 24555
10 ALB Albania 1954 12246
add a comment |
Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.
Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)
wide %>%
gather(Year, Value, -Code, -Country) %>%
mutate(Year = parse_number(Year)
, Value = parse_number(Value))
Returns:
Code Country Year Value
1 AFG Afghanistan 1950 20249
2 ALB Albania 1950 8097
3 AFG Afghanistan 1951 21352
4 ALB Albania 1951 8986
5 AFG Afghanistan 1952 22532
6 ALB Albania 1952 10058
7 AFG Afghanistan 1953 23557
8 ALB Albania 1953 11123
9 AFG Afghanistan 1954 24555
10 ALB Albania 1954 12246
add a comment |
Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.
Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)
wide %>%
gather(Year, Value, -Code, -Country) %>%
mutate(Year = parse_number(Year)
, Value = parse_number(Value))
Returns:
Code Country Year Value
1 AFG Afghanistan 1950 20249
2 ALB Albania 1950 8097
3 AFG Afghanistan 1951 21352
4 ALB Albania 1951 8986
5 AFG Afghanistan 1952 22532
6 ALB Albania 1952 10058
7 AFG Afghanistan 1953 23557
8 ALB Albania 1953 11123
9 AFG Afghanistan 1954 24555
10 ALB Albania 1954 12246
Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.
Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)
wide %>%
gather(Year, Value, -Code, -Country) %>%
mutate(Year = parse_number(Year)
, Value = parse_number(Value))
Returns:
Code Country Year Value
1 AFG Afghanistan 1950 20249
2 ALB Albania 1950 8097
3 AFG Afghanistan 1951 21352
4 ALB Albania 1951 8986
5 AFG Afghanistan 1952 22532
6 ALB Albania 1952 10058
7 AFG Afghanistan 1953 23557
8 ALB Albania 1953 11123
9 AFG Afghanistan 1954 24555
10 ALB Albania 1954 12246
answered Dec 4 '16 at 19:20
Mark Peterson
6,89721333
6,89721333
add a comment |
add a comment |
protected by Jaap May 2 '16 at 5:11
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
2
Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51
1
And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10
3
And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37