Reshaping data.frame from wide to long format










105














I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:



Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246


Now I like to transform this data.frame into a long data.frame.
Something like this:



Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246


I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.



If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.










share|improve this question



















  • 2




    Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
    – Eduardo Leoni
    Feb 2 '10 at 17:51






  • 1




    And the reshape package has been superseded by reshape2.
    – 42-
    Sep 16 '14 at 0:10






  • 3




    And now reshape2 has been superseded by tidyr.
    – drhagen
    Feb 15 '16 at 13:37















105














I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:



Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246


Now I like to transform this data.frame into a long data.frame.
Something like this:



Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246


I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.



If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.










share|improve this question



















  • 2




    Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
    – Eduardo Leoni
    Feb 2 '10 at 17:51






  • 1




    And the reshape package has been superseded by reshape2.
    – 42-
    Sep 16 '14 at 0:10






  • 3




    And now reshape2 has been superseded by tidyr.
    – drhagen
    Feb 15 '16 at 13:37













105












105








105


54





I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:



Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246


Now I like to transform this data.frame into a long data.frame.
Something like this:



Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246


I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.



If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.










share|improve this question















I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:



Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
ALB Albania 8,097 8,986 10,058 11,123 12,246


Now I like to transform this data.frame into a long data.frame.
Something like this:



Code Country Year Value
AFG Afghanistan 1950 20,249
AFG Afghanistan 1951 21,352
AFG Afghanistan 1952 22,532
AFG Afghanistan 1953 23,557
AFG Afghanistan 1954 24,555
ALB Albania 1950 8,097
ALB Albania 1951 8,986
ALB Albania 1952 10,058
ALB Albania 1953 11,123
ALB Albania 1954 12,246


I have looked and tried it already with the melt() and the reshape() functions
as some people were suggesting similar questions.
However, so far I only get messy results.



If it is possible I would like to do it with the reshape() function since
it looks a little bit nicer to handle.







r dataframe reshape r-faq






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 14 at 19:09









Saranjith

4,43612361




4,43612361










asked Feb 2 '10 at 15:36









mropa

5,97392729




5,97392729







  • 2




    Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
    – Eduardo Leoni
    Feb 2 '10 at 17:51






  • 1




    And the reshape package has been superseded by reshape2.
    – 42-
    Sep 16 '14 at 0:10






  • 3




    And now reshape2 has been superseded by tidyr.
    – drhagen
    Feb 15 '16 at 13:37












  • 2




    Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
    – Eduardo Leoni
    Feb 2 '10 at 17:51






  • 1




    And the reshape package has been superseded by reshape2.
    – 42-
    Sep 16 '14 at 0:10






  • 3




    And now reshape2 has been superseded by tidyr.
    – drhagen
    Feb 15 '16 at 13:37







2




2




Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51




Don't know if that was the problem, but the functions in the reshape package are melt and cast (and recast.)
– Eduardo Leoni
Feb 2 '10 at 17:51




1




1




And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10




And the reshape package has been superseded by reshape2.
– 42-
Sep 16 '14 at 0:10




3




3




And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37




And now reshape2 has been superseded by tidyr.
– drhagen
Feb 15 '16 at 13:37












5 Answers
5






active

oldest

votes


















65














reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:



reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value", 
idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)





share|improve this answer






























    92














    Three alternative solutions:



    1: With reshape2



    library(reshape2)
    long <- melt(wide, id.vars = c("Code", "Country"))


    giving:



     Code Country variable value
    1 AFG Afghanistan 1950 20,249
    2 ALB Albania 1950 8,097
    3 AFG Afghanistan 1951 21,352
    4 ALB Albania 1951 8,986
    5 AFG Afghanistan 1952 22,532
    6 ALB Albania 1952 10,058
    7 AFG Afghanistan 1953 23,557
    8 ALB Albania 1953 11,123
    9 AFG Afghanistan 1954 24,555
    10 ALB Albania 1954 12,246


    Some alternative notations that give the same result:



    # you can also define the id-variables by column number
    melt(wide, id.vars = 1:2)

    # as an alternative you can also specify the measure-variables
    # all other variables will then be used as id-variables
    melt(wide, measure.vars = 3:7)
    melt(wide, measure.vars = as.character(1950:1954))


    2: With data.table



    You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:



    library(data.table)
    long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")


    Some alternative notations:



    melt(setDT(wide), id.vars = 1:2, variable.name = "year")
    melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
    melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")


    3: With tidyr



    library(tidyr)
    long <- wide %>% gather(year, value, -c(Code, Country))


    Some alternative notations:



    wide %>% gather(year, value, -Code, -Country)
    wide %>% gather(year, value, -1:-2)
    wide %>% gather(year, value, -(1:2))
    wide %>% gather(year, value, -1, -2)
    wide %>% gather(year, value, 3:7)
    wide %>% gather(year, value, `1950`:`1954`)



    If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.




    Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:



    long$value <- as.numeric(gsub(",", "", long$value))


    Or directly with data.table or dplyr:



    # data.table
    long <- melt(setDT(wide),
    id.vars = c("Code","Country"),
    variable.name = "year")[, value := as.numeric(gsub(",", "", value))]

    # tidyr and dplyr
    long <- wide %>% gather(year, value, -c(Code,Country)) %>%
    mutate(value = as.numeric(gsub(",", "", value)))



    Data:



    wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
    AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
    ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)





    share|improve this answer






















    • great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
      – Jason Goal
      Oct 19 '17 at 11:11







    • 1




      @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
      – Jaap
      Oct 19 '17 at 11:55










    • ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
      – Jason Goal
      Oct 20 '17 at 5:34


















    29














    Using reshape package:



    #data
    x <- read.table(textConnection(
    "Code Country 1950 1951 1952 1953 1954
    AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
    ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)

    library(reshape)

    x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
    x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))





    share|improve this answer






























      6














      Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.



      Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:




      is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise.




      I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.



      Here's the stack approach:



      cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
      ## Code Country values ind
      ## 1 AFG Afghanistan 20,249 1950
      ## 2 ALB Albania 8,097 1950
      ## 3 AFG Afghanistan 21,352 1951
      ## 4 ALB Albania 8,986 1951
      ## 5 AFG Afghanistan 22,532 1952
      ## 6 ALB Albania 10,058 1952
      ## 7 AFG Afghanistan 23,557 1953
      ## 8 ALB Albania 11,123 1953
      ## 9 AFG Afghanistan 24,555 1954
      ## 10 ALB Albania 12,246 1954





      share|improve this answer






















      • never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
        – theforestecologist
        Jun 17 at 14:35


















      5














      Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.



      Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)



      wide %>%
      gather(Year, Value, -Code, -Country) %>%
      mutate(Year = parse_number(Year)
      , Value = parse_number(Value))


      Returns:



       Code Country Year Value
      1 AFG Afghanistan 1950 20249
      2 ALB Albania 1950 8097
      3 AFG Afghanistan 1951 21352
      4 ALB Albania 1951 8986
      5 AFG Afghanistan 1952 22532
      6 ALB Albania 1952 10058
      7 AFG Afghanistan 1953 23557
      8 ALB Albania 1953 11123
      9 AFG Afghanistan 1954 24555
      10 ALB Albania 1954 12246





      share|improve this answer



















        protected by Jaap May 2 '16 at 5:11



        Thank you for your interest in this question.
        Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



        Would you like to answer one of these unanswered questions instead?














        5 Answers
        5






        active

        oldest

        votes








        5 Answers
        5






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        65














        reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:



        reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value", 
        idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)





        share|improve this answer



























          65














          reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:



          reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value", 
          idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)





          share|improve this answer

























            65












            65








            65






            reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:



            reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value", 
            idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)





            share|improve this answer














            reshape() takes a while to get used to, just as melt/cast. Here is a solution with reshape, assuming your data frame is called d:



            reshape(d, direction = "long", varying = list(names(d)[3:7]), v.names = "Value", 
            idvar = c("Code","Country"), timevar = "Year", times = 1950:1954)






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Feb 12 at 10:41









            Jaap

            54.9k20117130




            54.9k20117130










            answered Feb 2 '10 at 16:07









            Aniko

            13.9k33743




            13.9k33743























                92














                Three alternative solutions:



                1: With reshape2



                library(reshape2)
                long <- melt(wide, id.vars = c("Code", "Country"))


                giving:



                 Code Country variable value
                1 AFG Afghanistan 1950 20,249
                2 ALB Albania 1950 8,097
                3 AFG Afghanistan 1951 21,352
                4 ALB Albania 1951 8,986
                5 AFG Afghanistan 1952 22,532
                6 ALB Albania 1952 10,058
                7 AFG Afghanistan 1953 23,557
                8 ALB Albania 1953 11,123
                9 AFG Afghanistan 1954 24,555
                10 ALB Albania 1954 12,246


                Some alternative notations that give the same result:



                # you can also define the id-variables by column number
                melt(wide, id.vars = 1:2)

                # as an alternative you can also specify the measure-variables
                # all other variables will then be used as id-variables
                melt(wide, measure.vars = 3:7)
                melt(wide, measure.vars = as.character(1950:1954))


                2: With data.table



                You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:



                library(data.table)
                long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")


                Some alternative notations:



                melt(setDT(wide), id.vars = 1:2, variable.name = "year")
                melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
                melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")


                3: With tidyr



                library(tidyr)
                long <- wide %>% gather(year, value, -c(Code, Country))


                Some alternative notations:



                wide %>% gather(year, value, -Code, -Country)
                wide %>% gather(year, value, -1:-2)
                wide %>% gather(year, value, -(1:2))
                wide %>% gather(year, value, -1, -2)
                wide %>% gather(year, value, 3:7)
                wide %>% gather(year, value, `1950`:`1954`)



                If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.




                Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:



                long$value <- as.numeric(gsub(",", "", long$value))


                Or directly with data.table or dplyr:



                # data.table
                long <- melt(setDT(wide),
                id.vars = c("Code","Country"),
                variable.name = "year")[, value := as.numeric(gsub(",", "", value))]

                # tidyr and dplyr
                long <- wide %>% gather(year, value, -c(Code,Country)) %>%
                mutate(value = as.numeric(gsub(",", "", value)))



                Data:



                wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
                AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)





                share|improve this answer






















                • great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
                  – Jason Goal
                  Oct 19 '17 at 11:11







                • 1




                  @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
                  – Jaap
                  Oct 19 '17 at 11:55










                • ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
                  – Jason Goal
                  Oct 20 '17 at 5:34















                92














                Three alternative solutions:



                1: With reshape2



                library(reshape2)
                long <- melt(wide, id.vars = c("Code", "Country"))


                giving:



                 Code Country variable value
                1 AFG Afghanistan 1950 20,249
                2 ALB Albania 1950 8,097
                3 AFG Afghanistan 1951 21,352
                4 ALB Albania 1951 8,986
                5 AFG Afghanistan 1952 22,532
                6 ALB Albania 1952 10,058
                7 AFG Afghanistan 1953 23,557
                8 ALB Albania 1953 11,123
                9 AFG Afghanistan 1954 24,555
                10 ALB Albania 1954 12,246


                Some alternative notations that give the same result:



                # you can also define the id-variables by column number
                melt(wide, id.vars = 1:2)

                # as an alternative you can also specify the measure-variables
                # all other variables will then be used as id-variables
                melt(wide, measure.vars = 3:7)
                melt(wide, measure.vars = as.character(1950:1954))


                2: With data.table



                You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:



                library(data.table)
                long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")


                Some alternative notations:



                melt(setDT(wide), id.vars = 1:2, variable.name = "year")
                melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
                melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")


                3: With tidyr



                library(tidyr)
                long <- wide %>% gather(year, value, -c(Code, Country))


                Some alternative notations:



                wide %>% gather(year, value, -Code, -Country)
                wide %>% gather(year, value, -1:-2)
                wide %>% gather(year, value, -(1:2))
                wide %>% gather(year, value, -1, -2)
                wide %>% gather(year, value, 3:7)
                wide %>% gather(year, value, `1950`:`1954`)



                If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.




                Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:



                long$value <- as.numeric(gsub(",", "", long$value))


                Or directly with data.table or dplyr:



                # data.table
                long <- melt(setDT(wide),
                id.vars = c("Code","Country"),
                variable.name = "year")[, value := as.numeric(gsub(",", "", value))]

                # tidyr and dplyr
                long <- wide %>% gather(year, value, -c(Code,Country)) %>%
                mutate(value = as.numeric(gsub(",", "", value)))



                Data:



                wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
                AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)





                share|improve this answer






















                • great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
                  – Jason Goal
                  Oct 19 '17 at 11:11







                • 1




                  @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
                  – Jaap
                  Oct 19 '17 at 11:55










                • ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
                  – Jason Goal
                  Oct 20 '17 at 5:34













                92












                92








                92






                Three alternative solutions:



                1: With reshape2



                library(reshape2)
                long <- melt(wide, id.vars = c("Code", "Country"))


                giving:



                 Code Country variable value
                1 AFG Afghanistan 1950 20,249
                2 ALB Albania 1950 8,097
                3 AFG Afghanistan 1951 21,352
                4 ALB Albania 1951 8,986
                5 AFG Afghanistan 1952 22,532
                6 ALB Albania 1952 10,058
                7 AFG Afghanistan 1953 23,557
                8 ALB Albania 1953 11,123
                9 AFG Afghanistan 1954 24,555
                10 ALB Albania 1954 12,246


                Some alternative notations that give the same result:



                # you can also define the id-variables by column number
                melt(wide, id.vars = 1:2)

                # as an alternative you can also specify the measure-variables
                # all other variables will then be used as id-variables
                melt(wide, measure.vars = 3:7)
                melt(wide, measure.vars = as.character(1950:1954))


                2: With data.table



                You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:



                library(data.table)
                long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")


                Some alternative notations:



                melt(setDT(wide), id.vars = 1:2, variable.name = "year")
                melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
                melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")


                3: With tidyr



                library(tidyr)
                long <- wide %>% gather(year, value, -c(Code, Country))


                Some alternative notations:



                wide %>% gather(year, value, -Code, -Country)
                wide %>% gather(year, value, -1:-2)
                wide %>% gather(year, value, -(1:2))
                wide %>% gather(year, value, -1, -2)
                wide %>% gather(year, value, 3:7)
                wide %>% gather(year, value, `1950`:`1954`)



                If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.




                Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:



                long$value <- as.numeric(gsub(",", "", long$value))


                Or directly with data.table or dplyr:



                # data.table
                long <- melt(setDT(wide),
                id.vars = c("Code","Country"),
                variable.name = "year")[, value := as.numeric(gsub(",", "", value))]

                # tidyr and dplyr
                long <- wide %>% gather(year, value, -c(Code,Country)) %>%
                mutate(value = as.numeric(gsub(",", "", value)))



                Data:



                wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
                AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)





                share|improve this answer














                Three alternative solutions:



                1: With reshape2



                library(reshape2)
                long <- melt(wide, id.vars = c("Code", "Country"))


                giving:



                 Code Country variable value
                1 AFG Afghanistan 1950 20,249
                2 ALB Albania 1950 8,097
                3 AFG Afghanistan 1951 21,352
                4 ALB Albania 1951 8,986
                5 AFG Afghanistan 1952 22,532
                6 ALB Albania 1952 10,058
                7 AFG Afghanistan 1953 23,557
                8 ALB Albania 1953 11,123
                9 AFG Afghanistan 1954 24,555
                10 ALB Albania 1954 12,246


                Some alternative notations that give the same result:



                # you can also define the id-variables by column number
                melt(wide, id.vars = 1:2)

                # as an alternative you can also specify the measure-variables
                # all other variables will then be used as id-variables
                melt(wide, measure.vars = 3:7)
                melt(wide, measure.vars = as.character(1950:1954))


                2: With data.table



                You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column:



                library(data.table)
                long <- melt(setDT(wide), id.vars = c("Code","Country"), variable.name = "year")


                Some alternative notations:



                melt(setDT(wide), id.vars = 1:2, variable.name = "year")
                melt(setDT(wide), measure.vars = 3:7, variable.name = "year")
                melt(setDT(wide), measure.vars = as.character(1950:1954), variable.name = "year")


                3: With tidyr



                library(tidyr)
                long <- wide %>% gather(year, value, -c(Code, Country))


                Some alternative notations:



                wide %>% gather(year, value, -Code, -Country)
                wide %>% gather(year, value, -1:-2)
                wide %>% gather(year, value, -(1:2))
                wide %>% gather(year, value, -1, -2)
                wide %>% gather(year, value, 3:7)
                wide %>% gather(year, value, `1950`:`1954`)



                If you want to exclude NA values, you can add na.rm = TRUE to the melt as well as the gather functions.




                Another problem with the data is that the values will be read by R as character-values (as a result of the , in the numbers). You can repair that with gsub and as.numeric:



                long$value <- as.numeric(gsub(",", "", long$value))


                Or directly with data.table or dplyr:



                # data.table
                long <- melt(setDT(wide),
                id.vars = c("Code","Country"),
                variable.name = "year")[, value := as.numeric(gsub(",", "", value))]

                # tidyr and dplyr
                long <- wide %>% gather(year, value, -c(Code,Country)) %>%
                mutate(value = as.numeric(gsub(",", "", value)))



                Data:



                wide <- read.table(text="Code Country 1950 1951 1952 1953 1954
                AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                ALB Albania 8,097 8,986 10,058 11,123 12,246", header=TRUE, check.names=FALSE)






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Aug 22 at 14:46

























                answered Sep 15 '14 at 20:09









                Jaap

                54.9k20117130




                54.9k20117130











                • great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
                  – Jason Goal
                  Oct 19 '17 at 11:11







                • 1




                  @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
                  – Jaap
                  Oct 19 '17 at 11:55










                • ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
                  – Jason Goal
                  Oct 20 '17 at 5:34
















                • great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
                  – Jason Goal
                  Oct 19 '17 at 11:11







                • 1




                  @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
                  – Jaap
                  Oct 19 '17 at 11:55










                • ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
                  – Jason Goal
                  Oct 20 '17 at 5:34















                great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
                – Jason Goal
                Oct 19 '17 at 11:11





                great answer, just one more tiny reminder : do not put any variables other than id andtime in your data frame, melt could not tell what you want to do in this case.
                – Jason Goal
                Oct 19 '17 at 11:11





                1




                1




                @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
                – Jaap
                Oct 19 '17 at 11:55




                @JasonGoal Could you elaborate on that? As I'm interpreting you comment, it shouldn't be a problem. Just specify both the id.vars and the measure.vars.
                – Jaap
                Oct 19 '17 at 11:55












                ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
                – Jason Goal
                Oct 20 '17 at 5:34




                ,then that's good for me, don't know id.vars and the measure.vars can be specified in the first alternative,sorry for the mess, its my fault.
                – Jason Goal
                Oct 20 '17 at 5:34











                29














                Using reshape package:



                #data
                x <- read.table(textConnection(
                "Code Country 1950 1951 1952 1953 1954
                AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)

                library(reshape)

                x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
                x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))





                share|improve this answer



























                  29














                  Using reshape package:



                  #data
                  x <- read.table(textConnection(
                  "Code Country 1950 1951 1952 1953 1954
                  AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                  ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)

                  library(reshape)

                  x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
                  x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))





                  share|improve this answer

























                    29












                    29








                    29






                    Using reshape package:



                    #data
                    x <- read.table(textConnection(
                    "Code Country 1950 1951 1952 1953 1954
                    AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                    ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)

                    library(reshape)

                    x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
                    x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))





                    share|improve this answer














                    Using reshape package:



                    #data
                    x <- read.table(textConnection(
                    "Code Country 1950 1951 1952 1953 1954
                    AFG Afghanistan 20,249 21,352 22,532 23,557 24,555
                    ALB Albania 8,097 8,986 10,058 11,123 12,246"), header=TRUE)

                    library(reshape)

                    x2 <- melt(x, id = c("Code", "Country"), variable_name = "Year")
                    x2[,"Year"] <- as.numeric(gsub("X", "" , x2[,"Year"]))






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jan 9 at 10:20









                    zx8754

                    29.1k76398




                    29.1k76398










                    answered Feb 2 '10 at 16:08









                    Shane

                    77.3k27197211




                    77.3k27197211





















                        6














                        Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.



                        Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:




                        is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise.




                        I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.



                        Here's the stack approach:



                        cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
                        ## Code Country values ind
                        ## 1 AFG Afghanistan 20,249 1950
                        ## 2 ALB Albania 8,097 1950
                        ## 3 AFG Afghanistan 21,352 1951
                        ## 4 ALB Albania 8,986 1951
                        ## 5 AFG Afghanistan 22,532 1952
                        ## 6 ALB Albania 10,058 1952
                        ## 7 AFG Afghanistan 23,557 1953
                        ## 8 ALB Albania 11,123 1953
                        ## 9 AFG Afghanistan 24,555 1954
                        ## 10 ALB Albania 12,246 1954





                        share|improve this answer






















                        • never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
                          – theforestecologist
                          Jun 17 at 14:35















                        6














                        Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.



                        Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:




                        is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise.




                        I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.



                        Here's the stack approach:



                        cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
                        ## Code Country values ind
                        ## 1 AFG Afghanistan 20,249 1950
                        ## 2 ALB Albania 8,097 1950
                        ## 3 AFG Afghanistan 21,352 1951
                        ## 4 ALB Albania 8,986 1951
                        ## 5 AFG Afghanistan 22,532 1952
                        ## 6 ALB Albania 10,058 1952
                        ## 7 AFG Afghanistan 23,557 1953
                        ## 8 ALB Albania 11,123 1953
                        ## 9 AFG Afghanistan 24,555 1954
                        ## 10 ALB Albania 12,246 1954





                        share|improve this answer






















                        • never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
                          – theforestecologist
                          Jun 17 at 14:35













                        6












                        6








                        6






                        Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.



                        Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:




                        is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise.




                        I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.



                        Here's the stack approach:



                        cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
                        ## Code Country values ind
                        ## 1 AFG Afghanistan 20,249 1950
                        ## 2 ALB Albania 8,097 1950
                        ## 3 AFG Afghanistan 21,352 1951
                        ## 4 ALB Albania 8,986 1951
                        ## 5 AFG Afghanistan 22,532 1952
                        ## 6 ALB Albania 10,058 1952
                        ## 7 AFG Afghanistan 23,557 1953
                        ## 8 ALB Albania 11,123 1953
                        ## 9 AFG Afghanistan 24,555 1954
                        ## 10 ALB Albania 12,246 1954





                        share|improve this answer














                        Since this answer is tagged with r-faq, I felt it would be useful to share another alternative from base R: stack.



                        Note, however, that stack does not work with factors--it only works if is.vector is TRUE, and from the documentation for is.vector, we find that:




                        is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise.




                        I'm using the sample data from @Jaap's answer, where the values in the year columns are factors.



                        Here's the stack approach:



                        cbind(wide[1:2], stack(lapply(wide[-c(1, 2)], as.character)))
                        ## Code Country values ind
                        ## 1 AFG Afghanistan 20,249 1950
                        ## 2 ALB Albania 8,097 1950
                        ## 3 AFG Afghanistan 21,352 1951
                        ## 4 ALB Albania 8,986 1951
                        ## 5 AFG Afghanistan 22,532 1952
                        ## 6 ALB Albania 10,058 1952
                        ## 7 AFG Afghanistan 23,557 1953
                        ## 8 ALB Albania 11,123 1953
                        ## 9 AFG Afghanistan 24,555 1954
                        ## 10 ALB Albania 12,246 1954






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Jan 9 at 5:52

























                        answered Jan 9 at 5:31









                        A5C1D2H2I1M1N2O1R2T1

                        152k18282375




                        152k18282375











                        • never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
                          – theforestecologist
                          Jun 17 at 14:35
















                        • never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
                          – theforestecologist
                          Jun 17 at 14:35















                        never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
                        – theforestecologist
                        Jun 17 at 14:35




                        never heard of stack, but it does exactly what I need and does so as simply as possible. Thanks! +1
                        – theforestecologist
                        Jun 17 at 14:35











                        5














                        Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.



                        Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)



                        wide %>%
                        gather(Year, Value, -Code, -Country) %>%
                        mutate(Year = parse_number(Year)
                        , Value = parse_number(Value))


                        Returns:



                         Code Country Year Value
                        1 AFG Afghanistan 1950 20249
                        2 ALB Albania 1950 8097
                        3 AFG Afghanistan 1951 21352
                        4 ALB Albania 1951 8986
                        5 AFG Afghanistan 1952 22532
                        6 ALB Albania 1952 10058
                        7 AFG Afghanistan 1953 23557
                        8 ALB Albania 1953 11123
                        9 AFG Afghanistan 1954 24555
                        10 ALB Albania 1954 12246





                        share|improve this answer

























                          5














                          Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.



                          Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)



                          wide %>%
                          gather(Year, Value, -Code, -Country) %>%
                          mutate(Year = parse_number(Year)
                          , Value = parse_number(Value))


                          Returns:



                           Code Country Year Value
                          1 AFG Afghanistan 1950 20249
                          2 ALB Albania 1950 8097
                          3 AFG Afghanistan 1951 21352
                          4 ALB Albania 1951 8986
                          5 AFG Afghanistan 1952 22532
                          6 ALB Albania 1952 10058
                          7 AFG Afghanistan 1953 23557
                          8 ALB Albania 1953 11123
                          9 AFG Afghanistan 1954 24555
                          10 ALB Albania 1954 12246





                          share|improve this answer























                            5












                            5








                            5






                            Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.



                            Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)



                            wide %>%
                            gather(Year, Value, -Code, -Country) %>%
                            mutate(Year = parse_number(Year)
                            , Value = parse_number(Value))


                            Returns:



                             Code Country Year Value
                            1 AFG Afghanistan 1950 20249
                            2 ALB Albania 1950 8097
                            3 AFG Afghanistan 1951 21352
                            4 ALB Albania 1951 8986
                            5 AFG Afghanistan 1952 22532
                            6 ALB Albania 1952 10058
                            7 AFG Afghanistan 1953 23557
                            8 ALB Albania 1953 11123
                            9 AFG Afghanistan 1954 24555
                            10 ALB Albania 1954 12246





                            share|improve this answer












                            Here is another example showing the use of gather from tidyr. You can select the columns to gather either by removing them individually (as I do here), or by including the years you want explicitly.



                            Note that, to handle the commas (and X's added if check.names = FALSE is not set), I am also using dplyr's mutate with parse_number from readr to convert the text values back to numbers. These are all part of the tidyverse and so can be loaded together with library(tidyverse)



                            wide %>%
                            gather(Year, Value, -Code, -Country) %>%
                            mutate(Year = parse_number(Year)
                            , Value = parse_number(Value))


                            Returns:



                             Code Country Year Value
                            1 AFG Afghanistan 1950 20249
                            2 ALB Albania 1950 8097
                            3 AFG Afghanistan 1951 21352
                            4 ALB Albania 1951 8986
                            5 AFG Afghanistan 1952 22532
                            6 ALB Albania 1952 10058
                            7 AFG Afghanistan 1953 23557
                            8 ALB Albania 1953 11123
                            9 AFG Afghanistan 1954 24555
                            10 ALB Albania 1954 12246






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Dec 4 '16 at 19:20









                            Mark Peterson

                            6,89721333




                            6,89721333















                                protected by Jaap May 2 '16 at 5:11



                                Thank you for your interest in this question.
                                Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                Would you like to answer one of these unanswered questions instead?



                                Popular posts from this blog

                                27

                                Top Tejano songwriter Luis Silva dead of heart attack at 64

                                Category:Rhetoric