Plotting the mean and or median of a set of observations










1















The data:



df = structure(list(obs_date = structure(c(17728, 17759, 17750, 17751, 
17759, 17777, 17778, 17779, 17780, 17751, 17759, 17773, 17779,
17759, 17773, 17777, 17784, 17722, 17759, 17750, 17759, 17724,
17759, 17760, 17780, 17781, 17740, 17759, 17779, 17780, 17777,
17759, 17765, 17759, 17760, 17766, 17774, 17750, 17759, 17779,
17740, 17759, 17779, 17716, 17732, 17735, 17736, 17760, 17740,
17759, 17765), class = "Date", tzone = "Australia/Sydney"),
obs_value = c(0.104669, 0.109833, 0.196295, 0.2, 0.21, 0.21422, 0.21, 0.202339, 0.2,
0.24, 0.24, 0.25, 0.24, 0.209645, 0.204462, 0.204462, 0.2042,
NA, NA, 0.204, 0.224486, 0.142, 0.142, 0.144, 0.144, 0.15, NA,
0.22, 0.22, 0.22, 0.23, 0.208, 0.208, 0.213781, 0.213781, 0.23111,
0.23111, 0.2, 0.190581, 0.188411, 0.318, 0.208, 0.204, 0.31,
0.31, 0.21, 0.21, 0.21, 0.25, 0.21, 0.21),
obs_id = c("2HN", "2HN", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "KFM",
"KFM", "KFM", "KFM", "N9S", "N9S", "N9S", "N9S", "NF7", "NF7",
"7Q6", "7Q6", "M6Q", "M6Q", "M6Q", "M6Q", "M6Q", "MW6", "YP0",
"YP0", "YP0", "ZG9", "D14", "D14", "MDY", "MDY", "MDY", "MDY",
"G3S", "G3S", "G3S", "J6Z", "J6Z", "J6Z", "6RU", "6RU", "6RU",
"6RU", "6RU", "6ZE", "6ZE", "6ZE")), class = "data.frame", row.names = c(NA, -51L))


In the dataframe df:



The obs_id are individuals estimating a particular value of a variable. The obs_value is the value observed by this individual.
The observations persist until a new observation is recorded, according to the observation date.



The plot of the observations are shown below:



library(plotly)
plot_ly(data = df, x = ~obs_date, y = ~obs_value,
type = 'scatter', mode = 'lines',
line = list(shape = "hvh"),
color = ~obs_id)


The question:
Is there a way to overlay/display the median/mean observation (over the full observation period) in the same chart?










share|improve this question


























    1















    The data:



    df = structure(list(obs_date = structure(c(17728, 17759, 17750, 17751, 
    17759, 17777, 17778, 17779, 17780, 17751, 17759, 17773, 17779,
    17759, 17773, 17777, 17784, 17722, 17759, 17750, 17759, 17724,
    17759, 17760, 17780, 17781, 17740, 17759, 17779, 17780, 17777,
    17759, 17765, 17759, 17760, 17766, 17774, 17750, 17759, 17779,
    17740, 17759, 17779, 17716, 17732, 17735, 17736, 17760, 17740,
    17759, 17765), class = "Date", tzone = "Australia/Sydney"),
    obs_value = c(0.104669, 0.109833, 0.196295, 0.2, 0.21, 0.21422, 0.21, 0.202339, 0.2,
    0.24, 0.24, 0.25, 0.24, 0.209645, 0.204462, 0.204462, 0.2042,
    NA, NA, 0.204, 0.224486, 0.142, 0.142, 0.144, 0.144, 0.15, NA,
    0.22, 0.22, 0.22, 0.23, 0.208, 0.208, 0.213781, 0.213781, 0.23111,
    0.23111, 0.2, 0.190581, 0.188411, 0.318, 0.208, 0.204, 0.31,
    0.31, 0.21, 0.21, 0.21, 0.25, 0.21, 0.21),
    obs_id = c("2HN", "2HN", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "KFM",
    "KFM", "KFM", "KFM", "N9S", "N9S", "N9S", "N9S", "NF7", "NF7",
    "7Q6", "7Q6", "M6Q", "M6Q", "M6Q", "M6Q", "M6Q", "MW6", "YP0",
    "YP0", "YP0", "ZG9", "D14", "D14", "MDY", "MDY", "MDY", "MDY",
    "G3S", "G3S", "G3S", "J6Z", "J6Z", "J6Z", "6RU", "6RU", "6RU",
    "6RU", "6RU", "6ZE", "6ZE", "6ZE")), class = "data.frame", row.names = c(NA, -51L))


    In the dataframe df:



    The obs_id are individuals estimating a particular value of a variable. The obs_value is the value observed by this individual.
    The observations persist until a new observation is recorded, according to the observation date.



    The plot of the observations are shown below:



    library(plotly)
    plot_ly(data = df, x = ~obs_date, y = ~obs_value,
    type = 'scatter', mode = 'lines',
    line = list(shape = "hvh"),
    color = ~obs_id)


    The question:
    Is there a way to overlay/display the median/mean observation (over the full observation period) in the same chart?










    share|improve this question
























      1












      1








      1








      The data:



      df = structure(list(obs_date = structure(c(17728, 17759, 17750, 17751, 
      17759, 17777, 17778, 17779, 17780, 17751, 17759, 17773, 17779,
      17759, 17773, 17777, 17784, 17722, 17759, 17750, 17759, 17724,
      17759, 17760, 17780, 17781, 17740, 17759, 17779, 17780, 17777,
      17759, 17765, 17759, 17760, 17766, 17774, 17750, 17759, 17779,
      17740, 17759, 17779, 17716, 17732, 17735, 17736, 17760, 17740,
      17759, 17765), class = "Date", tzone = "Australia/Sydney"),
      obs_value = c(0.104669, 0.109833, 0.196295, 0.2, 0.21, 0.21422, 0.21, 0.202339, 0.2,
      0.24, 0.24, 0.25, 0.24, 0.209645, 0.204462, 0.204462, 0.2042,
      NA, NA, 0.204, 0.224486, 0.142, 0.142, 0.144, 0.144, 0.15, NA,
      0.22, 0.22, 0.22, 0.23, 0.208, 0.208, 0.213781, 0.213781, 0.23111,
      0.23111, 0.2, 0.190581, 0.188411, 0.318, 0.208, 0.204, 0.31,
      0.31, 0.21, 0.21, 0.21, 0.25, 0.21, 0.21),
      obs_id = c("2HN", "2HN", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "KFM",
      "KFM", "KFM", "KFM", "N9S", "N9S", "N9S", "N9S", "NF7", "NF7",
      "7Q6", "7Q6", "M6Q", "M6Q", "M6Q", "M6Q", "M6Q", "MW6", "YP0",
      "YP0", "YP0", "ZG9", "D14", "D14", "MDY", "MDY", "MDY", "MDY",
      "G3S", "G3S", "G3S", "J6Z", "J6Z", "J6Z", "6RU", "6RU", "6RU",
      "6RU", "6RU", "6ZE", "6ZE", "6ZE")), class = "data.frame", row.names = c(NA, -51L))


      In the dataframe df:



      The obs_id are individuals estimating a particular value of a variable. The obs_value is the value observed by this individual.
      The observations persist until a new observation is recorded, according to the observation date.



      The plot of the observations are shown below:



      library(plotly)
      plot_ly(data = df, x = ~obs_date, y = ~obs_value,
      type = 'scatter', mode = 'lines',
      line = list(shape = "hvh"),
      color = ~obs_id)


      The question:
      Is there a way to overlay/display the median/mean observation (over the full observation period) in the same chart?










      share|improve this question














      The data:



      df = structure(list(obs_date = structure(c(17728, 17759, 17750, 17751, 
      17759, 17777, 17778, 17779, 17780, 17751, 17759, 17773, 17779,
      17759, 17773, 17777, 17784, 17722, 17759, 17750, 17759, 17724,
      17759, 17760, 17780, 17781, 17740, 17759, 17779, 17780, 17777,
      17759, 17765, 17759, 17760, 17766, 17774, 17750, 17759, 17779,
      17740, 17759, 17779, 17716, 17732, 17735, 17736, 17760, 17740,
      17759, 17765), class = "Date", tzone = "Australia/Sydney"),
      obs_value = c(0.104669, 0.109833, 0.196295, 0.2, 0.21, 0.21422, 0.21, 0.202339, 0.2,
      0.24, 0.24, 0.25, 0.24, 0.209645, 0.204462, 0.204462, 0.2042,
      NA, NA, 0.204, 0.224486, 0.142, 0.142, 0.144, 0.144, 0.15, NA,
      0.22, 0.22, 0.22, 0.23, 0.208, 0.208, 0.213781, 0.213781, 0.23111,
      0.23111, 0.2, 0.190581, 0.188411, 0.318, 0.208, 0.204, 0.31,
      0.31, 0.21, 0.21, 0.21, 0.25, 0.21, 0.21),
      obs_id = c("2HN", "2HN", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "4GJ", "KFM",
      "KFM", "KFM", "KFM", "N9S", "N9S", "N9S", "N9S", "NF7", "NF7",
      "7Q6", "7Q6", "M6Q", "M6Q", "M6Q", "M6Q", "M6Q", "MW6", "YP0",
      "YP0", "YP0", "ZG9", "D14", "D14", "MDY", "MDY", "MDY", "MDY",
      "G3S", "G3S", "G3S", "J6Z", "J6Z", "J6Z", "6RU", "6RU", "6RU",
      "6RU", "6RU", "6ZE", "6ZE", "6ZE")), class = "data.frame", row.names = c(NA, -51L))


      In the dataframe df:



      The obs_id are individuals estimating a particular value of a variable. The obs_value is the value observed by this individual.
      The observations persist until a new observation is recorded, according to the observation date.



      The plot of the observations are shown below:



      library(plotly)
      plot_ly(data = df, x = ~obs_date, y = ~obs_value,
      type = 'scatter', mode = 'lines',
      line = list(shape = "hvh"),
      color = ~obs_id)


      The question:
      Is there a way to overlay/display the median/mean observation (over the full observation period) in the same chart?







      r charts plotly






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 16 '18 at 4:30









      Tony2016Tony2016

      1089




      1089






















          1 Answer
          1






          active

          oldest

          votes


















          0














          You can try a tidyverse



          library(tidyverse)
          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value,color= obs_id)) +
          geom_point() +
          geom_segment(data= . %>% distinct(obs_id, start, end, Mean),
          aes(x=start, xend = end, y = Mean, yend =Mean))


          enter image description here



          Afterwards you can try to transform the plot to a plotly using library(plotly);ggplotly(the_plot)



          According to your comment you can try



          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value)) +
          geom_point(aes(color= obs_id)) +
          stat_summary(fun.y = "mean", geom="bar", alpha=0.2)


          enter image description here






          share|improve this answer

























          • I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

            – Tony2016
            Nov 18 '18 at 4:57











          • @Tony2016 see my edits.

            – Jimbou
            Nov 19 '18 at 8:35










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53331479%2fplotting-the-mean-and-or-median-of-a-set-of-observations%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          You can try a tidyverse



          library(tidyverse)
          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value,color= obs_id)) +
          geom_point() +
          geom_segment(data= . %>% distinct(obs_id, start, end, Mean),
          aes(x=start, xend = end, y = Mean, yend =Mean))


          enter image description here



          Afterwards you can try to transform the plot to a plotly using library(plotly);ggplotly(the_plot)



          According to your comment you can try



          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value)) +
          geom_point(aes(color= obs_id)) +
          stat_summary(fun.y = "mean", geom="bar", alpha=0.2)


          enter image description here






          share|improve this answer

























          • I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

            – Tony2016
            Nov 18 '18 at 4:57











          • @Tony2016 see my edits.

            – Jimbou
            Nov 19 '18 at 8:35















          0














          You can try a tidyverse



          library(tidyverse)
          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value,color= obs_id)) +
          geom_point() +
          geom_segment(data= . %>% distinct(obs_id, start, end, Mean),
          aes(x=start, xend = end, y = Mean, yend =Mean))


          enter image description here



          Afterwards you can try to transform the plot to a plotly using library(plotly);ggplotly(the_plot)



          According to your comment you can try



          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value)) +
          geom_point(aes(color= obs_id)) +
          stat_summary(fun.y = "mean", geom="bar", alpha=0.2)


          enter image description here






          share|improve this answer

























          • I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

            – Tony2016
            Nov 18 '18 at 4:57











          • @Tony2016 see my edits.

            – Jimbou
            Nov 19 '18 at 8:35













          0












          0








          0







          You can try a tidyverse



          library(tidyverse)
          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value,color= obs_id)) +
          geom_point() +
          geom_segment(data= . %>% distinct(obs_id, start, end, Mean),
          aes(x=start, xend = end, y = Mean, yend =Mean))


          enter image description here



          Afterwards you can try to transform the plot to a plotly using library(plotly);ggplotly(the_plot)



          According to your comment you can try



          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value)) +
          geom_point(aes(color= obs_id)) +
          stat_summary(fun.y = "mean", geom="bar", alpha=0.2)


          enter image description here






          share|improve this answer















          You can try a tidyverse



          library(tidyverse)
          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value,color= obs_id)) +
          geom_point() +
          geom_segment(data= . %>% distinct(obs_id, start, end, Mean),
          aes(x=start, xend = end, y = Mean, yend =Mean))


          enter image description here



          Afterwards you can try to transform the plot to a plotly using library(plotly);ggplotly(the_plot)



          According to your comment you can try



          df %>%
          group_by(obs_id) %>%
          mutate(start = min(obs_date),
          end= max(obs_date),
          Mean = mean(obs_value, na.rm = T)) %>%
          ggplot(aes(obs_date, obs_value)) +
          geom_point(aes(color= obs_id)) +
          stat_summary(fun.y = "mean", geom="bar", alpha=0.2)


          enter image description here







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 19 '18 at 8:35

























          answered Nov 16 '18 at 9:43









          JimbouJimbou

          9,86111231




          9,86111231












          • I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

            – Tony2016
            Nov 18 '18 at 4:57











          • @Tony2016 see my edits.

            – Jimbou
            Nov 19 '18 at 8:35

















          • I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

            – Tony2016
            Nov 18 '18 at 4:57











          • @Tony2016 see my edits.

            – Jimbou
            Nov 19 '18 at 8:35
















          I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

          – Tony2016
          Nov 18 '18 at 4:57





          I should have been more specific. What I want is the global mean/median of all observations, not the individual mean/medians of the various obs_ids. So the point on Jul 15 is the mean/median(all obs_value) where the obs_value is the last obs_value carried forward.

          – Tony2016
          Nov 18 '18 at 4:57













          @Tony2016 see my edits.

          – Jimbou
          Nov 19 '18 at 8:35





          @Tony2016 see my edits.

          – Jimbou
          Nov 19 '18 at 8:35



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53331479%2fplotting-the-mean-and-or-median-of-a-set-of-observations%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Top Tejano songwriter Luis Silva dead of heart attack at 64

          政党

          ReactJS Fetched API data displays live - need Data displayed static