How to extract sentences that contains citation mark with R










-2















For example, I have String:



string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"


The result should be like this:



"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."


please help me!










share|improve this question
























  • There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?

    – Mike
    Nov 13 '18 at 20:57











  • But the second sentence also contains (... et al., XXXX). What do you mean by a citation mark? What have you tried?

    – Wiktor Stribiżew
    Nov 13 '18 at 21:46











  • This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.

    – hrbrmstr
    Nov 14 '18 at 19:43















-2















For example, I have String:



string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"


The result should be like this:



"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."


please help me!










share|improve this question
























  • There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?

    – Mike
    Nov 13 '18 at 20:57











  • But the second sentence also contains (... et al., XXXX). What do you mean by a citation mark? What have you tried?

    – Wiktor Stribiżew
    Nov 13 '18 at 21:46











  • This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.

    – hrbrmstr
    Nov 14 '18 at 19:43













-2












-2








-2








For example, I have String:



string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"


The result should be like this:



"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."


please help me!










share|improve this question
















For example, I have String:



string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"


The result should be like this:



"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."


please help me!







r regex string






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 '18 at 20:35









markus

11k1031




11k1031










asked Nov 13 '18 at 20:33









Alfonso SorrentinoAlfonso Sorrentino

1




1












  • There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?

    – Mike
    Nov 13 '18 at 20:57











  • But the second sentence also contains (... et al., XXXX). What do you mean by a citation mark? What have you tried?

    – Wiktor Stribiżew
    Nov 13 '18 at 21:46











  • This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.

    – hrbrmstr
    Nov 14 '18 at 19:43

















  • There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?

    – Mike
    Nov 13 '18 at 20:57











  • But the second sentence also contains (... et al., XXXX). What do you mean by a citation mark? What have you tried?

    – Wiktor Stribiżew
    Nov 13 '18 at 21:46











  • This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.

    – hrbrmstr
    Nov 14 '18 at 19:43
















There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?

– Mike
Nov 13 '18 at 20:57





There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?

– Mike
Nov 13 '18 at 20:57













But the second sentence also contains (... et al., XXXX). What do you mean by a citation mark? What have you tried?

– Wiktor Stribiżew
Nov 13 '18 at 21:46





But the second sentence also contains (... et al., XXXX). What do you mean by a citation mark? What have you tried?

– Wiktor Stribiżew
Nov 13 '18 at 21:46













This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.

– hrbrmstr
Nov 14 '18 at 19:43





This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.

– hrbrmstr
Nov 14 '18 at 19:43












2 Answers
2






active

oldest

votes


















1














How about just using the powerful, underlying library that stringr wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:



stringi::stri_split_boundaries(string, type="sentence")[[1]][1]





share|improve this answer






























    0














    You could start with something like the following where:
    .* matches any character at least 0 times
    , \d4\)\. matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010). If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.



    library(stringr)
    str_extract(string,".*, \d4\)\.")
    #[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."





    share|improve this answer
























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289072%2fhow-to-extract-sentences-that-contains-citation-mark-with-r%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      How about just using the powerful, underlying library that stringr wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:



      stringi::stri_split_boundaries(string, type="sentence")[[1]][1]





      share|improve this answer



























        1














        How about just using the powerful, underlying library that stringr wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:



        stringi::stri_split_boundaries(string, type="sentence")[[1]][1]





        share|improve this answer

























          1












          1








          1







          How about just using the powerful, underlying library that stringr wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:



          stringi::stri_split_boundaries(string, type="sentence")[[1]][1]





          share|improve this answer













          How about just using the powerful, underlying library that stringr wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:



          stringi::stri_split_boundaries(string, type="sentence")[[1]][1]






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 14 '18 at 2:03









          hrbrmstrhrbrmstr

          60.5k687148




          60.5k687148























              0














              You could start with something like the following where:
              .* matches any character at least 0 times
              , \d4\)\. matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010). If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.



              library(stringr)
              str_extract(string,".*, \d4\)\.")
              #[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."





              share|improve this answer





























                0














                You could start with something like the following where:
                .* matches any character at least 0 times
                , \d4\)\. matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010). If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.



                library(stringr)
                str_extract(string,".*, \d4\)\.")
                #[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."





                share|improve this answer



























                  0












                  0








                  0







                  You could start with something like the following where:
                  .* matches any character at least 0 times
                  , \d4\)\. matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010). If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.



                  library(stringr)
                  str_extract(string,".*, \d4\)\.")
                  #[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."





                  share|improve this answer















                  You could start with something like the following where:
                  .* matches any character at least 0 times
                  , \d4\)\. matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010). If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.



                  library(stringr)
                  str_extract(string,".*, \d4\)\.")
                  #[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 13 '18 at 20:57

























                  answered Nov 13 '18 at 20:46









                  jasbnerjasbner

                  2,026618




                  2,026618



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289072%2fhow-to-extract-sentences-that-contains-citation-mark-with-r%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Top Tejano songwriter Luis Silva dead of heart attack at 64

                      政党

                      天津地下鉄3号線