How to extract sentences that contains citation mark with R
For example, I have String:
string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"
The result should be like this:
"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
please help me!
r regex string
add a comment |
For example, I have String:
string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"
The result should be like this:
"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
please help me!
r regex string
There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?
– Mike
Nov 13 '18 at 20:57
But the second sentence also contains(... et al., XXXX)
. What do you mean by a citation mark? What have you tried?
– Wiktor Stribiżew
Nov 13 '18 at 21:46
This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.
– hrbrmstr
Nov 14 '18 at 19:43
add a comment |
For example, I have String:
string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"
The result should be like this:
"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
please help me!
r regex string
For example, I have String:
string = "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010). Data from 2008 Phase III and 2010 Phase V of the survey were analysed to determine whether (cumulative) scores on the General Health Questionnaire (Goldberg and Williams, 1988) and the CFQ (Broadbent et al., 1982), were related to the occurrence of accidents over a three-year period (2007–2010)"
The result should be like this:
"The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
please help me!
r regex string
r regex string
edited Nov 13 '18 at 20:35
markus
11k1031
11k1031
asked Nov 13 '18 at 20:33
Alfonso SorrentinoAlfonso Sorrentino
1
1
There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?
– Mike
Nov 13 '18 at 20:57
But the second sentence also contains(... et al., XXXX)
. What do you mean by a citation mark? What have you tried?
– Wiktor Stribiżew
Nov 13 '18 at 21:46
This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.
– hrbrmstr
Nov 14 '18 at 19:43
add a comment |
There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?
– Mike
Nov 13 '18 at 20:57
But the second sentence also contains(... et al., XXXX)
. What do you mean by a citation mark? What have you tried?
– Wiktor Stribiżew
Nov 13 '18 at 21:46
This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.
– hrbrmstr
Nov 14 '18 at 19:43
There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?
– Mike
Nov 13 '18 at 20:57
There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?
– Mike
Nov 13 '18 at 20:57
But the second sentence also contains
(... et al., XXXX)
. What do you mean by a citation mark? What have you tried?– Wiktor Stribiżew
Nov 13 '18 at 21:46
But the second sentence also contains
(... et al., XXXX)
. What do you mean by a citation mark? What have you tried?– Wiktor Stribiżew
Nov 13 '18 at 21:46
This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.
– hrbrmstr
Nov 14 '18 at 19:43
This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.
– hrbrmstr
Nov 14 '18 at 19:43
add a comment |
2 Answers
2
active
oldest
votes
How about just using the powerful, underlying library that stringr
wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:
stringi::stri_split_boundaries(string, type="sentence")[[1]][1]
add a comment |
You could start with something like the following where:.*
matches any character at least 0 times, \d4\)\.
matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010).
If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.
library(stringr)
str_extract(string,".*, \d4\)\.")
#[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289072%2fhow-to-extract-sentences-that-contains-citation-mark-with-r%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
How about just using the powerful, underlying library that stringr
wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:
stringi::stri_split_boundaries(string, type="sentence")[[1]][1]
add a comment |
How about just using the powerful, underlying library that stringr
wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:
stringi::stri_split_boundaries(string, type="sentence")[[1]][1]
add a comment |
How about just using the powerful, underlying library that stringr
wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:
stringi::stri_split_boundaries(string, type="sentence")[[1]][1]
How about just using the powerful, underlying library that stringr
wraps and use it to it's fullest potential vs rely on the crutch and regex hacks:
stringi::stri_split_boundaries(string, type="sentence")[[1]][1]
answered Nov 14 '18 at 2:03
hrbrmstrhrbrmstr
60.5k687148
60.5k687148
add a comment |
add a comment |
You could start with something like the following where:.*
matches any character at least 0 times, \d4\)\.
matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010).
If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.
library(stringr)
str_extract(string,".*, \d4\)\.")
#[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
add a comment |
You could start with something like the following where:.*
matches any character at least 0 times, \d4\)\.
matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010).
If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.
library(stringr)
str_extract(string,".*, \d4\)\.")
#[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
add a comment |
You could start with something like the following where:.*
matches any character at least 0 times, \d4\)\.
matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010).
If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.
library(stringr)
str_extract(string,".*, \d4\)\.")
#[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
You could start with something like the following where:.*
matches any character at least 0 times, \d4\)\.
matches a comma followed by a space, exactly 4 digits, a parenthesis and a period e.g. , 2010).
If you think there is a possibility that the string might contain that sequence in another instance other than a citation, or is not at the start of the string, you may have to get more specific.
library(stringr)
str_extract(string,".*, \d4\)\.")
#[1] "The present paper describes an analysis of data from a cohort study of occupational stress in the Royal Navy (Bridger et al., 2010)."
edited Nov 13 '18 at 20:57
answered Nov 13 '18 at 20:46
jasbnerjasbner
2,026618
2,026618
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289072%2fhow-to-extract-sentences-that-contains-citation-mark-with-r%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
There are multiple sentence with citations i.e Goldberg and Williams, 1988 is in the second sentence. Do you not want them as well?
– Mike
Nov 13 '18 at 20:57
But the second sentence also contains
(... et al., XXXX)
. What do you mean by a citation mark? What have you tried?– Wiktor Stribiżew
Nov 13 '18 at 21:46
This appears to be homework and the OP is also trying to subvert intellectual property rights (not that I like paywalled journals but the law is the law) and get the SO community to do that for them. Anyone coming to the should review the questions after it questions to see the progression.
– hrbrmstr
Nov 14 '18 at 19:43