Parsing a number with commas with Javascript regex

Multi tool use
I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:
<script type="text/javascript">
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);
</script>
This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.
javascript regex
add a comment |
I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:
<script type="text/javascript">
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);
</script>
This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.
javascript regex
I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.
– Joel Cornett
Nov 16 '18 at 5:45
1
I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number1,234.456 789
it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.
– VLAZ
Nov 16 '18 at 6:33
As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this1,00,000,00,00,000
– VLAZ
Nov 16 '18 at 6:36
add a comment |
I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:
<script type="text/javascript">
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);
</script>
This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.
javascript regex
I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:
<script type="text/javascript">
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);
</script>
This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.
javascript regex
javascript regex
asked Nov 16 '18 at 5:33
LeoLeo
930612
930612
I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.
– Joel Cornett
Nov 16 '18 at 5:45
1
I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number1,234.456 789
it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.
– VLAZ
Nov 16 '18 at 6:33
As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this1,00,000,00,00,000
– VLAZ
Nov 16 '18 at 6:36
add a comment |
I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.
– Joel Cornett
Nov 16 '18 at 5:45
1
I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number1,234.456 789
it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.
– VLAZ
Nov 16 '18 at 6:33
As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this1,00,000,00,00,000
– VLAZ
Nov 16 '18 at 6:36
I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.
– Joel Cornett
Nov 16 '18 at 5:45
I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.
– Joel Cornett
Nov 16 '18 at 5:45
1
1
I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number
1,234.456 789
it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.– VLAZ
Nov 16 '18 at 6:33
I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number
1,234.456 789
it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.– VLAZ
Nov 16 '18 at 6:33
As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this
1,00,000,00,00,000
– VLAZ
Nov 16 '18 at 6:36
As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this
1,00,000,00,00,000
– VLAZ
Nov 16 '18 at 6:36
add a comment |
1 Answer
1
active
oldest
votes
One option is
^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$
The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3)
in an optional group.
Note that commas and periods do not need to be escaped in a character set, and |
in a character set indicates a literal pipe character - just use [ ,.]
instead.
Also use ^
and $
anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).
https://regex101.com/r/2dFk0f/1
(d1,3)
- One to three digits, followed by an optional big non-capturing group of
([ ,.]?)(d3)(?:2(d3))?
, which is:
([ ,.]?)
- Capture the separator used
(d3)
- Repeat three digits
(?:2(d3)?
- If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)
The backreference works but what if you have to also support a delimiter for decimal fractions? The$1,000,000.50
can be read asone million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or$1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say,3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.
– VLAZ
Nov 16 '18 at 6:45
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
1
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string1,234,5678
passes because1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in1234,567
, there's no separator between the1
and the2
(allowed if input was1234567
), and there's a,
between the4
and5
(allowed if input was1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3
– CertainPerformance
Nov 16 '18 at 7:31
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53332003%2fparsing-a-number-with-commas-with-javascript-regex%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
One option is
^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$
The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3)
in an optional group.
Note that commas and periods do not need to be escaped in a character set, and |
in a character set indicates a literal pipe character - just use [ ,.]
instead.
Also use ^
and $
anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).
https://regex101.com/r/2dFk0f/1
(d1,3)
- One to three digits, followed by an optional big non-capturing group of
([ ,.]?)(d3)(?:2(d3))?
, which is:
([ ,.]?)
- Capture the separator used
(d3)
- Repeat three digits
(?:2(d3)?
- If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)
The backreference works but what if you have to also support a delimiter for decimal fractions? The$1,000,000.50
can be read asone million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or$1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say,3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.
– VLAZ
Nov 16 '18 at 6:45
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
1
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string1,234,5678
passes because1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in1234,567
, there's no separator between the1
and the2
(allowed if input was1234567
), and there's a,
between the4
and5
(allowed if input was1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3
– CertainPerformance
Nov 16 '18 at 7:31
|
show 2 more comments
One option is
^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$
The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3)
in an optional group.
Note that commas and periods do not need to be escaped in a character set, and |
in a character set indicates a literal pipe character - just use [ ,.]
instead.
Also use ^
and $
anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).
https://regex101.com/r/2dFk0f/1
(d1,3)
- One to three digits, followed by an optional big non-capturing group of
([ ,.]?)(d3)(?:2(d3))?
, which is:
([ ,.]?)
- Capture the separator used
(d3)
- Repeat three digits
(?:2(d3)?
- If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)
The backreference works but what if you have to also support a delimiter for decimal fractions? The$1,000,000.50
can be read asone million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or$1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say,3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.
– VLAZ
Nov 16 '18 at 6:45
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
1
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string1,234,5678
passes because1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in1234,567
, there's no separator between the1
and the2
(allowed if input was1234567
), and there's a,
between the4
and5
(allowed if input was1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3
– CertainPerformance
Nov 16 '18 at 7:31
|
show 2 more comments
One option is
^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$
The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3)
in an optional group.
Note that commas and periods do not need to be escaped in a character set, and |
in a character set indicates a literal pipe character - just use [ ,.]
instead.
Also use ^
and $
anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).
https://regex101.com/r/2dFk0f/1
(d1,3)
- One to three digits, followed by an optional big non-capturing group of
([ ,.]?)(d3)(?:2(d3))?
, which is:
([ ,.]?)
- Capture the separator used
(d3)
- Repeat three digits
(?:2(d3)?
- If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)
One option is
^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$
The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3)
in an optional group.
Note that commas and periods do not need to be escaped in a character set, and |
in a character set indicates a literal pipe character - just use [ ,.]
instead.
Also use ^
and $
anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).
https://regex101.com/r/2dFk0f/1
(d1,3)
- One to three digits, followed by an optional big non-capturing group of
([ ,.]?)(d3)(?:2(d3))?
, which is:
([ ,.]?)
- Capture the separator used
(d3)
- Repeat three digits
(?:2(d3)?
- If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)
answered Nov 16 '18 at 5:47
CertainPerformanceCertainPerformance
95.3k165786
95.3k165786
The backreference works but what if you have to also support a delimiter for decimal fractions? The$1,000,000.50
can be read asone million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or$1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say,3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.
– VLAZ
Nov 16 '18 at 6:45
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
1
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string1,234,5678
passes because1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in1234,567
, there's no separator between the1
and the2
(allowed if input was1234567
), and there's a,
between the4
and5
(allowed if input was1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3
– CertainPerformance
Nov 16 '18 at 7:31
|
show 2 more comments
The backreference works but what if you have to also support a delimiter for decimal fractions? The$1,000,000.50
can be read asone million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or$1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say,3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.
– VLAZ
Nov 16 '18 at 6:45
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
1
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string1,234,5678
passes because1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in1234,567
, there's no separator between the1
and the2
(allowed if input was1234567
), and there's a,
between the4
and5
(allowed if input was1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3
– CertainPerformance
Nov 16 '18 at 7:31
The backreference works but what if you have to also support a delimiter for decimal fractions? The
$1,000,000.50
can be read as one million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or $1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.– VLAZ
Nov 16 '18 at 6:45
The backreference works but what if you have to also support a delimiter for decimal fractions? The
$1,000,000.50
can be read as one million dollars and 50 cents" while another culture could express this as
$1 000 000,50` or $1.000.000,50
. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500
- did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.– VLAZ
Nov 16 '18 at 6:45
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too
– CertainPerformance
Nov 16 '18 at 6:52
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?
– Leo
Nov 16 '18 at 6:54
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)
– CertainPerformance
Nov 16 '18 at 6:55
1
1
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string
1,234,5678
passes because 1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567
, there's no separator between the 1
and the 2
(allowed if input was 1234567
), and there's a ,
between the 4
and 5
(allowed if input was 1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3– CertainPerformance
Nov 16 '18 at 7:31
@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string
1,234,5678
passes because 1,234,567
matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567
, there's no separator between the 1
and the 2
(allowed if input was 1234567
), and there's a ,
between the 4
and 5
(allowed if input was 1,234,567
), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3– CertainPerformance
Nov 16 '18 at 7:31
|
show 2 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53332003%2fparsing-a-number-with-commas-with-javascript-regex%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Bz tsk ML3Gro,4Xn4TD,QSl7OUG3neEVhp,gJP2jCApt j4tSY8yf,dsLWgQbnxgQKgdZBJt5kQm,EhoJQ,xrs5DSULrbl
I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.
– Joel Cornett
Nov 16 '18 at 5:45
1
I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number
1,234.456 789
it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.– VLAZ
Nov 16 '18 at 6:33
As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this
1,00,000,00,00,000
– VLAZ
Nov 16 '18 at 6:36