Parsing a number with commas with Javascript regex










2















I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:



<script type="text/javascript"> 
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);

</script>


This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.










share|improve this question






















  • I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.

    – Joel Cornett
    Nov 16 '18 at 5:45






  • 1





    I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number 1,234.456 789 it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.

    – VLAZ
    Nov 16 '18 at 6:33











  • As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this 1,00,000,00,00,000

    – VLAZ
    Nov 16 '18 at 6:36















2















I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:



<script type="text/javascript"> 
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);

</script>


This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.










share|improve this question






















  • I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.

    – Joel Cornett
    Nov 16 '18 at 5:45






  • 1





    I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number 1,234.456 789 it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.

    – VLAZ
    Nov 16 '18 at 6:33











  • As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this 1,00,000,00,00,000

    – VLAZ
    Nov 16 '18 at 6:36













2












2








2








I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:



<script type="text/javascript"> 
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);

</script>


This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.










share|improve this question














I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used.
I have written the following:



<script type="text/javascript"> 
var re = /(d1,3)[ |,|.]?(d3)(?:[ |,|.]?(d3))?/i;
function testStr(input)
var str = input.value;
var newstr = str.replace(re, '[1]: $1n[2]: $2n[3]: $3');
alert(newstr);

</script>


This works well, except that it also parses input such as 1234,567,890 or 1,234,5678
The groups of 4 consecutive digits should not be allowed. Why is this happening?
Thanks for any help.







javascript regex






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 16 '18 at 5:33









LeoLeo

930612




930612












  • I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.

    – Joel Cornett
    Nov 16 '18 at 5:45






  • 1





    I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number 1,234.456 789 it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.

    – VLAZ
    Nov 16 '18 at 6:33











  • As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this 1,00,000,00,00,000

    – VLAZ
    Nov 16 '18 at 6:36

















  • I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.

    – Joel Cornett
    Nov 16 '18 at 5:45






  • 1





    I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number 1,234.456 789 it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.

    – VLAZ
    Nov 16 '18 at 6:33











  • As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this 1,00,000,00,00,000

    – VLAZ
    Nov 16 '18 at 6:36
















I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.

– Joel Cornett
Nov 16 '18 at 5:45





I think it would be easier, and the code would be more readable if you tested against two separate regexes, one with and one without separators.

– Joel Cornett
Nov 16 '18 at 5:45




1




1





I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number 1,234.456 789 it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.

– VLAZ
Nov 16 '18 at 6:33





I don't think this is the job for a regex. Unless you are absolutely certain the number is not mistyped (and you're not), then a regex is a bad choice here. For example, what of the number 1,234.456 789 it has each of the separators but they are not consistent. It does conform to the regex, though. You can write a regular expression to reject that but it's not going to be pretty and easy to maintain, since you'd need a separate branch for each separator. It'd be easier to parse the number and try to find the separators, then see if they conform as part of other validations.

– VLAZ
Nov 16 '18 at 6:33













As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this 1,00,000,00,00,000

– VLAZ
Nov 16 '18 at 6:36





As a side note, you can even use a library to do the number parsing for you. That way, you won't have to tear your hair own when you find out about, say, the Indian system of writing numbers where the grouping is not done in only threes or twos but both and a number can look like this 1,00,000,00,00,000

– VLAZ
Nov 16 '18 at 6:36












1 Answer
1






active

oldest

votes


















1














One option is



^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$


The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3) in an optional group.



Note that commas and periods do not need to be escaped in a character set, and | in a character set indicates a literal pipe character - just use [ ,.] instead.



Also use ^ and $ anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).



https://regex101.com/r/2dFk0f/1



(d1,3) - One to three digits, followed by an optional big non-capturing group of



([ ,.]?)(d3)(?:2(d3))?, which is:



([ ,.]?) - Capture the separator used



(d3) - Repeat three digits



(?:2(d3)? - If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)






share|improve this answer























  • The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

    – VLAZ
    Nov 16 '18 at 6:45











  • Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

    – CertainPerformance
    Nov 16 '18 at 6:52











  • I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

    – Leo
    Nov 16 '18 at 6:54











  • @Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

    – CertainPerformance
    Nov 16 '18 at 6:55







  • 1





    @Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

    – CertainPerformance
    Nov 16 '18 at 7:31











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53332003%2fparsing-a-number-with-commas-with-javascript-regex%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














One option is



^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$


The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3) in an optional group.



Note that commas and periods do not need to be escaped in a character set, and | in a character set indicates a literal pipe character - just use [ ,.] instead.



Also use ^ and $ anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).



https://regex101.com/r/2dFk0f/1



(d1,3) - One to three digits, followed by an optional big non-capturing group of



([ ,.]?)(d3)(?:2(d3))?, which is:



([ ,.]?) - Capture the separator used



(d3) - Repeat three digits



(?:2(d3)? - If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)






share|improve this answer























  • The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

    – VLAZ
    Nov 16 '18 at 6:45











  • Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

    – CertainPerformance
    Nov 16 '18 at 6:52











  • I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

    – Leo
    Nov 16 '18 at 6:54











  • @Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

    – CertainPerformance
    Nov 16 '18 at 6:55







  • 1





    @Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

    – CertainPerformance
    Nov 16 '18 at 7:31
















1














One option is



^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$


The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3) in an optional group.



Note that commas and periods do not need to be escaped in a character set, and | in a character set indicates a literal pipe character - just use [ ,.] instead.



Also use ^ and $ anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).



https://regex101.com/r/2dFk0f/1



(d1,3) - One to three digits, followed by an optional big non-capturing group of



([ ,.]?)(d3)(?:2(d3))?, which is:



([ ,.]?) - Capture the separator used



(d3) - Repeat three digits



(?:2(d3)? - If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)






share|improve this answer























  • The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

    – VLAZ
    Nov 16 '18 at 6:45











  • Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

    – CertainPerformance
    Nov 16 '18 at 6:52











  • I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

    – Leo
    Nov 16 '18 at 6:54











  • @Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

    – CertainPerformance
    Nov 16 '18 at 6:55







  • 1





    @Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

    – CertainPerformance
    Nov 16 '18 at 7:31














1












1








1







One option is



^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$


The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3) in an optional group.



Note that commas and periods do not need to be escaped in a character set, and | in a character set indicates a literal pipe character - just use [ ,.] instead.



Also use ^ and $ anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).



https://regex101.com/r/2dFk0f/1



(d1,3) - One to three digits, followed by an optional big non-capturing group of



([ ,.]?)(d3)(?:2(d3))?, which is:



([ ,.]?) - Capture the separator used



(d3) - Repeat three digits



(?:2(d3)? - If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)






share|improve this answer













One option is



^(d1,3)(?:([ ,.]?)(d3)(?:2(d3))?)?$


The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (d1,3) in an optional group.



Note that commas and periods do not need to be escaped in a character set, and | in a character set indicates a literal pipe character - just use [ ,.] instead.



Also use ^ and $ anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).



https://regex101.com/r/2dFk0f/1



(d1,3) - One to three digits, followed by an optional big non-capturing group of



([ ,.]?)(d3)(?:2(d3))?, which is:



([ ,.]?) - Capture the separator used



(d3) - Repeat three digits



(?:2(d3)? - If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 16 '18 at 5:47









CertainPerformanceCertainPerformance

95.3k165786




95.3k165786












  • The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

    – VLAZ
    Nov 16 '18 at 6:45











  • Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

    – CertainPerformance
    Nov 16 '18 at 6:52











  • I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

    – Leo
    Nov 16 '18 at 6:54











  • @Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

    – CertainPerformance
    Nov 16 '18 at 6:55







  • 1





    @Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

    – CertainPerformance
    Nov 16 '18 at 7:31


















  • The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

    – VLAZ
    Nov 16 '18 at 6:45











  • Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

    – CertainPerformance
    Nov 16 '18 at 6:52











  • I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

    – Leo
    Nov 16 '18 at 6:54











  • @Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

    – CertainPerformance
    Nov 16 '18 at 6:55







  • 1





    @Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

    – CertainPerformance
    Nov 16 '18 at 7:31

















The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

– VLAZ
Nov 16 '18 at 6:45





The backreference works but what if you have to also support a delimiter for decimal fractions? The $1,000,000.50 can be read as one million dollars and 50 cents" while another culture could express this as $1 000 000,50` or $1.000.000,50. You'd be able to back reference the separators but I don't think you can say "but the decimal delimiter is the same sans the back reference". And then you run into a problem with, say, 3.500 - did somebody type three and a half with an extra zero or three hundred and fifty? I know it's not a requirement by OP but I fear it's only not yet.

– VLAZ
Nov 16 '18 at 6:45













Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

– CertainPerformance
Nov 16 '18 at 6:52





Indeed, if you add decimals, it gets a whole lot more convoluted, especially since periods can be thousands-separators too

– CertainPerformance
Nov 16 '18 at 6:52













I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

– Leo
Nov 16 '18 at 6:54





I think that the issue of the separators is distracting us from the real issue, so let's simplify my question by allowing only commas and making them mandatory to make it even simpler. So if I write /(d1,3),(d3)[,(d3)]?/i; the problem still remains that it will parse groups of four digits, such as with 1,234,3456. How can I parse 1,234,567 but not 1,234,5678?

– Leo
Nov 16 '18 at 6:54













@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

– CertainPerformance
Nov 16 '18 at 6:55






@Leo The answer does address that - see regex101.com/r/2dFk0f/2 (same regex, just with your string above added to the input text). The idea is to, if the number is more than 3 digits long, require the final 3-digit string to be preceded by a separator (if there are any separators at all)

– CertainPerformance
Nov 16 '18 at 6:55





1




1





@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

– CertainPerformance
Nov 16 '18 at 7:31






@Leo Two reasons: one is that you're lacking the start-of-line and end-of-line anchors, as said in the answer (so, for example, the string 1,234,5678 passes because 1,234,567 matches the RE). The other is that you need to backreference the first separator found wherever a separator is expected, otherwise, for example, in 1234,567, there's no separator between the 1 and the 2 (allowed if input was 1234567), and there's a , between the 4 and 5 (allowed if input was 1,234,567), but you need the backreference to enforce consistency (see answer). regex101.com/r/2dFk0f/3

– CertainPerformance
Nov 16 '18 at 7:31




















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53332003%2fparsing-a-number-with-commas-with-javascript-regex%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Top Tejano songwriter Luis Silva dead of heart attack at 64

ReactJS Fetched API data displays live - need Data displayed static

政党