Remove a set of characters using Regex including the space character doesn't work










1














Currently I am using a StringBuilder to remove a list of characters from a string as below



char charArray = ', '<', '>', 
';', ':', ',', '"', '(', ')', '[', ']', '\',
'/', '*', '+', ' ' ;

// Remove special characters that aren't allowed

var sanitizedAddress = new StringBuilder();
foreach (var character in emailAddress.ToCharArray())

if (Array.IndexOf(charArray, character) < 0)
sanitizedAddress.Append(character);



I tried to use Regex for the same as follows



var invalidCharacters = Regex.Escape(@"%&=?|<>;:,"()\/*+s");
emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");









share|improve this question




























    1














    Currently I am using a StringBuilder to remove a list of characters from a string as below



    char charArray = ', '<', '>', 
    ';', ':', ',', '"', '(', ')', '[', ']', '\',
    '/', '*', '+', ' ' ;

    // Remove special characters that aren't allowed

    var sanitizedAddress = new StringBuilder();
    foreach (var character in emailAddress.ToCharArray())

    if (Array.IndexOf(charArray, character) < 0)
    sanitizedAddress.Append(character);



    I tried to use Regex for the same as follows



    var invalidCharacters = Regex.Escape(@"%&=?|<>;:,"()\/*+s");
    emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");









    share|improve this question


























      1












      1








      1







      Currently I am using a StringBuilder to remove a list of characters from a string as below



      char charArray = ', '<', '>', 
      ';', ':', ',', '"', '(', ')', '[', ']', '\',
      '/', '*', '+', ' ' ;

      // Remove special characters that aren't allowed

      var sanitizedAddress = new StringBuilder();
      foreach (var character in emailAddress.ToCharArray())

      if (Array.IndexOf(charArray, character) < 0)
      sanitizedAddress.Append(character);



      I tried to use Regex for the same as follows



      var invalidCharacters = Regex.Escape(@"%&=?|<>;:,"()\/*+s");
      emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");









      share|improve this question















      Currently I am using a StringBuilder to remove a list of characters from a string as below



      char charArray = ', '<', '>', 
      ';', ':', ',', '"', '(', ')', '[', ']', '\',
      '/', '*', '+', ' ' ;

      // Remove special characters that aren't allowed

      var sanitizedAddress = new StringBuilder();
      foreach (var character in emailAddress.ToCharArray())

      if (Array.IndexOf(charArray, character) < 0)
      sanitizedAddress.Append(character);



      I tried to use Regex for the same as follows



      var invalidCharacters = Regex.Escape(@"%&=?|<>;:,"()\/*+s");
      emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");






      c# .net regex






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 13 '18 at 8:00









      Dmitry Bychenko

      106k992132




      106k992132










      asked Nov 13 '18 at 4:32









      lohiarahul

      498518




      498518






















          2 Answers
          2






          active

          oldest

          votes


















          1














          You can try using Linq (in order to filter out the unwanted characters with a help of Where) instead of Regular Expressions:



          using System.Linq;

          ...

          // Hash set is faster on Contains operation than array - O(1) vs. O(N)
          HashSet<char> toRemove = new HashSet<char>()
          '%', '&', '=', '?', '', '', ';

          string emailAddress = ...

          string emailAddress = string.Concat(emailAddress
          .Where(c => !toRemove.Contains(c)));


          You can add more Where e.g.



          string emailAddress = string.Concat(emailAddress
          .Where(c => !toRemove.Contains(c))
          .Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well


          In case you insist on regular expressions you have to build the pattern, e.g.:



           char charArray = ', '<', '>',
          ';', ':', ',', '"', '(', ')', '[', ']', '\',
          '/', '*', '+', ' ' ;

          // Joined with | ("or" in regular expressions) all the characters (escaped!)
          string pattern = string.Join("|", charArray
          .Select(c => Regex.Escape(c.ToString())));


          And then you can Replace:



           string emailAddress = Regex.Replace(emailAddress, pattern, "");





          share|improve this answer






























            1














            You can use character set [...] for this:



            var invalidCharacters = "[" + Regex.Escape(@"%&=?|<>;:,""()*/+") + @"][s]";
            emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");


            Some side notes:



            • when using double quote in "at string", you should use "", not "


            • s is alread an escaped sequence, so Regex.Escape will render \s, which is not what you wanted


            • Regex.Escape don't seem to escape ] character correctly - that's why it's added separately





            share|improve this answer






















              Your Answer






              StackExchange.ifUsing("editor", function ()
              StackExchange.using("externalEditor", function ()
              StackExchange.using("snippets", function ()
              StackExchange.snippets.init();
              );
              );
              , "code-snippets");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "1"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273857%2fremove-a-set-of-characters-using-regex-including-the-space-character-doesnt-wor%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1














              You can try using Linq (in order to filter out the unwanted characters with a help of Where) instead of Regular Expressions:



              using System.Linq;

              ...

              // Hash set is faster on Contains operation than array - O(1) vs. O(N)
              HashSet<char> toRemove = new HashSet<char>()
              '%', '&', '=', '?', '', '', ';

              string emailAddress = ...

              string emailAddress = string.Concat(emailAddress
              .Where(c => !toRemove.Contains(c)));


              You can add more Where e.g.



              string emailAddress = string.Concat(emailAddress
              .Where(c => !toRemove.Contains(c))
              .Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well


              In case you insist on regular expressions you have to build the pattern, e.g.:



               char charArray = ', '<', '>',
              ';', ':', ',', '"', '(', ')', '[', ']', '\',
              '/', '*', '+', ' ' ;

              // Joined with | ("or" in regular expressions) all the characters (escaped!)
              string pattern = string.Join("|", charArray
              .Select(c => Regex.Escape(c.ToString())));


              And then you can Replace:



               string emailAddress = Regex.Replace(emailAddress, pattern, "");





              share|improve this answer



























                1














                You can try using Linq (in order to filter out the unwanted characters with a help of Where) instead of Regular Expressions:



                using System.Linq;

                ...

                // Hash set is faster on Contains operation than array - O(1) vs. O(N)
                HashSet<char> toRemove = new HashSet<char>()
                '%', '&', '=', '?', '', '', ';

                string emailAddress = ...

                string emailAddress = string.Concat(emailAddress
                .Where(c => !toRemove.Contains(c)));


                You can add more Where e.g.



                string emailAddress = string.Concat(emailAddress
                .Where(c => !toRemove.Contains(c))
                .Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well


                In case you insist on regular expressions you have to build the pattern, e.g.:



                 char charArray = ', '<', '>',
                ';', ':', ',', '"', '(', ')', '[', ']', '\',
                '/', '*', '+', ' ' ;

                // Joined with | ("or" in regular expressions) all the characters (escaped!)
                string pattern = string.Join("|", charArray
                .Select(c => Regex.Escape(c.ToString())));


                And then you can Replace:



                 string emailAddress = Regex.Replace(emailAddress, pattern, "");





                share|improve this answer

























                  1












                  1








                  1






                  You can try using Linq (in order to filter out the unwanted characters with a help of Where) instead of Regular Expressions:



                  using System.Linq;

                  ...

                  // Hash set is faster on Contains operation than array - O(1) vs. O(N)
                  HashSet<char> toRemove = new HashSet<char>()
                  '%', '&', '=', '?', '', '', ';

                  string emailAddress = ...

                  string emailAddress = string.Concat(emailAddress
                  .Where(c => !toRemove.Contains(c)));


                  You can add more Where e.g.



                  string emailAddress = string.Concat(emailAddress
                  .Where(c => !toRemove.Contains(c))
                  .Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well


                  In case you insist on regular expressions you have to build the pattern, e.g.:



                   char charArray = ', '<', '>',
                  ';', ':', ',', '"', '(', ')', '[', ']', '\',
                  '/', '*', '+', ' ' ;

                  // Joined with | ("or" in regular expressions) all the characters (escaped!)
                  string pattern = string.Join("|", charArray
                  .Select(c => Regex.Escape(c.ToString())));


                  And then you can Replace:



                   string emailAddress = Regex.Replace(emailAddress, pattern, "");





                  share|improve this answer














                  You can try using Linq (in order to filter out the unwanted characters with a help of Where) instead of Regular Expressions:



                  using System.Linq;

                  ...

                  // Hash set is faster on Contains operation than array - O(1) vs. O(N)
                  HashSet<char> toRemove = new HashSet<char>()
                  '%', '&', '=', '?', '', '', ';

                  string emailAddress = ...

                  string emailAddress = string.Concat(emailAddress
                  .Where(c => !toRemove.Contains(c)));


                  You can add more Where e.g.



                  string emailAddress = string.Concat(emailAddress
                  .Where(c => !toRemove.Contains(c))
                  .Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well


                  In case you insist on regular expressions you have to build the pattern, e.g.:



                   char charArray = ', '<', '>',
                  ';', ':', ',', '"', '(', ')', '[', ']', '\',
                  '/', '*', '+', ' ' ;

                  // Joined with | ("or" in regular expressions) all the characters (escaped!)
                  string pattern = string.Join("|", charArray
                  .Select(c => Regex.Escape(c.ToString())));


                  And then you can Replace:



                   string emailAddress = Regex.Replace(emailAddress, pattern, "");






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 13 '18 at 7:29

























                  answered Nov 13 '18 at 7:02









                  Dmitry Bychenko

                  106k992132




                  106k992132























                      1














                      You can use character set [...] for this:



                      var invalidCharacters = "[" + Regex.Escape(@"%&=?|<>;:,""()*/+") + @"][s]";
                      emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");


                      Some side notes:



                      • when using double quote in "at string", you should use "", not "


                      • s is alread an escaped sequence, so Regex.Escape will render \s, which is not what you wanted


                      • Regex.Escape don't seem to escape ] character correctly - that's why it's added separately





                      share|improve this answer



























                        1














                        You can use character set [...] for this:



                        var invalidCharacters = "[" + Regex.Escape(@"%&=?|<>;:,""()*/+") + @"][s]";
                        emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");


                        Some side notes:



                        • when using double quote in "at string", you should use "", not "


                        • s is alread an escaped sequence, so Regex.Escape will render \s, which is not what you wanted


                        • Regex.Escape don't seem to escape ] character correctly - that's why it's added separately





                        share|improve this answer

























                          1












                          1








                          1






                          You can use character set [...] for this:



                          var invalidCharacters = "[" + Regex.Escape(@"%&=?|<>;:,""()*/+") + @"][s]";
                          emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");


                          Some side notes:



                          • when using double quote in "at string", you should use "", not "


                          • s is alread an escaped sequence, so Regex.Escape will render \s, which is not what you wanted


                          • Regex.Escape don't seem to escape ] character correctly - that's why it's added separately





                          share|improve this answer














                          You can use character set [...] for this:



                          var invalidCharacters = "[" + Regex.Escape(@"%&=?|<>;:,""()*/+") + @"][s]";
                          emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");


                          Some side notes:



                          • when using double quote in "at string", you should use "", not "


                          • s is alread an escaped sequence, so Regex.Escape will render \s, which is not what you wanted


                          • Regex.Escape don't seem to escape ] character correctly - that's why it's added separately






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Nov 13 '18 at 5:08

























                          answered Nov 13 '18 at 5:03









                          qbik

                          2,9021623




                          2,9021623



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273857%2fremove-a-set-of-characters-using-regex-including-the-space-character-doesnt-wor%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Top Tejano songwriter Luis Silva dead of heart attack at 64

                              政党

                              天津地下鉄3号線