Random alphanumeric password generator with GOTOs









up vote
5
down vote

favorite












I was inspired after this thread on Stack Overflow to create a random 8 character alphanumeric password generator.



Sadly, it is closed, so I cannot provide an answer there. Anyway, here is the code. Let me know if there are any bugs/bias going on.



using System.Collections.Generic;
using System.Security.Cryptography;

public class AlphanumericGen

public virtual string Generate(byte length)

ICollection<char> chars = new LinkedList<char>();
var buffer = new byte[1];
short counter = 0;
using (var rng = new RNGCryptoServiceProvider())

while (counter < length)

rng.GetBytes(buffer);
var nextChar = (char)buffer[0];

if (nextChar >= '0' && nextChar <= '9')

goto addChar;


if (nextChar >= 'A' && nextChar <= 'Z')

goto addChar;


if (nextChar >= 'a' && nextChar <= 'z')

goto addChar;


continue;

addChar:
chars.Add(nextChar);
++counter;



return string.Concat(chars);











share|improve this question



















  • 1




    Why aren't you using Char.IsDigit() and Char.IsLetter() instead of your own range tests?
    – Barmar
    Nov 8 at 3:57










  • @Barmar: In this case Char.IsDigit() and Char.IsLetter() are probably the wrong choice, since the aim is to create a password with characters from a known range.
    – Jack Aidley
    2 days ago










  • Of course it's slightly more complex because you may need to concatenate mulitple calls but its should be trivial. Given that you need 8 characters and time spent to generate 256 random bytes is negligible...you may even optimize your code for the best case (just use StringBuilder to store chars and put above code in a while (stringBuffer.length < length) (change Take() to actually take only required number of characters...)
    – Adriano Repetti
    2 days ago







  • 1




    @Shelby - Of course they include Unicode characters (0, for example). In this case, we seem to be selecting from the Latin-1 set, given that buffer contains a byte. We could mask that down to ASCII using & 0x7F (and increase the number of random values we actually use, rather than discard).
    – Toby Speight
    2 days ago






  • 4




    Moderator note: Not everyone has to mention how bad goto is. Many of the comments that were just moved to chat did not add anything constructive at all. You don't have to come up with new ways to say how bad goto is.
    – Simon Forsberg
    yesterday














up vote
5
down vote

favorite












I was inspired after this thread on Stack Overflow to create a random 8 character alphanumeric password generator.



Sadly, it is closed, so I cannot provide an answer there. Anyway, here is the code. Let me know if there are any bugs/bias going on.



using System.Collections.Generic;
using System.Security.Cryptography;

public class AlphanumericGen

public virtual string Generate(byte length)

ICollection<char> chars = new LinkedList<char>();
var buffer = new byte[1];
short counter = 0;
using (var rng = new RNGCryptoServiceProvider())

while (counter < length)

rng.GetBytes(buffer);
var nextChar = (char)buffer[0];

if (nextChar >= '0' && nextChar <= '9')

goto addChar;


if (nextChar >= 'A' && nextChar <= 'Z')

goto addChar;


if (nextChar >= 'a' && nextChar <= 'z')

goto addChar;


continue;

addChar:
chars.Add(nextChar);
++counter;



return string.Concat(chars);











share|improve this question



















  • 1




    Why aren't you using Char.IsDigit() and Char.IsLetter() instead of your own range tests?
    – Barmar
    Nov 8 at 3:57










  • @Barmar: In this case Char.IsDigit() and Char.IsLetter() are probably the wrong choice, since the aim is to create a password with characters from a known range.
    – Jack Aidley
    2 days ago










  • Of course it's slightly more complex because you may need to concatenate mulitple calls but its should be trivial. Given that you need 8 characters and time spent to generate 256 random bytes is negligible...you may even optimize your code for the best case (just use StringBuilder to store chars and put above code in a while (stringBuffer.length < length) (change Take() to actually take only required number of characters...)
    – Adriano Repetti
    2 days ago







  • 1




    @Shelby - Of course they include Unicode characters (0, for example). In this case, we seem to be selecting from the Latin-1 set, given that buffer contains a byte. We could mask that down to ASCII using & 0x7F (and increase the number of random values we actually use, rather than discard).
    – Toby Speight
    2 days ago






  • 4




    Moderator note: Not everyone has to mention how bad goto is. Many of the comments that were just moved to chat did not add anything constructive at all. You don't have to come up with new ways to say how bad goto is.
    – Simon Forsberg
    yesterday












up vote
5
down vote

favorite









up vote
5
down vote

favorite











I was inspired after this thread on Stack Overflow to create a random 8 character alphanumeric password generator.



Sadly, it is closed, so I cannot provide an answer there. Anyway, here is the code. Let me know if there are any bugs/bias going on.



using System.Collections.Generic;
using System.Security.Cryptography;

public class AlphanumericGen

public virtual string Generate(byte length)

ICollection<char> chars = new LinkedList<char>();
var buffer = new byte[1];
short counter = 0;
using (var rng = new RNGCryptoServiceProvider())

while (counter < length)

rng.GetBytes(buffer);
var nextChar = (char)buffer[0];

if (nextChar >= '0' && nextChar <= '9')

goto addChar;


if (nextChar >= 'A' && nextChar <= 'Z')

goto addChar;


if (nextChar >= 'a' && nextChar <= 'z')

goto addChar;


continue;

addChar:
chars.Add(nextChar);
++counter;



return string.Concat(chars);











share|improve this question















I was inspired after this thread on Stack Overflow to create a random 8 character alphanumeric password generator.



Sadly, it is closed, so I cannot provide an answer there. Anyway, here is the code. Let me know if there are any bugs/bias going on.



using System.Collections.Generic;
using System.Security.Cryptography;

public class AlphanumericGen

public virtual string Generate(byte length)

ICollection<char> chars = new LinkedList<char>();
var buffer = new byte[1];
short counter = 0;
using (var rng = new RNGCryptoServiceProvider())

while (counter < length)

rng.GetBytes(buffer);
var nextChar = (char)buffer[0];

if (nextChar >= '0' && nextChar <= '9')

goto addChar;


if (nextChar >= 'A' && nextChar <= 'Z')

goto addChar;


if (nextChar >= 'a' && nextChar <= 'z')

goto addChar;


continue;

addChar:
chars.Add(nextChar);
++counter;



return string.Concat(chars);








c# random






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 7 at 20:23









t3chb0t

33.2k744106




33.2k744106










asked Nov 7 at 20:08









Sam Pearson

277210




277210







  • 1




    Why aren't you using Char.IsDigit() and Char.IsLetter() instead of your own range tests?
    – Barmar
    Nov 8 at 3:57










  • @Barmar: In this case Char.IsDigit() and Char.IsLetter() are probably the wrong choice, since the aim is to create a password with characters from a known range.
    – Jack Aidley
    2 days ago










  • Of course it's slightly more complex because you may need to concatenate mulitple calls but its should be trivial. Given that you need 8 characters and time spent to generate 256 random bytes is negligible...you may even optimize your code for the best case (just use StringBuilder to store chars and put above code in a while (stringBuffer.length < length) (change Take() to actually take only required number of characters...)
    – Adriano Repetti
    2 days ago







  • 1




    @Shelby - Of course they include Unicode characters (0, for example). In this case, we seem to be selecting from the Latin-1 set, given that buffer contains a byte. We could mask that down to ASCII using & 0x7F (and increase the number of random values we actually use, rather than discard).
    – Toby Speight
    2 days ago






  • 4




    Moderator note: Not everyone has to mention how bad goto is. Many of the comments that were just moved to chat did not add anything constructive at all. You don't have to come up with new ways to say how bad goto is.
    – Simon Forsberg
    yesterday












  • 1




    Why aren't you using Char.IsDigit() and Char.IsLetter() instead of your own range tests?
    – Barmar
    Nov 8 at 3:57










  • @Barmar: In this case Char.IsDigit() and Char.IsLetter() are probably the wrong choice, since the aim is to create a password with characters from a known range.
    – Jack Aidley
    2 days ago










  • Of course it's slightly more complex because you may need to concatenate mulitple calls but its should be trivial. Given that you need 8 characters and time spent to generate 256 random bytes is negligible...you may even optimize your code for the best case (just use StringBuilder to store chars and put above code in a while (stringBuffer.length < length) (change Take() to actually take only required number of characters...)
    – Adriano Repetti
    2 days ago







  • 1




    @Shelby - Of course they include Unicode characters (0, for example). In this case, we seem to be selecting from the Latin-1 set, given that buffer contains a byte. We could mask that down to ASCII using & 0x7F (and increase the number of random values we actually use, rather than discard).
    – Toby Speight
    2 days ago






  • 4




    Moderator note: Not everyone has to mention how bad goto is. Many of the comments that were just moved to chat did not add anything constructive at all. You don't have to come up with new ways to say how bad goto is.
    – Simon Forsberg
    yesterday







1




1




Why aren't you using Char.IsDigit() and Char.IsLetter() instead of your own range tests?
– Barmar
Nov 8 at 3:57




Why aren't you using Char.IsDigit() and Char.IsLetter() instead of your own range tests?
– Barmar
Nov 8 at 3:57












@Barmar: In this case Char.IsDigit() and Char.IsLetter() are probably the wrong choice, since the aim is to create a password with characters from a known range.
– Jack Aidley
2 days ago




@Barmar: In this case Char.IsDigit() and Char.IsLetter() are probably the wrong choice, since the aim is to create a password with characters from a known range.
– Jack Aidley
2 days ago












Of course it's slightly more complex because you may need to concatenate mulitple calls but its should be trivial. Given that you need 8 characters and time spent to generate 256 random bytes is negligible...you may even optimize your code for the best case (just use StringBuilder to store chars and put above code in a while (stringBuffer.length < length) (change Take() to actually take only required number of characters...)
– Adriano Repetti
2 days ago





Of course it's slightly more complex because you may need to concatenate mulitple calls but its should be trivial. Given that you need 8 characters and time spent to generate 256 random bytes is negligible...you may even optimize your code for the best case (just use StringBuilder to store chars and put above code in a while (stringBuffer.length < length) (change Take() to actually take only required number of characters...)
– Adriano Repetti
2 days ago





1




1




@Shelby - Of course they include Unicode characters (0, for example). In this case, we seem to be selecting from the Latin-1 set, given that buffer contains a byte. We could mask that down to ASCII using & 0x7F (and increase the number of random values we actually use, rather than discard).
– Toby Speight
2 days ago




@Shelby - Of course they include Unicode characters (0, for example). In this case, we seem to be selecting from the Latin-1 set, given that buffer contains a byte. We could mask that down to ASCII using & 0x7F (and increase the number of random values we actually use, rather than discard).
– Toby Speight
2 days ago




4




4




Moderator note: Not everyone has to mention how bad goto is. Many of the comments that were just moved to chat did not add anything constructive at all. You don't have to come up with new ways to say how bad goto is.
– Simon Forsberg
yesterday




Moderator note: Not everyone has to mention how bad goto is. Many of the comments that were just moved to chat did not add anything constructive at all. You don't have to come up with new ways to say how bad goto is.
– Simon Forsberg
yesterday










10 Answers
10






active

oldest

votes

















up vote
49
down vote













As a general rule of thumb, any time you feel the need to use goto, take a couple of aspirin and lay down until the feeling passes. They probably should have been deprecated decades ago.



In this particular case, using a string of allowed characters and randomly picking an index in that string would do away with your goto's.



I think too a StringBuilder would do a lot better for storing the characters than a LinkedList.






share|improve this answer






















  • Comments are not for extended discussion; this conversation has been moved to chat.
    – Jamal
    yesterday










  • As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
    – Agnius Vasiliauskas
    17 hours ago







  • 1




    @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
    – NieDzejkob
    16 hours ago










  • @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
    – Agnius Vasiliauskas
    13 hours ago










  • @AgniusVasiliauskas factoring. Not refactoring.
    – NieDzejkob
    3 hours ago

















up vote
20
down vote













Data structure



Why are you using a LinkedList<char>? Linked lists involve a lot of overhead for each node. You know exactly how long the result should be, so why not write to a char array?



Actually, you may want to consider returning the result as a char rather than as a string. A security-sensitive program may wish to wipe the contents of the secret by overwriting it with NUL bytes, rather than waiting for the garbage collector to clean up the string.



Flow control



goto should be avoided, and its use is not justified here.



The loop is a bit clumsy, referring to counter all over the place.



char chars = new char[length];
using (var rng = new RNGCryptoServiceProvider())

for (int counter = 0; counter < length; )

return new string(chars);



Algorithm



The algorithm is rather inefficient:



  • It reads a byte at a time from the random number generator. .GetBytes() is thread safe, so each call should incur some synchronization overhead.

  • It throws away most of the bytes it reads, keeping each byte with a probability of $62 over 256$.

A smarter approach would be to fetch a bit more than $3 over 4 mathrmlength$ random bytes, base64-encode it, and discard just the resulting + and / characters.






share|improve this answer






















  • Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
    – t3chb0t
    Nov 7 at 20:49






  • 4




    @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
    – 200_success
    Nov 7 at 20:53










  • ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
    – t3chb0t
    Nov 7 at 21:00







  • 3




    @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
    – 200_success
    Nov 7 at 21:43






  • 3




    @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
    – 200_success
    2 days ago

















up vote
12
down vote













Readability & GoTo



If you want to do the same action for 3 different if-statements make them one if-statement or make them a function. goto causes massive headaches when it's been months since you've touched the code and you come back to it not remembering a thing.



rng.GetBytes(buffer);
var nextChar = (char)buffer[0];

var isNumber = nextChar >= '0' && nextChar <= '9';
var isUppercaseLetter = nextChar >= 'A' && nextChar <= 'Z';
var isLowercaseLetter = nextChar >= 'a' && nextChar <= 'z';

if (isNumber || isUppercaseLetter || isLowercaseLetter)

chars.Add(nextChar);
++counter;



Now with the conditions named even an idiot like me can come in here and see that you're checking if the character is alpha-numeric. If you wanted to make it even more obvious you could go one step further and make it a function.



public bool IsAlphaNumericCharacter(char c)
isUppercaseLetter



Then your loop becomes shorter and clearer.



while (counter < length)

rng.GetBytes(buffer);
var nextChar = (char)buffer[0];

if (IsAlphaNumericCharacter(nextChar))

chars.Add(nextChar);
++counter;




Bytes



Is there a particular reason you're using byte for length instead of int? Also, was there a reason for var buffer = new byte[1]; being an array instead of just var buffer = new byte();



If the answer to both of those questions is no, then you could have a character array of size length instead of a LinkedList<char>.



StringBuilder



StringBuilder could be beneficial, but if you're at a low amount of characters (like your link suggests) then it's probably not going to make a difference performance-wise (always best to test yourself instead of listening to a stranger on the internet though :P).






share|improve this answer


















  • 1




    You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
    – Toby Speight
    Nov 7 at 20:53







  • 1




    I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
    – Shelby115
    Nov 8 at 2:18











  • "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
    – Nic Hartley
    2 days ago










  • uhh whoops, slight oversight on my part. Good catch, my bad.
    – Shelby115
    2 days ago

















up vote
9
down vote













Avoid premature optimizations, they're really hurting this code.



In the comments you defend the use of goto using the argument that it avoids branch prediction.



Accepting the dubious premise that the compiler produces more optimized code from a series of sequential if statements than it would from a conventional if-else construct, lets look at the context of that optimization to see what it might get you.



As other answers have pointed out, the data structure is poorly chosen, as the cost of a node allocation is going to dwarf the cost of a missed branch prediction.



Assuming you swap that out, as pointed out in another answer, you'll also want to grab multiple bytes at a time from the RNG. That'll help, but even fully optimized that call will cost more than you could possibly save using a questionable branch prediction optimization.



That final bit converting the characters to a string? Yep, that's a traversal so you're looking at one conditional per node, plenty of chances for a bad branch prediction.



The bottom line is that, by focusing on something which makes you feel like you're "thinking for yourself" and "questioning authority" but doesn't actually help, you've given away readability for no measurable benefit.






share|improve this answer



























    up vote
    5
    down vote













    In this case, if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version.



    It might even be something like this (completely and absolutely untested) which - in average - should also perform better than original implementation:



    using (var rng = new RNGCryptoServiceProvider())

    var buffer = new byte[length * 5];

    var password = new StringBuilder(length);
    while (password.Length < length)

    rng.GetBytes(buffer);
    password.Append(buffer
    .Select(x => (char)x)
    .Where(Char.IsLetterOrDigit)
    .Take(length - password.Length)
    .ToArray()
    );




    Notes:



    • Decision to use length * 5 is arbitrary (you take only 58 values from the 256 possible). For small strings there are good chances that it doesn't work well, run this few thousands times and see how many times one call to GetBytes() is not enough.

    • It performs pretty well (GetBytes() overhead is - typically - payed only once) and you're not building lists of expensive structures (StringBuilder is highly optimized to build strings).

    • It should be slightly more fast than base64 encoding (which otherwise is a clever idea).

    • It's really easy to optimize for the best case scenario adding some more code (for example when ToArray() returns the required string you can avoid StringBuilder() - and its initial allocation - all together).

    • It scales well (parallel generation of N passwords or even generation of a huge password in parallel from M chunks).

    • Given that your input is a single byte then you can use Char.IsLetterOrDigit directly because all the other non Latin letters and digits are > 255 when UTF-16 encoded).

    The very next (or previous?) step is to test this code. Can you reliably an extensively test a function which uses a random number generator? Hardly. t3schb0t covers this aspect in his answer.



    Are we done? There are some security concerns you should consider, see last section.




    As you did imagine goto is the controversial part of your code. Let me address this separately because, it seems, you're prone to use it.



    goto usually hurts legibility and you're also (probably) preventing some optimizations compiler might do. Depending where in the compiler pipeline the optimization is applied (near the front-end where goto and if/else differ, or near the back-end - or even after JIT - where they already generated the exact same code) a compiler is probably not optimizing goto (because it's uncommon to be worth of the time and when you're using it then you know what you're doing). In short: if it's because of performance than 99/100 compiler can do a better job than you.



    I'm not saying that there are not use-cases (there are) but hey're so rare that hat you may live without writing a goto for most of your career (with few more chances for its little brother goto case). Please do not generalize this to every language, in C - for example - it might be useful for clean-up code and error handling but in higher level languages we have other powerful tools.



    Does it make the code shorter or more readable? In your specific case, even without changing anything else, the very same code might be rewritten as:



    while (chars.Count < length)

    rng.GetBytes(buffer);
    char nextChar = (char)buffer[0];

    if (Char.IsLetterOrDigit(nextChar))
    chars.Add(nextChar);



    What did we do? We moved some logic into a separate function, this has two advantages:



    • It makes logic easier to follow because we can read the function name instead of understanding the code.

    • If frees us from the need of that goto because we can reduce those multiple ifs.

    What if there wasn't Char.IsLetterOrDigit()? You write your own and 90% you will get rid of those jumps. There are more things in the original code (beside goto): counter is useless, you can use an array instead of a linked list (size is known), it's preferable to avoid var for primitive types, you're hard-coding magic constants, and...you're reducing the entropy of the generated password.




    Security considerations



    You're correctly using RNGCryptoServiceProvider to generate the random numbers however you're not using a proper one-to-one encoding algorithm to transform those numbers into text. Simply dropping some values reduces the entropy. This is unavoidable if you want a fixed length alphanumeric string.



    However if your goal is to generate a password with the desired amount of entropy then for this step you should use base62 (as elsewhere suggested base64 for performance reasons), a Diceware algorithm, or a perfect hash function (no, you can't use modulus for this because it, obviously, has multiple collisions).






    share|improve this answer






















    • I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
      – t3chb0t
      2 days ago











    • definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
      – Adriano Repetti
      2 days ago











    • I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
      – Henrik Hansen
      2 days ago






    • 2




      if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
      – t3chb0t
      2 days ago







    • 1




      Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
      – t3chb0t
      2 days ago

















    up vote
    4
    down vote













    There is only one answer so far, that by @Adriano Repetti where clean code is mentioned. All others are havily focused on performane.



    I'd like to add to this one answer and say that reusibility comes always first. I agree that if performance is not of great importance, you should primarily create an API that is easy to use, mantain and test.



    Organizing your code from this point of view would mean to first create an extension for the RandomNumberGenerator that is the base class of the RNGCryptoServiceProvider. It would enumerate only chars that you are interested in. It doesn't have to know how many, that's what we have LINQ for so it creates an endless collection of chars.



    public static IEnumerable<char> EnumerateLetterOrDigit(this RandomNumberGenerator rng, int bufferSize = 256)

    var buffer = new byte[bufferSize];
    while (true)

    rng.GetBytes(buffer);
    foreach (var c in buffer.Select(b => (char)b).Where(char.IsLetterOrDigit))

    yield return c;





    By doing so you could use a different provider or even mock it for testing and always return the same values. The options are there.



    From here, it's now very easy to build your password generator. Create any random-number-generator, Take as many chars as you need and create the string.



    public static string Generate(int length)

    using (var rng = new RNGCryptoServiceProvider())

    return new string(rng.EnumerateLetterOrDigit().Take(length).ToArray());




    And even though it havily uses LINQ, it's accidentally faster then your code because of the reduced number of calls to GetBytes (discovered by @Adriano?).



    So saying that LINQ is always slow is once again proven wrong because most of the time there will be something else that isn't optimal. LINQ is just convenient and this is so great about it!




    There is one thing about goto that hasn't been said yet... and you're using something very similar everyday, it's called switch:



    public static void Main()

    var num = 1;

    switch (num)

    case 1:
    goto case 3;
    case 2:
    Console.WriteLine("Hi goto!");
    break;
    case 3:
    goto case 2;




    Yes, you can jump within its scope from case to case in any order.






    share|improve this answer





























      up vote
      3
      down vote













      goto, part deux




      I don't agree with the mass consciousness that goto is a code smell.




      code smell means a possible problem, not that is is a problem. So yes, it is a code smell. I too have used goto, only once in the last decade - and it was with C#.



      Why goto could be bad



      We do not trust ourselves to use it wisely and well. This is borne out in history. There are millllions of COBOL LOC demonstrating that when used routinely goto tends to corrupt flow control, create inter and intra method coupling, and spawns "one off" code duplication; all to such a degree that one should assume the code will break when changed. In other words the code becomes unmaintainable.




      Why goto is bad



      goto murders the bedrock of all good programs - structured programming and modular construction. goto literally goes around them. Anything you can say about modern languages, coding principles, object orientation, etc. is fundamentally rooted in these two precepts. full stop.



      As for history: the unstructured nature of COBOL, for example, gave rise to an "hard jump" coding style and mind set. It's fair to say the liberal use of goto is a code lobotomy. Many factors contribute to crumbling code but the more gotos the more schizophrenic the code becomes. I have worked on a system that gave a dreadful feeling of random execution when stepping through with the debugger. Many existing bugs could not be duplicated and almost no bug could be reproduced reliably. 99 percent of code tweaks induced bugs. I am not exaggerating at all here! Our in house customer head guy told me "we don't bring up problems anymore because we know they won't be fixed."



      So wide spread was the problem that Edsger Dijkstra wrote a seminal paper GOTO statement considered harmful



      A very readable, practical, nicely concise "using goto wisely" discussion






      share|improve this answer






















      • These are the best arguments against goto I've ever seen.
        – t3chb0t
        yesterday

















      up vote
      1
      down vote













      I can only add to the already said, that your algorithm has the potential to run for ever - theoretically speaking.




      The most effective approach is to select chars from a list of valid chars. If using the numbers [0, 9] and the English letters [a, z] and [A, Z] we end up with 62 entries which will cause a byte based random selection be biased because 256 % 62 != 0. The problem can be solved by either adding two more chars resulting in 64 char:



       const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";


      or reduce the valid chars to 32:



      const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUV";


      There are pros and cons to both obviously, but both can be used i an efficient algorithm like:



      public string Generate(int length)

      const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";

      using (var rng = new RNGCryptoServiceProvider())

      char result = new char[length];
      byte buffer = new byte[length];
      int range = chars.Length;
      rng.GetBytes(buffer);

      for (int i = 0; i < length; i++)

      result[i] = chars[buffer[i] % range];


      return new string(result);




      This should be unbiased.



      I deliberately avoid the use of LINQ, because it is nearly always slower than more "native" code. This not to say, I in general avoid using LINQ, because I use it very often and gladly, but it is my experience that when it comes to performance, LINQ is often significantly slower than comparable, well written traditional code.






      share|improve this answer





























        up vote
        1
        down vote













        Testing for character ranges presumes ordering and continuity of those code points. Assuming the Latin alphabet, the algorithm of course works as expected using ASCII and Unicode, but breaks down if another alphabet should be used or in the highly unlikely case of more obscure character sets such as EBCDIC.



        An improved solution would be defining an array of allowable code points (as in "ABC...YZabc...xyz012..89", and then using the random number to index into this array. This resolves all above issues, and the function is trivially extendable to support additional characters as desired.



        As regards the goto statements, I don't see any real fault here. It is as easy to read and logically identical to a similarly constructed switch() statement with range cases.






        share|improve this answer








        New contributor




        stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.
























          up vote
          -3
          down vote













          I would use a HashSet:



          HashSet<byte> validChars = new HashSet<byte> 48, 49, 50 ; //add all valid

          ........

          rng.GetBytes(buffer);
          if(validChars.Contains(buffer[0]))

          chars.Add(char(buffer[0]));
          ++counter;






          share|improve this answer


















          • 3




            Mind elaborating on why you'd use a HashSet?
            – Shelby115
            2 days ago










          • @Shelby115 Uh, O(1) speed?
            – paparazzo
            2 days ago






          • 2




            @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
            – Pieter Witvoet
            yesterday











          • @PieterWitvoet Not buying. Good day.
            – paparazzo
            yesterday






          • 2




            Feel free to measure for yourself if you don't believe me.
            – Pieter Witvoet
            yesterday










          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207171%2frandom-alphanumeric-password-generator-with-gotos%23new-answer', 'question_page');

          );

          Post as a guest






























          10 Answers
          10






          active

          oldest

          votes








          10 Answers
          10






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          49
          down vote













          As a general rule of thumb, any time you feel the need to use goto, take a couple of aspirin and lay down until the feeling passes. They probably should have been deprecated decades ago.



          In this particular case, using a string of allowed characters and randomly picking an index in that string would do away with your goto's.



          I think too a StringBuilder would do a lot better for storing the characters than a LinkedList.






          share|improve this answer






















          • Comments are not for extended discussion; this conversation has been moved to chat.
            – Jamal
            yesterday










          • As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
            – Agnius Vasiliauskas
            17 hours ago







          • 1




            @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
            – NieDzejkob
            16 hours ago










          • @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
            – Agnius Vasiliauskas
            13 hours ago










          • @AgniusVasiliauskas factoring. Not refactoring.
            – NieDzejkob
            3 hours ago














          up vote
          49
          down vote













          As a general rule of thumb, any time you feel the need to use goto, take a couple of aspirin and lay down until the feeling passes. They probably should have been deprecated decades ago.



          In this particular case, using a string of allowed characters and randomly picking an index in that string would do away with your goto's.



          I think too a StringBuilder would do a lot better for storing the characters than a LinkedList.






          share|improve this answer






















          • Comments are not for extended discussion; this conversation has been moved to chat.
            – Jamal
            yesterday










          • As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
            – Agnius Vasiliauskas
            17 hours ago







          • 1




            @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
            – NieDzejkob
            16 hours ago










          • @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
            – Agnius Vasiliauskas
            13 hours ago










          • @AgniusVasiliauskas factoring. Not refactoring.
            – NieDzejkob
            3 hours ago












          up vote
          49
          down vote










          up vote
          49
          down vote









          As a general rule of thumb, any time you feel the need to use goto, take a couple of aspirin and lay down until the feeling passes. They probably should have been deprecated decades ago.



          In this particular case, using a string of allowed characters and randomly picking an index in that string would do away with your goto's.



          I think too a StringBuilder would do a lot better for storing the characters than a LinkedList.






          share|improve this answer














          As a general rule of thumb, any time you feel the need to use goto, take a couple of aspirin and lay down until the feeling passes. They probably should have been deprecated decades ago.



          In this particular case, using a string of allowed characters and randomly picking an index in that string would do away with your goto's.



          I think too a StringBuilder would do a lot better for storing the characters than a LinkedList.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago









          Wai Ha Lee

          1506




          1506










          answered Nov 7 at 20:37









          tinstaafl

          6,240727




          6,240727











          • Comments are not for extended discussion; this conversation has been moved to chat.
            – Jamal
            yesterday










          • As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
            – Agnius Vasiliauskas
            17 hours ago







          • 1




            @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
            – NieDzejkob
            16 hours ago










          • @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
            – Agnius Vasiliauskas
            13 hours ago










          • @AgniusVasiliauskas factoring. Not refactoring.
            – NieDzejkob
            3 hours ago
















          • Comments are not for extended discussion; this conversation has been moved to chat.
            – Jamal
            yesterday










          • As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
            – Agnius Vasiliauskas
            17 hours ago







          • 1




            @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
            – NieDzejkob
            16 hours ago










          • @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
            – Agnius Vasiliauskas
            13 hours ago










          • @AgniusVasiliauskas factoring. Not refactoring.
            – NieDzejkob
            3 hours ago















          Comments are not for extended discussion; this conversation has been moved to chat.
          – Jamal
          yesterday




          Comments are not for extended discussion; this conversation has been moved to chat.
          – Jamal
          yesterday












          As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
          – Agnius Vasiliauskas
          17 hours ago





          As a general rule of thumb, any time you feel the fear of using goto, drink several drops of valerian and lay down until the feeling passes - then start using goto's. Sometimes goto is an invaluable tool - for example to break N nested loops if some language doesn't support break n syntax construct
          – Agnius Vasiliauskas
          17 hours ago





          1




          1




          @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
          – NieDzejkob
          16 hours ago




          @AgniusVasiliauskas if you feel the need to break several loops at once, your code might benefit from some factoring or a custom iterator
          – NieDzejkob
          16 hours ago












          @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
          – Agnius Vasiliauskas
          13 hours ago




          @NieDzejkob 1) In general - employer don't wants to pay salary for code refactoring - it gives no added value for a business, so you can't refactor just because "code looks ugly" for you. 2) There are a lot of legacy code which are very complex in structure and may be not-refactorable easily at all. Big TTA (Time-To-Accomplish) is a no go for business 3) "Refactoring" means different things to different people (no single cure). I for example- would refactor bunch of iterators into several goto's, if that would help to decrease code size, etc.
          – Agnius Vasiliauskas
          13 hours ago












          @AgniusVasiliauskas factoring. Not refactoring.
          – NieDzejkob
          3 hours ago




          @AgniusVasiliauskas factoring. Not refactoring.
          – NieDzejkob
          3 hours ago












          up vote
          20
          down vote













          Data structure



          Why are you using a LinkedList<char>? Linked lists involve a lot of overhead for each node. You know exactly how long the result should be, so why not write to a char array?



          Actually, you may want to consider returning the result as a char rather than as a string. A security-sensitive program may wish to wipe the contents of the secret by overwriting it with NUL bytes, rather than waiting for the garbage collector to clean up the string.



          Flow control



          goto should be avoided, and its use is not justified here.



          The loop is a bit clumsy, referring to counter all over the place.



          char chars = new char[length];
          using (var rng = new RNGCryptoServiceProvider())

          for (int counter = 0; counter < length; )

          return new string(chars);



          Algorithm



          The algorithm is rather inefficient:



          • It reads a byte at a time from the random number generator. .GetBytes() is thread safe, so each call should incur some synchronization overhead.

          • It throws away most of the bytes it reads, keeping each byte with a probability of $62 over 256$.

          A smarter approach would be to fetch a bit more than $3 over 4 mathrmlength$ random bytes, base64-encode it, and discard just the resulting + and / characters.






          share|improve this answer






















          • Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
            – t3chb0t
            Nov 7 at 20:49






          • 4




            @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
            – 200_success
            Nov 7 at 20:53










          • ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
            – t3chb0t
            Nov 7 at 21:00







          • 3




            @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
            – 200_success
            Nov 7 at 21:43






          • 3




            @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
            – 200_success
            2 days ago














          up vote
          20
          down vote













          Data structure



          Why are you using a LinkedList<char>? Linked lists involve a lot of overhead for each node. You know exactly how long the result should be, so why not write to a char array?



          Actually, you may want to consider returning the result as a char rather than as a string. A security-sensitive program may wish to wipe the contents of the secret by overwriting it with NUL bytes, rather than waiting for the garbage collector to clean up the string.



          Flow control



          goto should be avoided, and its use is not justified here.



          The loop is a bit clumsy, referring to counter all over the place.



          char chars = new char[length];
          using (var rng = new RNGCryptoServiceProvider())

          for (int counter = 0; counter < length; )

          return new string(chars);



          Algorithm



          The algorithm is rather inefficient:



          • It reads a byte at a time from the random number generator. .GetBytes() is thread safe, so each call should incur some synchronization overhead.

          • It throws away most of the bytes it reads, keeping each byte with a probability of $62 over 256$.

          A smarter approach would be to fetch a bit more than $3 over 4 mathrmlength$ random bytes, base64-encode it, and discard just the resulting + and / characters.






          share|improve this answer






















          • Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
            – t3chb0t
            Nov 7 at 20:49






          • 4




            @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
            – 200_success
            Nov 7 at 20:53










          • ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
            – t3chb0t
            Nov 7 at 21:00







          • 3




            @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
            – 200_success
            Nov 7 at 21:43






          • 3




            @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
            – 200_success
            2 days ago












          up vote
          20
          down vote










          up vote
          20
          down vote









          Data structure



          Why are you using a LinkedList<char>? Linked lists involve a lot of overhead for each node. You know exactly how long the result should be, so why not write to a char array?



          Actually, you may want to consider returning the result as a char rather than as a string. A security-sensitive program may wish to wipe the contents of the secret by overwriting it with NUL bytes, rather than waiting for the garbage collector to clean up the string.



          Flow control



          goto should be avoided, and its use is not justified here.



          The loop is a bit clumsy, referring to counter all over the place.



          char chars = new char[length];
          using (var rng = new RNGCryptoServiceProvider())

          for (int counter = 0; counter < length; )

          return new string(chars);



          Algorithm



          The algorithm is rather inefficient:



          • It reads a byte at a time from the random number generator. .GetBytes() is thread safe, so each call should incur some synchronization overhead.

          • It throws away most of the bytes it reads, keeping each byte with a probability of $62 over 256$.

          A smarter approach would be to fetch a bit more than $3 over 4 mathrmlength$ random bytes, base64-encode it, and discard just the resulting + and / characters.






          share|improve this answer














          Data structure



          Why are you using a LinkedList<char>? Linked lists involve a lot of overhead for each node. You know exactly how long the result should be, so why not write to a char array?



          Actually, you may want to consider returning the result as a char rather than as a string. A security-sensitive program may wish to wipe the contents of the secret by overwriting it with NUL bytes, rather than waiting for the garbage collector to clean up the string.



          Flow control



          goto should be avoided, and its use is not justified here.



          The loop is a bit clumsy, referring to counter all over the place.



          char chars = new char[length];
          using (var rng = new RNGCryptoServiceProvider())

          for (int counter = 0; counter < length; )

          return new string(chars);



          Algorithm



          The algorithm is rather inefficient:



          • It reads a byte at a time from the random number generator. .GetBytes() is thread safe, so each call should incur some synchronization overhead.

          • It throws away most of the bytes it reads, keeping each byte with a probability of $62 over 256$.

          A smarter approach would be to fetch a bit more than $3 over 4 mathrmlength$ random bytes, base64-encode it, and discard just the resulting + and / characters.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 7 at 21:03

























          answered Nov 7 at 20:47









          200_success

          127k14148410




          127k14148410











          • Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
            – t3chb0t
            Nov 7 at 20:49






          • 4




            @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
            – 200_success
            Nov 7 at 20:53










          • ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
            – t3chb0t
            Nov 7 at 21:00







          • 3




            @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
            – 200_success
            Nov 7 at 21:43






          • 3




            @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
            – 200_success
            2 days ago
















          • Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
            – t3chb0t
            Nov 7 at 20:49






          • 4




            @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
            – 200_success
            Nov 7 at 20:53










          • ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
            – t3chb0t
            Nov 7 at 21:00







          • 3




            @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
            – 200_success
            Nov 7 at 21:43






          • 3




            @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
            – 200_success
            2 days ago















          Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
          – t3chb0t
          Nov 7 at 20:49




          Linked lists involve a lot of overhead for each node. - do they? I many cases they are pretty damn fast.
          – t3chb0t
          Nov 7 at 20:49




          4




          4




          @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
          – 200_success
          Nov 7 at 20:53




          @t3chb0t "Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters or boolean values, because the storage overhead for the links may exceed by a factor of two or more the size of the data. … It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pools." Definitely much much worse than a char for this application.
          – 200_success
          Nov 7 at 20:53












          ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
          – t3chb0t
          Nov 7 at 21:00





          ok, in general this is true but for a small password generator not really critical - I wouldn't panic - unless OP is generating several millions of them - they are like linq, convinient and with convenience there is a price :-)
          – t3chb0t
          Nov 7 at 21:00





          3




          3




          @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
          – 200_success
          Nov 7 at 21:43




          @t3chb0t That makes no sense. Both versions should call GetBytes() the same number of times (statistically speaking). When I measured, mine is marginally faster. Furthermore, if I comment out rng.GetBytes() and produce the same character every time, mine is much much faster. So, the running time is completely dominated by rng.GetBytes(), and my Base64 suggestion should make the biggest difference of all.
          – 200_success
          Nov 7 at 21:43




          3




          3




          @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
          – 200_success
          2 days ago




          @t3chb0t When the results make no sense, then it is time to consider whether your test is wrong.
          – 200_success
          2 days ago










          up vote
          12
          down vote













          Readability & GoTo



          If you want to do the same action for 3 different if-statements make them one if-statement or make them a function. goto causes massive headaches when it's been months since you've touched the code and you come back to it not remembering a thing.



          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          var isNumber = nextChar >= '0' && nextChar <= '9';
          var isUppercaseLetter = nextChar >= 'A' && nextChar <= 'Z';
          var isLowercaseLetter = nextChar >= 'a' && nextChar <= 'z';

          if (isNumber || isUppercaseLetter || isLowercaseLetter)

          chars.Add(nextChar);
          ++counter;



          Now with the conditions named even an idiot like me can come in here and see that you're checking if the character is alpha-numeric. If you wanted to make it even more obvious you could go one step further and make it a function.



          public bool IsAlphaNumericCharacter(char c)
          isUppercaseLetter



          Then your loop becomes shorter and clearer.



          while (counter < length)

          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          if (IsAlphaNumericCharacter(nextChar))

          chars.Add(nextChar);
          ++counter;




          Bytes



          Is there a particular reason you're using byte for length instead of int? Also, was there a reason for var buffer = new byte[1]; being an array instead of just var buffer = new byte();



          If the answer to both of those questions is no, then you could have a character array of size length instead of a LinkedList<char>.



          StringBuilder



          StringBuilder could be beneficial, but if you're at a low amount of characters (like your link suggests) then it's probably not going to make a difference performance-wise (always best to test yourself instead of listening to a stranger on the internet though :P).






          share|improve this answer


















          • 1




            You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
            – Toby Speight
            Nov 7 at 20:53







          • 1




            I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
            – Shelby115
            Nov 8 at 2:18











          • "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
            – Nic Hartley
            2 days ago










          • uhh whoops, slight oversight on my part. Good catch, my bad.
            – Shelby115
            2 days ago














          up vote
          12
          down vote













          Readability & GoTo



          If you want to do the same action for 3 different if-statements make them one if-statement or make them a function. goto causes massive headaches when it's been months since you've touched the code and you come back to it not remembering a thing.



          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          var isNumber = nextChar >= '0' && nextChar <= '9';
          var isUppercaseLetter = nextChar >= 'A' && nextChar <= 'Z';
          var isLowercaseLetter = nextChar >= 'a' && nextChar <= 'z';

          if (isNumber || isUppercaseLetter || isLowercaseLetter)

          chars.Add(nextChar);
          ++counter;



          Now with the conditions named even an idiot like me can come in here and see that you're checking if the character is alpha-numeric. If you wanted to make it even more obvious you could go one step further and make it a function.



          public bool IsAlphaNumericCharacter(char c)
          isUppercaseLetter



          Then your loop becomes shorter and clearer.



          while (counter < length)

          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          if (IsAlphaNumericCharacter(nextChar))

          chars.Add(nextChar);
          ++counter;




          Bytes



          Is there a particular reason you're using byte for length instead of int? Also, was there a reason for var buffer = new byte[1]; being an array instead of just var buffer = new byte();



          If the answer to both of those questions is no, then you could have a character array of size length instead of a LinkedList<char>.



          StringBuilder



          StringBuilder could be beneficial, but if you're at a low amount of characters (like your link suggests) then it's probably not going to make a difference performance-wise (always best to test yourself instead of listening to a stranger on the internet though :P).






          share|improve this answer


















          • 1




            You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
            – Toby Speight
            Nov 7 at 20:53







          • 1




            I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
            – Shelby115
            Nov 8 at 2:18











          • "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
            – Nic Hartley
            2 days ago










          • uhh whoops, slight oversight on my part. Good catch, my bad.
            – Shelby115
            2 days ago












          up vote
          12
          down vote










          up vote
          12
          down vote









          Readability & GoTo



          If you want to do the same action for 3 different if-statements make them one if-statement or make them a function. goto causes massive headaches when it's been months since you've touched the code and you come back to it not remembering a thing.



          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          var isNumber = nextChar >= '0' && nextChar <= '9';
          var isUppercaseLetter = nextChar >= 'A' && nextChar <= 'Z';
          var isLowercaseLetter = nextChar >= 'a' && nextChar <= 'z';

          if (isNumber || isUppercaseLetter || isLowercaseLetter)

          chars.Add(nextChar);
          ++counter;



          Now with the conditions named even an idiot like me can come in here and see that you're checking if the character is alpha-numeric. If you wanted to make it even more obvious you could go one step further and make it a function.



          public bool IsAlphaNumericCharacter(char c)
          isUppercaseLetter



          Then your loop becomes shorter and clearer.



          while (counter < length)

          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          if (IsAlphaNumericCharacter(nextChar))

          chars.Add(nextChar);
          ++counter;




          Bytes



          Is there a particular reason you're using byte for length instead of int? Also, was there a reason for var buffer = new byte[1]; being an array instead of just var buffer = new byte();



          If the answer to both of those questions is no, then you could have a character array of size length instead of a LinkedList<char>.



          StringBuilder



          StringBuilder could be beneficial, but if you're at a low amount of characters (like your link suggests) then it's probably not going to make a difference performance-wise (always best to test yourself instead of listening to a stranger on the internet though :P).






          share|improve this answer














          Readability & GoTo



          If you want to do the same action for 3 different if-statements make them one if-statement or make them a function. goto causes massive headaches when it's been months since you've touched the code and you come back to it not remembering a thing.



          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          var isNumber = nextChar >= '0' && nextChar <= '9';
          var isUppercaseLetter = nextChar >= 'A' && nextChar <= 'Z';
          var isLowercaseLetter = nextChar >= 'a' && nextChar <= 'z';

          if (isNumber || isUppercaseLetter || isLowercaseLetter)

          chars.Add(nextChar);
          ++counter;



          Now with the conditions named even an idiot like me can come in here and see that you're checking if the character is alpha-numeric. If you wanted to make it even more obvious you could go one step further and make it a function.



          public bool IsAlphaNumericCharacter(char c)
          isUppercaseLetter



          Then your loop becomes shorter and clearer.



          while (counter < length)

          rng.GetBytes(buffer);
          var nextChar = (char)buffer[0];

          if (IsAlphaNumericCharacter(nextChar))

          chars.Add(nextChar);
          ++counter;




          Bytes



          Is there a particular reason you're using byte for length instead of int? Also, was there a reason for var buffer = new byte[1]; being an array instead of just var buffer = new byte();



          If the answer to both of those questions is no, then you could have a character array of size length instead of a LinkedList<char>.



          StringBuilder



          StringBuilder could be beneficial, but if you're at a low amount of characters (like your link suggests) then it's probably not going to make a difference performance-wise (always best to test yourself instead of listening to a stranger on the internet though :P).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered Nov 7 at 20:41









          Shelby115

          1,523517




          1,523517







          • 1




            You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
            – Toby Speight
            Nov 7 at 20:53







          • 1




            I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
            – Shelby115
            Nov 8 at 2:18











          • "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
            – Nic Hartley
            2 days ago










          • uhh whoops, slight oversight on my part. Good catch, my bad.
            – Shelby115
            2 days ago












          • 1




            You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
            – Toby Speight
            Nov 7 at 20:53







          • 1




            I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
            – Shelby115
            Nov 8 at 2:18











          • "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
            – Nic Hartley
            2 days ago










          • uhh whoops, slight oversight on my part. Good catch, my bad.
            – Shelby115
            2 days ago







          1




          1




          You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
          – Toby Speight
          Nov 7 at 20:53





          You might be doing more work than the usual short-circuit equivalent, by storing comparison results in those (nicely named!) variables before testing. Unless your compiler optimises that away, of course.
          – Toby Speight
          Nov 7 at 20:53





          1




          1




          I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
          – Shelby115
          Nov 8 at 2:18





          I trust the compiler to do great things, hasn't caused me problems yet. That said, 99% of my performance issues are database bottlenecks (mostly do ASP.NET) so I've never really needed to care about my code's performance haha.
          – Shelby115
          Nov 8 at 2:18













          "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
          – Nic Hartley
          2 days ago




          "|| is short-circuit" even though you use variables -- Are you sure about that? No, really, that's not me being sarcastic! I'm curious now if the same bytecode is generated. If it is, that'd be worth adding to your answer as a side note.
          – Nic Hartley
          2 days ago












          uhh whoops, slight oversight on my part. Good catch, my bad.
          – Shelby115
          2 days ago




          uhh whoops, slight oversight on my part. Good catch, my bad.
          – Shelby115
          2 days ago










          up vote
          9
          down vote













          Avoid premature optimizations, they're really hurting this code.



          In the comments you defend the use of goto using the argument that it avoids branch prediction.



          Accepting the dubious premise that the compiler produces more optimized code from a series of sequential if statements than it would from a conventional if-else construct, lets look at the context of that optimization to see what it might get you.



          As other answers have pointed out, the data structure is poorly chosen, as the cost of a node allocation is going to dwarf the cost of a missed branch prediction.



          Assuming you swap that out, as pointed out in another answer, you'll also want to grab multiple bytes at a time from the RNG. That'll help, but even fully optimized that call will cost more than you could possibly save using a questionable branch prediction optimization.



          That final bit converting the characters to a string? Yep, that's a traversal so you're looking at one conditional per node, plenty of chances for a bad branch prediction.



          The bottom line is that, by focusing on something which makes you feel like you're "thinking for yourself" and "questioning authority" but doesn't actually help, you've given away readability for no measurable benefit.






          share|improve this answer
























            up vote
            9
            down vote













            Avoid premature optimizations, they're really hurting this code.



            In the comments you defend the use of goto using the argument that it avoids branch prediction.



            Accepting the dubious premise that the compiler produces more optimized code from a series of sequential if statements than it would from a conventional if-else construct, lets look at the context of that optimization to see what it might get you.



            As other answers have pointed out, the data structure is poorly chosen, as the cost of a node allocation is going to dwarf the cost of a missed branch prediction.



            Assuming you swap that out, as pointed out in another answer, you'll also want to grab multiple bytes at a time from the RNG. That'll help, but even fully optimized that call will cost more than you could possibly save using a questionable branch prediction optimization.



            That final bit converting the characters to a string? Yep, that's a traversal so you're looking at one conditional per node, plenty of chances for a bad branch prediction.



            The bottom line is that, by focusing on something which makes you feel like you're "thinking for yourself" and "questioning authority" but doesn't actually help, you've given away readability for no measurable benefit.






            share|improve this answer






















              up vote
              9
              down vote










              up vote
              9
              down vote









              Avoid premature optimizations, they're really hurting this code.



              In the comments you defend the use of goto using the argument that it avoids branch prediction.



              Accepting the dubious premise that the compiler produces more optimized code from a series of sequential if statements than it would from a conventional if-else construct, lets look at the context of that optimization to see what it might get you.



              As other answers have pointed out, the data structure is poorly chosen, as the cost of a node allocation is going to dwarf the cost of a missed branch prediction.



              Assuming you swap that out, as pointed out in another answer, you'll also want to grab multiple bytes at a time from the RNG. That'll help, but even fully optimized that call will cost more than you could possibly save using a questionable branch prediction optimization.



              That final bit converting the characters to a string? Yep, that's a traversal so you're looking at one conditional per node, plenty of chances for a bad branch prediction.



              The bottom line is that, by focusing on something which makes you feel like you're "thinking for yourself" and "questioning authority" but doesn't actually help, you've given away readability for no measurable benefit.






              share|improve this answer












              Avoid premature optimizations, they're really hurting this code.



              In the comments you defend the use of goto using the argument that it avoids branch prediction.



              Accepting the dubious premise that the compiler produces more optimized code from a series of sequential if statements than it would from a conventional if-else construct, lets look at the context of that optimization to see what it might get you.



              As other answers have pointed out, the data structure is poorly chosen, as the cost of a node allocation is going to dwarf the cost of a missed branch prediction.



              Assuming you swap that out, as pointed out in another answer, you'll also want to grab multiple bytes at a time from the RNG. That'll help, but even fully optimized that call will cost more than you could possibly save using a questionable branch prediction optimization.



              That final bit converting the characters to a string? Yep, that's a traversal so you're looking at one conditional per node, plenty of chances for a bad branch prediction.



              The bottom line is that, by focusing on something which makes you feel like you're "thinking for yourself" and "questioning authority" but doesn't actually help, you've given away readability for no measurable benefit.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 8 at 3:26









              Morgen

              63139




              63139




















                  up vote
                  5
                  down vote













                  In this case, if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version.



                  It might even be something like this (completely and absolutely untested) which - in average - should also perform better than original implementation:



                  using (var rng = new RNGCryptoServiceProvider())

                  var buffer = new byte[length * 5];

                  var password = new StringBuilder(length);
                  while (password.Length < length)

                  rng.GetBytes(buffer);
                  password.Append(buffer
                  .Select(x => (char)x)
                  .Where(Char.IsLetterOrDigit)
                  .Take(length - password.Length)
                  .ToArray()
                  );




                  Notes:



                  • Decision to use length * 5 is arbitrary (you take only 58 values from the 256 possible). For small strings there are good chances that it doesn't work well, run this few thousands times and see how many times one call to GetBytes() is not enough.

                  • It performs pretty well (GetBytes() overhead is - typically - payed only once) and you're not building lists of expensive structures (StringBuilder is highly optimized to build strings).

                  • It should be slightly more fast than base64 encoding (which otherwise is a clever idea).

                  • It's really easy to optimize for the best case scenario adding some more code (for example when ToArray() returns the required string you can avoid StringBuilder() - and its initial allocation - all together).

                  • It scales well (parallel generation of N passwords or even generation of a huge password in parallel from M chunks).

                  • Given that your input is a single byte then you can use Char.IsLetterOrDigit directly because all the other non Latin letters and digits are > 255 when UTF-16 encoded).

                  The very next (or previous?) step is to test this code. Can you reliably an extensively test a function which uses a random number generator? Hardly. t3schb0t covers this aspect in his answer.



                  Are we done? There are some security concerns you should consider, see last section.




                  As you did imagine goto is the controversial part of your code. Let me address this separately because, it seems, you're prone to use it.



                  goto usually hurts legibility and you're also (probably) preventing some optimizations compiler might do. Depending where in the compiler pipeline the optimization is applied (near the front-end where goto and if/else differ, or near the back-end - or even after JIT - where they already generated the exact same code) a compiler is probably not optimizing goto (because it's uncommon to be worth of the time and when you're using it then you know what you're doing). In short: if it's because of performance than 99/100 compiler can do a better job than you.



                  I'm not saying that there are not use-cases (there are) but hey're so rare that hat you may live without writing a goto for most of your career (with few more chances for its little brother goto case). Please do not generalize this to every language, in C - for example - it might be useful for clean-up code and error handling but in higher level languages we have other powerful tools.



                  Does it make the code shorter or more readable? In your specific case, even without changing anything else, the very same code might be rewritten as:



                  while (chars.Count < length)

                  rng.GetBytes(buffer);
                  char nextChar = (char)buffer[0];

                  if (Char.IsLetterOrDigit(nextChar))
                  chars.Add(nextChar);



                  What did we do? We moved some logic into a separate function, this has two advantages:



                  • It makes logic easier to follow because we can read the function name instead of understanding the code.

                  • If frees us from the need of that goto because we can reduce those multiple ifs.

                  What if there wasn't Char.IsLetterOrDigit()? You write your own and 90% you will get rid of those jumps. There are more things in the original code (beside goto): counter is useless, you can use an array instead of a linked list (size is known), it's preferable to avoid var for primitive types, you're hard-coding magic constants, and...you're reducing the entropy of the generated password.




                  Security considerations



                  You're correctly using RNGCryptoServiceProvider to generate the random numbers however you're not using a proper one-to-one encoding algorithm to transform those numbers into text. Simply dropping some values reduces the entropy. This is unavoidable if you want a fixed length alphanumeric string.



                  However if your goal is to generate a password with the desired amount of entropy then for this step you should use base62 (as elsewhere suggested base64 for performance reasons), a Diceware algorithm, or a perfect hash function (no, you can't use modulus for this because it, obviously, has multiple collisions).






                  share|improve this answer






















                  • I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
                    – t3chb0t
                    2 days ago











                  • definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
                    – Adriano Repetti
                    2 days ago











                  • I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
                    – Henrik Hansen
                    2 days ago






                  • 2




                    if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
                    – t3chb0t
                    2 days ago







                  • 1




                    Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
                    – t3chb0t
                    2 days ago














                  up vote
                  5
                  down vote













                  In this case, if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version.



                  It might even be something like this (completely and absolutely untested) which - in average - should also perform better than original implementation:



                  using (var rng = new RNGCryptoServiceProvider())

                  var buffer = new byte[length * 5];

                  var password = new StringBuilder(length);
                  while (password.Length < length)

                  rng.GetBytes(buffer);
                  password.Append(buffer
                  .Select(x => (char)x)
                  .Where(Char.IsLetterOrDigit)
                  .Take(length - password.Length)
                  .ToArray()
                  );




                  Notes:



                  • Decision to use length * 5 is arbitrary (you take only 58 values from the 256 possible). For small strings there are good chances that it doesn't work well, run this few thousands times and see how many times one call to GetBytes() is not enough.

                  • It performs pretty well (GetBytes() overhead is - typically - payed only once) and you're not building lists of expensive structures (StringBuilder is highly optimized to build strings).

                  • It should be slightly more fast than base64 encoding (which otherwise is a clever idea).

                  • It's really easy to optimize for the best case scenario adding some more code (for example when ToArray() returns the required string you can avoid StringBuilder() - and its initial allocation - all together).

                  • It scales well (parallel generation of N passwords or even generation of a huge password in parallel from M chunks).

                  • Given that your input is a single byte then you can use Char.IsLetterOrDigit directly because all the other non Latin letters and digits are > 255 when UTF-16 encoded).

                  The very next (or previous?) step is to test this code. Can you reliably an extensively test a function which uses a random number generator? Hardly. t3schb0t covers this aspect in his answer.



                  Are we done? There are some security concerns you should consider, see last section.




                  As you did imagine goto is the controversial part of your code. Let me address this separately because, it seems, you're prone to use it.



                  goto usually hurts legibility and you're also (probably) preventing some optimizations compiler might do. Depending where in the compiler pipeline the optimization is applied (near the front-end where goto and if/else differ, or near the back-end - or even after JIT - where they already generated the exact same code) a compiler is probably not optimizing goto (because it's uncommon to be worth of the time and when you're using it then you know what you're doing). In short: if it's because of performance than 99/100 compiler can do a better job than you.



                  I'm not saying that there are not use-cases (there are) but hey're so rare that hat you may live without writing a goto for most of your career (with few more chances for its little brother goto case). Please do not generalize this to every language, in C - for example - it might be useful for clean-up code and error handling but in higher level languages we have other powerful tools.



                  Does it make the code shorter or more readable? In your specific case, even without changing anything else, the very same code might be rewritten as:



                  while (chars.Count < length)

                  rng.GetBytes(buffer);
                  char nextChar = (char)buffer[0];

                  if (Char.IsLetterOrDigit(nextChar))
                  chars.Add(nextChar);



                  What did we do? We moved some logic into a separate function, this has two advantages:



                  • It makes logic easier to follow because we can read the function name instead of understanding the code.

                  • If frees us from the need of that goto because we can reduce those multiple ifs.

                  What if there wasn't Char.IsLetterOrDigit()? You write your own and 90% you will get rid of those jumps. There are more things in the original code (beside goto): counter is useless, you can use an array instead of a linked list (size is known), it's preferable to avoid var for primitive types, you're hard-coding magic constants, and...you're reducing the entropy of the generated password.




                  Security considerations



                  You're correctly using RNGCryptoServiceProvider to generate the random numbers however you're not using a proper one-to-one encoding algorithm to transform those numbers into text. Simply dropping some values reduces the entropy. This is unavoidable if you want a fixed length alphanumeric string.



                  However if your goal is to generate a password with the desired amount of entropy then for this step you should use base62 (as elsewhere suggested base64 for performance reasons), a Diceware algorithm, or a perfect hash function (no, you can't use modulus for this because it, obviously, has multiple collisions).






                  share|improve this answer






















                  • I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
                    – t3chb0t
                    2 days ago











                  • definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
                    – Adriano Repetti
                    2 days ago











                  • I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
                    – Henrik Hansen
                    2 days ago






                  • 2




                    if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
                    – t3chb0t
                    2 days ago







                  • 1




                    Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
                    – t3chb0t
                    2 days ago












                  up vote
                  5
                  down vote










                  up vote
                  5
                  down vote









                  In this case, if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version.



                  It might even be something like this (completely and absolutely untested) which - in average - should also perform better than original implementation:



                  using (var rng = new RNGCryptoServiceProvider())

                  var buffer = new byte[length * 5];

                  var password = new StringBuilder(length);
                  while (password.Length < length)

                  rng.GetBytes(buffer);
                  password.Append(buffer
                  .Select(x => (char)x)
                  .Where(Char.IsLetterOrDigit)
                  .Take(length - password.Length)
                  .ToArray()
                  );




                  Notes:



                  • Decision to use length * 5 is arbitrary (you take only 58 values from the 256 possible). For small strings there are good chances that it doesn't work well, run this few thousands times and see how many times one call to GetBytes() is not enough.

                  • It performs pretty well (GetBytes() overhead is - typically - payed only once) and you're not building lists of expensive structures (StringBuilder is highly optimized to build strings).

                  • It should be slightly more fast than base64 encoding (which otherwise is a clever idea).

                  • It's really easy to optimize for the best case scenario adding some more code (for example when ToArray() returns the required string you can avoid StringBuilder() - and its initial allocation - all together).

                  • It scales well (parallel generation of N passwords or even generation of a huge password in parallel from M chunks).

                  • Given that your input is a single byte then you can use Char.IsLetterOrDigit directly because all the other non Latin letters and digits are > 255 when UTF-16 encoded).

                  The very next (or previous?) step is to test this code. Can you reliably an extensively test a function which uses a random number generator? Hardly. t3schb0t covers this aspect in his answer.



                  Are we done? There are some security concerns you should consider, see last section.




                  As you did imagine goto is the controversial part of your code. Let me address this separately because, it seems, you're prone to use it.



                  goto usually hurts legibility and you're also (probably) preventing some optimizations compiler might do. Depending where in the compiler pipeline the optimization is applied (near the front-end where goto and if/else differ, or near the back-end - or even after JIT - where they already generated the exact same code) a compiler is probably not optimizing goto (because it's uncommon to be worth of the time and when you're using it then you know what you're doing). In short: if it's because of performance than 99/100 compiler can do a better job than you.



                  I'm not saying that there are not use-cases (there are) but hey're so rare that hat you may live without writing a goto for most of your career (with few more chances for its little brother goto case). Please do not generalize this to every language, in C - for example - it might be useful for clean-up code and error handling but in higher level languages we have other powerful tools.



                  Does it make the code shorter or more readable? In your specific case, even without changing anything else, the very same code might be rewritten as:



                  while (chars.Count < length)

                  rng.GetBytes(buffer);
                  char nextChar = (char)buffer[0];

                  if (Char.IsLetterOrDigit(nextChar))
                  chars.Add(nextChar);



                  What did we do? We moved some logic into a separate function, this has two advantages:



                  • It makes logic easier to follow because we can read the function name instead of understanding the code.

                  • If frees us from the need of that goto because we can reduce those multiple ifs.

                  What if there wasn't Char.IsLetterOrDigit()? You write your own and 90% you will get rid of those jumps. There are more things in the original code (beside goto): counter is useless, you can use an array instead of a linked list (size is known), it's preferable to avoid var for primitive types, you're hard-coding magic constants, and...you're reducing the entropy of the generated password.




                  Security considerations



                  You're correctly using RNGCryptoServiceProvider to generate the random numbers however you're not using a proper one-to-one encoding algorithm to transform those numbers into text. Simply dropping some values reduces the entropy. This is unavoidable if you want a fixed length alphanumeric string.



                  However if your goal is to generate a password with the desired amount of entropy then for this step you should use base62 (as elsewhere suggested base64 for performance reasons), a Diceware algorithm, or a perfect hash function (no, you can't use modulus for this because it, obviously, has multiple collisions).






                  share|improve this answer














                  In this case, if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version.



                  It might even be something like this (completely and absolutely untested) which - in average - should also perform better than original implementation:



                  using (var rng = new RNGCryptoServiceProvider())

                  var buffer = new byte[length * 5];

                  var password = new StringBuilder(length);
                  while (password.Length < length)

                  rng.GetBytes(buffer);
                  password.Append(buffer
                  .Select(x => (char)x)
                  .Where(Char.IsLetterOrDigit)
                  .Take(length - password.Length)
                  .ToArray()
                  );




                  Notes:



                  • Decision to use length * 5 is arbitrary (you take only 58 values from the 256 possible). For small strings there are good chances that it doesn't work well, run this few thousands times and see how many times one call to GetBytes() is not enough.

                  • It performs pretty well (GetBytes() overhead is - typically - payed only once) and you're not building lists of expensive structures (StringBuilder is highly optimized to build strings).

                  • It should be slightly more fast than base64 encoding (which otherwise is a clever idea).

                  • It's really easy to optimize for the best case scenario adding some more code (for example when ToArray() returns the required string you can avoid StringBuilder() - and its initial allocation - all together).

                  • It scales well (parallel generation of N passwords or even generation of a huge password in parallel from M chunks).

                  • Given that your input is a single byte then you can use Char.IsLetterOrDigit directly because all the other non Latin letters and digits are > 255 when UTF-16 encoded).

                  The very next (or previous?) step is to test this code. Can you reliably an extensively test a function which uses a random number generator? Hardly. t3schb0t covers this aspect in his answer.



                  Are we done? There are some security concerns you should consider, see last section.




                  As you did imagine goto is the controversial part of your code. Let me address this separately because, it seems, you're prone to use it.



                  goto usually hurts legibility and you're also (probably) preventing some optimizations compiler might do. Depending where in the compiler pipeline the optimization is applied (near the front-end where goto and if/else differ, or near the back-end - or even after JIT - where they already generated the exact same code) a compiler is probably not optimizing goto (because it's uncommon to be worth of the time and when you're using it then you know what you're doing). In short: if it's because of performance than 99/100 compiler can do a better job than you.



                  I'm not saying that there are not use-cases (there are) but hey're so rare that hat you may live without writing a goto for most of your career (with few more chances for its little brother goto case). Please do not generalize this to every language, in C - for example - it might be useful for clean-up code and error handling but in higher level languages we have other powerful tools.



                  Does it make the code shorter or more readable? In your specific case, even without changing anything else, the very same code might be rewritten as:



                  while (chars.Count < length)

                  rng.GetBytes(buffer);
                  char nextChar = (char)buffer[0];

                  if (Char.IsLetterOrDigit(nextChar))
                  chars.Add(nextChar);



                  What did we do? We moved some logic into a separate function, this has two advantages:



                  • It makes logic easier to follow because we can read the function name instead of understanding the code.

                  • If frees us from the need of that goto because we can reduce those multiple ifs.

                  What if there wasn't Char.IsLetterOrDigit()? You write your own and 90% you will get rid of those jumps. There are more things in the original code (beside goto): counter is useless, you can use an array instead of a linked list (size is known), it's preferable to avoid var for primitive types, you're hard-coding magic constants, and...you're reducing the entropy of the generated password.




                  Security considerations



                  You're correctly using RNGCryptoServiceProvider to generate the random numbers however you're not using a proper one-to-one encoding algorithm to transform those numbers into text. Simply dropping some values reduces the entropy. This is unavoidable if you want a fixed length alphanumeric string.



                  However if your goal is to generate a password with the desired amount of entropy then for this step you should use base62 (as elsewhere suggested base64 for performance reasons), a Diceware algorithm, or a perfect hash function (no, you can't use modulus for this because it, obviously, has multiple collisions).







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited yesterday

























                  answered 2 days ago









                  Adriano Repetti

                  9,61511440




                  9,61511440











                  • I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
                    – t3chb0t
                    2 days ago











                  • definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
                    – Adriano Repetti
                    2 days ago











                  • I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
                    – Henrik Hansen
                    2 days ago






                  • 2




                    if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
                    – t3chb0t
                    2 days ago







                  • 1




                    Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
                    – t3chb0t
                    2 days ago
















                  • I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
                    – t3chb0t
                    2 days ago











                  • definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
                    – Adriano Repetti
                    2 days ago











                  • I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
                    – Henrik Hansen
                    2 days ago






                  • 2




                    if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
                    – t3chb0t
                    2 days ago







                  • 1




                    Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
                    – t3chb0t
                    2 days ago















                  I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
                  – t3chb0t
                  2 days ago





                  I'm pretty sure the outher while could be squeezed into a nice linq chain by using TakeUntil ;-]
                  – t3chb0t
                  2 days ago













                  definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
                  – Adriano Repetti
                  2 days ago





                  definitely! If working with RNG OP has already (probably) - a generator function and everything might be reduced to rng.Generate()....Take().ToArray()!
                  – Adriano Repetti
                  2 days ago













                  I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
                  – Henrik Hansen
                  2 days ago




                  I'm not sure, you can make that cast from byte to char using Cast<char>() - you'll have to use Select(b => (char)b). If you move rng.GetBytes(buffer); inside the while-loop as the first thing you'll have a fresh set of bytes for every iteration to avoid (or at least minimize) the risk of having the buffer filled with only non-nums/digits
                  – Henrik Hansen
                  2 days ago




                  2




                  2




                  if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
                  – t3chb0t
                  2 days ago





                  if performance are not a MEASURED issue (read: you're not generating 1,000,000 passwords) then I'd definitely pick the most readable version. - this is the wisest sentence here - OP didn't say anthing about performance but all reviews are extremely focused on it and everyone tries to write the fastest code... why?
                  – t3chb0t
                  2 days ago





                  1




                  1




                  Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
                  – t3chb0t
                  2 days ago




                  Btw, coincidentally the optimization with getting a larger buffer and callig GetBytes fewer time beats any other version presented here.
                  – t3chb0t
                  2 days ago










                  up vote
                  4
                  down vote













                  There is only one answer so far, that by @Adriano Repetti where clean code is mentioned. All others are havily focused on performane.



                  I'd like to add to this one answer and say that reusibility comes always first. I agree that if performance is not of great importance, you should primarily create an API that is easy to use, mantain and test.



                  Organizing your code from this point of view would mean to first create an extension for the RandomNumberGenerator that is the base class of the RNGCryptoServiceProvider. It would enumerate only chars that you are interested in. It doesn't have to know how many, that's what we have LINQ for so it creates an endless collection of chars.



                  public static IEnumerable<char> EnumerateLetterOrDigit(this RandomNumberGenerator rng, int bufferSize = 256)

                  var buffer = new byte[bufferSize];
                  while (true)

                  rng.GetBytes(buffer);
                  foreach (var c in buffer.Select(b => (char)b).Where(char.IsLetterOrDigit))

                  yield return c;





                  By doing so you could use a different provider or even mock it for testing and always return the same values. The options are there.



                  From here, it's now very easy to build your password generator. Create any random-number-generator, Take as many chars as you need and create the string.



                  public static string Generate(int length)

                  using (var rng = new RNGCryptoServiceProvider())

                  return new string(rng.EnumerateLetterOrDigit().Take(length).ToArray());




                  And even though it havily uses LINQ, it's accidentally faster then your code because of the reduced number of calls to GetBytes (discovered by @Adriano?).



                  So saying that LINQ is always slow is once again proven wrong because most of the time there will be something else that isn't optimal. LINQ is just convenient and this is so great about it!




                  There is one thing about goto that hasn't been said yet... and you're using something very similar everyday, it's called switch:



                  public static void Main()

                  var num = 1;

                  switch (num)

                  case 1:
                  goto case 3;
                  case 2:
                  Console.WriteLine("Hi goto!");
                  break;
                  case 3:
                  goto case 2;




                  Yes, you can jump within its scope from case to case in any order.






                  share|improve this answer


























                    up vote
                    4
                    down vote













                    There is only one answer so far, that by @Adriano Repetti where clean code is mentioned. All others are havily focused on performane.



                    I'd like to add to this one answer and say that reusibility comes always first. I agree that if performance is not of great importance, you should primarily create an API that is easy to use, mantain and test.



                    Organizing your code from this point of view would mean to first create an extension for the RandomNumberGenerator that is the base class of the RNGCryptoServiceProvider. It would enumerate only chars that you are interested in. It doesn't have to know how many, that's what we have LINQ for so it creates an endless collection of chars.



                    public static IEnumerable<char> EnumerateLetterOrDigit(this RandomNumberGenerator rng, int bufferSize = 256)

                    var buffer = new byte[bufferSize];
                    while (true)

                    rng.GetBytes(buffer);
                    foreach (var c in buffer.Select(b => (char)b).Where(char.IsLetterOrDigit))

                    yield return c;





                    By doing so you could use a different provider or even mock it for testing and always return the same values. The options are there.



                    From here, it's now very easy to build your password generator. Create any random-number-generator, Take as many chars as you need and create the string.



                    public static string Generate(int length)

                    using (var rng = new RNGCryptoServiceProvider())

                    return new string(rng.EnumerateLetterOrDigit().Take(length).ToArray());




                    And even though it havily uses LINQ, it's accidentally faster then your code because of the reduced number of calls to GetBytes (discovered by @Adriano?).



                    So saying that LINQ is always slow is once again proven wrong because most of the time there will be something else that isn't optimal. LINQ is just convenient and this is so great about it!




                    There is one thing about goto that hasn't been said yet... and you're using something very similar everyday, it's called switch:



                    public static void Main()

                    var num = 1;

                    switch (num)

                    case 1:
                    goto case 3;
                    case 2:
                    Console.WriteLine("Hi goto!");
                    break;
                    case 3:
                    goto case 2;




                    Yes, you can jump within its scope from case to case in any order.






                    share|improve this answer
























                      up vote
                      4
                      down vote










                      up vote
                      4
                      down vote









                      There is only one answer so far, that by @Adriano Repetti where clean code is mentioned. All others are havily focused on performane.



                      I'd like to add to this one answer and say that reusibility comes always first. I agree that if performance is not of great importance, you should primarily create an API that is easy to use, mantain and test.



                      Organizing your code from this point of view would mean to first create an extension for the RandomNumberGenerator that is the base class of the RNGCryptoServiceProvider. It would enumerate only chars that you are interested in. It doesn't have to know how many, that's what we have LINQ for so it creates an endless collection of chars.



                      public static IEnumerable<char> EnumerateLetterOrDigit(this RandomNumberGenerator rng, int bufferSize = 256)

                      var buffer = new byte[bufferSize];
                      while (true)

                      rng.GetBytes(buffer);
                      foreach (var c in buffer.Select(b => (char)b).Where(char.IsLetterOrDigit))

                      yield return c;





                      By doing so you could use a different provider or even mock it for testing and always return the same values. The options are there.



                      From here, it's now very easy to build your password generator. Create any random-number-generator, Take as many chars as you need and create the string.



                      public static string Generate(int length)

                      using (var rng = new RNGCryptoServiceProvider())

                      return new string(rng.EnumerateLetterOrDigit().Take(length).ToArray());




                      And even though it havily uses LINQ, it's accidentally faster then your code because of the reduced number of calls to GetBytes (discovered by @Adriano?).



                      So saying that LINQ is always slow is once again proven wrong because most of the time there will be something else that isn't optimal. LINQ is just convenient and this is so great about it!




                      There is one thing about goto that hasn't been said yet... and you're using something very similar everyday, it's called switch:



                      public static void Main()

                      var num = 1;

                      switch (num)

                      case 1:
                      goto case 3;
                      case 2:
                      Console.WriteLine("Hi goto!");
                      break;
                      case 3:
                      goto case 2;




                      Yes, you can jump within its scope from case to case in any order.






                      share|improve this answer














                      There is only one answer so far, that by @Adriano Repetti where clean code is mentioned. All others are havily focused on performane.



                      I'd like to add to this one answer and say that reusibility comes always first. I agree that if performance is not of great importance, you should primarily create an API that is easy to use, mantain and test.



                      Organizing your code from this point of view would mean to first create an extension for the RandomNumberGenerator that is the base class of the RNGCryptoServiceProvider. It would enumerate only chars that you are interested in. It doesn't have to know how many, that's what we have LINQ for so it creates an endless collection of chars.



                      public static IEnumerable<char> EnumerateLetterOrDigit(this RandomNumberGenerator rng, int bufferSize = 256)

                      var buffer = new byte[bufferSize];
                      while (true)

                      rng.GetBytes(buffer);
                      foreach (var c in buffer.Select(b => (char)b).Where(char.IsLetterOrDigit))

                      yield return c;





                      By doing so you could use a different provider or even mock it for testing and always return the same values. The options are there.



                      From here, it's now very easy to build your password generator. Create any random-number-generator, Take as many chars as you need and create the string.



                      public static string Generate(int length)

                      using (var rng = new RNGCryptoServiceProvider())

                      return new string(rng.EnumerateLetterOrDigit().Take(length).ToArray());




                      And even though it havily uses LINQ, it's accidentally faster then your code because of the reduced number of calls to GetBytes (discovered by @Adriano?).



                      So saying that LINQ is always slow is once again proven wrong because most of the time there will be something else that isn't optimal. LINQ is just convenient and this is so great about it!




                      There is one thing about goto that hasn't been said yet... and you're using something very similar everyday, it's called switch:



                      public static void Main()

                      var num = 1;

                      switch (num)

                      case 1:
                      goto case 3;
                      case 2:
                      Console.WriteLine("Hi goto!");
                      break;
                      case 3:
                      goto case 2;




                      Yes, you can jump within its scope from case to case in any order.







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited yesterday

























                      answered 2 days ago









                      t3chb0t

                      33.2k744106




                      33.2k744106




















                          up vote
                          3
                          down vote













                          goto, part deux




                          I don't agree with the mass consciousness that goto is a code smell.




                          code smell means a possible problem, not that is is a problem. So yes, it is a code smell. I too have used goto, only once in the last decade - and it was with C#.



                          Why goto could be bad



                          We do not trust ourselves to use it wisely and well. This is borne out in history. There are millllions of COBOL LOC demonstrating that when used routinely goto tends to corrupt flow control, create inter and intra method coupling, and spawns "one off" code duplication; all to such a degree that one should assume the code will break when changed. In other words the code becomes unmaintainable.




                          Why goto is bad



                          goto murders the bedrock of all good programs - structured programming and modular construction. goto literally goes around them. Anything you can say about modern languages, coding principles, object orientation, etc. is fundamentally rooted in these two precepts. full stop.



                          As for history: the unstructured nature of COBOL, for example, gave rise to an "hard jump" coding style and mind set. It's fair to say the liberal use of goto is a code lobotomy. Many factors contribute to crumbling code but the more gotos the more schizophrenic the code becomes. I have worked on a system that gave a dreadful feeling of random execution when stepping through with the debugger. Many existing bugs could not be duplicated and almost no bug could be reproduced reliably. 99 percent of code tweaks induced bugs. I am not exaggerating at all here! Our in house customer head guy told me "we don't bring up problems anymore because we know they won't be fixed."



                          So wide spread was the problem that Edsger Dijkstra wrote a seminal paper GOTO statement considered harmful



                          A very readable, practical, nicely concise "using goto wisely" discussion






                          share|improve this answer






















                          • These are the best arguments against goto I've ever seen.
                            – t3chb0t
                            yesterday














                          up vote
                          3
                          down vote













                          goto, part deux




                          I don't agree with the mass consciousness that goto is a code smell.




                          code smell means a possible problem, not that is is a problem. So yes, it is a code smell. I too have used goto, only once in the last decade - and it was with C#.



                          Why goto could be bad



                          We do not trust ourselves to use it wisely and well. This is borne out in history. There are millllions of COBOL LOC demonstrating that when used routinely goto tends to corrupt flow control, create inter and intra method coupling, and spawns "one off" code duplication; all to such a degree that one should assume the code will break when changed. In other words the code becomes unmaintainable.




                          Why goto is bad



                          goto murders the bedrock of all good programs - structured programming and modular construction. goto literally goes around them. Anything you can say about modern languages, coding principles, object orientation, etc. is fundamentally rooted in these two precepts. full stop.



                          As for history: the unstructured nature of COBOL, for example, gave rise to an "hard jump" coding style and mind set. It's fair to say the liberal use of goto is a code lobotomy. Many factors contribute to crumbling code but the more gotos the more schizophrenic the code becomes. I have worked on a system that gave a dreadful feeling of random execution when stepping through with the debugger. Many existing bugs could not be duplicated and almost no bug could be reproduced reliably. 99 percent of code tweaks induced bugs. I am not exaggerating at all here! Our in house customer head guy told me "we don't bring up problems anymore because we know they won't be fixed."



                          So wide spread was the problem that Edsger Dijkstra wrote a seminal paper GOTO statement considered harmful



                          A very readable, practical, nicely concise "using goto wisely" discussion






                          share|improve this answer






















                          • These are the best arguments against goto I've ever seen.
                            – t3chb0t
                            yesterday












                          up vote
                          3
                          down vote










                          up vote
                          3
                          down vote









                          goto, part deux




                          I don't agree with the mass consciousness that goto is a code smell.




                          code smell means a possible problem, not that is is a problem. So yes, it is a code smell. I too have used goto, only once in the last decade - and it was with C#.



                          Why goto could be bad



                          We do not trust ourselves to use it wisely and well. This is borne out in history. There are millllions of COBOL LOC demonstrating that when used routinely goto tends to corrupt flow control, create inter and intra method coupling, and spawns "one off" code duplication; all to such a degree that one should assume the code will break when changed. In other words the code becomes unmaintainable.




                          Why goto is bad



                          goto murders the bedrock of all good programs - structured programming and modular construction. goto literally goes around them. Anything you can say about modern languages, coding principles, object orientation, etc. is fundamentally rooted in these two precepts. full stop.



                          As for history: the unstructured nature of COBOL, for example, gave rise to an "hard jump" coding style and mind set. It's fair to say the liberal use of goto is a code lobotomy. Many factors contribute to crumbling code but the more gotos the more schizophrenic the code becomes. I have worked on a system that gave a dreadful feeling of random execution when stepping through with the debugger. Many existing bugs could not be duplicated and almost no bug could be reproduced reliably. 99 percent of code tweaks induced bugs. I am not exaggerating at all here! Our in house customer head guy told me "we don't bring up problems anymore because we know they won't be fixed."



                          So wide spread was the problem that Edsger Dijkstra wrote a seminal paper GOTO statement considered harmful



                          A very readable, practical, nicely concise "using goto wisely" discussion






                          share|improve this answer














                          goto, part deux




                          I don't agree with the mass consciousness that goto is a code smell.




                          code smell means a possible problem, not that is is a problem. So yes, it is a code smell. I too have used goto, only once in the last decade - and it was with C#.



                          Why goto could be bad



                          We do not trust ourselves to use it wisely and well. This is borne out in history. There are millllions of COBOL LOC demonstrating that when used routinely goto tends to corrupt flow control, create inter and intra method coupling, and spawns "one off" code duplication; all to such a degree that one should assume the code will break when changed. In other words the code becomes unmaintainable.




                          Why goto is bad



                          goto murders the bedrock of all good programs - structured programming and modular construction. goto literally goes around them. Anything you can say about modern languages, coding principles, object orientation, etc. is fundamentally rooted in these two precepts. full stop.



                          As for history: the unstructured nature of COBOL, for example, gave rise to an "hard jump" coding style and mind set. It's fair to say the liberal use of goto is a code lobotomy. Many factors contribute to crumbling code but the more gotos the more schizophrenic the code becomes. I have worked on a system that gave a dreadful feeling of random execution when stepping through with the debugger. Many existing bugs could not be duplicated and almost no bug could be reproduced reliably. 99 percent of code tweaks induced bugs. I am not exaggerating at all here! Our in house customer head guy told me "we don't bring up problems anymore because we know they won't be fixed."



                          So wide spread was the problem that Edsger Dijkstra wrote a seminal paper GOTO statement considered harmful



                          A very readable, practical, nicely concise "using goto wisely" discussion







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited 2 days ago

























                          answered 2 days ago









                          radarbob

                          5,2541025




                          5,2541025











                          • These are the best arguments against goto I've ever seen.
                            – t3chb0t
                            yesterday
















                          • These are the best arguments against goto I've ever seen.
                            – t3chb0t
                            yesterday















                          These are the best arguments against goto I've ever seen.
                          – t3chb0t
                          yesterday




                          These are the best arguments against goto I've ever seen.
                          – t3chb0t
                          yesterday










                          up vote
                          1
                          down vote













                          I can only add to the already said, that your algorithm has the potential to run for ever - theoretically speaking.




                          The most effective approach is to select chars from a list of valid chars. If using the numbers [0, 9] and the English letters [a, z] and [A, Z] we end up with 62 entries which will cause a byte based random selection be biased because 256 % 62 != 0. The problem can be solved by either adding two more chars resulting in 64 char:



                           const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";


                          or reduce the valid chars to 32:



                          const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUV";


                          There are pros and cons to both obviously, but both can be used i an efficient algorithm like:



                          public string Generate(int length)

                          const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";

                          using (var rng = new RNGCryptoServiceProvider())

                          char result = new char[length];
                          byte buffer = new byte[length];
                          int range = chars.Length;
                          rng.GetBytes(buffer);

                          for (int i = 0; i < length; i++)

                          result[i] = chars[buffer[i] % range];


                          return new string(result);




                          This should be unbiased.



                          I deliberately avoid the use of LINQ, because it is nearly always slower than more "native" code. This not to say, I in general avoid using LINQ, because I use it very often and gladly, but it is my experience that when it comes to performance, LINQ is often significantly slower than comparable, well written traditional code.






                          share|improve this answer


























                            up vote
                            1
                            down vote













                            I can only add to the already said, that your algorithm has the potential to run for ever - theoretically speaking.




                            The most effective approach is to select chars from a list of valid chars. If using the numbers [0, 9] and the English letters [a, z] and [A, Z] we end up with 62 entries which will cause a byte based random selection be biased because 256 % 62 != 0. The problem can be solved by either adding two more chars resulting in 64 char:



                             const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";


                            or reduce the valid chars to 32:



                            const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUV";


                            There are pros and cons to both obviously, but both can be used i an efficient algorithm like:



                            public string Generate(int length)

                            const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";

                            using (var rng = new RNGCryptoServiceProvider())

                            char result = new char[length];
                            byte buffer = new byte[length];
                            int range = chars.Length;
                            rng.GetBytes(buffer);

                            for (int i = 0; i < length; i++)

                            result[i] = chars[buffer[i] % range];


                            return new string(result);




                            This should be unbiased.



                            I deliberately avoid the use of LINQ, because it is nearly always slower than more "native" code. This not to say, I in general avoid using LINQ, because I use it very often and gladly, but it is my experience that when it comes to performance, LINQ is often significantly slower than comparable, well written traditional code.






                            share|improve this answer
























                              up vote
                              1
                              down vote










                              up vote
                              1
                              down vote









                              I can only add to the already said, that your algorithm has the potential to run for ever - theoretically speaking.




                              The most effective approach is to select chars from a list of valid chars. If using the numbers [0, 9] and the English letters [a, z] and [A, Z] we end up with 62 entries which will cause a byte based random selection be biased because 256 % 62 != 0. The problem can be solved by either adding two more chars resulting in 64 char:



                               const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";


                              or reduce the valid chars to 32:



                              const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUV";


                              There are pros and cons to both obviously, but both can be used i an efficient algorithm like:



                              public string Generate(int length)

                              const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";

                              using (var rng = new RNGCryptoServiceProvider())

                              char result = new char[length];
                              byte buffer = new byte[length];
                              int range = chars.Length;
                              rng.GetBytes(buffer);

                              for (int i = 0; i < length; i++)

                              result[i] = chars[buffer[i] % range];


                              return new string(result);




                              This should be unbiased.



                              I deliberately avoid the use of LINQ, because it is nearly always slower than more "native" code. This not to say, I in general avoid using LINQ, because I use it very often and gladly, but it is my experience that when it comes to performance, LINQ is often significantly slower than comparable, well written traditional code.






                              share|improve this answer














                              I can only add to the already said, that your algorithm has the potential to run for ever - theoretically speaking.




                              The most effective approach is to select chars from a list of valid chars. If using the numbers [0, 9] and the English letters [a, z] and [A, Z] we end up with 62 entries which will cause a byte based random selection be biased because 256 % 62 != 0. The problem can be solved by either adding two more chars resulting in 64 char:



                               const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";


                              or reduce the valid chars to 32:



                              const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUV";


                              There are pros and cons to both obviously, but both can be used i an efficient algorithm like:



                              public string Generate(int length)

                              const string chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+";

                              using (var rng = new RNGCryptoServiceProvider())

                              char result = new char[length];
                              byte buffer = new byte[length];
                              int range = chars.Length;
                              rng.GetBytes(buffer);

                              for (int i = 0; i < length; i++)

                              result[i] = chars[buffer[i] % range];


                              return new string(result);




                              This should be unbiased.



                              I deliberately avoid the use of LINQ, because it is nearly always slower than more "native" code. This not to say, I in general avoid using LINQ, because I use it very often and gladly, but it is my experience that when it comes to performance, LINQ is often significantly slower than comparable, well written traditional code.







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited 2 days ago

























                              answered 2 days ago









                              Henrik Hansen

                              5,5781622




                              5,5781622




















                                  up vote
                                  1
                                  down vote













                                  Testing for character ranges presumes ordering and continuity of those code points. Assuming the Latin alphabet, the algorithm of course works as expected using ASCII and Unicode, but breaks down if another alphabet should be used or in the highly unlikely case of more obscure character sets such as EBCDIC.



                                  An improved solution would be defining an array of allowable code points (as in "ABC...YZabc...xyz012..89", and then using the random number to index into this array. This resolves all above issues, and the function is trivially extendable to support additional characters as desired.



                                  As regards the goto statements, I don't see any real fault here. It is as easy to read and logically identical to a similarly constructed switch() statement with range cases.






                                  share|improve this answer








                                  New contributor




                                  stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                  Check out our Code of Conduct.





















                                    up vote
                                    1
                                    down vote













                                    Testing for character ranges presumes ordering and continuity of those code points. Assuming the Latin alphabet, the algorithm of course works as expected using ASCII and Unicode, but breaks down if another alphabet should be used or in the highly unlikely case of more obscure character sets such as EBCDIC.



                                    An improved solution would be defining an array of allowable code points (as in "ABC...YZabc...xyz012..89", and then using the random number to index into this array. This resolves all above issues, and the function is trivially extendable to support additional characters as desired.



                                    As regards the goto statements, I don't see any real fault here. It is as easy to read and logically identical to a similarly constructed switch() statement with range cases.






                                    share|improve this answer








                                    New contributor




                                    stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.



















                                      up vote
                                      1
                                      down vote










                                      up vote
                                      1
                                      down vote









                                      Testing for character ranges presumes ordering and continuity of those code points. Assuming the Latin alphabet, the algorithm of course works as expected using ASCII and Unicode, but breaks down if another alphabet should be used or in the highly unlikely case of more obscure character sets such as EBCDIC.



                                      An improved solution would be defining an array of allowable code points (as in "ABC...YZabc...xyz012..89", and then using the random number to index into this array. This resolves all above issues, and the function is trivially extendable to support additional characters as desired.



                                      As regards the goto statements, I don't see any real fault here. It is as easy to read and logically identical to a similarly constructed switch() statement with range cases.






                                      share|improve this answer








                                      New contributor




                                      stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.









                                      Testing for character ranges presumes ordering and continuity of those code points. Assuming the Latin alphabet, the algorithm of course works as expected using ASCII and Unicode, but breaks down if another alphabet should be used or in the highly unlikely case of more obscure character sets such as EBCDIC.



                                      An improved solution would be defining an array of allowable code points (as in "ABC...YZabc...xyz012..89", and then using the random number to index into this array. This resolves all above issues, and the function is trivially extendable to support additional characters as desired.



                                      As regards the goto statements, I don't see any real fault here. It is as easy to read and logically identical to a similarly constructed switch() statement with range cases.







                                      share|improve this answer








                                      New contributor




                                      stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.









                                      share|improve this answer



                                      share|improve this answer






                                      New contributor




                                      stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.









                                      answered 16 hours ago









                                      stephan mantler

                                      112




                                      112




                                      New contributor




                                      stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.





                                      New contributor





                                      stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.






                                      stephan mantler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.




















                                          up vote
                                          -3
                                          down vote













                                          I would use a HashSet:



                                          HashSet<byte> validChars = new HashSet<byte> 48, 49, 50 ; //add all valid

                                          ........

                                          rng.GetBytes(buffer);
                                          if(validChars.Contains(buffer[0]))

                                          chars.Add(char(buffer[0]));
                                          ++counter;






                                          share|improve this answer


















                                          • 3




                                            Mind elaborating on why you'd use a HashSet?
                                            – Shelby115
                                            2 days ago










                                          • @Shelby115 Uh, O(1) speed?
                                            – paparazzo
                                            2 days ago






                                          • 2




                                            @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
                                            – Pieter Witvoet
                                            yesterday











                                          • @PieterWitvoet Not buying. Good day.
                                            – paparazzo
                                            yesterday






                                          • 2




                                            Feel free to measure for yourself if you don't believe me.
                                            – Pieter Witvoet
                                            yesterday














                                          up vote
                                          -3
                                          down vote













                                          I would use a HashSet:



                                          HashSet<byte> validChars = new HashSet<byte> 48, 49, 50 ; //add all valid

                                          ........

                                          rng.GetBytes(buffer);
                                          if(validChars.Contains(buffer[0]))

                                          chars.Add(char(buffer[0]));
                                          ++counter;






                                          share|improve this answer


















                                          • 3




                                            Mind elaborating on why you'd use a HashSet?
                                            – Shelby115
                                            2 days ago










                                          • @Shelby115 Uh, O(1) speed?
                                            – paparazzo
                                            2 days ago






                                          • 2




                                            @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
                                            – Pieter Witvoet
                                            yesterday











                                          • @PieterWitvoet Not buying. Good day.
                                            – paparazzo
                                            yesterday






                                          • 2




                                            Feel free to measure for yourself if you don't believe me.
                                            – Pieter Witvoet
                                            yesterday












                                          up vote
                                          -3
                                          down vote










                                          up vote
                                          -3
                                          down vote









                                          I would use a HashSet:



                                          HashSet<byte> validChars = new HashSet<byte> 48, 49, 50 ; //add all valid

                                          ........

                                          rng.GetBytes(buffer);
                                          if(validChars.Contains(buffer[0]))

                                          chars.Add(char(buffer[0]));
                                          ++counter;






                                          share|improve this answer














                                          I would use a HashSet:



                                          HashSet<byte> validChars = new HashSet<byte> 48, 49, 50 ; //add all valid

                                          ........

                                          rng.GetBytes(buffer);
                                          if(validChars.Contains(buffer[0]))

                                          chars.Add(char(buffer[0]));
                                          ++counter;







                                          share|improve this answer














                                          share|improve this answer



                                          share|improve this answer








                                          edited yesterday

























                                          answered 2 days ago









                                          paparazzo

                                          4,9911733




                                          4,9911733







                                          • 3




                                            Mind elaborating on why you'd use a HashSet?
                                            – Shelby115
                                            2 days ago










                                          • @Shelby115 Uh, O(1) speed?
                                            – paparazzo
                                            2 days ago






                                          • 2




                                            @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
                                            – Pieter Witvoet
                                            yesterday











                                          • @PieterWitvoet Not buying. Good day.
                                            – paparazzo
                                            yesterday






                                          • 2




                                            Feel free to measure for yourself if you don't believe me.
                                            – Pieter Witvoet
                                            yesterday












                                          • 3




                                            Mind elaborating on why you'd use a HashSet?
                                            – Shelby115
                                            2 days ago










                                          • @Shelby115 Uh, O(1) speed?
                                            – paparazzo
                                            2 days ago






                                          • 2




                                            @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
                                            – Pieter Witvoet
                                            yesterday











                                          • @PieterWitvoet Not buying. Good day.
                                            – paparazzo
                                            yesterday






                                          • 2




                                            Feel free to measure for yourself if you don't believe me.
                                            – Pieter Witvoet
                                            yesterday







                                          3




                                          3




                                          Mind elaborating on why you'd use a HashSet?
                                          – Shelby115
                                          2 days ago




                                          Mind elaborating on why you'd use a HashSet?
                                          – Shelby115
                                          2 days ago












                                          @Shelby115 Uh, O(1) speed?
                                          – paparazzo
                                          2 days ago




                                          @Shelby115 Uh, O(1) speed?
                                          – paparazzo
                                          2 days ago




                                          2




                                          2




                                          @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
                                          – Pieter Witvoet
                                          yesterday





                                          @paparazzo: but this aspect of the original code is also O(1). Using a HashSet here is slightly slower due to extra overhead.
                                          – Pieter Witvoet
                                          yesterday













                                          @PieterWitvoet Not buying. Good day.
                                          – paparazzo
                                          yesterday




                                          @PieterWitvoet Not buying. Good day.
                                          – paparazzo
                                          yesterday




                                          2




                                          2




                                          Feel free to measure for yourself if you don't believe me.
                                          – Pieter Witvoet
                                          yesterday




                                          Feel free to measure for yourself if you don't believe me.
                                          – Pieter Witvoet
                                          yesterday

















                                           

                                          draft saved


                                          draft discarded















































                                           


                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function ()
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207171%2frandom-alphanumeric-password-generator-with-gotos%23new-answer', 'question_page');

                                          );

                                          Post as a guest














































































                                          Popular posts from this blog

                                          Top Tejano songwriter Luis Silva dead of heart attack at 64

                                          政党

                                          天津地下鉄3号線