Black-Box Assessment of Pseudorandom Algorithms
By Derek Soeder
Thanks to everyone who attended our Black Hat presentation! For those who couldn't make it, the single-sentence version is that we presented a methodology based on our Prangster tool for recovering the internal state of many pseudorandom number generators (PRNGs) given only an output string of pseudorandom characters, and from there, predicting future and past output for the sake of exploitation.
Originally we wanted to talk about the math and algorithms that made our attacks so much faster than naive brute-force, but we realized the presentation would be more useful to practitioners—assessors, auditors, penetration testers—and most vulnerability researchers if we focused on how to perform practical attacks. If you have an appetite for the algorithmic, be sure to read our white paper.
In this post, we'll walk through the four-step methodology in greater detail and point out some possibilities and considerations that there just wasn't time to cover on stage. The presentation established some important background on how to identify a PRNG and especially what we mean by "alphabet," so we certainly recommend checking out the slides if you missed the talk or could use a refresher.
1. Identify pseudorandom application output
Given some application (really, any code making use of a PRNG) that you want to attack, the first step is to identify pseudorandom output being generated by the application and used in a security-sensitive context. Interact with the application and identify random-looking portions of its output, which we'll call samples. If a sample appears to be encoded (such as in hexadecimal or base64), try decoding it and see if the decoded bytes still appear random.
The sample should be considered in context. If you know or suspect its intended meaning to the application, does a need for unpredictability or collision resistance make sense there? Some common situations where security depends on the unpredictability of the sample include session IDs, temporary tokens, generated passwords, authentication challenge nonces, and CAPTCHA answers. Generated file names might also need to be unpredictable to prevent deliberate access or accidental name collisions.
Often a visual inspection is enough to find data that looks like it could be random, but to rule out a candidate sample as a red herring takes some effort. In the rest of this section, we'll discuss some misleading forms of data which Prangster isn't able to handle, and we'll provide what tips we can for identifying them.
If the sample is four bytes in size and (as of this writing) appears to constitute a number in the neighborhood of 0x51000000, or if it's eight bytes in size and in the vicinity of 0x01CE000000000000, then it might be a timestamp. Timestamps can appear random in isolation but should be easy enough to identify during the second step. Checksums, hashes, and ciphertexts, on the other hand, also appear random yet are far harder to distinguish from pseudorandom samples. In these cases, consider if hashing or encryption makes sense in context, and try comparing the samples to hashes or checksums of other nearby data, computed using common algorithms such as CRC-32, MD5, and SHA-1. A twenty-byte sample with no apparent byte value restrictions is fairly likely to be a SHA-1 hash.
Finally, try searching the Internet for various representations of the bytes (e.g., "AA BB", "AABB", "BB AA", and "BBAA", with and without "0x" prefixes) to determine if they might actually constitute a well-known "magic" constant. In the case of a sixteen-byte sample with no obvious patterns or restrictions in its byte values, it might be a Universally Unique Identifier, or UUID, also known as a Globally Unique Identifier (GUID) or in some Windows contexts a Class Identifier (CLSID) or Interface Identifier (IID). As described in RFC 4122, UUIDs have some internal structure which helps identify them. If the top four bits of the eighth byte are 0001, then the sample is most likely a UUID if the last six bytes make sense as a MAC address (search the IEEE OUI public listing for the first three bytes of the presumptive MAC address) and if the first 60 bits make sense as a timestamp (the eighth and seventh bytes should fall in the range 11CE through 11E3 for a timestamp in the years 1994 through 2013). If the sample appears largely random but the top four bits of the eighth byte are 0100, then it might still be a UUID; this should become clear during the second step if that four-bit field remains static. In either case, also search the Windows registry and the Windows SDK's header and IDL files for hexadecimal representations of portions of the sample.
We'll consider the value of gathering multiple samples in the next section, but the approach does have applicability in spotting pseudorandomness or the lack thereof. For example, the determination that a sample only monotonically increases over time suggests that it might be a counter or timestamp, while samples that only change when some accompanying bytes change might be a checksum or hash. If a sample's length varies, but only in multiples of 8 or 16 bytes, then it could be ciphertext encrypted using a block cipher; the repetition of some random-looking blocks (suggestive of ECB mode encryption) would corroborate this possibility.
If you still suspect you've identified a sample of pseudorandom output and not one of these red herrings, then it's time for analysis.
2. Analyze multiple samples
The second step involves gathering as many samples as possible and analyzing them collectively. Typically this means issuing multiple requests to the application and extracting samples from the same place in each response. If the samples don't vary, you might need to modify the request (including any associated state or metadata) to provoke a different response, although it could be the case that the sample is somehow based on the request and isn't actually pseudorandom. If the samples vary predictably, like a counter or timestamp, or if certain portions remain fixed while others change, then you might need to refine which portion of the output you're collecting as a sample or give up on this portion altogether.
At the very least, you need enough samples to know or guess the full set of values from which the application can choose the symbols that constitute each sample. If you're not at liberty to gather many samples, you might have to guess whether or not certain symbol values can occur; otherwise, more samples are always better. Count the number of times each symbol value occurs in the sample set. It takes a lot (dozens to potentially millions) of samples to infer a statistically significant bias, but to the extent possible, attempt to identify which symbol values occur more or less frequently than the others. As described in our white paper, subtle biases can provide important hints about which PRNG is being used and how it is being used, the size of the alphabet, and the ordering of symbol values within the alphabet. Large biases, such as a symbol value being favored two-to-one over other values, can indicate that a symbol value appears in the alphabet multiple times.
Once you have the full set of symbol values, the tricky part is arranging them into the correct alphabet. Often digits and letters appear sequentially, but they could reasonably be expected to appear in keyboard order (QWERTYUIOP or 1234567890). Punctuation could be ordered according to ASCII code, keyboard position, or arbitrarily. And while it's common for classes of symbol values to be grouped together, even then it isn't clear how those groups might be ordered relative to one another.
In any case, the point of this step is just to gather information for making educated guesses about the alphabet and the PRNG; in the next step, we'll put these guesses to the test.
3. Seed recovery
Now our goal is to determine the seed that produced a given sample of pseudorandom output, and in doing so, conclude with certainty the use of an insecure PRNG and prepare to attack the application that uses it. This is the primary function of the Prangster tool; all it needs is the output and the right PRNG and alphabet.
In our presentation, we emphasized the difference between a PRNG's state and the seed used to initialize it, but for now, we'll only consider PRNGs where the state and seed are effectively interchangeable. Most of the PRNGs we've examined satisfy this condition—specifically, all but the array-based PRNGs of the Microsoft .NET Framework, the GNU C Library (glibc), and PureBasic. Since every function of Prangster requires the PRNG to be explicitly specified, this seems like an important point to call out before we get started.
So with that in mind, it's time to run Prangster. For reference, Prangster's usage text is as follows:
Usage: Prangster.exe r <prng> <alphabet> Prangster.exe g <prng> <alphabet> <seed> <length> Prangster.exe s <prng> <seed> <offset> Where: r recovers seeds that generate the output read from stdin (use "echo xxx|" or "<xxx.txt", or type and press Ctrl+Z to end) g if <length> is positive, reproduces the next <length> output symbol(s) generated after initializing to state <seed>; if <length> is negative, reproduces in forward order the previous -<length> output symbol(s) generated before reaching state <seed> s seeks from <seed> by <offset> state(s) and displays the seed representing the new state; a negative <offset> seeks backward <prng> is one of the following PRNGs: PrngBsdLibc PrngBsdLibcOld PrngDotNet PrngDotNetMod PrngGlibc3 PrngJava PrngMssql PrngMsvcrt PrngMsvcrtMul PrngMysql PrngPureBasic PrngV8 PrngVbscript <alphabet> is a string that maps the symbols comprising the application's pseudorandom output to numbers, based on a given symbol's position in the string or the symbol at a given position
To begin, run echo sample | Prangster.exe R PRNG alphabet, where sample is the sample of pseudorandom output, PRNG is your best guess at the non-array-based PRNG being used (from the choices listed by Prangster), and alphabet is a guess at the alphabet. Depending on the PRNG, the alphabet, and the length of the sample, this could take from less than a second to many minutes to complete. Prangster will print any seed values it recovers to stdout, in decimal and one per line, and it will also print dots to stderr to indicate progress for some PRNGs.
If Prangster doesn't recover any seeds, most likely the choice of PRNG or alphabet was incorrect. Try other PRNGs, based on information about the application or the platform hosting it (for example, from OS fingerprinting, file extensions, "Server" and "Powered-By" banners, etc.), or just guess. Looking for biases and evidence of recurrence relations might also be of use here. Separately, try rearranging the alphabet to address the uncertainties mentioned in the previous section. With any luck, some combination will yield exactly one seed.
Of course, there is the possibility that Prangster will find too many seeds. This usually means that the sample is too short to be unique—many seeds, probably on many PRNGs, will produce that same sample. A decent rule of thumb for a minimum sample size is logN of the number of seed values possible for the PRNG, where N is the size of the alphabet. If you're seeing more than a couple seeds for a given sample, try another sample (especially a longer sample if their lengths vary), or try concatenating two samples, perhaps with a guessed symbol in between. Internally Prangster has the concept of "wildcard" symbols, which would be a big help when trying to join two samples with some amount of hidden pseudorandomness consumption in between, but this functionality isn't currently exposed by the tool. If you find yourself with a surfeit of seeds and you're feeling brave, consider diving into the Prangster source code and hacking in some ad hoc wildcard support.
For the sake of this post, let's assume you hit on a PRNG-plus-alphabet combination that yields exactly one seed for some sample. This is a very significant finding: you've just determined that the application uses an insecure PRNG, and it's probably attackable. If your mission is just to point out a potential vulnerability, then that finding might be good enough. But if only exploitation will do, let's take it one step further.
Earlier we mentioned the concept of "security through unpredictability" to explain how an application's use of a random number generator might arise in a security-sensitive context. With knowledge of the PRNG's full state, we can violate this security assumption by predicting the unpredictable, our primary goal at this step. Recovering the entropy used to derive the seed could also reveal interesting information and enable further exploitation, so we'll subsequently explore that possibility. But first, let's examine some of the practical considerations around prediction.
After step 3, we have some crucial information about how the application obtains and processes pseudorandom numbers, but we could always use more. First, it's important to know what the application does with the PRNG when it's not generating the output we get to see. Try collecting two samples as close together in time as possible, and determine if there's a gap between them. If the sample size is uniform and the gap is some multiple of the sample size, then it's possible that other samples were generated between yours. If the gap isn't exactly a multiple, then the application must be consuming pseudorandomness for some purpose other than emitting pseudorandom symbols. (If sample lengths vary, then the pseudorandom number drawn before generating a sample's symbols is likely used to decide a length.) Either way, this gap is important for being able to extract the correct portion of the pseudorandom stream and thereby predict the samples received by others, so study it carefully. (As an aside, measuring gaps over time can also provide interesting details about the utilization of an application.)
While it is possible to seek a seed forward or backward using Prangster.exe S PRNG seed offset until it matches another seed, an easier way to detect and measure a gap is to generate a large amount of output after or before one sample and search for the other sample in Prangster's output. Use Prangster.exe G PRNG alphabet seed length and redirect stdout to a text file or some utility so that the position of the second sample can be determined relative to the beginning (when generating forward) or end (when generating backward) of Prangster's output.
If you're not able to discern a sensible gap between samples, this generally means that the samples effectively (but not technically) came from different pseudorandom streams—in other words, they were generated by separate instances of the PRNG or by a single instance that is being reseeded. To address the former case, try requesting many samples in rapid succession and check for a gap between every possible pair of samples. The idea is that some hopefully finite number of processes or threads is serving up samples, each with its own instance of the PRNG, and that eventually two of your requests will be served by the same worker. If you're interested in sampling every worker (useful in a practical attack), consider leaving connections open or crafting long-running requests to ensure that workers stay busy for as long as possible, although be aware that this could cause the application to create new workers if it starts feeling starved. Another approach, useful both in the case of new workers and in the case of a reseeded PRNG, is to recover the entropy and use that to predict future seedings. If a single PRNG instance is reseeded before generating each sample, then the seeds recovered for two samples should be close assuming that a common entropy source—a time, counter, or process or thread ID—is used. Even if some pseudorandomness is consumed between reseeding and sample generation, it should be easy to rewind two recovered PRNG seeds to a point where they appear proximate. Of course, it's possible that the application uses a more chaotic entropy source than we're anticipating here, in which case prediction would depend on a deep analysis of recovered seeds which is far beyond the scope of this post.
With any luck, by this point you've been able to arrive at an understanding of how the application uses its PRNG, to the extent that you can now predict the pseudorandom output that will be or has been generated for other users or other purposes. Obtain a seed, manually adjust it if necessary to compensate for reseeding (this could require a number of attempts to get right), seek the seed forward or backward if predicting into the distant future or past, and then have Prangster generate a length of output sufficient to reveal the pseudorandom application output of interest. How to make use of this output is application-specific, but generally it involves using prediction to achieve "psychic" read access to pseudorandom information you don't receive—for example, opportunistic prediction of some other user's session ID, or the determination of a password reset token emailed to an account you can't access. In an ideal attack scenario, everything should now be predictable and pseudorandomness no better than obfuscation.
As promised, we wanted to include a few thoughts on the value of recovering entropy from seeds. The most obvious application, which came up earlier in the context of accommodating PRNG reseeding, is determining what data is being used as entropy (for instance, some timer or counter) and then using knowledge of that data and predictions about how it changes over time to guess seeds. But typical sources of entropy can be interesting for other reasons. For example, knowing a system's clock time could reveal its clock skew and clock drift (interesting for fingerprinting a system), or perhaps its time zone (and thus some clue to its physical location). System uptime is another common entropy source, and is at least more unique than clock time. When they're included in the seed, it might also be possible to extract process IDs and thread IDs, although it's more difficult to suggest compelling uses for these. In general, most of the common entropy sources could likely be used to fingerprint a platform—for instance, on Windows, the system uptime is most often taken as a count of milliseconds that increases by 15 or 16 at a time, while the clock time might be expressed as a count of tenths of microseconds, and process and thread IDs are a multiple of four on all modern versions.
Of course, in attacking pseudorandom behavior, as in all exploitation, tools and methodologies can only get you so far. Beyond those tasks where computers excel, ultimately your imagination, resourcefulness, and persistence as a tester or a researcher are the most enabling factors. We leave you with this final thought:
java.util.Random prng = new java.util.Random(113301085335054L); String alphabet = "abcdefghijklmnopqrstuvwxyz !"; for (int i = 0; i < 10; i++) System.out.write(alphabet.charAt(prng.nextInt(alphabet.length())));
Sr. Security Researcher