August 29, 2012

Statistical Tests. Runs Test

According to Wikipedia, a numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; pseudorandomness is sufficient for many uses, such as statistics, hence the namestatistical randomness.

The main idea of a random sequence is that each number has equal chance of occurring.

It exists some tests for proving that a sequence of numbers are random. The first tests were published by M.G Kendall and Bernard Babington Smith, those tests was built on statisticals tools like Pearson's che-squared test.

The National Institute of Standards and Technology (NIST) have made some tests for random and pseudo-randomnumber generators that may be used for many purposes including cryptographic, modelingand simulation applications. There is a total of 16 tests; 14 of 16 of these tests are designed for generators that produce a sequence of bits, being therefore focused on cryptographic applications.

The main distributions used in these tests are the Standard Normal distribution and the Chi-square distribution. The first one is used to compare the test statistic obtained for the random number generator with the expected value of the statistic under the assumption of randomness. The Chi-square distribution is used to compare the goodness-of-fit of the observed frequencies of a sample measure to the corresponding expected frequencies of the hypothesis distribution

Well, for this week I choose the Runs Tests from these 14 tests, and here is some information about this test and also a program in python that takes the keys that I used to cypher the text in the One Time Pad programa of last week. The purpose of this test is to prove that Python's random it is actually random and no pseudorandom numbers.

What is the Runs Test?

Well, the main purpose of this test is to determine if the number of runs of ones and zeros are a random sequence. In this context a run is an uninterrupted sequence of identical bits. A run of length k consists of exactly k identical bits bounded before and after with a bit of the opposite value.

In general, the parameters of this test are:
  • n = the length of the input sequence in bits.
  • ε = the sequence of bits that is going to be tested. ε = ε1ε2… εn
  • The total number of sequences.
  • The total number of the first element (in cryptography is 1 or 0)
  • The total number of the second elelement (in cryptography is 1 or 0)
Having all this parameters now is turn to know which operations are going to be used:
  • Mean:

  • Variance:




  • Probability





Now, below is the code that asks for a message to the user and this message is converted into binary and after that the keys are generated



And finally here is the code of the Runs Test, that takes as input the file that has the keys to encrypt the message.


At the end of the program the probability is compared with 1.96 because this number is from a standard normal table; and at the 5 % significance level, a test statistic with an absolute value greater than 1.96 indicates non-randomness.



Here is the program running


Sources


1 comment:

  1. It would have been good to include multiple tests. 5 pts.

    ReplyDelete