|
RSA: The Complete Picture
INTRODUCTION:
This paper is the first one in the series of papers that lead to the design of RSA Crypto Chip using VHDL and as such can be used to understand RSA Algorithm. The first part lays down the foundation of RSA cryptosystem , all of it’s mathematical intricacies have been investigated and elucidated in the easiest possible manner. I have provided "Mathematical insights" to the underlying principles of RSA so as to make the whole thing more understandable and a pleasant learning experience. BACKGROUND:
RSA is a public-key cryptosystem developed by MIT professors: Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman in 1977 in an effort to help ensure internet security.This system is based on several mathematical principles in number theory. The basic technique was first discovered in 1973 by Clifford Cocks. A cryptosystem is an algorithm that can convert input data into something unrecognizable (encryption), and convert the unrecognizable data back to its original form (decryption). In a RSA cryptosystem, the sender encrypts a message with a private key. The sender's public key is posted (usually in a table). The recipient looks up the senders public key and uses it and her/his own private key to unlock the message. Private and public keys are associated by a function. In the RSA cryptosystem, the private and public keys are linked by the factorization of prime numbers The challenge of public-key cryptography is developing a system in which it is impossible to determine the private key. This is accomplished through the use of a one-way function. With a one-way function, it is relatively easy to compute a result given some input values. However, it is extremely difficult, nearly impossible, to determine the original values if you start with the result. In mathematical terms, given x, computing f(x) is easy, but given f(x), computing x is nearly impossible. The one-way function used in RSA is multiplication of prime numbers. It is easy to multiply two big prime numbers, but for most very large primes, it is extremely time-consuming to factor them. Public-key cryptography uses this function by building a cryptosystem which uses two large primes to build the private key and the product of those primes to build the public key. KEY GENERATION ALGORITHM
1 Generate two large random prime numbers say, p and q, of approximately equal size such that their product n(n is known as the modulus) = pq is of the required bit length, e.g. 1024 bits.
1.1 To generate the primes p and q, generate a random number of bit length b/2 where b is the required bit lengthof n
1.2 Set the low bit (this ensures the number is odd) and set the two highest bits (this ensures that the high bit of n is also set)
1.3 Check if prime (use the Rabin-Miller test); if not, increment the number by two and check again. This is p
1.4 Repeat for q starting with an integer of length b-b/2.
1.5 If p1.6 In the extremely unlikely event that p = q, check your random number generator. For greater security, instead of incrementing by 2, generate another random number each time.
2 Compute modulus,n = pq and phi(n) = (p-1)(q-1).
In the eighteenth century, the mathematician Leonhard Euler (pronounced "Oiler") described j(n) as the number of numbers less than n that are relatively prime to n. The character is the Greek letter "phi" (in math circles it rhymes with "tea," in the academic organization Phi Beta Kappa it rhymes with "tie"). This is known as Euler's phi-function.
Two numbers are relatively prime if they share only one factor, namely 1. For example, 10 and 21 are relatively prime. Neither is prime, but the numbers that evenly divide 10 are 1, 2, 5 and 10, whereas the numbers that evenly divide 21 are 1, 3, 7 and 21. The only number in both lists is 1, so the numbers are relatively prime.
So j(6), for instance, is 2, since of all the numbers less than 6 (1, 2, 3, 4 and 5), only two of them (1 and 5) are relatively prime with 6. The numbers 2 and 4 share with 6 a common factor other than 1, namely 2. And 3 and 6 share 3 as a common factor.
What about j(7)? Because 7 is prime, its only factors are 1 and 7. Hence, any number less than 7 can share with 7 only 1 as a common factor. Without even examining those numbers less than 7, we know they are all relatively prime with 7. Since there are 6 numbers less than 7,j (7) = 6. Clearly this result will extend to all prime numbers. Hence if n is prime number and p and q are it's factors which themself are prime then , j(n) = (p - 1)(q-1) >
3 Choose an integer e know as the public exponent or encryption exponent, 1 < e < phi, such that gcd(e, j) = 1.
"gcd" stands for greatest common divisor and Euclid's algorithm finds the greatest common divisor of two positive integers. The greatest common divisor (gcd) is the largest number that evenly divides two numbers. For instance, the gcd of 12 and 16 is 4.
Euclid's algorithm is as following. Given two positive integers (u, v),
STEP 1: Does v divide u evenly? Compute u/v, if there is no remainder, the answer is v. If not, continue to Step 2.
STEP 2: Replace u with the remainder of Step 1. If the remainder of u/v (which can be denoted u mod v) is 1, the numbers are relatively prime and the answer is 1. If not, return to step 1 with the numbers (v, u mod v).
For example, find the gcd of (945, 217).
Step 1: 945/217 = 4 rem 77
Step 2: find gcd(217, 77)
Step 1: 217/77 = 2 rem 63
Step 2: find gcd(77, 63)
Step 1: 77/63 = 1 rem 14
Step 2: find gcd(63, 14)
Step 1: 63/14 = 4 rem 7
Step 2: find gcd(14, 7)
Step 1: 14/7 = 2, no remainder
Hence, the gcd of (945, 217) is 7.>
3.1 In practice, common choices for e are 3, 17 and 65537(2^16+1). These are Fermat primes and are chosen because they make the modular exponentiation operation faster.
3.2 Also, having chosen e, it is simpler to test whether gcd(e, p-1)=1 and gcd(e, q-1)=1 while generating and testing the primes in step 1. Values of p or q that fail this test can be rejected there and then
4 Compute the secret exponent or decryption exponent d, 1 < d < j, such that ed = 1 (mod j).
4.1 To compute the value for d, use the Extended Euclidean Algorithm to calculate d = e^-1 mod phi (this is known as modular inversion).
5 The public key is (n, e) and the private key is (n, d). The values of p, q, and phi should also be kept secret. APLICABILITY OF RSA:
The RSA algorithm can be used for both encryption and digital signatures, and each of these have been explained as below: ENCRYPTION
Let us assume that a sender A is sending an encrypted message to receiver B.
Sender A does the following:-
1 Obtains the recipient B's public key (n, e).
2 Represents the plaintext message as a positive integer m < n.
2.1 When representing the plaintext octets as the representative integer m, it is usual to add random padding characters to make the size of the integer m large and less susceptible to certain types of attack.
2.2 NoteIf m = 0 or 1 there is no security as the ciphertext has the same value.
3 Computes the ciphertext c = m^e mod n.
It turns out that prime numbers possess various useful properties when used in modular math. The RSA algorithm will take advantage of these properties. Modular math means that the only numbers under consideration are the non-negative integers less than the modulus. So for mod n, only the integers from 0 to (n - 1) are valid operands and results of operations will always be numbers from 0 to (n - 1). Think of military time where the modulus is 2400. For instance, 2200 plus 400 (10:00 PM plus 4 hours) is not 2600. Once you reach 2400, you start over at 0. Hence, 2200 + 400 mod 2400 is 2600 - 2400 = 0200, or 2:00 in the morning. Likewise, if we start at 0, or midnight, 6 times 500 (say six 5-hour shifts) is not 3000, but 0600, or 6:00 AM the following day. Hence (7 * 343) mod 2400 = 2401 - 2400 = 1 mod 2400
DECRYPTION
Recipient B does the following:-
1 Uses his private key (n, d) to compute m = c^d mod n(this calculation is called modular exponentiation ). Because of the relationship between d and e, the algorithm correctly recovers the original message m, since Cd mod n = (me)d = med = m1 = m mod n
2 Extracts the plaintext from the integer representative m. DIGITAL SIGNING
Sender A does the following:-
1. Creates a message digest of the information to be sent.
2. Represents this digest as an integer m between 0 and n-1
Decryption and signing are identical as far as the mathematics is concerned as both use the private key. Similarly, encryption and verification both use the same mathematical operation with the public key. However, note these important differences:-
The signature is derived from a message digest of the original information. The recipient will need to follow exactly the same process to derive the message digest.
The recommended methods for deriving the representative integers are different for encryption and signing
3. Uses her private key (n, d) to compute the signature s = m^d mod n.
4. Sends this signature s to the recipient, B.
SINGNATURE VERIFICATION
Recipient B does the following:-
1. Uses sender A's public key (n, e) to compute integer v = s^e mod n.
2. Extracts the message digest from this integer.
3. Independently computes the message digest of the information that has been signed.
4. If both message digests are identical, the signature is valid. SUMMARY OF RSA · n = pq where p and q are distinct primes. · phi, ö = (p-1)(q-1) · e < n such that gcd(e, j)=1 · d = e^-1 mod j. · c = m^e mod n. · m = c^d mod n.
EXAMPLE : First two prime numbers are chosen.
prime1 7
prime2 11
n = prime1 * prime2 77
j 60
Public key calculation:
The "keys" used to encrypt and dencrypt the data are pairs of numbers, (e,n) and (d,n) where "e" stands for encryption and "d" for decryption.
'e' is a number chosen to be less than, and which shares no factors with, the j=60
It is often prime but does not have to be; clearly no even number will do, however. In this case the value 13 works
So the receiver has created a public key (13, 77) and publishes it to the world
Using this public key anyone encrypt messages using the process below and safely transmit them to the receiver. No one can decrypt the message unless they can calculate the private key, which requires them to know the factors, because they must use the value j. So the message is secure until the value of 'n' is factorised, when the messages can be decrypted. (In this case, of course factorising 77 is easy).
Private key calculation:
n =77, is known but to calculate 'd', the prime factors of 'n': 7 and 11 are needed to get the product because 'd' is an integer that obeys the equation: d*e MOD j = 1,
in this case: d * 13 MOD 60 = 1
This equation says that if d is multipled by e and divided by the value j, the remainder is 1.
In this case the smallest value of d which which does this is 37 since
37 * 13 = 481 and 481/60 = 8, remainder 1
Encrypting a message:
Now we have calculated n, j, e and d.
Messages sent to the receiver are encrypted by the transmitter according to the formula:
C = M^e MOD n where M represents a message unit (e.g a byte) and C the corresponding Cypher unit. i.e. each byte is raised to the power e, and the remainder found when the result is divided by n (For real messages some very large numbers must be calculated, but as shown below in decrypting, there is a way of keeping the values manageable by doing the calculation in stages).
For example the Message 12345 is broken into Message units:
Message Units 1 2 3 4 5
M^13 1 8192 1594323 67108864 1220703125
M^13 MOD 77 1 30 38 53 26
OutGoing 1 30 38 53 26
And these are then sent. Note that neither 0 nor 1 cannot be encrypted, since raising either to any positive integer power does not change them.
To decrypt the message the reverse has to be done. Trying this directly and it will not work; if 6^13 is too big, then 30^37 is much too big and gives a #NUM error!.
Incoming 1 30 38 53 26
=M^37 1 4.50 E+54 2.83 E+58 6.28 E+63 2.25 1E+52
M^37 MOD 77 1 #NUM! #NUM! #NUM! #NUM!
However, since it is a MODULUS process it can be tackled in stages, but each time finding the remainder after division by 77. The full table looks like this:
Incoming 1 30 38 53 26
=M^2 1 900 1444 2809 676
=M^2 MOD 77 1 53 58 37 60
=M^3 MOD 77 1 50 48 36 20
=M^4 MOD 77 1 37 53 60 58
=M^5 MOD 77 1 32 12 23 45
=M^6 MOD 77 1 36 71 64 15
=M^7 MOD 77 1 2 3 4 5
=M^8 MOD 77 1 60 37 58 53
=M^9 MOD 77 1 29 20 71 69
=M^10 MOD 77 1 23 67 67 23
=M^11 MOD 77 1 74 5 9 59
=M^12 MOD 77 1 64 36 15 71
=M^13 MOD 77 1 72 59 25 75
=M^14 MOD 77 1 4 9 16 25
=M^15 MOD 77 1 43 34 1 34
=M^16 MOD 77 1 58 60 53 37
=M^17 MOD 77 1 46 47 37 38
=M^18 MOD 77 1 71 15 36 64
=M^19 MOD 77 1 51 31 60 47
=M^20 MOD 77 1 67 23 23 67
=M^21 MOD 77 1 8 27 64 48
=M^22 MOD 77 1 9 25 4 16
=M^23 MOD 77 1 39 26 58 31
=M^24 MOD 77 1 15 64 71 36
=M^25 MOD 77 1 65 45 67 12
=M^26 MOD 77 1 25 16 9 4
=M^27 MOD 77 1 57 69 15 27
=M^28 MOD 77 1 16 4 25 9
=M^29 MOD 77 1 18 75 16 3
=M^30 MOD 77 1 1 1 1 1
=M^31 MOD 77 1 30 38 53 26
=M^32 MOD 77 1 53 58 37 60
=M^33 MOD 77 1 50 48 36 20
=M^34 MOD 77 1 37 53 60 58
=M^35 MOD 77 1 32 12 23 45
=M^36 MOD 77 1 36 71 64 15
=M^37 MOD 77 1 2 3 4 5
The last line gives back the original message, .
decrypted 1 2 3 4 5 EFFICIENCY OF RSA: 1 Key generation is only carried out occasionally and so computational efficiency is less of an issue.
2 On average, doubling the modulus length will increase the time required for public key operations (encryption and signature verification) by a factor of four. The time for private key operations (decryption and signing) by a factor of eight
3 One efficient method to carry out modular exponentiation(a = b^e mod n ) on a computer is the binary left-to-right method and an appropriate choice of e can reduce the computational effort required to carry out the computation of c = m^e mod n. Popular choices like 3, 17 and 65537 are all primes with only two bits set: 3 = 0011'B, 17 = 0x11, 65537 = 0x10001.
4 The bits in the decryption exponent d, however, will not be so convenient and so decryption using the standard method of modular exponentiation will take longer than encryption. SECURITY ISSUE:
Remember that it is possible to obtain the private key d from the public key (n, e), by factoring n into p and q. In order to find d, one must know the product (p - 1)(q - 1). But to find that value, one must know p and q. For example, in the earlier example, an eavesdropper would know that pq = 55, but what is (p - 1)(q - 1)?
Finding the prime factors of 55 is easy, they are 5 and 11. However, for very large numbers, factoring is very difficult. The RSA Laboratories publication, "Frequently Asked Questions About Today's Cryptography," (the "FAQ") describes the state of the art in factoring. In short, factoring numbers takes a certain number of steps, and the number of steps increases exponentially as the size of the number increases. Even on supercomputers, the time to execute all the steps is so great that for large numbers it could take years to compute. According to the "FAQ," the current threshold of general numbers which can be factored is about 129 digits. Within a short time, that limit will probably rise to 155 digits. Most applications will use a 231- to 308- or even 616-digit RSA modulus. THE RSA CHALLENGE:
RSA Laboratories sponsors the RSA Factoring Challenge to encourage research into computational number theory and the practical difficulty of factoring large integers. The contest is based on two "challenge lists" of numbers. The first list, the "RSA List," contains numbers representative of those used in the RSA cryptosystem. The second list, the "Partition List," contains what mathematicians define as "partition numbers." The shortest number in each list is 100 decimal digits long (well within the current state of the art). The RSA List contains numbers up to 500 digits in length, while the Partition List contains arbitrarily large numbers.
The challenge is to factor the numbers on these lists The RSA List contains numbers of the kind we believe to be the hardest to factor; the numbers on this list should be particularly challenging. These are the kind of numbers used in devising secure RSA cryptosystems. The Partition List contains numbers that are more-or-less "random"; some of these numbers (even very large ones) may be quite easy to factor, although others may be quite difficult. Although RSA Data Security views the RSA List as the primary and most interesting challenge list, the Partition List challenge is also provided in order to encourage work on factoring in general as well as on factoring numbers of cryptographic interest.
One number on the RSA list was factored with the help of 800 computers that crunched numbers for eight months. For more details, see www.rsa.com <http://www.rsa.com/>. REFERENCE:
1 “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems". Communications of the ACM, 21 (2), pp. 120-126, February 1978. Author: I hold a B.Tech(Hons) degree in Electronics & Communication from National Institute Of Technology, Hamirpur India, and work as a research associate in High End Technologies Group, a consortium of freelancers working towards the common goal of providing technology solutions. If you have any suggestions, queries with regard to this paper then I can be contacted on abhishek_sood@r....
|
 |