# Exercise 15

**note** this exercise is particularly challenging. If you are a
beginniner programmer, enter at your own risk!

## XOR decryption

There is a site called Project Euler that lists a set of challening mathematical / computer programming problems.

This exercise is Project Euler Problem 59, "XOR decryption":

Each character on a computer is assigned a unique code and the preferred standard is ASCII (American Standard Code for Information Interchange). For example, uppercase A = 65, asterisk (*) = 42, and lowercase k = 107.

A modern encryption method is to take a text file, convert the bytes
to ASCII, then **XOR each byte with a given value**, taken from a secret
key. The advantage with the XOR function is that **using the same
encryption key on the cipher text, restores the plain text**; for
example, 65 XOR 42 = 107, then 107 XOR 42 = 65.

You can read about how bitwise XOR works on Wikipedia: bitwise XOR or here. Below I show you code snippets in Python, R and MATLAB for how to achieve bitwise XOR using integers.

For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both "halves", it is impossible to decrypt the message.

Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.

Your task has been made easy, as **the encryption key consists of three
lower case characters**. Using p059\_cipher.txt (right click and 'Save
Link/Target As…'), a file containing the encrypted ASCII codes, and
the knowledge that **the plain text must contain common English words**,
decrypt the message and find the sum of the ASCII values in the
original text.

## Hints

You will need three components to your program. **First** you will need
some facility to generate candidate encryption keys. **Second** you will
need some facility to generate plaintext, given the ciphertext and a
candidate encryption key (by using the XOR operation). **Third** you
will need some facility to analyse candidate plaintext to determine
whether it's the real plaintext. We know in this case the plaintext is
in English and so you'll need some facility for deciding whether a
certain plaintext is in English or not.

## Candidate encryption keys

We know that the encryption key consists of three lower case characters, and so we know there are \((26)(26)(26) = 17576\) possible keys. My suggestion would be to try a brute force method, e.g. for every possible encryption key, generate plaintext and decide whether it's English. One of them will be the correct one, you just have to find it.

## XOR in Python, MATLAB and R

In Python the function that returns the numerical ascii code for a
letter is `ord()`

. So for example `ord('a')`

returns `97`

. The
function that goes the other way, that returns a character given a
code, is `chr()`

so for example `chr(97)`

returns `a`

. To apply the XOR operator to two numbers we can use the `^`

operator, so for example `ord('a') ^ ord('b')`

returns `3`

.

In MATLAB the `double()`

function will return the ascii code for a
letter, so `double('a')`

returns `97`

. Going the other way, the
`char()`

function will convert an ascii code to a character, so
`char(97)`

returns `a`

. In MATLAB the `bitxor()`

function will perform
the XOR operation we need here.

In R there are the `charToRaw()`

and `rawToChar()`

functions but they
require a bit of TLC to work as you wish. Here are two wrapper
functions that help. The `asc()`

function converts a character into
the corresponding ascii value, so `asc("a")`

returns `97`

. The `chr()`

function goes the other way, so `chr(97)`

returns `a`

. Don't worry
about the details here for now.

asc <- function(x) { strtoi(charToRaw(x),16L) } chr <- function(n) { rawToChar(as.raw(n)) }

In R the XOR function `bitXor()`

in the `bitops`

package is what we
need. So first load the `bitops`

library using: `library(bitops)`

and
then the `bitXor()`

function will be available to us.

## Is the plaintext English?

One possibility is to simply take the words in the candidate plaintext
and see if they are found in an english dictionary. On Unix machines
(Linux, Mac OS X) you have a dictionary file in:
`/usr/share/dict/words`

.

Another possibility is to search the candidate plaintext for common English words such as "the" or "and" and so on.

Another possibility is to examine the relative frequencies of various letters in the candidate plaintext, and compare them to known frequencies of letters in English. Wikipedia has a nice page on frequency analysis in cryptanalysis. This might help you write a function to determine whether a given candidate plaintext is English or not.

## Super Hints

Here is some sample code that illustrates how to generate plaintext given the ciphertext and a candidate encryption key.

The first 5 ciphertext values in p059\_cipher.txt are
`79,59,12,2,79`

. Let's assume that our candidate encryption key is
`abc`

. Here is how you would use the candidate encryption key to
generate the first 5 characters of plaintext from the first 5 values
in the ciphertext:

In Python:

plaintext = "" plaintext += chr(79 ^ ord('a')) plaintext += chr(59 ^ ord('b')) plaintext += chr(12 ^ ord('c')) plaintext += chr(2 ^ ord('a')) plaintext += chr(79 ^ ord('b')) print "plaintext = %s" % (plaintext)

In MATLAB:

plaintext = ''; plaintext = [plaintext char(bitxor(79, double('a')))]; plaintext = [plaintext char(bitxor(59, double('b')))]; plaintext = [plaintext char(bitxor(12, double('c')))]; plaintext = [plaintext char(bitxor(2, double('a')))]; plaintext = [plaintext char(bitxor(79, double('b')))]; disp(['plaintext = ',plaintext]);

In R:

asc <- function(x) { strtoi(charToRaw(x),16L) } chr <- function(n) { rawToChar(as.raw(n)) } library(bitops) plaintext = "" plaintext = paste(plaintext, chr(bitXor(79, asc('a'))), sep="") plaintext = paste(plaintext, chr(bitXor(59, asc('b'))), sep="") plaintext = paste(plaintext, chr(bitXor(12, asc('c'))), sep="") plaintext = paste(plaintext, chr(bitXor(2, asc('a'))), sep="") plaintext = paste(plaintext, chr(bitXor(79, asc('b'))), sep="") cat("plaintext = ", plaintext,"\n")

In C:

#include <stdio.h> int main(int argc, char *argv[]) { char plaintext[6]; plaintext[0] = 79 ^ 'a'; plaintext[1] = 59 ^ 'b'; plaintext[2] = 12 ^ 'c'; plaintext[3] = 2 ^ 'a'; plaintext[4] = 79 ^ 'b'; plaintext[5] = '\0'; // null termination for the string printf("plaintext = %s\n", plaintext); return 0; }