wordsTOcodons
This simple function takes a single argument of a text string containing any of the letters A to Z and spaces. It treats this as a protein sequence represented with single letter IUPAC amino acid codes. It converts this to the DNA sequence that would code for this protein. This function makes use of a couple of codons that are normally stop codons but in certain organisms have been coopted to code for unusual amino acids. This is how we are able to expand the alphabet to include O and U. We further expand the alphabet to include B, J, X, and Z by utilizing ambiguity abbreviations from IUPAC
U is the abbreviation for selenocysteine it is encoded by a TGA codon, which is normally a stop codon.
O is the abbreviation for pyrrolysine it is encoded by a TAG codon, also normally a stop codon.
B is the abbreviation for aspartic acid (GAT) or asparagine (AAT) so we represent this with RAT since R is the ambiguity code for A or G
J is the abbreviation for leucine (CTT) or isoleucine (ATT) so we represent this with MTT since M is the ambiguity code for C or A.
X is the abbreviation for any amino acid so we represent it with NNN
Z is the abbreviation for glutamic acid (GAA) or glutamine (CAA) so we represent this with SAA since S is the ambiguity code for C or G
wordTOcodons <- function(x){
table <- matrix(c('A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I',
'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V', 'O', 'U',
'B', 'J', 'X', 'Z', ' ',
'GCT', 'CGC', 'AAT', 'GAT', 'TGT', 'CAG',
'GAA', 'GGT', 'CAT', 'ATT', 'TTA', 'AAG', 'ATG', 'TTC',
'CCA', 'TCA', 'ACT', 'TGG', 'TAC', 'GTT', 'TAG', 'TGA',
'RAT', 'MTT', 'NNN', 'SAA', ' '),27,2)
x <- toupper(x)
x <- strsplit(x, "")[[1]]
result <- list()
for(i in 1:length(x)){
result[[i]] <- table[table[,1] == x[i] , 2]
if(!x[i] %in% table[,1]){
result[[i]] <- '-'
}
}
return(paste(unlist(result), sep="", collapse=" "))
}
> wordTOcodons("Heath Blackmon")
RandomMonkey <- function(target){
target <- toupper(target)
i <- 0
ptm <- proc.time()
target <- strsplit(target, split="")[[1]]
mutations <- c(toupper(letters)," ", ".",",",";")
result <- sample(mutations, size=length(target), replace=T)
check <- F
while(check == F){
result <- sample(mutations, size=length(target), replace=T)
if(sum(result == target) == length(target)) check <- T
mx <- 40
if(length(result)<40) mx <- length(result)
cat(paste(paste(result[1:mx], collapse=""), "\n"))
i <- i + 1
}
x<-proc.time() - ptm
cat(paste(round(x[1], digits=4), "time elapsed. Perfect match required", i, "attempts"))
}
SelectedMonkey <- function(target){
target <- toupper(target)
i <- 0
ptm <- proc.time()
target <- strsplit(target, split="")[[1]]
mutations <- c(toupper(letters)," ", ".",",",";")
result <- sample(mutations, size=length(target), replace=T)
check <- F
while(check == F){
result[result!=target] <- sample(mutations, size = sum(result!=target), replace=T)
if(sum(result == target) == length(target)) check <- T
mx <- 40
if(length(result)<40) mx <- length(result)
cat(paste(paste(result[1:mx], collapse=""), "\n"))
i <- i + 1
}
x<-proc.time() - ptm
cat(paste(round(x[1], digits=4), "time elapsed. Perfect match required", i, "attempts"))
}
First we try "evolving" my sons name (Leslie) by just randomly picking letters the correct number of letters. If we fail in any way we throw it out and try again.
RandomMonkey("leslie")
SelectedMonkey("leslie")

To forestall comments I should say that I am aware that Dawkins was not the first to apply this infinite monkey theorem to thinking about evolution. Likely one of the Huxley's did much earlier. However, it was Richard's discussion that made me want to write my own little snippet of code.
Add a comment