We have designed codon optimized genes for Novozymes. To boost protein production in host organisms scientists will sometimes modify genes with synonymous DNA sequences - the protein sequence is the same but the DNA sequence is different. A simple codon optimization might work like so: an enzyme from one host is put into a stongly producing host. The DNA of that gene is modified so that its 3 letter codons match the frequency of the strongly producing host. We used Machine Learning to improve on this even, by looking at adjacent amino acids to predict the most native sequence, and included other features such as secondary protein structure and known expression levels of genes.

Machine learning can improve codon optimizations