DNA is intended to carry data and replicate it without error. Therefore, it would seem to be the perfect storage medium. Researchers from the New York Genome Center and the Center for Computational Biology and Bioinformatics at Columbia wrote six different files into DNA. These were a computer operating system, an 1895 French film, "Arrival of a Train at La Ciotat”, an image of the Pioneer plaque, a 1948 study by information theorist Claude Shannon, a computer virus, and a $50.00 Amazon gift card.

They compressed the files into a master file, and then split the data into short strings of binary code made up of ones and zeros. Using an erasure-correcting algorithm called fountain codes, they randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA—adenine, cytosine, guanine and thymine. The algorithm deleted letter combinations known to create errors, and added a barcode to each droplet to help reassemble the files later.

They then synthesized those organic molecules into DNA strands and stored the DNA in a test tube. To extract the information, they sequenced that DNA. DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule.

DNA sequencing may be used to determine the arrangement of individual genes, clusters of genes, full chromosomes or entire genomes, of any organism. In fact, DNA sequencing has become a key technology in many areas of biology and other sciences, such as medicine, forensics or anthropology.

The result of the sequencing was a perfect copy of the original data. Since DNA is designed for storage, it turns out to be much better at it than anything we have invented.

DNA has some great advantages. It is much smaller than traditional media. Researchers found that they can reach a density of 215 petabytes per gram of DNA. Also, DNA lasts for a prolonged length of time—over 100,000 years—which is magnitudes more than traditional media. As an example, one petabyte is roughly 16,000 times the data that your 64GB iPhone can store. DNA can store 215 petabytes in just 0.035 ounces.

Another advantage of DNA is that it will never become obsolete. If you have cassette tapes of music, or vinyl records, CD’s, DVDs, they will not be able to be played unless you have a compatible playback device. These devices are available now, but what about in future decades or even centuries? DNA has been around for billions of years, and humanity is unlikely to lose its ability to read these molecules.

Image result for petabytes

Technology companies, like Facebook, routinely build sprawling data centers to store all the baby pictures, financial transactions, cat videos and email messages its users store.

But a new technique developed by University of Washington and Microsoft researchers could shrink the space needed to store digital data that today would fill a Walmart supercenter down to the size of a sugar cube.

All the data contained in our computer files, historic archives, movies, photo collections and the exploding volume of digital information collected by businesses worldwide is expected to hit 44 trillion gigabytes by 2020. This represents enough data to fill more than six stacks of computer tablets stretching to the moon. The world is producing data faster than the capacity to store it.

Now celebrating its centenary year, Technicolor's laboratories are at the cutting edge of the science of filmmaking, leading a worldwide revolution in immersive entertainment. The company's latest amazing innovation is the encoding of movies into artificial, "non-biological" DNA. DNA is almost unimaginably small—up to 90,000 molecules can fit into the width of one human hair—so even a large movie library is totally invisible to the human eye. All you can see is the water in a small test tube.

Image result for Technicolor Using DNA to Store Movies

DNA is a long, coiled molecular "ladder"—the famous double helix structure—comprising the four chemical rungs, adenine, cytosine, guanine and thymine, which team up in pairs. Technicolor digitized the 1902 movie, "A Trip to the Moon", into data in computing's binary code, and transcribed it into DNA code, which was then turned into molecules, using lab-dish chemicals. The contents are "read" by sequencing the DNA and turning it back into computer code. Converting movies into man-made DNA has huge advantages, as the archives of every Hollywood studio are currently taking up a huge amount of floor space. Now, their archives could fit into something the size of a domino.

So, how do you store a movie on DNA? The process of coding a film to DNA is complicated but I will try to simplify the method. Each digital film image is made up of pixels. The technician allocates each pixel a code made of zeros and ones, based on its color. The code is then converted into the DNA chemical bases adenine, cytosine, guanine and thymine.

The audio is broken down into info-bytes, allotted a numerical code and then changed into DNA base pairs. Each DNA strand is labeled with a chemical index that communicates where the pixel or sound belongs in a movie. A computer program then organizes the millions of DNA strands representing all the pixels and sounds of the film so that the movie can be recreated.

Once the DNA strands are fabricated, many copies can be made. The DNA containing the movie will be smaller than a speck of dust. Then, the movie can be returned to its original form using a DNA sequencer, which works like a washing machine that spins and sorts the strands. The movie can then be read by a computer.

Image result for DNA sequencer

A practically unlimited number of copies of a movie could be created with this coding technique by multiplying the DNA sample through polymerase chain reaction (PCR), and that those copies, and even copies of their copies could be recovered with no errors.

Currant data is kept on spools of magnetic tape in a data storage facility, which must be backed-up every four years or so. There is no digital data storage technology that will last for hundreds of years. We’re essentially repurposing DNA to store digital data—pictures, videos, documents—in a manageable way for hundreds or even thousands of years. Don’t look for your favorite movie to be stored on DNA anytime soon. Commercial DNA data storage won’t be available for at least a decade.

For more information:

  1. https://homes.cs.washington.edu/~luisceze/publications/dnastorage-asplos16.pdf
  2. http://www.afcinema.com/IMG/pdf/technicolor_storing_movies_in_dna.pdf

  3. http://ita.ucsd.edu/workshop/16/files/paper/paper_3187.pdf



Len Calderone – Contributing Editor

Len contributes to this publication on a regular basis. Past articles can be found with an Article Search and his profile on our Associates Page

He also writes short stores that always have a surprise ending. These can be found at http://www.smashwords.com/profile/view/Megalen


Len Calderone