Researchers from the University of Washington and Microsoft Research have developed one of the first complete systems to encode, store and retrieve digital data using DNA molecules, capable of storing information millions of times more compactly than current data storage technologies.
A paper presented this month at the Architectural Support for Programming Languages and Operating Systems (ASPLOS) conference in Atlanta describes how four images were successfully encoded into DNA snippets. More significantly, the researchers were able to reverse the process and perfectly retrieve the images without losing a single byte of information.
The images, which are encoded as a string of 0’s and 1’s, are converted into a string of A’s, C’s, T’s and G’s—the bases that pair to form DNA. A DNA molecule with that sequence is then chemically synthesized and dried out for storing with billions of other molecules.
“This works for any digital data, not just images,” says Luis Ceze, an associate professor of electrical engineering at the University of Washington and one of the authors of the study. “We used images because images and video tend to take lots of storage space.”Digital data—including videos, photos and text—collected by devices is expected to hit 44 trillion gigabytes by 2020. At its current rate, the world is producing more data than the capacity to store it. Ceze and his colleagues believe DNA could be the solution to this problem.
“It’s very dense and with the right storage conditions, DNA can be extremely long-lasting,” Georg Seelig, an associate professor at the University of Washington and co-author of the study, tells Newsweek. “However, reading and writing DNA is still very slow, so it’s good for applications where you want to keep information around for a long time but not access it often.”
Ceze adds: “DNA also never becomes obsolete, unlike that old dusty floppy disk at the bottom of your drawer.”
The team from the University of Washington are one of only two groups in the U.S. to have demonstrated the ability to perform “random access” to identify and retrieve data from DNA. Before this can be rolled out on a significant scale, however, the cost of synthesizing DNA—or writing it—for this purpose needs to be reduced, and the efficiency improved.
Both Ceze and Seelig agree that if the right incentives are in place, this could easily be achieved and sprawling data centers could become a thing of the past.