HomeLatest ThreadsGreatest ThreadsForums & GroupsMy SubscriptionsMy Posts
DU Home » Latest Threads » Forums & Groups » Topics » Science » Science (Group) » MP3 files written as DNA ...
Introducing Discussionist: A new forum by the creators of DU

Thu Jan 24, 2013, 06:44 AM

MP3 files written as DNA with storage density of 2.2 petabytes per gram

It's easy to get excited about the idea of encoding information in single molecules, which seems to be the ultimate end of the miniaturization that has been driving the electronics industry. But it's also easy to forget that we've been beaten there—by a few billion years. The chemical information present in biomolecules was critical to the origin of life and probably dates back to whatever interesting chemical reactions preceded it.

It's only within the past few decades, however, that humans have learned to speak DNA. Even then, it took a while to develop the technology needed to synthesize and determine the sequence of large populations of molecules. But we're there now, and people have started experimenting with putting binary data in biological form. Now, a new study has confirmed the flexibility of the approach by encoding everything from an MP3 to the decoding algorithm into fragments of DNA. The cost analysis done by the authors suggest that the technology may soon be suitable for decade-scale storage, provided current trends continue.
Trinary encoding

Computer data is in binary, while each location in a DNA molecule can hold any one of four bases (A, T, C, and G). Rather than using all that extra information capacity, however, the authors used it to avoid a technical problem. Stretches of a single type of base (say, TTTTT) are often not sequenced properly by current techniques—in fact, this was the biggest source of errors in the previous DNA data storage effort. So for this new encoding, they used one of the bases to break up long runs of any of the other three.

(To explain how this works practically, let's say the A, T, and C encoded information, while G represents "more of the same." If you had a run of four A's, you could represent it as AAGA. But since the G doesn't encode for anything in particular, TTGT can be used to represent four T's. The only thing that matters is that there are no more than two identical bases in a row.)

That leaves three bases to encode information, so the authors converted their information into trinary. In all, they encoded a large number of works: all 154 Shakespeare sonnets, a PDF of a scientific paper, a photograph of the lab some of them work in, and an MP3 of part of Martin Luther King's "I have a dream" speech. For good measure, they also threw in the algorithm they use for converting binary data into trinary.

http://arstechnica.com/science/2013/01/mp3-files-written-as-dna-with-storage-density-of-2-2-petabytes-per-gram/

1 replies, 556 views

Reply to this thread

Back to top Alert abuse

Always highlight: 10 newest replies | Replies posted after I mark a forum
Replies to this discussion thread
Arrow 1 replies Author Time Post
Reply MP3 files written as DNA with storage density of 2.2 petabytes per gram (Original post)
morningfog Jan 2013 OP
formercia Jan 2013 #1

Response to morningfog (Original post)

Thu Jan 24, 2013, 09:51 AM

1. Sorry, Mom, I didn't mean to eat your Library

At least it's compostable.

Reply to this post

Back to top Alert abuse Link here Permalink

Reply to this thread