Democratic Underground Latest Greatest Lobby Journals Search Options Help Login
Google

Math doesn't add up for the NSA database, must have MORE INFORMATION!!!!!!

Printer-friendly format Printer-friendly format
Printer-friendly format Email this thread to a friend
Printer-friendly format Bookmark this thread
This topic is archived.
Home » Discuss » Archives » General Discussion (01/01/06 through 01/22/2007) Donate to DU
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:46 AM
Original message
Math doesn't add up for the NSA database, must have MORE INFORMATION!!!!!!
Remember this article saying the NSA database was the biggest database in the world?
""It's the largest database ever assembled in the world," said one person, who, like the others who agreed to talk about the NSA's activities, declined to be identified by name or affiliation. The agency's goal is "to create a database of every call ever made" within the nation's borders, this person added."

Well this article today, though the author didn't realize it, implies there must MORE information in the NSA's database. He said that even in the most generous math the NSA database probably collects 219 terabytes per year. If we multiply that time 5 (we will give the NSA the most generous amounts) it comes out to 1095 terabytes in total size. Divide that by 1000 (the standard size for the next rung in the size chain) and we get 1.1 petabytes. This is the problem: later on in the article he said that Disney had a database of about 1.5 petabytes.

How can it be the biggest database in the world, if Disney has a database 27% larger? Plus, I went and looked up petabyte (to figure out how many terabytes are in a petabyte) I found the Internet Archive has 2 petabytes of info. How is the NSA database bigger than one 55% bigger? And then there is Google, who supposed to have up to 5 petabytes of data, of course that isn't official, so I guess it doesn't count.

Here are the two USA Today articles:
The original article:
http://www.usatoday.com/news/washington/2006-05-10-nsa_x.htm
The technology one talking about the NSA database and its size:
http://www.usatoday.com/tech/columnist/kevinmaney/2006-05-16-nsa-database_x.htm
Printer Friendly | Permalink |  | Top
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:48 AM
Response to Original message
1. kick
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:02 AM
Response to Reply #1
7. kick
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:03 AM
Response to Reply #7
8. kick
Printer Friendly | Permalink |  | Top
 
endarkenment Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:50 AM
Response to Original message
2. The NSA's budget is huge
and they have been building their data collection and analysis system for decades, with billions of our tax dollars every year. That is pretty much all they do. Build, collect, analyse. Assume that their system, the details of which are very secret, is the biggest such system on the planet.
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:57 AM
Response to Reply #2
5. But...
this story with the math is right. Phone calls alone should not amount to this kind of size. It must mean this is apart of some larger database.

The source must have let something slip. You see what I mean? That database isn't just about our call records because those wouldn't take up enough space.
Printer Friendly | Permalink |  | Top
 
annabanana Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:03 AM
Response to Reply #5
9. credit cards, debit cards, the grocery data base, EZ pass,,
drug purchases (in fact ALL computerized medical records)

oh, and, sure, the IRS.

The amount of stuff on computers about our lives is nearly endless....
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:09 AM
Response to Reply #9
11. I wonder how many other companies have sold out to the NSA?
I know that they have the data, but they don't typically arrest and hold someone on suspicion of a crime. That is why these private companies have been allowed to get this data.

I am telling you, this thing must be huge, because the NSA probably knows how big Google's database is and Google is supposed to have "a heck of" a lot of information.

We now live in a digital prison.
Printer Friendly | Permalink |  | Top
 
endarkenment Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:04 AM
Response to Reply #5
10. I am agreeing with you.
They have opened the kimono a bit here. I'd say they have done it on purpose - to let us know just how extensive their system of control is.
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:52 AM
Response to Original message
3. By the way, doing math is the reason we knew Divine Strake was different:
People added up the size of the explosion (equivalent to .5 kt of TNT) and we figured out there is no plane that can carry that big of a bomb. The Pentagon had to then give a more insane statement later. WOW!! We might have found something BIG! This might be what Russell Tice will be talking about.
Printer Friendly | Permalink |  | Top
 
OregonBlue Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:25 AM
Response to Reply #3
15. Sorry, off topic but when is Tice testifying? Is it today?
Printer Friendly | Permalink |  | Top
 
Skidmore Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:57 AM
Response to Original message
4. What's a petabyte? I've still not entirely figured out
how much bigger a gigabyte is than a megabyte.
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 08:58 AM
Response to Reply #4
6. 1000 terabytes
1 terabyte is 1000 gigabytes
1 gigabyte is 1000 megabytes
1 megabyte is 1000 kilobytes
Printer Friendly | Permalink |  | Top
 
DS1 Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:11 AM
Response to Original message
12. They could be going by row counts
If the Disney has 10,000,000 rows with binary data that would eat up disk space

The NSA could have 10 trillion rows each with a few colums - phone, time, where from, where to, length.

Now, of course, the NSA could easily compress your conversations into a small mp3 size, and store that, too.
Printer Friendly | Permalink |  | Top
 
originalpckelly Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:17 AM
Response to Reply #12
13. Well I was just listening to the math in the USA Today article...
not any other math. I don't think you would have rows per se in a computer database. You usually have like arrays or something.
Printer Friendly | Permalink |  | Top
 
DS1 Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:22 AM
Response to Reply #13
14. Databases are built around rows
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 09:54 AM
Response to Reply #14
16. Rows, records, tuples
There are lots of terms that are used in the field.

The physical size on disk of a databases can vary a lot, based on a lot of factors. There's the metadata overhead, the indexing overhead (anything this big will need to be indexed like crazy, which will make it even bigger), the filesystem overhead if they aren't writing to raw block devices, etc. Also, they may or may not encrypt the data at row-level and may or may not compress the data.

If the database is really 1.1 petabytes, that's a lot. Let's say naively they are storing sending phone number, receiving phone number, and timestamp. If these were just stored in an ASCII flat file (which for large amounts of data would almost certainly be less efficient than a database in terms of storage size, since even naive compression would reduce it by almost half, but let's just keep it simple), then you would have something like this:
calling station phone number: 10 bytes (let's assume this is just US to US calls)
receiving station phone number: 10 bytes
start timestamp: 6 bytes
end timestamp: 6 bytes
for a total of a 32-byte record (which like I said is probably much bigger than how it would actually be stored)

That would make each call record 32 bytes. Now, we have to rememeber that data in memory like that goes up by factors of 1024, while storage goes up in factors of 1000. That means 1 petabyte of storage (let's assume they have a 10% storage overhead for the db, which would leave 1 petabyte of actual data) would be 1000 * 1000 *1000 * 1000 * 1000 / 1024 * 1024 * 1024 * 1024 * 1024 = about 0.89 petabytes in memory. Now, if a phone call record is 32 bytes, that means there are 32 phone records per kilobyte, 32768 per megabyte, 33554432 per gigabyte, 34359738368 per terabyte, and 35184372088832 per petabyte, which times 0.89 ~ 3.131409 * 10^13, or 31,314,090,000,000 (my computer's floating point precision broke off at that point and I don't feel like loading my arbitrary-precision calculator program; that's a ballpark though).

Think about that for a second. 31 trillion phone calls *if* they used a storage protocol so naive a first-year comp sci student would blush at it. Binary encoding rather than ASCII would reduce it by a factor of four, and further simple compression would cut that in half. That would make 248 trillion phone calls, even going by the estimate of a 1.1 petabyte database. If there are 300,000,000 Americans, that's 826,667 phone calls per American, or to put it another way 48 phone calls per American per day every day since 9/11/2001.

Something other than just what number called what number when is being stored.

Check out the SELinux project that NSA put out some time... these guys do know what they're doing...
Printer Friendly | Permalink |  | Top
 
DS1 Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 10:09 AM
Response to Reply #16
17. I was hinting at them storing more than what I listed in my first
reply in this thread ;-)
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 10:16 AM
Response to Reply #17
18. Yeah, sorry
Not trying to steal your thunder, just getting some actual numbers into the fray.
Printer Friendly | Permalink |  | Top
 
DS1 Donating Member (1000+ posts) Send PM | Profile | Ignore Wed May-17-06 10:37 AM
Response to Reply #18
19. Not about thunder, numbers are a good thing
:-)
Printer Friendly | Permalink |  | Top
 
DU AdBot (1000+ posts) Click to send private message to this author Click to view 
this author's profile Click to add 
this author to your buddy list Click to add 
this author to your Ignore list Tue Apr 23rd 2024, 10:42 AM
Response to Original message
Advertisements [?]
 Top

Home » Discuss » Archives » General Discussion (01/01/06 through 01/22/2007) Donate to DU

Powered by DCForum+ Version 1.1 Copyright 1997-2002 DCScripts.com
Software has been extensively modified by the DU administrators


Important Notices: By participating on this discussion board, visitors agree to abide by the rules outlined on our Rules page. Messages posted on the Democratic Underground Discussion Forums are the opinions of the individuals who post them, and do not necessarily represent the opinions of Democratic Underground, LLC.

Home  |  Discussion Forums  |  Journals |  Store  |  Donate

About DU  |  Contact Us  |  Privacy Policy

Got a message for Democratic Underground? Click here to send us a message.

© 2001 - 2011 Democratic Underground, LLC