A fast-growing FBI data-mining system billed as a tool for hunting
terrorists is being used in hacker and domestic criminal
investigations, and now contains tens of thousands of records from
private corporate databases, including car-rental companies, large
hotel chains and at least one national department store, declassified
documents obtained by Wired.com show.
[wired.com] Headquartered in Crystal City, Virginia, just outside Washington,
the FBI’s National Security Branch Analysis Center (NSAC) maintains a
hodgepodge of data sets packed with more than 1.5 billion government
and private-sector records about citizens and foreigners, the documents
show, bringing the government closer than ever to implementing the
“Total Information Awareness” system first dreamed up by the Pentagon
in the days following the Sept. 11 attacks.
Such a system, if successful, would correlate data from scores of
different sources to automatically identify terrorists and other
threats before they could strike. The FBI is seeking to quadruple the
known staff of the program.
But the proposal has long been criticized by privacy groups as
ineffective and invasive. Critics say the new documents show that the
government is proceeding with the plan in private, and without
sufficient oversight.
The FBI’s Data-Mining Ore
Composed of government information, commercial databases and records
acquired in criminal and terrorism probes, the FBI’s National Security
Branch Analysis Center is too broad to be considered mission-focused,
but still too patchy to be Orwellian. Here’s the data we know about.
• International travel records of citizens and foreigners
• Financial forms filed with the Treasury by banks and casinos
•
55,000 entries on customers of Wyndham Worldwide, which includes Ramada
Inn, Days Inn, Super 8, Howard Johnson and Hawthorn Suites
• 730 records from rental-car company Avis
• 165 credit card transaction histories from Sears
• Nearly 200 million records transferred from private data brokers such Accurint, Acxiom and Choicepoint
• A reverse White Pages with 696 million names and addresses tied to U.S. phone numbers
• Log data on all calls made by federal prison inmates
• A list of all active pilots
• 500,000 names of suspected terrorists from the Unified Terrorist Watch List
• Nearly 3 million records on people cleared to drive hazardous materials on the nation’s highways
• Telephone records and wiretapped conversations captured by FBI investigations
• 17,000 traveler itineraries from the Airlines Reporting Corporation
“We have a situation where the government is spending fairly large
sums of money to use an unproven technology that has a possibility of
false positives that would subject innocent Americans to unnecessary
scrutiny and impinge on their freedom,” said Kurt Opsahl, a lawyer with
the Electronic Frontier Foundation. “Before the NSAC expands its
mission, there must be strict oversight from Congress and the public.”
The FBI declined to comment on the program.
Among the data in its archive, the NSAC houses more than 55,000 entries on customers of the Cendant Hotel chain, now known as Wyndham Worldwide,
which includes Ramada Inn, Days Inn, Super 8, Howard Johnson and
Hawthorn Suites. The entries are for hotel customers whose names
matched those on a long list the FBI provided to the company.
Another 730 records come from the rental car company Avis, which
used to be owned by Cendant. Those records were derived from a one-time
search of Avis’s database against the State Department’s old terrorist
watch list. An additional 165 entries are credit card transaction
histories from the Sears department store chain. Like much of the data
used by NSAC, the records were likely retained at the conclusion of an
investigation, and added to NSAC for future data mining.
It’s unclear how the FBI got the records. In the past, companies
have been known to voluntarily hand over customer data to government
data-mining experiments — notably, in 2002, JetBlue secretly provided
a Pentagon contractor with 5 million passenger itineraries, for which
it later apologized. But the FBI also has broad authority to demand
records under the Patriot Act, using so-called “national security
letters” — a kind of self-issued subpoena that’s led to repeated abuses
being uncovered by the Justice Department’s inspector general.
Wyndham Worldwide did not respond to repeated requests for comment. Sears declined comment.
Wired.com’s analysis of more than 800 pages of documents obtained
under our Freedom of Information Act request show the FBI has been
continuously expanding the NSAC system and its goals since 2004. By
2008, NSAC comprised 103 full-time employees and contractors, and the
FBI was seeking budget approval for another 71 employees, plus more
than $8 million for outside contractors to help analyze its growing
pool of private and public data.
A long-term planning document from the same year shows the bureau ultimately wants to expand the center to 439 people.
As described in the documents, the system is both a meta-search
engine — querying many data sources at once — and a tool that performs
pattern and link analysis. The NSAC is an analytic Swiss army knife.
The FBI used the system to locate a suspected Al Qaeda operative
with expertise in biological agents who was hiding out in Houston. And
when law enforcement officials got information suggesting members of a
Pakistani terrorist group had obtained jobs as Philadelphia taxi
drivers, the NSAC was tapped to help the city’s police force run
background checks on Philadelphia cabbies.
(A Jordanian-born Philly cab driver was convicted in 2008 for his
part in a plot to attack the Fort Dix army base in New Jersey, but
there’s no evidence of a connection between the investigations.)
And when the FBI lost track of terrorism suspects swept in the
evacuation from Hurricane Katrina in 2005, it created a standing order
in the system to flag any activity by the missing targets.
Additionally, the FBI shared NSAC data with the Pentagon’s
controversial Counter-Intelligence Field Activity office, a secretive
domestic-spying unit which collected data on peace groups, including
the Quakers, until it was shut down in 2008. But the FBI told lawmakers
it would be careful in its interactions with that group.
Conventional criminal cases have also benefited. In a 2004 case
against a telemarketing company called Gecko Communications, NSAC used
its batch-searching capability to provide prosecutors with detailed
information on 192,000 alleged victims of a credit scam.
The feds suspected that Gecko had promised to help the victims
improve their credit scores, and then failed to produce results. NSAC
automatically analyzed the victims’ credit records to prove their
scores hadn’t improved, a task that took two days instead of the
four-and-a-half years that the U.S. Attorney’s Office had expected to
sink into the job. In December 2006, the owners and seven office
managers at the company were sentenced to prison.
The NSAC was born as two separate systems designed to improve
information-sharing between government agencies following the Sept. 11
attacks. The Foreign Terrorist Tracking Task Force database has been
used to screen flight-school candidates and assist anti-terror
investigations. The Investigative Data Warehouse is the more general
system, and is the principal element now under expansion.
“The IDW objective was to create a data warehouse that uses certain
data elements to provide a single-access repository for information
related to issues beyond counterterrorism to include
counterintelligence, criminal and cyber investigations,” stated a
formerly secret fiscal year 2008 budget request document. “These
missions will be refined and expanded as these capabilities are folded
into the NSAC.”
When the bureau unified the systems under the NSAC banner in 2007,
the move set off alarm bells with lawmakers, who thought it sounded a
lot like the Pentagon’s widely-criticized Total Information Awareness
project, which had sought to identify terrorist sleeper cells by
linking up and searching through U.S. credit card, health and
communication databases. The TIA program had moved into the shadows of
the intelligence world after Congress voted to revoke most of its
funding.
In 2007, Republican congressman James Sensenbrenner asked the
Government Accountability Office to look into the NSAC. No report has
been made public yet. But the documents obtained by Wired.com show that
the FBI has repeatedly downplayed the databases’s capabilities when
addressing critics in Congress, while simultaneously talking up — in
budget documents — the system’s power to spit out the names of newly
suspicious persons.
The FBI deflected criticism from a House
committee on June 29, 2007, by pointing out a major difference between
the NSAC and the shuttered TIA program: The NSAC, the bureau said, is
not as open-ended. “A mission is usually begun with a list of names or
personal identifiers that have arisen during a threat assessment,
preliminary or full investigation,” the unsigned response read. “Those
people under investigation are then assessed to determine if they have
any association with terrorism or foreign espionage.”
But a formerly secret 2008 funding justification document among the
newly released documents suggests the FBI’s pre-crime intentions are
much wider that the bureau acknowledged.
The NSAC will also pursue “pattern analysis” as part of its service
to the [National Security Branch]. Pattern analysis queries take a
predictive model or pattern of behavior and search for that pattern in
data sets. The FBI’s efforts to define predictive models … should
improve efforts to identify “sleeper cells.”
As an example, the FBI said its sophisticated data queries allowed
it to identify 165 licensed helicopter pilots who came from countries
of interest, and found that six of those had “derogatory” information
about them in the NSAC computers. It sent the leads to FBI field agents
in Los Angeles.
The FBI also has ambitious plans to expand its data set, the budget
request shows. Among the items on its wish list is the database of the
Airlines Reporting Corporation — a company that runs a backend system
for travel agencies and airlines. A complete database would include
billions of American’s itineraries, including all the information found
on the front of a ticket and their method of a payment.*
So far, the company has given the FBI nearly 17,000 records, which
are now part of NSAC. Spokesman Allan Mutén said the company gives the
FBI records only when presented with a subpoena or a national security letter
— which, he adds, has happened quite a bit. “Nine-eleven was a time and
event that piqued the interest of the authorities in airline passenger
data,” Mutén said.
The ever-growing size of the database concerns EFF’s Opsahl, who has pieced together the best picture of the FBI’s data mining system through other government FOIA requests.
Opsahl cites a October 2008 National Research Council paper
that concluded that data mining is a dangerous and ineffective way to
identify potential terrorists, which will inevitably generate false
positives that subject innocent citizens to invasive scrutiny by their
government.
At the same time, Opsahl admits the NSAC is not at the moment the Orwellian system that TIA would have been.
“This is too massive to be based on a particular query, but too
narrow to reflect a policy that they are going to out and collect this
kind of data systematically,” Opsahl said.
That could change if the FBI gets it hands on the data sources on
its 2008 wish list. That list includes airline manifests sent to the
Department of Homeland Security, the national Social Security number
database, and the Postal Service’s change-of-address database. There
are also 24 additional databases the FBI is seeking, but those names
were blacked out in the released data.
Graphic: Wired.com/Dennis Crothers
* Correction: This story reported that ARC’s database included
information such as date of birth, credit card numbers, names of
friends and family, e-mail addresses, meal preferences and health
information. ARC does not have access to the data, which lives in the
Passenger Name Record, which is handled by other entities. ARC only has
the data that appears on an airline ticket, and payment method, in
order to facilitate payment. Wired.com regrets the error.