December 31, 2017

Section 702 FAA expires: what are the problems with PRISM and Upstream?

(UPDATED: January 20, 2018)

Two important NSA programs, PRISM and Upstream, are based upon section 702 of the FISA Amendments Act (FAA), a law that was originally scheduled to expire today. Now the US Congress has to decide whether to continue or to reform this crucial legal authority.

Although PRISM became almost synonymous for NSA's alleged mass surveillance, it's actually, just like the Upstream program, targeted collection aimed at specific foreign targets. Still, many people think that these programs pull in way too many data (incidental collection) to be subsequently queried in an illegal way (backdoor searches).

Here we'll show some of the complexities of these two collection programs and that there are various internal procedures and methods in order to keep collection and analysis as focussed as possible.

Slide from the PRISM presentation that for the first time revealed PRISM
and Upstream as part of section 702 FAA collection.

Until recently, US lawmakers were too involved with president Trump's tax reform to devote enough attention to section 702 FAA. Therefore, on December 21, Congress extended the authority of this law through January 19, 2018. Lawyers from the Trump administration even concluded that the intelligence agencies can lawfully continue to operate under the FAA through late April (because the current FISA Court certification for the program actually expires late April 2018).

This leaves Congress some extra months to either reform or strengthen this important authority. There are several proposals, spanning from making the existing law permanent without changes, to imposing significant new limits to safeguard the privacy rights of Americans.

Meanwhile, the Office of the Director of National Intelligence (ODNI) came with additional information about data collection under section 702 FAA, and published for example a Section 702 Overview, which includes some nice infographics:

Diagram from ODNI about section 702 FAA collection. Click to enlarge.

702 FAA collection

The Snowden-revelations have shown that under the legal authority of section 702 FAA, NSA conducts two types of data collection:

- Upstream collection, for both internet and telephone communications, which are filtered out based upon specific selectors at major telephone and internet backbone switches. This takes place under the collection programs FAIRVIEW and STORMBREW.

- Downstream collection, only for internet (including internet telephony) communications, based upon specific selectors, which are acquired from at least 9 major American internet companies. This takes place under the collection program PRISM.

The Upstream and Downstream programs are different from eachother in many ways, but the thing they have in common is that collection take place inside the United States, while being aimed at foreign targets, although just one end of their communications has to be foreign. This means these programs also pull in communications between targeted foreigners and Americans - which is one of the main purposes of these programs: finding connections between terrorists inside and outside the US.

Slide showing the main differences between PRISM and Upstream
Published on October 22, 2013. Click to enlarge.

Upstream filtering

Although Upstream collection is based upon specific selectors, the American Civil Liberties Union (ACLU) presents it as "bulk surveillance", because in their opinion, the automated filtering actually means that NSA is "searching the contents of essentially everyone’s communications." Therefore they call these searches extraordinarily far-reaching and unprecedented and unlawful.

The Electronic Frontier Foundation (EFF) has a similar position and says that splitting internet cables is "unconstitutional seizure", while the subsequent search for selectors is an "unconstitutional search."

These judgements seem based upon comparing digital filtering with intercepting letters or telegrams (like what happened under project SHAMROCK from 1945-1975), but this ignores the differences with computer technology: NSA does copy entire data streams, but at virtually the same moment the filter system picks out the communications associated with the selectors, the other data are gone.

Searching through data packets of innocent people means at the same time destroying them - except when they contain one of the selectors which NSA is interested in.

Diagram from the EFF about Upstream collection. Click to enlarge.

Storage and classification

Under section 702 FAA, only data that are associated with a specific selector are stored. For Upstream collection, this means only the communications that remain after the filtering proces. These are processed (decoded, formatted, etc.) and stored in NSA databases for a maximum of only 2 years.

Downstream collection under the PRISM program results in all the data associated with specific selectors that the big internet companies hand over to the FBI, which then forwards them to NSA. These are also processed and then stored for a maximum of 5 years.
Data from FAA collection are usually stored in separate database partitions and are protected by the Exceptionally Controlled Information (ECI) compartment RAGTIME (RGT). Only analysts who are cleared for RAGTIME, have the specific need-to-know and who are authorized by the data owner have access to these data.

Already a few months before the start of the Snowden-revelations a book revealed that RAGTIME has 4 components:
- RAGTIME-A: foreign-to-foreign counterterrorism (CT) data
- RAGTIME-B: data from foreign governments (FG) transiting the US
- RAGTIME-C: data related to counterproliferation (CP) activities
- RAGTIME-P: domestic bulk collection of internet metadata*
Note that the first three components correspond to the first three FISA Court certifications that authorize section 702 FAA collection.

Last November, ZDNet reported about a leaked NSA document that lists a total of 11 components of RAGTIME. Besides the 4 known ones, the document also mentions RAGTIME-BQ, F, N, PQ, S, T and USP, but so far, we don't know what kind of data they protect.

On August 26, 2013, Der Spiegel published the so far only document from the RAGTIME (RGT)
compartment: the floorplan of the EU mission to the United Nations in New York.
Note the PINWALE ID (PWID): PWZA20120551215230001427125

Incidental collection

As almost every NSA target will communicate with at least some individuals who are not involved in terrorism or other threats to national security, it's inevitable that even targeted interception will result in storing communications of innocent (American) people too - NSA calls this "incidental collection".

The share of this incidental collection as part of the overall collection is not known: early 2017, NSA agreed to provide some information about how many American citizens may be impacted, but later, Director of National Intelligence (DNI) Dan Coats said that it "remains infeasible" for the government to cite a meaningful number.

Actual intercepts

Edward Snowden was also eager to draw public attention to this issue, and maybe he took his last job for Booz Allen at NSA in Hawaii for the sole purpose of getting access to raw data collected under section 702 FAA. In his view, the PRISM and Upstream programs "crossed the line of proportionality."

He succeeded in his effort and was able to exfiltrate a cache of ca. 22.000 collection reports, containing 160.000 individual conversations (75% of which instant messages), which were intercepted by NSA between 2009 and 2012 - a much more substantive leak than the usual internal powerpoint and sharepoint stuff.

Snowden handed them over to The Washington Post, which reported about this cache on July 5, 2014. After a cumbersome investigation, it found that the intercepted communications contained valuable foreign intelligence information, but also that over 9 out of 10 account holders were not the intended surveillance targets and that nearly half of the files contained US person identifiers.

Breakdown of the intercepted messages collected under 702 FAA authority
that were reviewed by The Washington Post. Click for a larger version.

Targeted interception

The numbers from The Post do sound like a massive overcollection, but we should keep in mind that this still is targeted collection, something that privacy advocats always prefer rather than bulk collection.

NSA's Upstream program will likely result in just as many communications of innnocent people as when the police taps phone numbers and IP addresses under a warrant, although NSA targets may be more careful in conducting private telecommunications than ordinary criminals.

From the dataset examined by The Washington Post, it becomes clear that innocent people can be affected in two ways: first, when they communicate directly with (or about) a foreign target, and second, by "joining a chat room, regardless of subject, or using an online service hosted on a server that a target used for something else entirely."

This shows that even with targeted interception, the technical configuration of certain internet platforms make it apparently quite difficult, or even impossible to isolate the conversations in which a target is personally involved.

As the dataset that Snowden exfiltrated seems to be derived from both Upstream and PRISM collection, it's hard to say which of these programs is more intrusive. Upstream became a less useful source since the most common communication services have been encrypted, while PRISM may also not be as productive as before, after it was exposed by the press.

Dataflow diagram for Upstream collection under the FAIRVIEW program.
Published on November 16, 2016. Click to enlarge.
(More FAIRVIEW dataflow diagrams)

Backdoor searches

On August 9, 2013, The Guardian disclosed the so-called "backdoor searches". This is a method used by NSA analysts that was approved by the FISA Court in October 2011, so these searches are not illegal like the term "backdoor" suggests.

Apparently these backdoor searches were introduced as a replacement for the bulk collection of domestic internet metadata under the PR/TT program, which NSA terminated by the end of 2011.

These backdoor searches are not about collecting new data by tapping telephone and internet cables or acquiring data from internet companies, but about conducting searches in data that have already been collected.

While in general, NSA is only allowed to collect new data when they are related to foreign targets, these backdoor searches may also involve identifiers (like names, e-mail addresses and phone numbers) of US citizens, hence they are now officially called "U.S. person queries".

Initially, these searches were only allowed for data from PRISM, because Upstream not only collected communications "to" and "from", but also "about" targets, which made it more sensitive than PRISM collection (Upstream appeared to pull in tens of thousands of purely domestic e-mails each year).

In April 2017, NSA halted this "about" collection, after which the FISA Court allowed NSA to also conduct US person queries on data collected through the Upstream program - something that had already happened since at least mid-2013.

Risks and safeguards

NSA analysts retrieving communications of Americans is of course something that reminds of the notorious project MINARET (1967-1973), under which NSA targeted 1.650 US citizens, including civil rights leaders, journalists and even two senators.

After Glenn Greenwald tried, but failed to proof that NSA is still monitoring American citizens in that way, it's now these backdoor searches which are considered the biggest privacy violations under section 702 FAA - the ACLU says that they allow "spying on U.S. residents without a warrant."

Even former NSA director Michael Hayden was aware of the privacy risks of these queries, but the PCLOB report about section 702 explains that NSA has procedures and requirements to limit these US person queries, although they are different for content and for metadata:

- Queries of content are only permitted for US person identifiers that have been pre-approved (i.e. added to a white list) through one of several processes, including other FISA processes. Such approvals are for example granted for US persons for whom there are already individual warrants from the FISA Court under section 105 FISA or section 704 FAA. US person identifiers can also be approved by the NSA's Office of General Counsel after showing that using a certain US person identifier would "reasonably likely return foreign intelligence information."

- Queries of metadata may only be conducted in a system that requires analysts to document the basis for their metadata query (a Foreign Intelligence (FI) justification) prior to conducting the query. An oversight report adds that "analysts are not required to check any specific database or seek any internal approvals prior to executing a query against [702 FAA] metadata."

Relevant queries

In general, NSA analysts are required to create queries that are as focussed as possible so they return information that is most useful and relevant for their foreign intelligence mission. According to the PCLOB report, analysts receive "training regarding how to use multiple query terms or other query discriminators (like a date range) to limit the information that is returned in response to their queries of the unminimized data."

In the Section 702 Overview that was published by ODNI on December 20, it is explained that US person queries on metadata are useful as they are often the fastest and most efficient way to check whether and how a certain US person (either suspect or victim) is connected to foreign actors. The overview also provides some remarkably concrete examples:
- Using the name of a US person hostage to cull through communications of the terrorist network that kidnapped her to pinpoint her location and condition;
- Using the e-mail address of a US victim of a cyber-attack to quickly identify the scope of malicious cyber activities and to warn the U.S. person of the actual or pending intrusion;
- Using the name of a government employee that has been approached by foreign spies to detect foreign espionage networks and identify other potential victims;
- Using the name of a government official who will be traveling to identify any threats to the official by terrorists or other foreign adversaries.

Dataflow diagram for Downstream collection under the PRISM program.
Published on June 29, 2013. Click to enlarge.

Numbers of queries

While NSA and the Office of the Director of National Intelligence (ODNI) were apparently not able to provide numbers about the "incidental collection" under section 702 FAA, they do better when it comes to numbers about the backdoor searches.

In a letter to senator Wyden, then DNI Clapper wrote that in 2013, NSA approved 198 US person identifiers for querying the content, and that there had been ca. 9.500 queries on metadata from data collected under the PRISM program, but of the latter ca. 36% were duplicative or recurring queries.

ODNI's annual transparancy report also provides numbers of US person queries. In 2016, there were 5.288 content queries, but this also includes CIA queries and NSA searches of content from Upstream collection, something that was actually unauthorized until April 2017 (see above), but which the agency is now trying to make visible.

The rise of the number of US person queries on metadata is even higher, as it went up from 9.500 in 2013, to 30.355 in 2016. The total presented in the ODNI report is supposed to apply to NSA, CIA and FBI, but actually it only shows the number for NSA, as the CIA isn't yet able to count such queries and the FBI isn't required to do so (see below).

It should be noted that for content, it's the particular identifier that is counted, not the number of times such an identifier is actually used to query the databases. For metadata this is different, as the agencies count each time a certain identifier is queried, which of course results in far higher numbers.

Numbers of US person queries on metadata, 2013-2016. Click to enlarge.

FBI searches

Besides NSA and CIA, the FBI is also allowed to conduct backdoor or US person searches on data that NSA collected under the PRISM program - something that is considered even more problematic, given the risk of parallel construction. The FBI doesn't need individual warrants for these searches either, but its agents should "design their queries in such a way that they will return evidence of a crime."

The FBI stores data from 702 FAA collection in the same repositories as data from its own traditional FISA monitoring and physical searches. This means that these data are searched and queried many times for other than national security purposes too, but the section 702 data can only be viewed by agents or analysts with the proper training and access rights.

Given the fact that the initial collection under section 702 FAA is aimed at foreign targets, it is "extremely unlikely" that this collection contains data that are of interest to FBI agents who are investigating criminal cases. Even as, inevitably, a relatively large amount of unrelated American communications are pulled in, the chance that they are useful for a particular criminal case is just very very small.

Besides that, by far the most FBI searches on section 702 data are for national security investigations, which means about foreign espionage, terrorism and Weapons of Mass Destruction (WMD). It's not clear whether FBI has similar restrictions for content queries as NSA.


On January 11, 2018, the House of Representatives voted to extend section 702 FAA for another six years, which is until the end of 2023.

This means that the US Person or backdoor searches can continue without individualized warrants, except for a "narrow warrant requirement that applies only for searches in some later-stage criminal investigations, a circumstance which the FBI itself has said almost never happens."

The renewal of section 702 also allows the restart of the "about" collection under the Upstream program, which was ended by NSA in April 2017, after being criticized by the FISA Court.

The bill went to the Senate, which voted to invoke so-called cloture on January 16. This means there will be no further debate or amendments - a disappointing end for liberal Democrats and libertarian Republicans who tried to limit the scope of intelligence collection under section 702.

By a vote of 65-34, the Senate passed the bill to renew section 702 FAA on January 18, 2018. The next day, president Trump signed the bill into law.

Links and sources
- Bruce Schneier: After Section 702 Reauthorization
- Politico: Five years after Snowden, security hawks notch landmark win
- Lawfare: FISA Section 702 Reauthorization Resource Page
- Congress is Debating Warrentless Surveillance in the Dark
- New York Times: Warrantless Surveillance Can Continue Even if Law Expires, Officials Say
- The Problems with Rosemary Collyer’s Shitty Upstream 702 Opinion
- The Washington Post: In NSA-intercepted data, those not targeted far outnumber the foreigners who are + The Debrief - An occasional series offering a reporter’s insights
- B. Hanssen: Why the NSA’s Incidental Collection under Its Section 702 Upstream Internet Program May Well Be Bulk Collection, Even If The Program Engages In Targeted Surveillance
- NSA Director of Civil Liberties and Privacy Office Report: NSA's Implementation of Foreign Intelligence Surveillance Act Section 702
- Privacy and Civil Liberties Oversight Board: Surveillance Program Operated Persuant to Section 702 FISA