Sources and Uses of Data at the Bureau of Consumer Financial Protection

1. Introduction

The Bureau of Consumer Financial Protection (Bureau) was created by the Dodd-Frank Wall Street Reform and Consumer Protection Act (Dodd-Frank Act). The Dodd-Frank Act states that the Bureau "shall seek to implement and … enforce Federal consumer financial law consistently for the purpose of ensuring that all consumers have access to markets for consumer financial products and services and that [those markets] are fair, transparent, and competitive." Data informs this work to a great extent.

Outside of the Bureau's Operations Division, three Divisions of the Bureau conduct most of the Bureau's data-driven work: the Division of Supervision, Enforcement and Fair Lending (SEFL) is responsible for conducting supervisory, enforcement and fair lending activities; the Division of Research, Markets, and Regulations (RMR) is responsible for conducting research and monitoring the consumer financial products and services markets as well as developing, implementing, and assessing regulations; and the Division of Consumer Education and Engagement (CEE) is responsible for providing financial education to consumers and collecting, monitoring, and responding to consumer complaints regarding consumer financial products or services.

To fulfill its statutory functions and obligations, the Bureau obtains data to inform its decisions. These activities include:

  • Writing rules, supervising companies, and enforcing the law
  • Taking consumer complaints
  • Providing financial education
  • Researching the consumer experience of using financial products, and
  • Monitoring financial markets for new risks to consumers

The Bureau has data governance processes for each stage of the data lifecycle, including intake, management, publication, and disposition of data. This report describes those processes as well as what data the Bureau collects, where data come from, how data are used, and how data are accessed and "reused" within the Bureau. Appendix A provides copies of the governance documents, Appendix B lists the Bureau's data assets subject to the limitations described below, and Appendix C lists the Bureau's Memoranda of Understanding with other governmental and quasi-governmental agencies that address the sharing of data. The Bureau intends to supplement this report with the text of its MOUs subject to obtaining the necessary approval of affected state and federal agencies.

Appendix B does not cover Bureau data collections from consumers on a voluntary basis through focus groups, one-on-one interviews, user testing or small-scale informal surveys, except where such data collection took place in the context of developing disclosures in a rulemaking or potential rulemaking context. The Bureau is compiling information with respect to the data collections excluded from this report and will supplement the report by adding it to Appendix B. The Bureau also is compiling and will add to Appendix B information on the third-party costs associated with the purchase and/or collection of information by or on behalf of the Bureau to the extent possible.

2. Data governance at the Bureau

Today, data intake, management, and publication are governed by the Bureau's Policy on Information Governance and related directives and operational charters, as part of the Bureau's data governance program. The Bureau's Chief Information Officer (CIO) has delegated responsibility for this program to the Chief Data Officer (CDO), who generally oversees the Bureau's data management. In this capacity, the CDO makes decisions regarding intake, management, disclosure, and disposition of Bureau data. The CDO reports to the Bureau's CIO.

The CDO's Office includes a Data Policy team that develops policies, standards, and guidance to help the Bureau manage its data assets throughout the data lifecycle. This team leads the Bureau's data governance program, managing how Bureau employees collect, use, access, and disclose data. The Bureau has established several internal advisory bodies to review general data governance issues as well as specific data intakes and public disclosures of data, as described below.

The Bureau has data governance policies in place that govern how information and data are: brought into the Bureau; shared internally across Divisions; released to the public; securely stored, classified, and used; and ultimately disposed. In addition, some data collection authorities related to the day-to-day activities are delegated to specific divisions or offices, as noted below.

2.1 Bureau data governance policies

The CIO signed the Bureau's Policy on Information Governance in June 2014. It establishes the overall framework for the Bureau's data governance program. This policy sets in place guidelines that:

  • Address what information the Bureau can and should take in, and how that information intake shall occur in order to ensure compliance with applicable laws, contractual obligations, and Bureau policy requirements.
  • Set forth standards for assigning a sensitivity level that may afford additional guidelines and policies on its access, use, and overall management.
  • Ensure information is adequately secured and responsibly used in accordance with applicable laws, contractual obligations, and Bureau policy requirements.
  • Set forth standards for what information can and should be disclosed by the Bureau and its program offices, subject matter experts, and data owners, either to the general public or to other government entities.
  • Describe the rules, roles, and responsibilities related to the retention, archiving, and destruction of electronic and physical information and related assets.

The Policy on Information Governance establishes the Bureau's Data Governance Board (DGB). It also describes how information governance oversight responsibilities may be delegated and outlines how the Bureau will take in, manage, and disclose information.

2.2 Bureau data governance bodies

The Bureau manages its data through a centralized data governance program. In 2011, the Bureau stood up the Data Coordination Council (DCC) — chaired by the CIO with representation from every office in which data played an important role. A predecessor to the current Data Governance Board (DGB), the DCC served three primary objectives: 1) Coordinate internal data projects and policies; 2) Coordinate external data acquisition and sharing; and 3) Coordinate analytical resources. Subsequently, it was replaced by other data governance bodies, described below.

As the Bureau has grown, it has refined, documented, and improved its practices in response to changing technological and operational demands. This report is intended to describe the Bureau's current data practices.

The Bureau currently has chartered several internal bodies that work in coordination with each other as part of the data governance program to review the Bureau's data practices, review data intake and reuse requests, make recommendations regarding potential data disclosures to the public, and otherwise govern data. The diagram below illustrates their functions in the data lifecycle:

2.2.1 Data Governance Board

The Bureau chartered the DGB in 2014 to replace the DCC as an advisory body on data policy. It advises the CDO on decisions regarding data intake, management, disclosure, and disposition, in accordance with Bureau policies. It also advises on creating and revising Bureau policies and procedures related to data. It is chaired by the CDO, and its members are senior staff, with cross-Bureau representatives including representatives from the CDO Data Policy team, Privacy team, Cybersecurity team, Office of Consumer Response, CEE, SEFL, RMR, Legal Division, External Affairs Division, Office of the Director, and the Operations Division Front Office.

The DGB reviews data governance standards and directives and other Bureau data governance bodies' charters. The DGB ensures that the policies stay current with the Bureau's data needs and the rapidly evolving data security environment.

The Bureau reviews data governance policies at least every five to seven years and updates those policies as needed. For example, the Bureau is redrafting the sensitivity-leveling standard in 2018 to be more user-friendly and to align with the new government-wide guidance on Confidential Unclassified Information pursuant to Executive Order 13556.

2.2.2 Data Intake Group

The Policy on Information Governance directs that the decision to acquire information be governed by eight guiding principles. The Bureau should: (1) ensure proper authority; (2) adhere to applicable law; (3) demonstrate due diligence; (4) avoid undue burden; (5) validate the reasonableness of an intake; (6) avoid redundancy; (7) align intakes with Bureau goals and objectives; and (8) standardize intakes where reasonably possible.

The CIO formed the Bureau's Data Intake Group (DIG) to coordinate and review potential data intakes in accordance with the Policy on Information Governance, and to advise the CIO on related issues. The CIO established the DIG in October 2012 and signed a written charter in 2015. It is an operational committee that is coordinated by a representative of the Data Policy team, known as the DIG Coordinator, and the committee is comprised of staff-level technical experts representing the Bureau's Cybersecurity Office, Paperwork Reduction Act (PRA) Office, Records Office, Freedom of Information Act Office, Privacy Office, and the Legal Division. For data intake requests that require the CDO's approval, the DIG members provide recommendations before the CDO approves or denies the request.

The CIO has determined that certain public data and certain low-sensitivity non-public data, such as subscription-based website content, do not warrant review and approval by the CIO or his designee. Accordingly, the CIO has established exceptions to the general rule requiring approval before any data are stored on a Bureau computer or network. These exceptions are included in Appendix A.

Additionally, where routine data intakes are a normal part of the Bureau's work, the CDO has granted the relevant operational office or division delegated authorities for these intakes within the Bureau's governance framework. For example, it delegated data governance responsibilities for consumer complaints to Consumer Response and delegated data governance responsibilities for data specific to supervisory and enforcement activities to Supervision and Enforcement, respectively. Each such office submits an annual report to the CDO and the DGB that describes the major data-related decisions and activities completed under the delegated authority. The CDO and DGB review the annual reports and information provided by each delegated authority to determine whether that delegation should be continued as drafted, amended, or rescinded.

The Bureau generally obtains data, consistent with its authorities, from five main sources: public sources, government agencies, commercial vendors, financial institutions (FIs), and consumers. As governed by the Policy on Information Governance, DIG charter, and DIG procedures — except as noted above — the DIG generally reviews proposed intakes of data for intended use, intended access, and compliance with applicable law and regulations and Bureau policies, including privacy and data security requirements.

The Bureau obtains many data assets for a single purpose involving a single office or division, such as a usability study pertaining to the Bureau's website or a supervisory examination of a FI. Data intakes can also be intended for multiple uses, such as Call Reports, noted in the Core Data Assets section, or for one purpose that involves the participation of several Bureau offices, such as research pertaining to financial education. The Bureau is in the process of centralizing through the DIG the process of reviewing requests by one part of the Bureau to use information that was brought in for a different purpose (referred to here as "reuse"). The preexisting policy and practice regarding reuse is discussed in the Data Reuse section of this report.

As detailed in the Policy on Information Governance, specific rules govern how the Bureau shares information internally across business areas based on the sensitivity level of the information and the authority under which the information was received. Before the Bureau adopted this policy in 2014, individual offices or divisions within the Bureau developed protocols for sharing data with other offices or divisions. See Section below, Data collected for supervision or enforcement used for research, monitoring, assessments or rulemaking.

The Bureau's Records Management Office has primary responsibility for developing and implementing the Bureau's record management program, including records retention schedules, in accordance with applicable laws, as described in the Information Governance Policy. The rules around identification, classification and scheduling of official records are defined by the Bureau's Records Management Office. The DIG includes a member of the Records Office to advise on retention (archival and disposal) requirements for records and non-records at the time of information intake or creation. Annually, the Records Officer notifies the DIG of eligible data assets for potential removal and destruction. The DIG evaluates the eligible data asset and makes a recommendation to the CDO as to whether the data should be removed or the retention period extended for one year based on a written justification. Once the CDO makes a determination, the data catalog is updated to reflect the new retention period. Where information is governed by a legal agreement, these legal agreements may provide additional disposition requirements.

2.2.3 Data Release Group

The CDO chartered the Bureau's Data Release Group (DRG) in 2017 as an operational body comprised of staff-level technical experts. Members include representatives from the CDO's Data Policy team, Office of Research, Privacy team, Legal Division, and the Division of External Affairs (EA). If an office or division seeks to release a Bureau data asset, the DRG works to review and refine the proposed releases. The DRG has recommended the CDO approve five proposed data disclosures since 2016.

2.3 Data security and privacy

The Bureau's Privacy Policy establishes the Bureau's privacy principles, including safeguarding the data it acquires, acquiring only the information it needs to execute each task, and minimizing intake of direct personal identifiers or personally identifiable information (PII). Where direct personal identifiers are necessary to perform Bureau work, such as responding to and monitoring consumer complaints, conducting enforcement investigations and litigation, or conducting supervisory activities, the Privacy Policy provides that access to the data assets that contain identifiers is to be limited to staff for whom access is relevant to their assigned duties.

The Bureau issues publicly available System of Records Notices (SORNs) pursuant to the Privacy Act of 1974. The SORNs describe personal information that the Bureau receives, why the Bureau receives such information, how the Bureau uses the information, and why and how the information may be shared. As shown in the table below, the Bureau has issued 26 SORNs, two of which have been rescinded. Although individual consumers are not identified in most Bureau data, where applicable, individuals can submit requests to the Bureau under the Privacy Act to access, correct, or amend information that the Bureau may have about them.

The Policy on Information Governance sets forth the principles governing who may be granted access to what data, based on the sensitivity level of the data and the user's assigned duties. The Bureau manages access to data at the level of each individual data asset for all network users, including contractors. In addition, all users are subject to the same training requirements and background checks. The Bureau grants access to information consistent with the information's sensitivity level (as outlined in the Bureau's Information Sensitivity Leveling Standard), the authority under which the Bureau collected the information, the Bureau's information sharing standards, cybersecurity policies and procedures, and applicable law or contractual obligations. The Bureau's Office of Cybersecurity uses the NIST Risk Management Framework to prioritize data according to its sensitivity. The same office continuously monitors systems for indications of a potential system compromise, and routinely identifies and blocks a number of potential exploit attempts. To date, the Bureau is not aware of any attacks from outsiders that resulted in third parties gaining access to non-public data without appropriate authorization. The Bureau also has not experienced a "major incident" as that term is defined by OMB and FISMA.

OMB defines "breach" broadly to include instances in which a person other than an authorized user accesses or potentially accesses PII. The Bureau has experienced 371 such breaches through June 2018. More than half of the Bureau's discrete breaches of PII occurred in connection with the Bureau's consumer response function, through which the Bureau has handled more than 1.5 million complaints. Those confirmed breaches generally occurred in one of three ways:

  1. The Bureau fails to follow internal processes and provides an update to a consumer about his or her complaint prior to receiving three pieces of information that would validate the consumer's identity;
  2. The Bureau attaches an incorrect document to a consumer's complaint; or
  3. The Bureau sends an unencrypted email to the wrong consumer.

The breaches that occurred outside of the consumer response function typically were instances when a Bureau employee sent an email including PII to the wrong individual, either inside or outside the Bureau.

Almost all breaches (approximately 90 percent) involved one or more of the following data elements: first name, last name, email address, phone number, or account number. For almost all of these breaches, the number of individuals potentially impacted by each breach is most likely one. This means that those breaches each involve separate pieces of information and no multiple data lapses occurred for any breach.

It is also important to note that, as stated above, the Bureau uses the broad definition of PII promulgated by OMB when referring to confirmed breaches (meaning that PII encompasses any information that can be used to distinguish or trace an individual's identity, either alone or when combined with other information that is linked or linkable to a specific individual).

2.4 External auditing

The Bureau's data governance, privacy programs, and information security have been subject to a number of recent audits or other third-party analyses that are relevant to this report.

2.4.1 U.S. Government Accountability Office

The U.S. Government Accountability Office (GAO) audited the Bureau's information and data practices, and published a report on its review on September 22, 2014. The GAO audit stated that the "CFPB has taken steps to protect the privacy of consumers and comply with requirements, restrictions, and recommended practices in the Dodd-Frank Act, [PRA], Privacy Act, E-Government Act, and NIST guidelines." The GAO's report contained 11 recommendations for the Bureau, focused primarily on formalizing and documenting existing privacy and security practices. The GAO "closed" each recommendation by April 17, 2017, meaning that the GAO determined that the Bureau took actions that satisfy the intent of its recommendation, as described below and reflected on the GAO's website. To help ensure consistent implementation of its current processes and practices, the GAO recommended that the Director of the Bureau should "…establish or enhance written procedures including..."

  1. The data intake process, including reviews of proposed data collections for compliance with applicable legal requirements and restrictions and documentation requirements about PRA applicability and OMB review under the PRA;
  2. Anonymizing data, including how staff should assess data sensitivity, which steps to take to anonymize data fields, and responsibilities for reviews of anonymized data collections;
  3. Assessing and managing privacy risks, including documentation requirements to support statements about potential privacy risks in PIAs and for determinations that PIAs are not required;
  4. Monitoring and auditing privacy controls; and
  5. Documenting information security risk-assessment results consistently and comprehensively to include all NIST-recommended elements.

These recommendations were resolved by documenting the existing processes and practices in written policies and procedures.

To enhance the protection of collected consumer financial data, the GAO also recommended that the Director of the Bureau should fully implement the following five privacy and security steps:

  1. Develop a comprehensive written privacy plan that brings together the existing privacy policies and guidance;
  2. Obtain periodic reviews of the privacy program's practices as part of the independent audit of Bureau's operations and budget;
  3. Develop, implement, and provide role-based privacy training;
  4. Update remedial plans for the information system that maintains consumer financial data and related components to include all identified weaknesses and realistic scheduled completion dates that reflect current priorities and available resources; and
  5. Include an evaluation of the plans related to priorities and resources. Evaluate compliance with contract provisions relating to information security in the Bureau's review of the for service provider that processes consumer providers of financial data for the Bureau.

These recommendations were resolved by:

  1. Creating a comprehensive written privacy plan;
  2. Including the privacy program in the independent audit. (The results of the 2017 audit are summarized below.);
  3. Conducting role-based privacy and cybersecurity training annually;
  4. Reviewing and enhancing internal documentation related to the plan of actions and milestones program; and
  5. Analyzing and updating the internal risk management process and updating the procurement language.

Finally, the GAO recommended that, to provide greater assurance of compliance with PRA, the Director of the Bureau should also consult further with the OMB about whether PRA requirements apply to its credit card data collection and information-sharing agreement with the OCC, and document the result of this consultation.

This recommendation was resolved by re-confirming with OMB that the credit card collection was compliant with PRA and the conversation and approval was documented.

2.4.2 Office of the Inspector General

The Bureau's Office of the Inspector General (OIG) reviews the Bureau's privacy program and information security on an annual basis. In its September 27, 2017, memorandum to the Bureau Director, the OIG noted:

  • Information security continues to be a key risk in the federal government, and as is the case for most federal agencies, the CFPB faces challenges due to the advanced persistent threat to information technology (IT) infrastructures. Although the CFPB has assumed responsibility for its IT infrastructure (the U.S. Department of the Treasury was previously responsible for the CFPB's IT infrastructure) and continues to mature its information security program, the agency faces challenges in fully implementing its information security continuous monitoring program. Specifically, the CFPB should implement a data loss prevention program and ensure that automated feeds from all systems, including contractor-operated systems, feed into the CFPB's security information event management tool.
  • The CFPB has taken several steps to develop and implement an information security continuous monitoring program that is generally consistent with federal requirements. For example, the CFPB has implemented a centralized logging information tool for CFPB systems. CFPB management continues to face challenges, however, associated with maturing its information security continuous monitoring program across the agency; such challenges include establishing alerting capabilities and continuous monitoring metrics and further automating tools for several of its manual information security continuous monitoring processes.

The Bureau is in the process of enhancing the continuous monitoring program, consistent with OIG's recommendations.

In its latest Independent Audit of the Consumer Financial Protection Bureau's Privacy Program in February 2018, the OIG stated:

  • Overall, we found that the CFPB has substantially developed, documented, and implemented a privacy program that addresses applicable federal privacy requirements and security risks related to collecting, processing, handling, storing, and disseminating sensitive privacy data. Further, we noted that the CFPB has documented privacy policies and procedures covering a wide range of topics, including privacy roles and responsibilities, privacy impact assessment (PIA) and system of records notice (SORN) management, training, breach notification and response, and monitoring and auditing.
  • Although the CFPB has substantially developed, documented, and implemented a privacy program with related policies and procedures, we identified two areas that require improvement: identification and maintenance of a comprehensive inventory of PII and physical controls over the CFPB's portable media.

The Bureau has accepted the OIG's findings. To resolve this recommendation, the Bureau agreed to include administrative data systems (such as Bureau's own financial data) in the PII inventory (which currently contains public, vendor, other agency, FI, and consumer data); provide locks for laptops stored in the secure building after works hours; and monitor the locking of laptops to desks within the secure building.

2.4.3 2018 White Hat Hacking Exercise

In January of 2018, the Bureau signed an Interagency Agreement with the Department of Defense to leverage Risk and Vulnerability Assessment (RVA) services as a mechanism to identify potential gaps in cybersecurity controls. This "white-hat hacking" effort is the same service the Department of Homeland Security (DHS) provides to other federal agencies to assess technical vulnerabilities beyond those identified in their Cyber Hygiene program, in which the Bureau also participates. RVA testing was completed in the spring of 2018 with no "Critical" findings identified by the assessors, and three technical recommendations. The Bureau has completed remediation of all three recommendations made by the assessors. The review concluded that overall the Bureau's security posture is well-organized and maintained.

3. Sources of data

The Bureau has five main external sources of data: public sources, government agencies, commercial vendors, FIs, and consumers. The Policy on Information Governance directs the Bureau to, wherever reasonably possible, avoid requesting or receiving duplicative information (whether from the same or different sources). Therefore, the first approach to acquiring data involves identifying whether the data the Bureau needs, or a reasonable approximation thereof, are already available within the Bureau, in the public domain, or — to the Bureau's knowledge — from another agency. If so, the Bureau explores using those data rather than creating a new collection. When existing sources are not sufficient, the Bureau seeks to collect the data from an external source.

3.1 Obtaining public data

When available, the Bureau uses public data. Data may have been collected and made public by a government agency or by a private party such as a researcher. Appendix B lists 32 public data assets the Bureau obtained or downloaded, three additional data assets that combine public data with non-public data obtained from another agency, and one additional data asset that combines public data with data procured from a commercial vendor. This is a best effort to provide a comprehensive listing of public data assets the Bureau has obtained, although this listing may be incomplete because (as previously discussed) Bureau policy permits employees to obtain certain public data without approval from the CDO. Therefore, for example, this listing does not include instances in which a Bureau employee may have downloaded or otherwise obtained a copy of a public report, including statistical appendices to such a report. Approval is required, however, for public microdata (consumer-level or account-level data) and any large data assets intended for Bureau-wide use, and those data assets are included in Appendix B.

Much of the Bureau's public data assets consist of data assets released by government agencies including the Census Bureau's decennial census, the American Community Survey, American Housing Survey, Current Population Survey, Quarterly Census of Employment and Wages, and others. Public data from other sources include consumer finance and housing data published on the Board of Governors of the Federal Reserve System's public website, and records from public websites of state and municipal agencies.

3.2 Data sharing from and with other government agencies and regulators

In some cases, the Bureau may be aware that other government agencies collect data that meet the Bureau's analytical, supervisory, enforcement, research, complaint handling or other needs. In these cases, the Bureau works with the other agencies to obtain the data. This method can help to reduce burden on industry and minimize overall government costs, while generally providing high quality data. In some cases, such as data collected under the Home Mortgage Disclosure Act (HMDA), some of the data are public while a subset of the data is restricted.

Appendix B lists 20 non-public data assets that the Bureau has obtained from other government agencies, most of which are financial regulators, three data assets which combine public data with non-public data obtained from another agency as noted above, and three data assets which combine data obtained from another agency with either data obtained from a vendor or data obtained from financial institutions. This listing does not include instances in which the Bureau may have obtained a non-public report or document from another agency which was of low sensitivity (as defined in the Bureau's policy) because (as noted above) the Bureau does not centrally track instances in which employees obtain reports. This is a best effort to provide a comprehensive listing of data assets obtained by the Bureau from other agencies. However, the list may be incomplete because the determination of whether particular information is of low sensitivity as defined in the Bureau's policy requires the exercise of judgment by individual or employees which obtain such material. A further limitation in providing a comprehensive list is that, pursuant to the Information Governance Policy, authority for some data intakes – including data intakes by Supervision and Enforcement – are not centrally recorded but managed under a delegated authority pursuant to the Information Governance Policy, as explained in the Data Governance at the Bureau section of the report. Finally, especially during the early years of the Bureau's existence, some data assets may have been obtained from a government agency and not recorded.

For three data assets – specifically, Call Reports collected by the Federal Financial Institutions Examination Council (FFIEC) and by the National Credit Union Administration (NCUA), and data collected by the FFIEC pursuant to the Home Mortgage Disclosure Act – a subset of the data collected by an agency is public while other parts of the data collection are non-public and have been provided to the Bureau. In these instances, these data assets are listed as both public and agency data assets in Appendix B.

The Bureau's government data assets consist largely of nonbank, housing, credit card, consumer complaint, and education data.

Data-sharing agreements, often referred to as Memoranda of Understanding (MOU), are used between agencies to ensure the secure handling and use of data. A list of the Bureau's datasharing MOUs is available in Appendix C. The Bureau intends to supplement this report with the text of its MOUs subject to obtaining the necessary approval of affected state and federal agencies.

An Interagency Access Request (IAR) is a request from another agency for confidential information collected by the Bureau (e.g., confidential investigative information (CII) or confidential supervisory information (CSI), typically collected by the Bureau during the course of an investigation or an examination and requested by another governmental agency as part of investigative or enforcement activity by that agency). The IAR process implements the Bureau's regulations on sharing the Bureau's confidential information with federal and state agencies. See 12 C.F.R. § 1070.43(b). Between Q3 of FY' 2012 and Q3 of FY' 2018, the Bureau processed 501 requests from 117 state, local, and federal agencies.

In addition, agencies that sign data-sharing agreements with Consumer Response can access consumer complaint data via the Bureau's Government Portal. As of July 2018, there were 78 state agencies, nine federal agencies, one local agency, and one organization acting on behalf of state regulators, with access to the government portal. Also, Congressional offices that sign data-sharing agreements with Consumer Response can access information about the consumer complaints they submit on behalf of their constituents via a secure Congressional Portal. Each office has access only to those complaints for which they also submit signed privacy waivers. As of August 2018, there were 149 Congressional offices with access to the Congressional Portal.

The Bureau refers consumer complaints that are outside of the Bureau's statutory authority to the prudential regulator responsible for the type of complaint.

The Bureau shares the restricted (i.e., non-public) portion of HMDA data with the Federal Financial Institutions Examination Council agencies, U.S. Department of Housing and Urban Development, and the Federal Housing Finance Agency (FHFA).

3.3 Purchasing data from commercial vendors

If the data needed are not available within the Bureau and cannot be readily obtained from another government agency, they can sometimes be obtained through a commercial vendor. The Bureau's Policy on Information Governance directs that the Bureau should seek to ensure that it does not place unnecessary burdens (technical, financial, etc.) on external parties in the course of requesting or receiving information. Therefore, if the Bureau is aware of commercially available data that potentially can meet the Bureau's needs in a timely and costeffective manner, the Bureau makes reasonable efforts to purchase the data. This approach can also help to reduce industry burden and is generally more cost effective than conducting a new survey or collection.

Appendix B lists 31 data assets which the Bureau has purchased from vendors, two additional data assets which combine data collected from a financial institution with data purchased from a commercial vendor, and one data asset which combines data obtained from other agencies with purchased data. This listing does not include instances in which the Bureau may have purchased a research report or other low-sensitivity information from a third party, as such purchases are not centrally tracked by the Bureau's data team. This is a best effort to provide a comprehensive listing of purchased data assets. However, as noted above, the listing may be incomplete because the determination of whether particular information is of low sensitivity as defined in the Bureau's policy requires the exercise of judgment by individual employees. Further, especially during the early years of the Bureau's existence, some small data assets may have been purchased from a third party and not centrally recorded with the Bureau's data team.

Purchased data consist of: data aggregated at a geographic or industry level; data aggregated at the level of individual financial institutions; or de-identified (whenever appropriate) data at the account or consumer level. The Bureau purchases data that are off-the-shelf products that the vendor sells to any willing purchaser. Other procured data may be customized products compiled from data that the vendor has collected in the normal course of its business from financial institutions, consumers, or public records.

On several occasions, the Bureau or a Bureau vendor arranged to have data obtained from a vendor (specifically one of the three nationwide consumer reporting agencies) appended to data obtained from one or more financial institutions. Those instances are included in Appendix B. In all but one instance the financial institution provided the nationwide consumer reporting agency with information that enabled the agency to identify the individuals whose (deidentified) records had been provided to the Bureau or a Bureau vendor, as well as a match key. This enabled the consumer reporting agency to provide a file containing the (de-identified) relevant credit records and the match key to the Bureau so it could append the credit data to the financial institution data. In this way the financial institution's data remained with the Bureau or its vendor without direct identifiers. In one instance the vendor that the Bureau used to collect the data on its behalf already had a de-identified set of credit records from a nationwide consumer reporting agency. The financial institutions provided the vendor with the match keys that the financial institutions used in furnishing data to the consumer reporting agency that enabled the vendor to append the credit records without obtaining any direct identifiers.

The Bureau also contracts with vendors from time to time to assist the Bureau in collecting data, or to collect data on behalf of the Bureau from FIs. These data collections are listed in Appendix B as data obtained from FIs and are discussed in the section of this report addressed to data collected from FIs. In two of these data collections, the vendor assisted in a mandatory data collection. As discussed in the section of this report on Core Data Assets, from 2012 through 2016 the Bureau collected credit card data pursuant to a supervisory request and contracted with a vendor who already collected credit card data from FIs. More recently, the Bureau has used a contractor to assist with collecting account-level data from a number of mortgage servicers pursuant to an order requiring the production of such data in connection with an assessment of a mortgage servicing rule.

The Bureau likewise has used vendors from time to time to assist the Bureau in collecting data from consumers or to collect data from consumers on behalf of the Bureau. This information is collected through surveys, focus groups, or one-on-one interviews. The contractor may provide the raw data for the Bureau to analyze — de-identified where appropriate and requested by the Bureau — or the contractor may analyze the data and provide a summary report. Subject to the limitation previously noted, these data collections are listed in Appendix B as data obtained from consumers, and they are discussed in the section of this report that addresses data collected from consumers.

Some institutions, such as certain consumer reporting agencies, are both vendors of data and also financial institutions in their own right. To the extent that the Bureau obtains data from such an entity as an FI subject to the Bureau's regulatory authority, this report treats such information as coming from an FI.

3.4 Collecting data from financial institutions

The Dodd-Frank Act expressly authorizes the Bureau to require FIs to provide data to the Bureau in response to consumer complaints; supervisory requests; civil investigative demands made in the context of enforcement investigations; and orders seeking information for monitoring market developments and risks to consumers, or for conducting assessments of significant rules issued by the Bureau. In addition, the Bureau collects data pursuant to four federal laws requiring FIs to provide certain specified data to the Bureau: the Home Mortgage Disclosure Act (HMDA) (application-level mortgage data); the Credit Card Accountability Responsibility and Disclosure Act (CARD Act) (consumer credit card agreements and college credit card marketing agreements); the Truth In Lending Act (TILA) (terms of credit card plans and related information); and the Installment Land Sales Act (ILSA) (land sales agreements). The Equal Credit Opportunity Act, as amended by the Dodd-Frank Act, also requires financial institutions to provide certain specified data to the Bureau with respect to applications for loans by small businesses and minority-owned and women-owned businesses, but that requirement will not become effective until the Bureau issues an implementing regulation. Regulation E likewise has a mandatory disclosure requirement with respect to prepaid card agreements that has not yet taken effect.

In addition, the Dodd-Frank Act expressly authorizes the Bureau to obtain information voluntarily from FIs.

Appendix B lists 58 data assets that the Bureau has obtained from FIs, two data assets which combine data from an FI with data from a commercial vendor, and two data assets that combine data from FIs with data from consumers. In addition, Appendix B also includes the following each as a single data asset: all data collected by Supervision which includes data from financial institutions and other sources; all data collected by Enforcement which also contains data collected from financial institutions as well as from other sources; and all data collected by Consumer Response which likewise includes data from FIs and from other sources. This listing does not include low-sensitivity information such as a presentation or handout provided to the Bureau voluntarily by FIs in the course of meetings, conversations, or other communications in the normal course of business. The listing likewise does not include information provided by FIs voluntarily in response to Requests for Information, an Advanced Notice of Proposed Rulemaking, or a Notice of Proposed Rulemaking.

This is a best effort to provide a comprehensive listing of data assets obtained from FIs. The Bureau is confident that all mandatory data collections are included in Appendix B. However, the listing may be incomplete because, as previously noted, the determination of whether particular information is of low sensitivity as defined in the Bureau's policy requires the exercise of judgment by individual employees. Further, especially during the early years of the Bureau's existence, some small data assets may have been obtained from an FI and not centrally recorded.

In some instances, financial institutions have collected data from consumers and provided that data to the Bureau. These are treated as data provided by an FI for purposes of Appendix B.

3.5 Collecting data from consumers

The Bureau obtains data directly from consumers through its Consumer Response function; interviews with potential witnesses conducted by Enforcement; and through research conducted by RMR and CEE. Research can take the form of a survey (conducted by phone or mail, inperson, or online), focus groups, one-on-one interviews, or laboratory or user tests. The Bureau also collects general feedback from consumers via the Bureau's "Tell Your Story" webpage. When the Bureau distributes money to victims of unlawful conduct through its Civil Penalty Fund, the Bureau may obtain information from individuals claiming to be or believed to be victims, to enable the Bureau to validate their entitlement to relief. Such information may include a claims form completed by the consumer or other documentation regarding the consumer's relationship with a particular FI.

Appendix B lists 31 data assets collected from consumers, two data assets which include data collected from consumers and data collected from financial institutions, plus the Enforcement and Consumer Response data assets which were discussed previously. Appendix B does not currently include any data assets collected through focus groups, one-on-one interviews, user testing or small-scale, informal surveys; as noted in the Introduction, the Bureau plans to supplement Appendix B to include those data assets. The listing in Appendix B is a comprehensive listing of formal surveys of consumers conducted by the Bureau.

4. Uses of data

As noted above, data informs much of the Bureau's work. The following is a discussion of how the various sources of data are used. Often, data are obtained for only one purpose. The office or division requesting authorization to obtain data identifies how it will use such data when it acquires them. Some data assets, such as website usability results, are not useful for more than one purpose. Other data assets, such as Call Reports (discussed in the Core Data Assets section), or Census data are brought in as general reference tools for use as the need arises or as an authoritative source for analyzing historical performance, following trends in a particular market and answering questions that may arise from time to time. Similarly, more than one Bureau office may participate in a single data collection, such as the Financial Well-Being Survey in America conducted by CEE with support from the Office of Research. Data assets foundational to the Bureau's work are described in more detail in the Description of core data assets and their uses section of this report.

Subject to the limitations previously stated, Appendix B describes each of the Bureau's data assets. Where data were obtained for a specific purpose, the description explains that purpose. Where data were obtained as a resource to be used as needed, the description states the nature of the data without specifying a particular, single use. This section elaborates on the Bureau's uses of its data assets.

4.1 Public domain data

The data the Bureau collects from public sources can be placed into a number of categories with different use cases for each category. Some data assets consist of information that was collected from consumers either by government agencies or private researchers and made public. The following data assets fall within this category:

  • United States Census
  • American Housing Survey (Census Dep't)
  • American Community Survey (Census Dep't)
  • Quarterly Census of Employment and Wages (Bureau of Labor Stat.)
  • Current Population Survey (Census Dep't)
  • Survey of Income and Program Participation (Census Dep't)
  • Consumer Expenditure Survey (Bureau of Labor Stat.)
  • Longitudinal Employer-Household Dynamics (Census Dep't)
  • Panel Study of Income Dynamics (University of Michigan)
  • Panel Study of Income Dynamics (University of Michigan)
  • General Social Survey (NORC –University of Chicago)

These data assets are basic research tools that are used by Bureau researchers when relevant to address research questions of interest. For example, for research into credit invisibles, the Bureau combined Census data with data contained in one of its core data assets to estimate the incidence of consumers without a credit record and across different demographic groups. To evaluate the definition of "rural" for purposes of special regulatory provisions available only to creditors operating predominantly in rural areas, the Bureau combined Census data with HMDA data to estimate the effects that alternative definitions of rural would have on the number of creditors that could take advantage of these special provisions. For its payday loan rulemaking, the Bureau combined data from the Consumer Expenditure Survey with data from the Current Population Survey to estimate the percentage of payday borrowers who would be able to satisfy an ability-to-repay requirement.

Two of the Bureau's public data assets – the Survey of Consumer Finances and the Survey of Household Economics and Decisionmaking (SHED) – consist of consumer surveys conducted by the Federal Reserve Board focused specifically on consumers' use of financial products and services. Bureau researchers use these data assets in addressing research questions relating to consumers' decisi0ns regarding, and use of, such products. For example, the Bureau used public data from the SHED in preparing a report on student loans held by older Americans. Another category of public data assets consists of data another agency collects from financial institutions and makes public. The following data assets fall in this category:

  • Federal Financial Institutions Examination Council (FFIEC) Call Reports
  • National Credit Union Administration Call Reports
  • Home Mortgage Disclosure Act (HMDA) public data
  • FDIC Summary of Deposits
  • These data assets are also basic research tools used for a variety of research purposes as well as to monitor trends in the markets for consumer financial products and services, define the scope of the Bureau's supervisory and enforcement jurisdiction with respect to depository institutions, and prioritize supervisory examinations. For example, the Bureau's supervision program uses a risk-based approach to prioritize which entities should be examined in a given time period with respect to which product lines, and that analysis may differ depending on estimates as to the size of FI's businesses. The Bureau arrives at such estimates using public FFIEC and NCUA Call Reports, non-public data from other government agencies, supervisory data obtained from FIs; the Bureau has also used a data asset culled from public records with respect to auto loans and purchased from a vendor. These data assets are listed in Appendix B.
  • In addition to these data assets, from time to time the Bureau has obtained more specialized public data assets for use for a particular rulemaking, study, or Bureau program. For these data assets, the description in Appendix B explains the purpose for which the data were obtained.

4.2 Other agency data

Appendix B lists the data assets obtained from other agencies, including data assets that combine public and non-public data obtained from another agency. The Bureau obtained some of those data assets to use on an ongoing basis, as needed, to support Bureau supervision, enforcement, market monitoring, research, and rulemaking functions. This is true, for example, of FFIEC and NCUA Call Reports restricted data, and NMLS Mortgage Call Reports.

Other agency data assets listed in Appendix B were obtained for a specific rulemaking, research report, or other project. Appendix B lists these data assets and the purpose for which the data were obtained.

For example, in developing a definition of "qualified mortgage" under the Dodd-Frank Act, the Bureau obtained an extract from the Historical Loan Performance data asset from the Federal Housing Finance Agency and a separate loan file from the Federal Housing Authority to inform its understanding of the significance of debt-to-income ratios. In conjunction with its work on a rule governing payday and auto title lending, the Bureau obtained data on the number and location of licensed storefront payday lenders from various states in order to estimate the impacts of the proposed rule. For its report to Congress on the use of arbitration agreements in consumer finance contracts, the Bureau obtained data from several states regarding class action settlements and also collected public data from court records regarding class action litigation and individual litigation.

4.3 Commercial vendor data

The Bureau has purchased a wide variety of data assets to meet a wide range of needs.

The Bureau's Consumer Credit Panel, described in the Description of core data assets and their uses section of this report, consists of purchased credit reporting data to which other purchased and public data have been appended. The National Mortgage Database, also a core data asset described in the same section of the report, consists of purchased credit reporting data to which other purchased data and agency data are appended.

A number of the Bureau's purchased data assets are off-the-shelf products that are widely viewed as industry-standard sources for studying historical trends and following current trends in particular markets. The following data assets fall within this category:

  • Black Knight Home Price Index
  • Mortgage Bankers Association National Delinquency Survey
  • Mintel Comperemedia mailout survey data
  • CoreLogic loan-level origination and performance data
  • Blackbox Logic private label mortgage performance data
  • Informa mortgage rates and fees
  • Experian AutoCount (auto loan origination date from motor vehicle records)
  • HSH mortgage rate data
  • Black Knight mortgage loan-level origination and performance data
  • Informa checking account rates and fees
  • Measure One lender-level private student loan performance data
  • Informa credit card rates and fees
  • The Bureau also has purchased Strategic Business Insights MacroMonitor, a biannual survey covering consumers' use of financial products, for Bureau researchers to use as needed to address research questions that may arise. For example, in planning Bureau research with respect to overdraft, Bureau researchers used the MacroMonitor data to inform their strategy in recruiting participants for the Bureau's research.
  • Similarly, the Bureau subscribes to Moody's Analytics creditorecast.com and to S&P Global to obtain information as needed regarding financial institutions and credit trends.
  • In addition to these data assets, the Bureau has purchased a number of data assets to meet specific needs in a particular rulemaking, research project, or other Bureau program. The description in Appendix B explains the purpose for which these data were acquired.

4.4 Financial institution data

When the Bureau obtains data from FIs, the use of the data depends on the means through which and purpose for which the data were obtained. Data that the Bureau obtains from FIs (including market monitoring, research, and financial education as well as supervisory and enforcement) is usually "confidential information" pursuant to 12 C.F.R. part 1070. This includes "confidential supervisory information" (CSI) and "confidential investigative information" (CII), and it is protected in accordance with Bureau regulation.

Supervision collects data from FIs to: (1) assess compliance with federal consumer financial law; (2) obtain information about the activities and compliance systems or procedures of the FI; and (3) detect and assess risks to consumers and to markets for consumer financial products and services. Similar to other federal and state bank supervisors, Bureau examiners collect and/or review policies and procedures, written responses to examiner questions, advertising materials, template form letters, individual loan files, and data assets related to supervised activities. Examiners review these data to check for compliance with federal consumer financial law and to determine trends or prevalence of various company practices. These data are also used in the Bureau's exam prioritization process.

After an enforcement matter is opened, Enforcement uses data obtained from FIs (among other sources) in investigating and, if necessary, litigating the matter. It may obtain such information during the investigatory stage through CIDs and via voluntary presentations or disclosures by FIs, or after the start of litigation through the discovery process. In the course of investigations, Enforcement investigators typically collect some combination of written answers to interrogatories, written reports, documents, and testimony.

The Office of Research and the various Markets Offices within RMR use data obtained from FIs for research and market monitoring, including research and monitoring that informs rulemakings and other policymaking. These data may be submitted either voluntarily or in response to orders issued by the Bureau. The following are general descriptions of data collections for research, market monitoring, and rulemaking. See Appendix B for more details.

RMR conducts surveys of financial institutions to provide evidence for the Bureau's assessments of significant rules. The Bureau also uses surveys to better estimate the costs of rules under consideration. Appendix B lists all the data assets collected from FIs for these purposes, subject to the limitations previously noted. Several illustrative examples are discussed below. Except where noted, all of these data collections were voluntary.

Before deciding whether to propose an extension of an exemption under its remittance rule, the Bureau conducted a survey of depository institutions to understand the extent to which these institutions were using the exemption. Before issuing an Outline of Proposals Under Consideration with respect to debt collection, the Bureau conducted a survey of debt collectors to better understand their operations and costs.

RMR also obtains data from FIs to prepare research reports and to support the Bureau's policymaking. Appendix B lists all the data assets collected from FIs for these purposes, subject to the limitations previously noted. Several illustrative examples ae discussed below.

To understand how the consumer financial markets, consumers, financial entities, or other economic factors might have changed as the result of Bureau rules on the manufactured housing market, the Bureau obtained application-level data from creditors engaged in manufactured housing lending. It used these data to prepare a white paper on this subject.

In connection with a report that the Bureau was required to submit to Congress on the private student loan market, a number of FIs provided the Bureau de-identified account-level data through a third-party analytics firm the FIs selected and with which the FIs contracted. The firm also de-identified the individual lenders. For another required report, a remittance transfer provider provided de-identified consumer-level transactional data to the Bureau and provided to one of the nationwide consumer reporting agencies information which enabled the agency to provide the Bureau with credit reporting data and a match key the Bureau could use to append the credit data to the transactional data. This enabled Bureau researchers to assess whether the reporting of remittance transfers could potentially broaden access to credit. For a third required report, the three nationwide credit reporting agencies each provided deidentified consumer-level data containing a number of different credit scores to enable Bureau researchers to study differences in scores sold to consumers and to creditors.

To help the Bureau evaluate how consumers and consumer financial markets might change as the result of potential regulation of certain short-term loan products, several FIs provided the Bureau with aggregate-level data on overdraft usage following the elimination of deposit advance products. The Bureau used these data for a report it issued in conjunction with its payday proposal.

In addition to these data collections initiated by RMR, in response to requests by the Private Student Loan Ombudsman several servicers provided summary information about business practices related to student borrowers' use of income driven repayment options and student loan performance. The Ombudsman analyzed these data in a special report.

As part of its program to engage with financial service innovators and promote consumerfriendly innovation, the Bureau from time to time has entered into research partnerships in which FIs shared de-identified data enabling the Bureau to study the effects of particular innovations. Appendix B lists all the data assets collected from FIs for these purposes, subject to the limitations previously noted. Several illustrative examples are discussed below.

One FI worked with the Bureau to conduct a randomized control trial of alternative methods of stimulating savings by prepaid card customers. The FI shared the test data, including deidentified data from surveying consumers, with the Bureau for analysis. The Bureau currently is analyzing de-identified data supplied by another FI from a test of alternative methods to stimulate savings in connection with tax refunds.

The Bureau has also required FIs to provide various types of data, as discussed in Appendix B. Appendix B lists all the data assets collected from FIs for these purposes, subject to the limitations previously noted. Several illustrative examples are discussed below:

  • For its arbitration report, the Bureau required a number of FIs to provide samples of their standard customer agreements to enable the Bureau to study the prevalence and terms of arbitration agreements.
  • In connection with the mandated 2015 and 2017 credit card industry reports, the Bureau required a number of credit card issuers to provide aggregate-level data regarding various metrics such as application and approval rates. The Bureau also required those issuers to provide additional information (including policies, sample marketing materials, sample disclosures) it used for deeper analyses conducted as part of these reports. The Bureau required several issuers that specialized in offering subprime credit cards to provide aggregate-level data as well.
  • In connection with its consideration of potential rulemaking relating to overdraft services the Bureau required a number of core processors to produce, on an anonymized basis, certain information regarding overdraft policies and outcomes for the depository institutions supported by these processors to understand the market.
  • To understand developments with respect to person-to-person payments, the Bureau required a service provider to provide certain information to the Bureau regarding its offering.

4.5 Consumer data

Consumer Response handles complaints and inquiries about financial products and services submitted by consumers about companies. Consumer Response conducts analyses of consumer complaint data with respect to FIs, financial products, and issues described in the complaint, and shares such analysis within the Bureau. The Bureau uses this analysis for supervisory purposes; for investigative or other enforcement purposes; for rulemaking and market monitoring purposes; and to inform work to meet the financial education needs of the public and needs or challenges faced by certain special populations identified in the Dodd-Frank Act, such as students, servicemembers, and older Americans.

The Bureau also collects data from consumers and a variety of professionals and entities who serve them (such as financial educators, credit counselors, social workers) through focus groups, interviews, user testing, and informal surveys. As previously explained, except with respect to data collections that informed disclosure testing, Appendix B does not currently include these data collections. The Bureau intends to supplement Appendix B to do so. The discussion below summarizes the purposes for which data is collected from consumers with some illustrative examples.

The Bureau has collected data from consumers in connection with a number of rulemakings or potential rulemakings. For example, the Bureau conducted focus groups among prepaid card users prior to initiating work on the prepaid card rulemaking, and in conjunction with a potential rulemaking involving debt collection. The Bureau likewise has conducted user testing of model disclosure language in connection with rulemakings or potential rulemakings, including the prepaid rulemaking and a potential debt collection rulemaking. Generally, this testing takes the form of iterative rounds of one-on-one interviews, conducted by a vendor, in which consumers are exposed to model forms under development and asked questions designed to elicit their understanding of the forms.

Other data collections from consumers support the Bureau's work to design and improve financial education initiatives and to measure the reach or how the consumer financial markets, consumer behavior, financial entity behavior, or other economic factors might change as the result of its financial education programs. For example, the Bureau gathers metrics related to materials visited on the Bureau's website to help understand what information consumers access. In addition, the Bureau has conducted a series of focus groups and one-on-one interviews to understand consumers' knowledge and experiences with respect to various consumer financial products and services. These data collections resulted in a series of "Consumer Voices" reports. Other data collections include user testing of contemplated consumer education materials and tools.

The Bureau also has conducted a number of formal quantitative surveys. Appendix B lists all such surveys. Several illustrative examples are discussed below.

In connection with a potential debt collection rulemaking, the Bureau conducted the Survey of Consumer Experiences with Debts in Collection. This was a mail survey sent to a nationally representative sample of consumers selected from the Bureau's Consumer Credit Panel, which is discussed in the Core Data Assets section of this report. Before issuing the TILA-RESPA Integrated Disclosure Rule, the Bureau conducted a quantitative test comparing consumer comprehension using the contemplated new disclosure forms with consumer comprehension using the preexisting forms.

The 2017 National Financial Well-Being in America Survey, conducted for the Offices of Financial Education and Financial Protection for Older Americans, was an online survey conducted to measure the financial well-being of adults in the United States. These data were created as a foundation for internal and external research into financial well-being and are relevant to work being done by researchers in the Office of Research who have access to the (deidentified) data.

5. Data reused within the Bureau

This section discusses instances in which data has been obtained by one part of the Bureau for one purpose and then used by another part of the Bureau for a different purpose. The discussion below first clarifies what is covered in this section.

The analysis or insights derived from the data one Bureau office collects can be useful to help inform the work of other offices in the Bureau. Therefore, after analyzing the data that it has collected, a Bureau office may share its analysis within the Bureau. The office conducting the analysis may share it proactively or in response to a question from another office. Additionally, one Bureau office may request another office to conduct an analysis of data the office receiving the request has collected and share findings or insights from such analysis. The Bureau does not centrally track these types of sharings and they are not considered a "reuse" for purpose of this report.

For example, to prepare its annual calendar of supervisory examinations, Supervision regularly consults with Markets to obtain information to help inform Supervision's risk-based prioritization and to define the scope of contemplated examinations. On an ad hoc basis, Supervision may also consult the relevant Markets Office prior to conducting a particular examination to discuss market trends that may be relevant to the examination.

In addition, periodically data is obtained for multiple purposes. The Call Reports, whose uses are noted in the Core Data Assets section of this report, are an example of data assets that are brought in for multiple purposes. The Bureau does not define these as reuses. The Bureau also distinguishes between "reuse" and instances in which two or more Offices within the Bureau work jointly on a particular project and jointly use data in that context. For example, the Research regularly supports Supervision with respect to fair lending examinations requiring complex econometric analyses of supervisory data; the Research also supports Enforcement from time to time on request to provide analyses of complex data assets obtained in the course of an investigation. CEE and the Office of Research collaborate on a number of research studies by exchanging ideas with respect to a research topic, research design, or a data collection instrument and share data in such endeavors. For example, the Office of Research collaborated with Financial Education in connection with the Financial Well-Being Survey in America.

In contrast to the above examples, and as noted above, data obtained for a particular purpose may prove relevant to work being conducted by another division for a different purpose. This has occurred most frequently with respect to data collected by Supervision for purposes of supervisory exams and later considered potentially relevant for research, market monitoring, rulemaking, or the assessment of significant rules, all of which is led by Research, Markets and Regulation. Consequently, in 2013 Supervision and RMR developed an information sharing framework, a copy of which is included in Appendix A, pursuant to which RMR has from time to time been able to reuse Supervision data as discussed below.

Enforcement data has been much less commonly shared, but on two occasions, discussed below, Enforcement and RMR entered into agreements to share specific data assets. These agreements also are contained in Appendix A.

The Bureau is currently working to centralize the process of authorizing and cataloging reuse requests through the Data Intake Group and the Chief Data Officer.

5.1 Public domain data

Public data are, by definition, available to any member of the public and any governmental agency. The Bureau does not track the reuse of these data because, as public data, they are available for and intended for a variety of analyses by anyone within government or the general public and they generally do not contain sensitive data.

5.2 Other agency data

Reuse of other agency data depends upon the nature of that data. Certain agency data, while non-public, is intended to be used on an ongoing basis by employees in multiple parts of the Bureau who have a need to know information contained in the data asset. This is true, for example, of the restricted version of the FFIEC and NCUA Call Reports, NMLS Mortgage Call Reports, or the non-public elements of the HMDA data asset, which may be used by certain employees in the SEFL and RMR.

Where the data constitutes another agency's confidential information, such as confidential supervisory or investigative information, the Bureau treats it in accordance with its and other agency's applicable regulations regarding the treatment of confidential information. Generally, the Bureau and the agency providing information have an MOU or another form of agreement — such as a FFIEC interagency agreement or letter authorizing usage — which governs how the Bureau can use and reuse the data it obtains from the agency or its representatives. Access to, and reuse of, these data are restricted in accordance with the MOU or agreements.

5.3 Commercial vendor data

As previously explained, the Bureau purchases data from vendors. Some of these purchases are "off the shelf" products that are available for purchase by anyone in the private sector or government. Other products, described further in the section of this report on Core Data Assets, are customized for the Bureau from data that the vendor collects for sale. All of those data assets are listed in Appendix B, subject to the limitations previously discussed.

Most of the data assets the Bureau has procured from vendors were procured to be used generally as relevant to understand historical patterns or current trends in particular markets or in consumers' preferences and uses regarding consumer financial products and services. The uses of these data assets thus falls outside the definition of "reuse." The Bureau is not aware of any instance in which it purchased data for a specific and limited purpose and then reused for another purpose.

Additionally, access to, and reuse of, these data are restricted in accordance with the applicable license agreements.

5.4 Financial institution data

As previously noted, the Bureau collects data from FIs to support different aspects of the Bureau's work. To avoid re-collecting the same or similar data, the Bureau can use data collected for a particular purpose for a different purpose if authorized and consistent with governing law. The sub-sections that follow describe the types of data the Bureau has obtained from FIs and how it reuses that data.

5.5 Data collected for supervision, used for enforcement

Enforcement and the Office of Fair Lending's Enforcement team access CSI to (1) ensure consistent application of the enforcement and supervisory tools, and (2) consider, through the ARC process, whether Enforcement or Supervision should address a potential violation. Supervision shares information, including CSI, with Enforcement if that information is relevant to the duties of the Enforcement employees who will receive and use that information, including: to support a matter that has been referred to Enforcement through the ARC process, to support an ongoing investigation or litigation, or when sharing Supervision prioritization information. The information Supervision shares with Enforcement includes any information that would support the finding of a violation of federal consumer financial law, including reports of examination, supervisory letters, and relevant materials from an institution (e.g., marketing materials, disclosures, account statements, etc.).

The process by which information is shared is outlined in SEFL governance documents and is part of the routine operation of the division. Current SEFL practice limits Enforcement access to Supervision work-papers to those Enforcement employees who require the information for specific tasks related to their job functions.

5.5.1 Data collected for supervision or enforcement used for research, monitoring, rulemaking, or assessments of significant rules

Data that the Bureau obtains for its supervision and enforcement work is generally considered CSI or CII, respectively, and is protected in accordance with Bureau regulation. Bureau requirements regarding CSI and CII govern how the Bureau reuses this information internally.

In a number of instances, Supervision and RMR collaborated to develop data collections intended to support supervisory activities, which also could potentially be of value for market monitoring and/or research. These data collections are discussed below:

  • Credit card database – As discussed in the Core Data Assets section of this report, from 2012 through 2016, the Bureau used its supervisory authority to collect, on a monthly basis, de-identified account-level data from credit card issuers. The data specifications for this data collection paralleled a preexisting data request by the OCC. The Bureau used these data for research and market monitoring as well as supervisory activities.
  • Payday – In 2012, the Bureau collected account-level data on payday loans in connection with a series of supervisory examinations. De-identified versions of these data were also used to study payday loan products and consumers' use of them and resulted in a number of research reports and papers.
  • Deposit advance products (DAP) – The Bureau collected a random sample of checking account data and transaction-level data from a sample of FIs offering DAP in connection with a series of supervisory examinations. De-identified versions of these data were also used to understand consumers' experiences with the product; in addition, these data provided insights into consumers' experiences with related products. These data were used in two research publications that were relied on in the Bureau's rulemaking, and several research papers.
  • Overdraft – The Bureau collected aggregated, de-identified data regarding overdraft usage at a number of large FIs and de-identified transaction-level data for a random sample of de-identified checking accounts from these banks. It used these data to analyze the market for overdraft services and consumer choices and outcomes and this resulted in several research reports. The Bureau subsequently contracted with one of the three nationwide consumer reporting agencies to provide credit data along with a match key that enabled the Bureau to match the credit data to the transaction data. Bureau researchers used these de-identified data to produce a research report.
  • Credit card – In 2013, Supervision obtained aggregate-level data from nine card issuers that were used for credit card industry reports that the Bureau was required to produce. In 2015, Supervision obtained match keys from nine credit card issuers for accounts in the Credit Card Database, discussed in the Core Data Assets section of this report, which enabled the Bureau to identify accounts in the database with a rewards feature and Supervision obtained certain aggregate-level data from five issuers in connection with supervisory prioritization efforts. RMR developed these data requests in collaboration with Supervision. In addition, these five issuers and four other issuers received orders pursuant to the Bureau's market monitoring authority seeking aggregate-level information comparable to the information obtained in 2013 and four other issuers received orders seeking a different set of aggregate-level information. For the 2017 report, all requests were made using the market monitoring authority.

As discussed above, data that Supervision collected in the normal course of its examination work has proven to be relevant for work undertaken by RMR and has been shared pursuant to the previously-referenced data-sharing framework established in 2013 by SEFL and RMR. See Appendix A. The framework was established to enable RMR to use CSI (where appropriate) to support its work, to minimize burden on industry, and to ensure the confidential nature of the information.

Of note, the framework allows RMR to access CSI when such access would further RMR's mission of informing the public, policymakers, and the Bureau's own policy-making with datadriven analysis of consumer finance markets and consumer behavior. The framework also contains access restrictions and approval requirements to protect the confidential nature of such information.

From time to time, data Supervision has collected in the normal course of conducting examinations has proven relevant for particular research or market monitoring projects of the Bureau. Specifically:

  • Rulemaking and potential rulemakings. The Bureau has used supervisory information to inform a number of rulemakings. To enhance their understanding of the payday market in connection with the payday rulemaking, members of the rulemaking team reviewed certain data collected through supervisory examinations of payday lenders. Similarly, in connection with a potential debt collection rulemaking, members of that team reviewed certain data collected through supervisory examinations of debt collectors. In preparation for a potential rulemaking regarding the collection and reporting of small business lending data, which is a rule as required by the Dodd-Frank Act, members of the team reviewed certain data collected from small business fair lending examinations. In the HMDA rulemaking, RMR used supervisory data from fair lending examinations to evaluate the incremental benefits that certain potential data points would have in furthering the statutory objectives. In connection with a potential overdraft rulemaking, members of the team examined supervisory data regarding opt-in rates for overdraft services.
  • Assessments of significant rules– Bureau researchers who are preparing statutorily mandated assessments of the Bureau's significant rules use data collected in the course of supervisory examinations of market participants for assessment purposes. Reviewing the data also has enabled the researchers to identify data gaps that are being filled through other means.
  • Credit card industry report – In preparing the 2017 credit card industry report, Markets staff examined information collected by Supervision relating to deferred interest offers.
  • Paperwork Reduction Act estimates – To inform the Bureau's estimate of the burden associated with paperwork requirements under the Fair Credit Reporting Act, Research reviewed related information collected in supervisory examinations.
  • Proxy methodology – Bureau researchers used data Supervision collected from mortgage lenders through fair-lending examinations to prepare a report assessing the effectiveness of a methodology developed by the Bureau to proxy the race and ethnicity of consumers when that information is unknown.
  • Mortgage-related data – Bureau researchers have used supervisory data from mortgage examinations and the restricted (non-public) HMDA data for several research projects in conjunction with Bureau policymaking work. On one occasion, researchers examined supervisory mortgage data to determine whether the data could be used to evaluate potential effects of the Bureau's mortgage rules on manufactured housing lending, but the data proved not to be useful for that purpose. In a second instance, researchers analyzed supervisory mortgage data to explore variations in origination points and fees but again the data proved not to be useful. Researchers likewise were unsuccessful in seeking to use supervisory data to develop an alternative means of calculating the Average Prime Offer Rate, which is embedded in certain regulatory provisions. On one occasion, researchers were able to use certain information collected through Supervision as part of a study on the returns to consumers from mortgage shopping.
  • Student lending – To develop data specifications for a data request to student loan servicers, Research and Markets staff examined certain examination information regarding data collected by servicers. The Bureau has not made the then-contemplated data request to servicers to date.
  • Consumer-permissioned access to data – To inform policy-making relating to consumers' ability to access transactional data about their accounts through third-party aggregators, the Markets team working on this issue reviewed certain agreements that Supervision had obtained through supervisory requests.
  • Auto finance research – Bureau researchers are using supervisory data from auto finance exams to conduct three separate research projects on issues in the auto finance market. None of these projects has reached the point of a published research paper.

The sharing of enforcement data for research, market monitoring, and rulemaking has been rare. There have been some instances in which data collected by Enforcement in the normal course of an investigation have proven relevant to other work at the Bureau, and the data were reused for these purposes. Specifically:

  • Installment and automobile title lender data – In the course of an investigation, Enforcement obtained data from certain companies which shed light on the size of the market for vehicle title and installment loans and the market share of certain participants in that market. These data were used by researchers and analysts who were working on a potential rulemaking to define larger participants in the market for personal loans. The Bureau has subsequently suspended work on that rulemaking.
  • Payday – The Bureau used standard-form agreements (i.e., the text of the contracts), some of which were secured from payday lenders as part of Enforcement investigations, for a report that the Bureau was required to submit to Congress regarding mandatory pre-dispute arbitration. The Bureau used the agreements in conducting its analysis of the prevalence and terms of arbitration agreements in the payday market.
  • Small dollar lending – Through Enforcement investigations of a number of different types of liquidity lenders, as well as through an order directed to certain lenders for riskassessment purposes, the Bureau obtained loan-level data that the Bureau used to better understand risks to consumers. The Bureau published two reports based in part upon these data, which were used in the Bureau's rulemaking regarding vehicle title loans.

5.5.2 Data collected by the Division of Research, Markets, and Regulations and used for other purposes

Although insights developed by RMR from data obtained by RMR and analyses of those data can be relevant to Supervision and Enforcement, the sharing of such insights and analyses falls outside the scope of "reuse" as defined for this report. Supervision and Enforcement do not generally use, in exams or cases, raw data obtained from RMR. The Bureau is aware of two instances, however, in which such data has been reused:

  • RMR shared with Enforcement a subset of arbitration agreements – originally obtained for the arbitration study the Bureau was required to conduct – to facilitate intraBureau feedback and input on the arbitration rule, and to help build institutional knowledge about trends in contract provisions.
  • On another occasion, a subset of data that were originally obtained for the 2015 Credit Card Report, relating to debt collection, were shared with Supervision for the purpose of educating staff on market trends.

5.6 Consumer data

As previously explained, this Report covers certain types of data collections from consumers. Appendix B will be supplemented with information with respect to data collections from consumers on a voluntary basis through focus groups, one-on-one interviews, user testing, and small-scale informal surveys. The Bureau is not aware of any instances in which it has reused such data collections, which tend to be of a very small scale and for a limited purpose.

Appendix B does list data collected from consumers through surveys as well as through disclosure research. As previously noted, RMR and CEE have jointly worked on a number of these surveys and have shared the data from those surveys. The Bureau is not aware of any instance in which data from any of these surveys has been reused.

The Bureau also is not aware of any instance in which data collected from consumers through investigative interviews by Enforcement have been reused. The same is true with respect to the one instance in which Supervision collected data from consumers.

The Bureau views data collected through Consumer Response as intended to be collected for multiple purposes beyond simply resolving additional complaints, and therefore does not consider multiple uses of this data to be reuse. With that said, this section describes how consumer complaint data is accessed and used within the Bureau.

Bureau staff that do not need access to direct personal identifiers use a de-identified version of consumer complaint data for the following purposes:

  • Supervisory activities including examination prioritization and scoping of exams;
  • Enforcement investigations;
  • Market monitoring; and
  • Research to support rulemaking and other policymaking by RMR, to inform the financial education work of the Office of Financial Education and the work of other offices within CEE servicing specific consumer populations, such as servicemembers, older Americans, students, and traditionally underserved consumers.

The Private Student Loan Ombudsman uses the complaint data involving students to monitor the resolution of individual complaints, to obtain additional information by speaking directly to complainants, to publish reports based upon the complaint data, and to review and attempt to resolve informally complaints related to student loans — all as required by statute.

The Office of Servicemember Affairs uses the complaint data involving current and former servicemembers to support its work monitoring complaints by service members and their families and responses to those complaints as required by statute.

In addition, in some cases Enforcement needs to contact consumers who have submitted complaints to obtain evidence for an investigation or witnesses for a judicial proceeding. Designated employees within Enforcement may access the complaint data asset that includes direct identifiers.

6. Description of core data assets and their uses

There are several data assets that the Bureau considers to be foundational to its work, and these core data assets are summarized below. The Bureau obtains these assets from a variety of sources, as discussed above, including the public domain, vendors, FIs, or consumers. They are considered foundational because they provide insight into consumer financial markets or support regulatory activities and are persistent rather than a one-time collection. Access to these data assets is restricted pursuant to Bureau policy as described above and in accordance with applicable law, including 12 C.F.R. part 1070 and the Privacy Act of 1974, 5 U.S.C. § 552a.

Consumer complaints – As noted above, Consumer Response receives consumer complaint data from consumers describing their issues with companies providing financial products or services. Consumer Response also receives complaint data in the form of company responses to those consumers. It also receives inquiries and feedback from consumers.

As discussed above, the Bureau primarily uses complaint data for the purpose of responding to consumer complaints, as well as market monitoring, supporting the supervision of FIs, Enforcement activities, and for trends analysis for consumer financial education and engagement (e.g., servicemembers, older Americans). Where appropriate, Bureau staff use a de-identified version of the data asset that does not contain direct personal identifiers.

Enforcement activities data – Data collected in the course of Enforcement activities. This is not one data asset, but rather a number of data assets maintained separately for each Enforcement matter, each with access restrictions. These data are considered CII, and they are governed by regulation and Bureau requirements regarding CII. If any of these data are shared with RMR as discussed above, access restrictions apply to such data.

Supervisory activities data – Data collected in the course of supervisory activities. This is not one data asset, but rather a number of data assets maintained separately for each supervisory matter, each with access restrictions. These data are considered CSI, and they are governed by regulation and Bureau requirements regarding CSI. If any of these data are shared with RMR as discussed above, access restrictions apply to such data.

Consumer Credit Panel (CCP) – The CCP is a nationally representative panel (1 in 48 sample) of approximately five million de-identified consumer credit records that is updated monthly and dates back to 2001. The Bureau procured these data through a competitive procurement process from one of the three nationwide credit reporting agencies, each of which is in the business of selling such data. The CCP data also includes marketing data (such as estimated income) which the Bureau's vendor sells for marketing purposes. The CCP excludes direct identifiers and the vendor does not provide the name of the lender, or other furnisher of data contained in the vendor's records, but instead provides a unique identifier for each data furnisher.

The Bureau's vendor matches the records in the CCP to a public database of servicemembers maintained by the Department of Defense so that records pertaining to servicemembers are flagged. The Bureau also procured, from a separate vendor, income data at the nine-digit (zip +4) zip code level and arranged to have the CCP vendor match that data to records in the CCP.

The Bureau uses these data primarily for research and market monitoring and have been used for a number of research reports, the Bureau's biennial reports to Congress on the credit card industry, and to inform rulemakings. These data are also used for the monthly publication of Consumer Credit Trends.

The Bureau also uses the CCP as a "sampling frame" for conducting certain surveys. The Bureau provides sampling criteria to the nationwide consumer reporting agency that provides the CCP data and it selects individuals to receive such surveys and mail them. The agency in turn strips the responses of any direct identifiers and provides the responses to the Bureau with a match key that the Bureau can use to match the response to the CCP but without any direct personal identifiers. This enables the Bureau's researchers to use the CCP data to weight responses and adjust for non-response bias and to study results for discrete segments such as segments defined by credit tier. The Bureau used this approach to conduct the first nationally representative survey of consumers with debts in collection.

National Mortgage Database (NMDB) – The NMDB is a joint project with the Federal Housing Finance Agency and is a source of information about the U.S. mortgage market based on a 5 percent sample of residential mortgages. It consists of three primary components: (1) account-level origination and loan performance data along with credit information associated with the accounts; (2) the quarterly National Survey of Mortgage Originations (NSMO); and, (3) the annual American Survey of Mortgage Borrowers (ASMB). To construct the NMDB, one of the three nationwide consumer reporting agencies provides quarterly mortgage account-level data and associated credit data for a nationally representative sample (5 percent) of mortgages active at any time since January 1998. The credit records start prior to the time the mortgage was first reported (but no earlier than 1998). The consumer reporting agency updates the account and credit data quarterly and adds a 5 percent sample of new mortgages to the database. The records are de-identified and the identities of creditors, servicers, or other furnisher of data to the consumer reporting agency are de-identified as well before the data is provided to the FHFA or the Bureau. The NMDB includes approximately 12 million residential mortgages.

For loans represented in the NMDB that were purchased or guaranteed by Fannie Mae, Freddie Mac, the Federal Housing Administration (FHA), the Department of Veterans Affairs (VA), or the Department of Agriculture, the nationwide consumer reporting agency appends to the NMDB records administrative data reported to the purchaser or guarantor. That consumer reporting agency also appends HMDA data and additional servicing and property records procured from a vendor. All matches and appending is conducted by the nationwide consumer reporting agency behind a firewall. The details of the processes used to append the data while protecting consumer privacy are explained in a technical paper on the construction of the NMDB.

On a quarterly basis, the consumer reporting agency responsible for providing the data in the NMDB selects a random sample of NMDB borrowers who recently obtained new mortgages and mails a survey to those selected regarding their origination experiences. Once a year, the consumer reporting agency selects a random sample of existing borrowers, using criteria for selecting the sample provided by the FHFA and the Bureau, and mails a survey seeking information regarding the borrowers' mortgage servicing experience. When it receives the responses, the consumer reporting agency provides the FHFA and the Bureau with the de-identified responses and a match key that can be used to match back to the data in the NMDB.

The Bureau is using the NMDB for market monitoring and assessments of significant rules, and it makes data publicly available on a periodic basis regarding mortgage performance trends. There have also been four "NMDB Technical Reports" issued with results from the originations surveys, and Bureau researchers used the survey data to prepare a report on the extent to which consumers shop for mortgages. In addition, two "NMDB Staff Working Papers" jointly prepared by staff of the FHFA and the Bureau have been published.

Home Mortgage Disclosure Act (HMDA) – HMDA requires many FIs to maintain, report, and publicly disclose information about applications for and originations of mortgage loans.149 Beginning with HMDA data collected in 2017 and submitted in 2018, responsibility to collect and process HMDA data transferred from the Board of Governors of the Federal Reserve System to the Bureau. For prior years, the Bureau has obtained the data collected by the Board including the small number of fields that were excluded from the public data asset. Appendix B contains an entry for the public dataset and a separate entry for the restricted data asset.

HMDA's purposes are to provide the public and public officials with sufficient information to enable them to determine whether institutions are serving the housing needs of the communities and neighborhoods in which they are located, to assist public officials in distributing public sector investments in a manner designed to improve the private investment environment, and to assist in identifying possible discriminatory lending patterns and enforcing antidiscrimination statutes. In the context of home mortgage lending, the Bureau (like other financial regulators) uses HMDA data to identify possible discriminatory lending patterns and to enforce anti-discrimination statutes like the Equal Credit Opportunity Act. As part of supervising very large banks and nonbank mortgage lenders, the Bureau reviews the accuracy of HMDA data and the adequacy of HMDA compliance programs.

The Bureau also uses HMDA data for market monitoring and research purposes, including research to inform rulemakings.

Credit Card Database (CCDB) – The CCDB is a sample of de-identified account-level (such as account balance) credit card data. The CCDB does not contain transaction level data pertaining to consumer purchases.

In 2012, the Bureau began collecting account-level data on credit card accounts maintained by nine credit card issuers covering 25 million to 75 million accounts. The collection's specifications mirrored a collection that the Office of the Comptroller of the Currency (OCC) had been conducting since 2008 from 16 large national banks covering approximately 520 million accounts. The Bureau's initial collection included data back to 2008. The combined collections covered approximately 85 percent of the market. Both collections involved credit card accounts of the FIs involved and the collections were performed by a contractor who already collected and maintained credit card data from FIs. Both collections took place using the respective agencies' supervisory authority. Neither collection captured data about individual purchases. The FIs also provided the contractor with a match key that enabled the contractor to match the records to a de-identified set of records from a nationwide consumer reporting agency.

Neither the Bureau nor the OCC nor the contractor received data containing any direct personal identifiers. Pursuant to the MOU between the OCC and the Bureau, each agency shared its data with the other agency through their common vendor.

In early 2016, the OCC ceased its data collection of account-level credit card data, and at the end of 2016, the Bureau did the same. In 2017, the Bureau arranged to obtain similar de-identified account-level data, (Reporting Form FR Y-14M), from the Board of Governors of the Federal Reserve System. The Y-14M data do not include any personal identifiers nor do they include the linkages to the credit reporting data and do not include any personal identifiers. The data covers the period starting in 2012. The Board's data covers approximately 500 million accounts representing 75 percent of the market. For some institutions, the Bureau currently retains the ability to run aggregate reports off the full data asset through the Board's vendor, and the Bureau retains a 40 percent sample of the de-identified account-level data (i.e., approximately 200 million accounts).

The Bureau has used the CCDB for market monitoring, including the preparation of a biennial report to Congress on the credit card market and to inform decisions about priorities for supervisory examinations and the scope of such examinations. It has also used the CCDB for a number of working papers prepared by Bureau researchers, as well as for research that has informed rulemakings and supervision.

Call Reports These include the Reports of Condition and Income from FIs in various financial markets. The data are collected by prudential regulators (by the FFIEC or the NCUA) and state regulators (through the Conference of State Bank Supervisors (CSBS)). The Bureau uses these data in a variety of research and analysis contexts. For example, these data provide the authoritative basis for determining which depository institutions have assets over $10 billion and thus fall within the Bureau's supervisory and enforcement jurisdiction. The Bureau likewise has used Call Report data to monitor the market for overdraft services and to report on the amount of overdraft fees paid by consumers and the fees' contribution to bank earnings. The Bureau used Call Report data in combination with HMDA data (as well as Census data and data from the Bureau's Consumer Credit Panel) to estimate the effect that alternative definitions of "small creditor" and "small servicer" would have in conjunction with several of the Bureau's mortgage rulemakings under title XIV of the Dodd-Frank Act.

7. Conclusion

The Bureau is issuing this report in order to provide transparency with respect to the Bureau's data governance program and its data collections. The Bureau is issuing concurrently with this report a Request for Information in which it seeks public comment on the program and collections, including ways to improve their efficiency and effectiveness.