Guidesdata brokersdata privacy

What Is a Data Broker and How Does It Work?

data brokerpersonal dataaudience segmentationCPM pricingCDPDMPGDPRlegitimate interestconsentdata protectionfraud detectionrisk mitigationdoxxingopt-outdata inferenceCNILPrivacy International

When data-privacy scandals make headlines, Facebook and Google tend to absorb most of the public anger. They are the familiar scapegoats whose practices get scrutinized by regulators and reported widely. But there is a parallel ecosystem of companies that operate largely out of the spotlight, unknown to most internet users — companies that in many cases hold more personal information about individuals than any single social network does.

In the United States, where federal data-privacy law remains comparatively permissive, data brokers can hold up to 1,500 pieces of information about a single person. In the EU, these companies operate close to the boundaries of GDPR, navigating its restrictions by leaning on contested interpretations of "legitimate interest" or by relying on consent mechanisms that most users never scrutinize carefully.

Saying that such companies have more information about citizens than state authorities isn't a great exaggeration. This article explains what data brokers are, how they collect and monetize data, what types exist, and what the legal and ethical controversies surrounding them look like.

What Are Data Brokers?

Data brokers — also called information brokers, data providers, or data suppliers — are companies that collect personal data themselves or purchase it from other organizations (such as credit card companies), crawl publicly accessible internet sources, and aggregate that information with data from offline sources. Most people are unaware these companies exist, yet the global data brokerage industry generates approximately $200 billion in revenue annually, and it continues to grow.

What Types of Data Do Data Brokers Collect?

Data brokers pull information from a wide range of online and offline sources.

Common sources include:

  • Social media platforms
  • Web browsing history
  • Online and offline purchase history, and warranty registration information
  • Credit card records
  • Government records (driver's licences, motor-vehicle records, census data, birth certificates, marriage licences, voter-registration data, etc.)

The types of data collected and sold typically include:

  • Full name
  • Current and previous addresses
  • Phone numbers
  • Email addresses
  • Age and gender
  • Social Security number
  • Real estate ownership information
  • Income
  • Education level
  • Occupation

Brokers combine these data points to construct audience segments — sometimes called user segments or simply audiences — which are then sold to other companies.

When the purpose is online advertising, most AdTech platforms (demand-side platforms, data management platforms, and similar) are not interested in raw identifiers like names and addresses. What they want is behavioural context: web history, purchase patterns, and demographic signals like age, gender, and income bracket that improve targeting accuracy.

How Do Data Brokers Make Money?

At the most fundamental level, the data brokerage business involves sourcing and aggregating personal data, then reselling curated audience categories to third parties. One of the more notorious documented cases involved a data broker that sold contact lists of rape victims, people with alcohol dependencies, and erectile dysfunction sufferers to advertisers — at $79 per 1,000 contacts.

When audience segments are sold to AdTech companies, they are typically priced on a cost per mille (CPM) basis, or as a percentage of media spend.

Despite the sensational cases that receive media coverage, most data brokers selling to mainstream advertising buyers avoid such extreme categories. The more common segments they deliver are things like "sports enthusiast," "music lover," or "impulse buyer" — behavioural labels rather than sensitive personal identifiers.

What Types of Data Brokers Are There?

Several thousand companies worldwide collect consumer information from public and non-public sources for the purpose of reselling it. They can be grouped into four broad categories based on their primary use case.

Type 1: Marketing and Advertising Data Brokers

These are the most widely known data brokers. Companies like Acxiom and Datalogix (acquired by Oracle) fall into this category, as do data-broker divisions within larger firms like Experian and Equifax.

Their core function is to build comprehensive databases of individuals — including age, location, education, income, web history, purchase history, and interests — and make those audiences available to advertising companies for targeted campaigns.

Data brokers source the information from data providers and provide it to data consumers.

Type 2: Fraud Detection Data Brokers

Some brokers specialize in fraud detection services used primarily by banks and mobile operators. Before approving a loan, for example, a bank might query a data broker to verify whether an applicant's stated information is accurate and to reduce the risk of extending credit to someone misrepresenting themselves.

Type 3: Risk-Mitigation Data Brokers

This category involves brokers that use behavioural data to influence financial and insurance decisions. A history of frequent online credit card purchases of luxury goods might lead to a classification as a high-risk borrower, resulting in higher loan interest rates. Conversely, holding an active gym membership could place someone in a lower cardiac-risk group, qualifying them for reduced life insurance premiums.

As a practical example, users of Yanosik — a dashcam and road-information app — can opt into sharing their driving behaviour data in exchange for lower car insurance premiums. Careful drivers are rewarded; reckless ones pay more.

The serious concern here is accuracy. These risk classifications are often built on incomplete or imprecise data, and because most people are unaware the profiling is happening, there is rarely a clear mechanism for them to access, correct, or remove the information being used against them.

Type 4: People-Search Sites

People-search sites such as PeekYou and Spokeo allow individuals or organizations to look up information about a person using their name, phone number, address, email, or Social Security number.

Information available through these sites can include:

  • Aliases
  • Current and past addresses
  • Birthdates
  • Interests and affiliations
  • Education and employment history
  • Marital status
  • Financial information (e.g., bankruptcy filings)
  • Social media profiles

Because this information is readily accessible, it creates a straightforward path to doxxing — the non-consensual public exposure of private personal details.

How Do Data Brokers Actually Operate?

Despite appearances, data brokers don't necessarily source their data illegally. They may scrape publicly available information from social media profiles, purchase data from companies that collected it directly from users, or acquire it from government record systems.

Data brokers often operate either right at the boundary of the law or fully within it — particularly in jurisdictions where data protection policies are weak or inconsistently enforced.

Consent to share your data with third-party brokers can be buried in a multi-item registration checkbox, or disclosed in fine print when you sign up for something like a store loyalty card (alongside the advertised 10% discount). Many users check boxes without reading what they are agreeing to, and brokers know this.

It's also worth noting that some people voluntarily participate in data-sharing programs — such as those run by market research firms like Luth Research — where individuals are paid to share granular details about themselves and explicitly consent to that data being resold. In theory, this is a transparent exchange, though it remains a niche behaviour.

The tightening of data-protection laws globally has created new compliance pressures for the data-brokerage model.

Legitimate Interest

Under the EU's General Data Protection Regulation (GDPR), data processing requires one of six defined legal bases. Data brokers frequently rely on legitimate interest — the most ambiguous of the six and arguably the most abused.

The important clarification here is that legitimate interest does not apply to advertising. For advertising-related data processing, data brokers and AdTech companies are required to obtain clear, explicit, and unambiguous consent from the individual — a much higher standard that the industry has been reluctant to meet uniformly.

Derived, Inferred, and Predicted Data

Some data brokers take the position that derived, inferred, or predicted data — information computed or extrapolated from observed behaviour rather than directly collected — doesn't qualify as personal data under privacy law. This interpretation is contested by regulators and privacy advocates, particularly given that such derived profiles can still be used to uniquely target individuals.

Vague Opt-Out Procedures

Opt-out options exist with many data brokers, but they are often poorly publicized and difficult to find. Some brokers do offer limited transparency: About the Data, a site operated by Acxiom, allows individuals to see what data points the company holds about them and to update inaccurate information.

However, even where opt-out mechanisms exist, they often don't address inferences that have already been drawn from the original data. And because digital copies replicate across storage systems, fully erasing personal data from the internet is effectively impossible — some copy will persist somewhere.

Regulatory Enforcement and What It Means

Data brokerage has faced considerably more scrutiny since the GDPR came into force in 2018. The stakes are real: Google was fined €57 million by France's data regulator CNIL for failing to comply with EU data-protection rules.

Formal GDPR complaints filed by Privacy International — a UK-based non-profit — were lodged against major AdTech companies including Criteo, Quantcast, and Tapad, as well as credit agencies Equifax and Experian.

Regulatory action has raised the cost of non-compliance, but it has not fundamentally dismantled the data-brokerage model. Users in most jurisdictions still have limited visibility into what is collected about them and limited practical ability to stop it.

The most actionable individual response is straightforward: be selective about the services and forms you engage with, read consent notices when feasible, and treat free services with appropriate scepticism. Data brokerage persists because it is profitable — and it is profitable because data about users is continuously generated and rarely fully protected.