Blogdata clean roomsprivacy-compliant data collaboration

Data Clean Rooms: A Deep Dive With Juan Baron of Decentriq

data clean roomconfidential computingK-anonymitylook-alike modelingtop affinity segmentsidentity matchingemail matchingfuzzy matchinglegitimate business interestCRM dataCDPDMPCTVmarketing consentdata encryptionhardware enclavesfirst-party datadata sharingprivacy-first advertisingmedia planningaudience activationmeasurementlegal memorandum

Not all data clean rooms are built the same way. Some prioritize flexibility; others put data security and privacy at the centre of their architecture. This interview explores both dimensions — the technical underpinnings of confidential computing, the advertising and non-advertising use cases for clean rooms, and the longer-term question of whether independent clean room vendors can hold their ground against Walled Garden alternatives.

Juan Baron, Director of Business Development and Strategy (Media & Advertising) at Decentriq, covers the full spectrum: from fuzzy matching and look-alike models to GDPR's legitimate business interest doctrine and the future of clean room interoperability.


Q&A: Data Clean Rooms

Question: Tell us about Decentriq and what differentiates it in the market.

Juan Baron: I've spent many years in AdTech on both the buy side and publisher side, mostly in the US, before moving to Switzerland about eight years ago. Since GDPR came into force, privacy-first advertising has shifted from a talking point to a structural requirement, and data clean rooms have emerged as one of the key tools to fill that gap.

Decentriq is a Swiss-based data clean room provider. Our differentiating feature is a hardware-based technology called confidential computing — more on that shortly. We provide clean rooms across a range of industries, not just advertising.


Question: What industries beyond advertising are using Decentriq, and for what purposes?

Juan Baron: At its core, Decentriq is a data science collaboration platform. That's really the nuts and bolts of it.

A lot of what we see is pure data scientists using the technology to collaborate with counterparts at another organization. We have a company in Asia using clean rooms in the trade finance sector — tracking and monitoring cargo shipping data in collaboration with logistics partners. We work with numerous pharmaceutical companies on market share analysis inside the clean room.

In media and advertising, we enable banks and insurance companies to activate their first-party data within premium publisher inventory. We also have publishers collaborating with insurance companies on what are called attribute prediction models — running machine learning models to predict data characteristics without ever leaking individual profile information. The only thing that exits the clean room is the model itself, not the underlying data. That's probably the most privacy-preserving form of data collaboration we've seen.

Beyond commercial use cases, we work with defence organizations on protecting critical infrastructure against cyber attacks.


Question: In programmatic advertising, matching two datasets usually requires some kind of linking ID. How does that work for non-programmatic or non-advertising use cases?

Juan Baron: Not every use case requires linking datasets at all. In our clean rooms, we have strict user permissions governing who can access and upload data. A data scientist might need to compute across multiple data sources to reach a result — without those datasets ever being joined at an individual level. That's fundamentally different from AdTech's model, which is largely about following an individual across media properties.


Question: What programming capabilities does Decentriq support inside the clean room?

Juan Baron: We use confidential computing hardware designed by Intel and AMD. We support essentially any programming language inside the clean room — not just SQL, but also R and Python. This is what data scientists already use in their day-to-day work; we're enabling them to work that way with sensitive data in a compliant environment.

We've built interactive data workflows and access permissions directly into the platform, which allows for a genuinely collaborative, iterative experience. That's quite different from what a lot of traditional competitors offer, where the focus is narrowly on finding a specific number of users for retargeting in a safe, compliant way.


Question: Since the analysis happens inside the clean room, can you explain how results are shared without exposing the underlying data?

Juan Baron: Exactly — you agree on the computation, run it, and then allow the collaborating party to access specific results.

We have what's called K-anonymity filters — essentially a privacy filter hardwired into the platform — that ensures end results are always aggregated, never individual-level. Beyond that, because of confidential computing, there's something called remote attestation that provides cryptographic proof of what is actually being done with the data. We surface this as an audit log in the platform.

Every time someone views results or triggers a computation, the audit log records what was run, who accessed what, and what happened inside the platform. From a data protection officer's perspective, that level of transparency is genuinely valuable. It gives them the assurance they need.


Question: What are the most common use cases you see in advertising and digital marketing?

Juan Baron: The three main areas are media planning, activation, and measurement.

Media planning typically starts with an overlap analysis — an advertiser brings their customer dataset and intersects it with a publisher's network to see the audience overlap. You can bring in your own identity graph to expand the match rate.

Activation comes in a few flavours. The first is precise activation, which requires explicit marketing consent from the brand — the traditional retargeting model everyone knows.

The second is what we call top affinity segments. Based on the intersection of the datasets, the platform identifies the top affinity segments for that particular publisher, then creates audiences or deal IDs around those segments. No explicit marketing consent is required for this.

The third and most sophisticated is look-alike modelling. The publisher runs their look-alike model inside the clean room against the data intersection, creating a larger segment. The only thing that exits the clean room is the model itself — not the raw data.

We can make these claims because of the way confidential computing works: Decentriq has no physical way to access the data, because the encryption keys remain with the data owner. Not even the cloud provider can access it.

We back all of this with legal memorandums from prominent European law firms confirming that neither the top affinity nor the look-alike model use cases require marketing consent under GDPR's legitimate business interest doctrine. On top of that — and this is fairly groundbreaking — several major European publishers have confirmed they don't even require a joint control agreement with brands when using Decentriq, given how the technology is built.

On the measurement side: publishers have historically been limited in how much data they can share to demonstrate campaign results. Inside a clean room, for the first time, they can provide ad exposure data combined with rich audience data, and deliver genuinely predictive analytics on behalf of the brand. Premium publishers are reclaiming some of the value that programmatic advertising had taken from them.


Question: Is Decentriq channel-agnostic, or does it focus on specific channels like display, CTV, or in-app?

Juan Baron: The clean room itself is agnostic to the channel. What matters is the publishing partner. In Switzerland, for instance, we work with publishers selling standard programmatic display, native, in-feed video, and CTV. It depends on the inventory the publisher controls and what DMP or CDP data they make available in the clean room. The advertiser then chooses which channels and audiences are most relevant based on the data intersection.


Question: Is a linking ID always required to match advertiser and publisher datasets?

Juan Baron: Yes, for identity-based matching you need some kind of link. The most common linking identifier is the email address. But because you can write any type of code inside the clean room, you can ingest one or multiple identity graphs, and you can do what we call fuzzy matching — combining email, phone number, first name, last name, and even home address to increase the match rate. It's flexible.


Question: On the consent question — does collecting an email address for use in a data clean room require explicit consent if that data won't be shared with any other party?

Juan Baron: The data clean room itself doesn't solve the consent question. The law around consent is not about the technology — it's about the lawful basis for processing the data.

What we and our legal advisors have established is that certain use cases fall under legitimate business interest rather than requiring explicit marketing consent.

Here's a practical example. Say a large bank wants to advertise on a premium news publisher. The bank has its own CRM data, survey data, and web analytics, and has built a basic understanding of its customers — let's say they're predominantly male, aged 25–45, interested in sports.

In traditional programmatic, the bank would simply ask the publisher to build an audience matching those parameters. With a data clean room — specifically with the top affinity approach — you intersect the two datasets and discover that the actual overlap audience is somewhat different: the age range is actually 28–35, and the interest isn't general sports but adrenaline sports specifically.

What's happened is that both parties have used their own data to derive business insights. No individual profile information has been transferred from one entity to the other. The publisher has not been given access to the bank's customer database, and the bank has not received any user-level data from the publisher. Under the legal opinions we hold, that means no marketing consent or additional data processing agreement is required to extract those insights.


Question: Would publishers need to list Decentriq as a data processing partner in their CMP?

Juan Baron: That's correct — they would not. And that's a significant change, because of how Decentriq is built: we genuinely have no way to see what's happening inside the clean room. We don't know what code is being run, what data is being uploaded, or who has access to what. It's locked inside a hardware enclave protected by confidential computing.

Those hardware chips — the microprocessors — are built so they can only execute code that was agreed upon between the collaborating parties in advance. It's the code and the rules that govern the collaboration, not any commercial agreement with Decentriq.


Question: How does Decentriq differ from other independent data clean rooms?

Juan Baron: The key differentiator is that we sit firmly in the privacy-enhancing technology (PETs) space. We use a combination of trusted execution environments and confidential computing hardware. As far as we're aware, Decentriq is the only data clean room using this specific combination, and it's by far the most secure architecture available.

Looking at other independent clean rooms in the market: some function more like a CMO dashboard — they pull in data from existing sources and surface collaboration insights, but there's no sophisticated data science involved. Others are built around data storage, which brings legal implications and limits the computational flexibility available.

The way to think about Decentriq is as a trusted computational layer. You agree on the computation, upload the data, run the analysis, and you're done. We're not in the storage business — we're in the business of computational flexibility. Once the computation is complete, results can feed into whatever internal analytics infrastructure the brand or publisher already has.


Question: What about Walled Garden data clean rooms from Google, Meta, and Amazon?

Juan Baron: Walled Gardens have their own business interests, and their clean room solutions reflect that. Google's Ads Data Hub is fundamentally a Google-centred view of your advertising. It's designed to show you how your data performs within the Google ecosystem — not across the entire media landscape.

If you're a large advertiser, your media spend doesn't live only on Google. It's across Meta, programmatic exchanges, direct publisher deals, and more. You need full visibility across all of it to properly allocate budgets. A Walled Garden clean room can't give you that by design.

The same logic applies to AWS — it's oriented toward driving spend within Amazon advertising, and it requires you to work within Amazon's or Snowflake's infrastructure, which limits flexibility.

Do I think Google will ever open up to the point where Meta uploads ad exposure data into a Google clean room? I genuinely don't. That's exactly why independent clean rooms are here to stay. They provide the trust and neutrality that no single Walled Garden can.

We do see a future where large brands will demand that Walled Gardens contribute their data into an independent, highly secure clean room in order to get proper cross-channel measurement. And the Walled Gardens themselves need the most secure environments available to protect their users' data — so there's actually an argument that they benefit from this infrastructure too.


Question: What are your thoughts on interoperability between different data clean room vendors?

Juan Baron: Interoperability is the key question for the industry right now. Decentriq is among the co-authors of the IAB paper on clean room standards, and what that paper set out to do was establish a baseline — both on what constitutes a genuinely private and secure clean room, and on how interoperability could work in practice.

Interoperability isn't just about technical compatibility. It's also about data normalization. Decentriq is agnostic to data inputs and outputs, but if we're ingesting data from a different clean room vendor because a brand or publisher uses their platform, the data needs to be normalized and validated at the point of entry. That's the foundational work the industry needs to do together before cross-clean-room collaboration becomes seamless.


The conversation covers a lot of ground, but a few themes stand out consistently: the importance of hardware-level security guarantees, the underappreciated breadth of clean room applications beyond advertising, and the structural reason why independent clean rooms will occupy a distinct and durable position in the market — one that no Walled Garden offering can fully replicate.