04 February 2026

Duck Tales: How DuckDuckGo blocks trackers on third-party websites (Ep.17)

Inside DuckDuckGo

About

This episode, Beah (Chief Product Officer) and Dave (Privacy team) discuss our tracker blocklist, how it works, and why most of it is open source.

Disclaimers: (1) The audio, video (above), and transcript (below) have been lightly edited for clarity. However, they may still contain some minor inaccuracies or transcription errors. (2) This website is operated by Substack. This is their privacy policy.

Beah: Hello and welcome to DuckTales where we go behind the scenes at DuckDuckGo and discuss stories, technology and people that help build privacy tools for everyone. In each episode we talk about like something going on with our product or our company or our company vision and or how we operate and today we have our guest is Dave, who’s here to talk about the tracker block list. Hi, Dave.

Dave: Hey, Beah. Thanks for having me on. Yeah, sure thing. Yeah, my name is Dave Harbage. I’m a privacy engineer at DuckDuckGo. I primarily work on identifying privacy threats, building features to protect our users from these threats, and ensuring that we deliver a great web browsing experience.

Beah: Do you wanna introduce yourself briefly? That sounds very important, Dave.

Dave: Yeah, yeah, it’s a constant battle. It’s trying to keep up with the developing environment and give our users a good experience.

Beah: Thanks for fighting a good fight. So if I’ve hosted, if you’ve seen other DuckTales, you might have met me, but if you haven’t, I’m Beah and I am on the product team here. So yeah, let’s talk a bit about the tracker block list. First of all, what even is it?

Dave: Yeah, so our tracker block list is a list of domains and URLs that we found to exhibit what we call cross-site tracking behaviors. We use it in our browsers and browser extensions to block tracking requests and enhance the privacy of our users.

Beah: Sweet. So what’s the point of all that? Why did we even make this thing?

Dave: Yeah, so at DuckDuckGo, we believe that privacy is a fundamental human right. And we believe that people should have the option to live their lives without third parties recording their every move. We realized that protecting users’ privacy on our search engine was only half of the battle. As soon as users leave our search engine page to visit other websites, they’re subject to these third parties tracking their activities anywhere they go on the web. Pretty much. I think you might be shocked at just how many companies are involved in tracking your activity and also the granularity of the data that they’re collecting. For example, we’ve seen individual websites load hundreds of different tracking requests on a single page load. And it’s all hidden to the users. So we decided that we wanted to build a product that protects our users’ privacy not just when they’re searching, but when they’re browsing the web.

Beah: Yeah, do you know like approximately what portion of websites or maybe of web traffic is to a site with or to a page that has trackers on it?

Dave: I don’t have the exact number on the top of my head. It is...

Beah: I feel like the last I looked at the data, which was a little while ago, was like something on the order of 90%.

Dave: Yeah, it’s up there. It’s apps that you install on your phone. It’s websites that you visit in a browser. It’s incredibly prevalent everywhere.

Beah: Yeah, so basic premise here, like I mean I’m sure there’s a lot of people listening who like know this in their sleep, but then for those who don’t, it’s like as you move around the web, there may be these hidden trackers that have no explicit connection to the site itself. Google specifically is on just a ton of pages on the internet. So you’re on some random website, like a community, maybe the school your child goes to or something like that. And Google is actually there watching what you do in some sense, collecting data on your behaviors.

Dave: Yeah, that’s exactly right. There are many different reasons that websites add these third parties to their pages. Sometimes it’s for analytics, sometimes it’s for advertising. But they all are collecting this information about what you’re looking at, how you’re interacting with the page. And it’s all being sent back to these third parties. It’s not even the site that you’re browsing. It’s not clear that they’re getting this information.

Beah: Yeah, got it. So how does that relate to ad blocking?

Dave: So in the general sense, what we offer is not an ad blocker. A lot of the open web is supported by ad revenue, and we’re really not out to destroy that business model. It does, however, block ads that track you. So as I mentioned earlier, a lot of these ads are actually phoning home about your activities. They’re either saying, this user lingered on this ad or they had their mouse over it or they’re on this page, it might be kind of personal. And anytime we detect that kind of behavior, we block that. I think a lot of people just don’t really realize that these ads that they’re seeing aren’t just static images or videos. They’re also data collection apparatuses.

Beah: Yeah, yeah, so we didn’t set out to build an ad blocker, but because a lot of the code that generates ads or that serves ads on a website has tracking in it, we block it as a consequence of blocking that tracking code. And I notice this personally, like if I use a different browser, I’m often surprised at a lot of sites that I go to on the regular. I’m like, in other browsers, these start halfway down the page because there’s this huge ad at the top.

Dave: Yep. Yep.

Beah: So, okay, this tracker block list, we built this, are we using somebody else’s data or did we build this in-house?

Dave: Yes, we built this entirely in-house. When we first started going down this path, we looked at the existing lists, and they didn’t quite meet our needs. So there are a lot of different open source lists out there. But what we found was that it wasn’t always clear why certain domains were on these lists and why other domains weren’t on these lists. That leaves some room for bias potentially, whether intentionally or unintentionally. In order to offer a good product to our users, we really wanted to build a fully objective tracker list. It’s built on real-time activity observed across the web so that any time there’s a tracker in our block list, if someone were to ask, why is that in there, we can tell them exactly why that’s in there.

Beah: Yeah, so actually, do you want to say what the criteria is? What would be the answer to that? How do we decide if something’s a tracker?

Dave: Sure, yeah. So every month, the way that we do this is we crawl hundreds of thousands of websites from all over the world. And we look at the behaviors that are exhibited by the third party requests or third party scripts that are on the page. When we’re trying to determine if something is a cross-site tracker, we focus on really a few key criteria. So the first one is, is it setting cookies or is it storing something locally that then could be accessed to track your activity across websites? The second one is, is it accessing browser APIs that are commonly used to create what’s called a fingerprint of your browser or device? So that might be checking to see how much memory your computer has or what kind of CPU you have or the width of your screen or the pixel density of your screen. A lot of tracking happens that way where they gather all the entropy from all of these different signals and they create what’s called a fingerprint of your device. And then they can uniquely identify you just by comparing that fingerprint across different sites.

Beah: Mm-hmm.

Dave: The third criteria is we look for things that are present on many different independent sites so that we have a lower threshold for what we consider to be a cross-site tracker.

Beah: Gotcha. Do you want to talk about are there any interesting challenges, like either technical or user-facing challenges that we’ve encountered in building out this block list?

Dave: Yeah, absolutely. So the first one is tracking techniques are evolving. So as we develop a better tracker identification method, these tracking companies see that we’re doing that, or they see that others are doing that. And they devise very clever ways to evade that and make it look like they’re not tracking so they don’t get blocked. So we have to continuously update our detection techniques to stay ahead of them. And then I think the most important issue that we run into is making sure that the web works. Because a lot of websites, what they’ve done is they’ve integrated these tracking companies in a way such that if you block those tracking companies from loading, the site often doesn’t work. We’ve developed a very efficient process for reviewing these breakage reports that we get from users. So in our browsers, anytime you hit a site and it’s not working right, you can report that to us. And then we take all those reports, we look at them, we figure out what’s going on. Is this real breakage? And we fix it. And we do that, I think, pretty efficiently at this point. Most of the time we can get things working within a few days.

Beah: Nice. So how, if any listeners are in counter breakage, what exactly should they do to report it to us, our broken site?

Dave: Yeah, so there’s two different ways to do it for our browsers. The first way is you can open the privacy dashboard. There’s a little green, we call it a duck foot icon in the address bar. It’s on the left side. Many people might think it’s a shield. It’s actually a duck foot. If you click that, it’ll open up and it’ll give you like an overview of the privacy of the website.

Beah: It’s both. It’s a shield and a duck foot.

Dave: And there’s a little link there that you can click to submit a broken site report. You can also just submit a broken site report from the primary browser menu in all of our browsers. Yeah, and those come straight to us. And our team reviews them and make sure that everything is working as expected.

Beah: Got it. So if you want the protection of our block list, have to get that. Just going to DuckDuckGo and searching isn’t going to give you the... We can’t use our block list to intervene if you’re in somebody else’s browser, unless you’re in a DuckDuckGo browser or you’ve installed our extension, right? Okay. So if you’re listening and you want this...

Dave: That’s right. That’s right.

Beah: all the benefits of this block list that Dave works hard on. Go install our browsers or extension and then report it when you run into a broken site, if you even do, because again, Dave is working hard to make sure that you don’t. Nice. Okay. So most of this tracking kind of happens behind the scenes. You can’t actually see it happening. Is there a way that users can understand what’s actually going on?

Dave: Yeah, absolutely. We show in our browsers, when you visit a web page, we’ll show a little animation in the address bar that shows the trackers that are being blocked. And then if you click into the privacy dashboard, the duck foot or shield icon, you can see a full list of every tracker that we’ve identified along with the company that it belongs to and a lot more information about the status of the web page, like the security of the site, the privacy practices of the site.

Beah: Yeah, it’s pretty wild. If you haven’t done this already, go to your favorite news site and click around and then click on the Duckfoot Shield and you can just see sometimes dozens of companies. Sometimes Google’s off in there, but sometimes there’s companies you’ve never even heard of that are on the site. It’s pretty wild.

Dave: Yeah, it’s crazy, especially the ones you’ve never heard of, because it’s not always clear what they’re doing with the data. I think some of the big advertising companies, they’re obviously using it to better target you with ads or different content. But some of these lesser known ones, they actually bundle up this data, create a profile of you, and then they sell it to the highest bidder, which is pretty scary.

Beah: Okay. Yeah. Okay, so maybe just before we wrap up, there anything that we haven’t touched on that you want to mention, Dave?

Dave: Most of this stuff is open source, so all of the tools that we use to build our tracker block list, they’re all open source. You can find them and you can use them. They exist on github.com. That’s G-I-T-H-U-B.com slash DuckDuckGo slash tracker hyphen radar. We actually have a few different. Yeah.

Beah: Maybe we can put that in the show. I was like, they can spell GitHub. And I was like, okay, this is getting complicated.

Dave: Yeah. So we have a few different places in our GitHub where we have open sourced all of this. And we’ve actually found that some cutting edge researchers have been using a crawler to find different risks online. It’s pretty good. We try to be as responsive as possible when somebody is trying to use it and has an issue or has a question about why something works the way it does. Yeah. Hit us up.

Beah: Yeah, directly contributing to DuckDuckGo’s mission of raising the standard of trust online.

Dave: That’s right.

Beah: That seems like a good note to end on, so thank you very much, Dave. Appreciate it. And see you around the hood.

Dave: Yeah. Yeah, thank you, Beah. Thanks.

Beah: Later.

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit insideduckduckgo.substack.com