
Duck Tales: How we've made it easier to generate images privately in Duck.ai (Ep.33)
Inside DuckDuckGo
In this episode, Beah (Chief Product Officer) and Matej (Engineering) discuss the evolution of image generation in Duck.ai, UX improvements, and the challenge of balancing output quality and speed.
Disclaimers: (1) The audio, video (above), and transcript (below) are unedited and may contain minor inaccuracies or transcription errors. (2) This website is operated by Substack. This is their privacy policy.
If you have feedback on Duck Tales, or episode ideas, email us at podcast@duckduckgo.com
Beah: Hi, welcome to Duck Tales, where we go behind the scenes at DuckDuckGo and discuss the stories, technology, and people that help build privacy tools for everyone. In each episode, you’ll hear from employees about our vision, product updates, engineering, or approach to AI. Today I’m here with Matej, and we’re going to talk about image generation in Duck.ai. I will let Matej introduce himself, but quickly I’ll just say that if you haven’t met me yet, I’m Beah, I’m on the product team here at DuckDuckGo. And yeah, Matej, take it away. Tell us a little bit about yourself.
Matej: All right, Matej. I’ve been at DuckDuckGo for a year and a half now, and I’m working on Duck.ai and I work primarily on image generation last year from September until Christmas.
Beah: Awesome. Well, okay, so just to kind of get started, tell me a little bit about what that even means. What is image generation in Duck.ai? And if you’re able, I I think a demo would be great.
Matej: Yep, definitely. Let me share my screen here. All right. So if you go to the Duck.ai, go to the Duck.ai and you will see basically this screen and you probably notice that there’s this option new image, which is when you click on it, then you type any prompt in here. Duck.ai will try to generate an image for you. So let’s show me an image of a cat on the windows. So hopefully at the end of this we’ll get an image. This is going to take a little while because image generation is quite a heavy process. Yeah, this is the simplest form of it.
Beah: Okay. I see it. It’s coming together.
Matej: We’re getting, yes, it’s coming together and here’s our little cat. can click on it, zoom on it, download the image and copy it if we need it. Or we can keep it writing on it and ask for adjustments in the image.
Beah: That’s pretty good. Good. I mean it has like the right number of limbs. There’s nothing too crazy going on here. I’m pretty impressed. do you wanna demo like any adjustments or anything else you wanna point out while your screen’s being shared?
Matej: Yes, that’s correct. Sure, we can try it. Add another cat to the image. Let’s see what it comes up with.
Beah: Okay. another cat. yeah. Cool. And what what model is operating here?
Matej: So, we’re currently using GPT Image 2 model. We’ve been through multiple iterations of this and it’s always been a trade-off between cost, the output quality and the performance, so how quick the model is. We went through GPT Image 1, 1.5, 1 Mini and 2. Currently the GPT Image 2 model is slower, however it... the output quality is much, much better than previously. You probably cannot see it in images like this. However, if you would prompt it to generate a text, then you would see a clear difference between different models, and GPT Image 2 follows instructions really well and generates the text really well.
Beah: Hmm. Got it. So like a bad job if you say if you tell it to put some copy in an image is like it doesn’t get the letter. It like uses different copy, it like mangles the letters. Is that right? Yeah. I’ve definitely been there. But hey, look, this cat also has the right number of limbs. I don’t see anything problematic or fishy about this. I guess it’s kind of a s a softball query, because there’s plenty of cat images on
Matej: Exactly, exactly. That’s the case. Yeah.
Beah: In in Windows online, huh? Yeah. A demo query. Nice. Cool. okay, awesome. Well tell tell me a little bit, like stepping back again, tell me a little bit about why we even built image generation.
Matej: That’s correct, yes. Yes. So I think from the outset, we knew that we will have to at some point. The reason for that is most of the competitors or similar products already provide this feature. We knew that we had the capability, but also we had some user feedback that this is like a highly requested feature. It doesn’t match. Like at this point in time, this is like a baseline feature that is expected by all the users for you to have in any sort of AI product, I would say.
Beah: Yeah, yeah, got it. and all right, so what were like did you run into any challenges when you were building this this feature?
Matej: So I think challenges were not technical. Surprisingly, most of the challenges would come out of, for example, content moderation. How do we prevent people from generating, for example, self-harm images? And then the second part that I’ve already mentioned is how do we balance the quality, performance, and cost at the same time? So those were the two challenges that I would say were that come to my mind as we’re building this.
Beah: Yeah. Gotcha. And like what was the what was the runtime on this work? Like how how long have you been working on this?
Matej: So I started working on initial prototype in September last year, and we shipped it to production just before Christmas. So that was it.
Beah: Got it. And was it mostly you or was there a team of people working on it?
Matej: In the beginning, was mostly me and then another engineer joined me for UI work. So I did mostly the back end and then a little bit of wiring on the front end. Specifically for like, you can also upload your own image and then ask the model to make adjustments to it. So this is the part that I was working on.
Beah: Yeah, yeah. Nice. Have we gotten any interesting or surprising feedback since it’s been live?
Matej: So I wouldn’t say any interesting, but what was surprising to me was that the reception was quite positive. And I was a bit worried when we built this feature because in general, image generation is perceived as a fairly harmful thing. depending on your intent, you can also do harmful things. However, I was surprised that most of the people were positive and they appreciated to have a privacy focused way of generating images.
Beah: Yeah. Mm-hmm. Mm, sounds like the hard work that you put into, you know, putting the right content generation guardrails on it and so forth must have must have paid off.
Matej: I hope so.
Beah: Nice. So like going forward are there improvements that we’re that you guys are making on this feature?
Matej: So we’ve already made a bunch of improvements. This current iteration that you saw in the demo is an evolution of original implementation. Previously, we had a separate mode just for image generation. It was not integrated into your normal conversations. And this is one feedback that we’ve received and we pivoted away from the separate mode into having it in line in your regular chats.
Beah: Mm.
Matej: Another piece of feedback we’ve heard, and I already mentioned this, we’ve gone through multiple models because people are asking for better output. Going forward, I don’t think we have anything in particular. if you have any feedback, I’m happy to read through it and put something on the roadmap. There are many ways we could take this. I can imagine we have like we had we can add a freeform canvas and you can ask a model to make your image pretty like you draw something with your free hand and you’ll get a nice image or advanced editing you mask out something and you ask a model to replace that bit of an image with something else. So there are many, many different ways we can take it.
Beah: Yeah, yeah. Nice. Yeah. going back to a second for the whole the like to the inline versus a separate separate mode thing. So that’s like just just in case it’s not obvious that like it used to be that you had to start your chat in image mode. And if you started with any other model or you know, you were mid-chat, you couldn’t like jump into image generation, right? So now if I’m like mid-chat on something, can I just ask it to generate an image and Like it’s properly it can jump between those modalities.
Matej: That’s correct, yes.
Beah: Nice. And then how does it so since we’re using one model for image generation, if I start chatting again, how does the how does the does it just go back to the previous model that I was chatting with?
Matej: Yes, that’s correct. So if you’re in a conversation, there’s in the background, there’s essentially two models. One is for your conversation, which will be GPT-5 mini or Opus. And then there’s another model that’s specifically for image generation that’s always being called whenever you ask for an image, which is GPT Image 2.
Beah: Got it. And I can just jump back and forth willy-nilly, yeah, to edit my image or have a chat. Okay. Gotcha. and then how do we know whether like how does the model know? Not not how do we know, but how or how does Duck.ai know whether to be you know, interpreting my prompt as a an image generation command or continue discussion?
Matej: Correct. Yes. That’s an excellent question. So we have image generation tool on our back end and basically it analyzes all the prompts that are coming in. And if any of those is categorized as one that is requesting an image, then it will trigger the tool and that tool will generate an image for you. So we have some.
Beah: How good are we at that? Like is the just are we v are we perfect at classifying or do you think there are times right now when like somebody wants to be editing the image and they get a text response or vice versa?
Matej: I don’t think we’re perfect, but I think we’re pretty good, I would say. I would say about 90%. We have a set of evaluations that we run on these tools and on various prompts, and we keep adding to the dataset that we have. So we keep improving this and it’s already pretty good.
Beah: Yeah. Nice. okay, one more question for you. What is your favorite thing about image generation in Duck.ai?
Matej: My favorite thing, I think, like I had a good time with my wife when we were just goofing around. were generating, I think we generated like a demon llama in a coat or something. I don’t know, like things that you would never think about, but it was, it was just funny to keep working on the same image and what it would come up with. Another use case that I really love is generating logos. If you have an event you want to put on your, like a logo on your invite, then it’s...
Beah: Mm.
Matej: It’s pretty easy. It generates you in generally good enough result for you to use.
Beah: Yeah, yeah. Yeah, I actually I had a good logo experience with Duck.ai not too long ago. I had I was talking to a a friend, a a child of a friend, who with my daughter had they made this like amazing like role playing card game and named it and they wanted to create a logo for it and I was like, yeah, let’s let’s just see how it does. And I just on my phone, we were like literally on a hike and I just picked up my phone and told Duck.ai what the g the can the game was called and like a few you know, suggestions for what this person had in mind for the logo and like what came out was was pretty great. I was I was impressed. It was like a it was neat that you could do like that, you know, literally like while hiking through a park somewhere in a matter of moments.
Matej: Mm-hmm.
Beah: So I think he was impressed too. Nice. All right. Well, anything else that you want to add, Matej, before we close?
Matej: Try using this feature. Give us your feedback so that we can improve it, we can keep working on it.
Beah: Yeah, for sure. We actually read that the feedback. All right. Well thanks for joining with today and thanks everyone for listening. See you around.
Matej: Yes. Thanks for having me, see ya.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit insideduckduckgo.substack.com