DALLE has surprising guardrails. Your image is not filtered based on your prompt. "Dead cookies" may be generated ...sometimes

Interesting findings on DALLE's content filtering mechanisms.

TL;DR ⏱️

DALLE-3 filters your content AFTER image creation
With prompt “dead cookies” you can reproduce inconsistent filtering over OpenAI API
40% of cases with same “dead cookies” prompt stop through content filter and 60% reach us over API

What is DALLE-3 🖼️

DALLE 3 is a generative text-to-image model by OpenAI also available as API
You pay per image
Images are created based on your prompt like “dead cookies”.
You can also add details like: “Dead Cookies in cute Pixar style” or “Dead cookies with dramatic situation in cute Pixar style”
Open-Source image GenAI models alternatives are available e.g., Stable Diffusion
Image GenAI are under discussion because of misuse like deepfakes or reproducing intellectual property.

DALLE-3 has a content filter to reduce misuse
If you hit the content filter you do not get a resulting image for your prompt.
The content filter is not applied based on the prompt, it is applied AFTER DALLE-3 generated the image, and the API decides in an extra step if the image should be sent to you. Likely some Image classifier.
Same prompt sometimes results in an image and sometimes in a content-filter response. For the prompt “dead cookies” you get in 60% of requests an image and in 40% a content filter issue

We @Comma Soft AG develop tools and pipelines with OS GenAI but also with API requests.
For good API-response handling, we also had to consider content filter scenarios. So we combined trigger words like "dead" with something like "cookies"
We had inconsistent content filter and still the finding that in case of content filter the response time was roughly as long as in the case of created image.

Who should pay for “dead cookies” if the resulting image was created but not sent due to content filter?
Have you known that the content filter for DALLE-3 is applied after image generation?
Do you also encounter content filter although your prompts were in principle ok?
Do you think content filters are a reasonable image GenAI misuse countermeasure?
How would you reduce Image GenAI misuse?
And most interesting (and we might never know OpenAI/DALL-E Open Ai), how do “Dead Cookies” images look like which are filtered out? 😅

The image was created by the prompt "Dead cookies in cute pixar style" If you like more of such content, reach out to me 😊

#artificalintelligence #genai #aiethics #dalle #openai #texttoimage #deepfakes