OpenAI offers early look at DALL-E API, showcases text-to-image use case

Did you miss a session from MetaBeat 2022? Head over to the on-demand library for all of our featured sessions here.

The DALL-E API won’t be officially announced until later this fall, according to OpenAI, but today the company shared details about a customer already leveraging the DALL-E API for a specific enterprise use case.  

New York City-based Cala, a startup that bills itself as the “world’s first operating system for fashion,” offers a digital platform (including a mobile app launched in March) that allows creators to design and produce clothing lines, unifying the process from product ideation through order fulfillment. With the addition of DALL-E-powered text-to-image generating tools, users can generate new visual design ideas from natural text descriptions or uploaded reference images – which the company says are first-of-its-kind capabilities for the fashion industry. 

“From the moment we saw DALL-E come into the wild, we knew that this was a really great fit for our business and how we work,” said Dylan Pyle, CTO of Cala, who added that the implementation of the DALL-E API happened just over the past few weeks. “We really see this as augmenting human designers – we’re helping you turn these ideas into fairly detailed explorations of what you’re trying to make…streamlining and making that whole process faster, more effective and more efficient.” 

Luke Miller, product manager of DALL-E API at OpenAI, said the research company already has a large developer base using its APIs, so it has reached out to offer the DALL-E API to specific companies. 


Low-Code/No-Code Summit

Join today’s leading executives at the Low-Code/No-Code Summit virtually on November 9. Register for your free pass today.

Register Here

“It’s been a little bit opportunistic, because we find creative and interesting use cases as we’re testing the product,” he said. “Our team was super excited to work with Cala on this very specific use case, sort of super-powering their creative process and building it into a real business application.” 

How Cala uses the DALL-E API

To use the DALL-E-powered tools, a user selects from dozens of product templates, such as a hoodie, a dress or a jacket, and adds terms like “dark, delicate and velvet” into an adjectives section and phrases like “sewn logo patches” into a section for trims and features. 

Cala then generates six example product designs. The user can continue regenerating designs based on the original prompt or continue further modifying a certain design. Creators can also upload their own designs and DALL-E will return six images with slightly different variations. 

Frame 103

Pyle pointed out that Cala sees the DALL-E API as a way to help boost the creative inspiration process, whether or not the creator is an experienced designer. “We’re really in the business of taking the design and making it a reality, and if we can make it easier to get to that moment of inspiration, that’s great for us.” 

Miller added that the DALL-E API empowers developers to take the DALL-E technology and build custom solutions specific to their applications. 

“We want to build a tool that is flexible enough for them to build specific to their customer’s needs,” he said. “So in this case, [it’s about] empowering end users to come up with ideas and variations, taking an image and generating a bunch of different versions of it – to tweak and adjust things for their specific needs and let their creativity run with it.” 

Guardrails around prompt results

The Cala developers don’t allow for completely open-ended DALL-E results – instead, they fine-tuned where the inputs could take users in each product category, said Pyle.  

“Obviously the model generating the images at the end of the day is DALL-E … but we are transforming those into prompts that we’ve developed for each product category to steer the DALL-E results in the way we feel makes the most sense,” he said. “We were blown away by how easy it is…you still have to have that kind of creative steering over the inputs and interpret the outputs in a sensible way. But with just a little bit of direction, you can get some really great results. That clicked for our team almost immediately.” 

When asked whether Cala’s users can use prompts such as popular designer names or logos, Miller responded that there are guardrails around the way users can input DALL-E prompts and that the API follows the OpenAI content policy – which prohibits content related to a variety of categories including hate, harassment, violence, sexual and political. 

“That’s a valuable asset to work through all of these questions on the moderation and safety side and pass it on to our developers so that it’s built into the experience,” he said. The Cala implementation is “a very specific use case that constrains the space and is specifically on the creative process.” 

“We’re certainly not interested in encouraging or enabling any approaches like that,” Pyle added. “We’re trying to keep focus on the kind of design elements that make your ideas unique.” 

Originally appeared on: TheSpuzz