Moonvalley wants to build more ethical video models

November 18, 2024

The wide availability of tools to build generative AI has led to a Cambrian explosion of startups in the space. Plentiful capital hasn’t hurt, either — nor has the declining cost of the requisite technical infrastructure.

In fact, one of the flashiest applications of generative AI, generative video, risks becoming oversaturated. Labs such as Genmo, Haiper and Rhymes AI are releasing models at a fast clip, and in some cases, little distinguishes them from the previous state-of-the-art.

Naeem Talukdar thinks that trust — not a model’s capabilities, necessarily — is what will set some generative video ventures apart from the rest. That’s why he’s founding Moonvalley, a Los Angeles-based startup that’s developing ostensibly more “transparent” generative video tools.

Talukdar led product growth at Zapier before founding a Y Combinator-backed company, Draft, that hosted a marketplace for enterprise AI content. He recruited Mateusz Malinowski and Mik Binkowski to launch Moonvalley — both former scientists at DeepMind, where they studied video generation techniques.

“We shared a belief that video generation was going to transform media and entertainment, but the startups we saw operating in the space didn’t have the necessary attributes to be successful,” Talukdar told TechCrunch. “Existing companies were deeply antagonistic toward artists, creators and the broader industry.”

To Talukdar’s point, most generative AI companies train models on public data, some of which is invariably copyrighted. These companies argue that fair-use doctrine shields the practice. For instance, OpenAI has insisted that it can’t properly train models without copyrighted material, and Suno has argued that indiscriminate training is no different from a “kid writing their own rock songs after listening to the genre.”

Moonvalley wants to build more ethical video models — Some of Moonvalley’s founding team. (L-R): Mateusz Malinowski, Byrn Mooser, Mikolaj Binkowski, and John Thomas. Mikolaj Binkowski, and John Thomas.Image Credits:Monvalley

But that hasn’t stopped rights owners from lodging complaints or filing cease and desists.

Vendors have become quite brazen even as lawsuits against them pile up. Early this year, ex-OpenAI CTO Mira Murati didn’t outright deny that OpenAI’s video model, Sora, was trained on YouTube clips — in seeming violation of YouTube’s usage policy. Elsewhere, a 404 Media report suggests Runway, a generative video startup, scraped YouTube footage from channels belonging to Disney and creators like MKBHD without permission.

Canadian AI startup Viggle outright admits that it uses YouTube videos to fuel its video models. And, like most of its rivals, it offers no recourse for creators whose works might’ve been swept up in its training.

“Generative models need to respect copyrights, trademarks, and likeness rights,” Talukdar said. “That’s why we’re partnering closely with creators on our models.”

Moonvalley, which doesn’t have a fully trained video model yet, claims it’s one of the few companies using exclusively licensed data from content owners who’ve “opted in.” To cover its bases, Moonvalley intends to let creators request their content be removed from its models, allow customers to delete their data at any time, and offer an indemnity policy to protect users from copyright challenges.

The approach parallels Adobe’s, which is training its Firefly video models on licensed content from its Adobe Stock platform. Talukdar wouldn’t say how much Moonvalley is paying contributors for clips, but it could be quite a lot. Bloomberg reported that Adobe was offering around $120 for every 40-45 minutes of video.

To be clear, Moonvalley isn’t procuring content itself. It’s working with unnamed partners who handle the licensing arrangements and package videos into data sets that Moonvalley purchases.

These partners — so-called “data brokers” — are in high demand these days, thanks to the generative AI boom. The market for AI training data is expected to grow from roughly $2.5 billion now to nearly $30 billion within a decade.

“We’re licensing high-quality data from multiple sources that work directly with creators and compensate them well for the use of their content,” Talukdar added. “We’re ensuring that we’re using a high-quality, diverse data set.”

Unlike some “unfiltered” video models that readily insert a person’s likeness into clips, Moonvalley is also committing to building guardrails around its creative tooling. Like OpenAI’s Sora, Moonvalley’s models will block certain content, like NSFW phrases, and won’t allow people to prompt them to generate videos of specific people or celebrities.

Of course, no filter’s perfect, but Talukdar says that this “red-teaming” will be a core part of Moonvalley’s release strategy.

“As the relationship between media and AI continues to evolve rapidly, and not without skepticism, Moonvalley aims to establish itself as the most trusted partner for media organizations,” he said.

But can Moonvalley really compete?

As alluded to earlier, Google, Meta, and countless others are pursuing generative video — with varying degrees of ethical consideration. Tech giants are changing their terms of use to gain a data advantage: Google is training its Veo video model on YouTube videos, while Meta is training its models on Instagram and Facebook content.

Moonvalley hopes to appeal to brands and creative houses, but some vendors have already made meaningful headway there. Runway recently signed a deal with Lionsgate to train a custom model on the studio’s movie catalog; Stability AI recruited “Avatar” director James Cameron to its board of directors; and OpenAI teamed up with brands and independent directors to showcase Sora’s potential.

Then there’s Adobe, which is going after Moonvalley’s target market: Artists and content creators who want “safer” (from a legal perspective, at least) generative video tools.

Moonvalley’s challenge is three-fold. It’ll have to convince customers its tools are competitive with what’s already out there. It’ll need to build up enough runway to be able to train and serve follow-up models. And it’ll have to secure a loyal base of customers who won’t switch to another provider at a moment’s notice.

Many artists and creators are understandably wary of generative AI, since it threatens to upend the film and television industry. A 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimates that more than 100,000 U.S.-based film, television, and animation jobs will be disrupted by AI by 2026.

“Our focus is on building tools to help creators create ever grander and more immersive content,” Talukdar said when I asked him about the risk of creatives losing their jobs from generative AI.

On the runway front, Moonvalley’s made some progress: The company recently raised $70 million in a seed funding round co-led by General Catalyst and Khosla Ventures, with participation from Bessemer Ventures. That’ll fund Moonvalley’s R&D and hiring.

Currently, the company has about 30 employees who previously worked at DeepMind, Meta, Microsoft, and TikTok, Talukdar says.

“What differentiates us from other companies is a product focus,” he added. “While the core of our company is in training state-of-the-art generative models, our focus is on building deeply capable creative tools to turn these models into powerhouse equipment for professional creators, studios, and brands.”

Talukdar says the plan is to release Moonvalley’s first model later this year. The company will have to hurry if it hopes to beat upcoming releases from Black Forest Labs, Luma Labs, Midjourney, and the elephant in the room.

Source link