Creative Tools

FOBI Editorial

FOBI Editorial
Creative AI's real battle is for workflow leverage, not model ownership.
07.05.26

Multiple bets

Long-term harmonic and narrative coherence in music

AI music generators excel at generating realistic-sounding bars but fail at composing coherent multi-minute pieces. Structural elements—overarching harmonic progressions, motif development, emotional arc, and form (verse-chorus-bridge)—require reasoning across hundreds of bars. Current models generate note-by-note without global planning, leading to repetitive or incoherent output.

Approaches in flight

Long-term harmonic and narrative coherence in music

▸ Hierarchical composition with structural priors

Suno and Udio use two-stage pipelines: first generate a high-level composition plan (chord progression, tempo map, formal structure), then infill notes and timings. This mirrors human composition. Transformer models (which dominate recent work) outperform LSTMs and GANs at capturing long-range harmonic dependencies, achieving 79% harmonic consistency in controlled benchmarks (Nature, 2025). But the approach requires learning from symbolic music (MIDI) with annotated structure, which is scarce. Symbolic generation via models like MuseNet (OpenAI) and MusicLM (Google DeepMind) show promise, but they struggle with rare genres or cross-cultural styles due to dataset bias. Krea and Leonardo.Ai are experimenting with LLM guidance—feed a description of the composition to a language model to generate structural metadata, then use that to guide generation.

▸ Multimodal audio-visual generation with temporal alignment

Invideo AI and Runway emphasize the coupling of music generation with video content: compose music that matches visual pacing, emotional tone, and scene structure. This provides external structure that constrains generation toward coherence. Google DeepMind's work on synchronized audio-visual diffusion shows this can improve both modalities. The challenge is that most music datasets lack fine-grained emotional or motion annotations, limiting what the model can learn about mapping narrative to music. Hybrid approaches (human composes high-level structure, AI fills details) show practical promise but sacrifice autonomy.

▸ Reinforcement learning with musicological evaluation

Recent work (2024–2025) explores training music models with rewards based on harmonic consistency, phrase balance, and stylistic appropriateness rather than just reconstruction loss. This requires musicological experts to define and implement reward signals, which is labor-intensive and subjective. AIVA and Udio have experimented with this but report slow training and difficulty balancing multiple competing objectives. The broader issue: music creativity itself is undefined—is a model generating novel, coherent music 'creative' or merely statistical? This philosophical tension makes evaluation hard. CHI 2025 research on prompt-based music generation found that users (even experts) struggle to specify temporal edits via language, suggesting interfaces and representations may need to change alongside models.

Creative Tools

Creative Tools

FOBI Editorial

Latest Stories

Companies

KPIs

Latest News

Krea2 is the very first AI which I see capable of creating high resolution coherent images in one pass (here 5760x1080)

Videos

Talent Moves

Catalysts

Conferences

Earnings Calls

Venture Stages

Valuations

Funding & analysis

Bottlenecks

Long-form video stability and consistency

Long-form video stability and consistency

Natural prosody and emotional expression in speech

Natural prosody and emotional expression in speech

Long-term harmonic and narrative coherence in music

Long-term harmonic and narrative coherence in music

Aligned multimodal generation and editing

Aligned multimodal generation and editing

Semantic consistency across iterative generation and editing

Semantic consistency across iterative generation and editing

Licensing and copyright compliance in model training

Licensing and copyright compliance in model training

Investment Theses

Frontier Model Quality in One Modality Still Buys a Durable Moat

Frontier Model Quality in One Modality Still Buys a Durable Moat

Open Weights Plus Node Workflows Become the Industry's Default Substrate

Open Weights Plus Node Workflows Become the Industry's Default Substrate

Creative Suites Capture Generative Value Through Workflow Ownership, Not Model Ownership

Creative Suites Capture Generative Value Through Workflow Ownership, Not Model Ownership

IP Compliance and Licensed Training Data Separate Commercial Platforms from Hobbyist Tools

IP Compliance and Licensed Training Data Separate Commercial Platforms from Hobbyist Tools

Stock Content Platforms Must Now Make the Generation Transition Alone

Stock Content Platforms Must Now Make the Generation Transition Alone

Top 10

Investors

Books

Deep Learning

Deep Learning: Foundations and Concepts

The Coming Wave

Generative Deep Learning, 2nd Edition

Hands-On Generative AI with Transformers and Diffusion Models

AI Engineering: Building Applications with Foundation Models

Designing Machine Learning Systems

Build a Large Language Model (From Scratch)

The Alignment Problem: Machine Learning and Human Values

Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence

Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI

AI 2041: Ten Visions for Our Future

Krea 2 Identity Edit LoRA

Krea 2 crossed 200k downloads on Hugging Face!

Tennessee Tourism Celebrates Real Photographers by Banishing AI Images

Dragonflies maneuver like fighter pilots

ComfyUI-Angelo now supports Krea 2 for Gen with Klein 9b for Edit

Cinematic storyboards with Krea2 (Turbo) + Custom nodes + Gemma 4

2D > 3D > Video via the Pallaidium tools for Blender

Expanding Managed Agents in Gemini API: background tasks, remote MCP and more

The Skibidi Toilet legal battle is a lesson for digital creators (and Hollywood studios)

Two new KREA 2 LoRAs. Garbage Pail Kids style and Ren and Stimpy Style. Links in description including HF and Civit. I included "most" of the steps and instructions on how I created these. Hopefully I covered it all (except installing Musubi, that's on you)

Character Loras with Krea2 (again)

Predictions

Policy & Courts

Round sizes

Stage mix

Lead investors

Publications

Conferences

University labs