Battle of the AI Tools: Pika Labs vs. Runway in Text-to-Video Generation


In my pursuit of uncovering the best AI tools available for video creation, my exploration begins with my favorite application of AI - text-to-video generation. If you've delved into my blogs previously, you're well aware of my enthusiasm for Runway's Gen-2 "Text to Video" tool. Since its initial launch, I've been consistently astounded by the video outputs produced solely from a basic text prompt. But to truly assess this tool's supremacy, one must explore other offerings in the market and compare their capabilities.


To put my preferred video generation tool to the test, I experimented with Pika Labs' Text-to-Video tool. Unlike Runway, which necessitates monthly subscription purchases after exhausting the free credits, this tool is entirely free. Operating through Discord, it presently exists in its beta stage. The company describes its tool as follows:


“Say goodbye to intricate video editing programs and drawn-out production methods. With the platform, turn your text into compelling and visually striking videos with ease. Let your imagination soar and witness your text effortlessly evolve into vibrant video content, engaging and mesmerizing your viewers.”¹ 


Upon joining the Pika Discord server, I was directed to the "getting started" channel, providing guidance on video generation within the app. Users can enter one of 10 generation channels to generate videos by simply typing "/create" and inputting their desired content in the prompt section, including specific aspects if desired. Once the prompt is entered, the video request joins the channel's shared queue and undergoes AI generation in the queue's order, typically taking a minute or two.



To test this tool, I initiated with the prompt "a close-up of a lava lamp on a table, -ar 16:9." The resulting output, displayed below, presented a stationary video vaguely resembling a lava lamp on the right.



Unsatisfied with this generation, I altered my prompt in hopes of achieving a more accurate and animated representation. My next prompt, "a lava lamp with lava flowing inside, -ar 16:9," resulted in a moving but inaccurate depiction of a lava lamp, as seen below.



Considering the tool's apparent unfamiliarity with a lava lamp's appearance, I attempted a prompt depicting a person observing a lava lamp, hoping for better results. The prompt, "person sitting at a desk watching lava lamp in motion, -ar 16:9," revealed the tool's inability to recognize a lava lamp, generating a typical lamp with a lava-like base. Despite the odd hands, it did produce a somewhat realistic-looking human.



Expanding my testing, I focused on a scenario that had challenged Runway's tool previously: a happy person lying in a field of flowers. To give Pika’s tool the best chance, I employed the exact prompt that had been successful with Runway: “shot of happy adult lying face up in a meadow of flowers.” Unfortunately, the resulting generation portrayed an unsettling head that could easily linger in one's dreams.



After this generation, considering Runway’s initial imperfections with similar prompts, I decided to give Pika's tool two more chances. I slightly modified the prompt to "happy woman lying in a field of flowers, -ar 16:9" in hopes of not just getting a creepy face. However, this resulted in a woman happily walking amidst flowers, not lying down.



To counter this, I adjusted the prompt to “happy woman lying down in a field of flowers, -ar 16:9.” This prompt got me the closest to my desired outcome yet didn’t quite hit the mark as there were two women, not one, sitting in the field of flowers, not lying down.



While acknowledging that Runway holds an advantage as it's not in beta, Runway’s Gen 2 “Text-to-Video” tool outperforms Pika’s. Using Pika’s tool felt like pulling teeth to match my envisioned output even slightly, ignoring key words in my text input in almost every generation. As it progresses from beta, I suggest developers add a “preview” button like Runway's, allowing users to select the best-fitting generation before finalizing. However, if you have time and patience, the free tool permits as many generations as you need to achieve your vision. For now, I’ll stick with Runway’s tool for AI video generation, appreciating its seamlessness, preview options, and the AI’s proficiency in understanding and generating from text.


Sources:

¹ https://pikalabs.org/about/

Comments

Popular Posts