什么是文生视频?
文生视频(Text-to-Video)是利用人工智能技术,通过文本描述生成视频内容的一种创新技术。类似于图像生成技术,文生视频允许用户通过输入简单的文本描述,AI 模型会自动将其转化为动态视频。这种技术广泛应用于创作、广告、教育等领域,为内容创作者提供了新的创作方式和灵感。
通义万相 2.1 文生视频
阿里旗下通义万相宣布推出 2.1 版本模型升级,视频生成、图像生成两大能力均有显著提升。
在视频生成方面,通义万相 2.1 通过自研的高效 VAE 和 DiT 架构增强了时空上下文建模能力,支持无限长 1080P 视频的高效编解码,首次实现了中文文字视频生成功能,登上 VBench 榜单第一。
开源仓库代码
开发者可通过 GitHub 平台直接下载并进行体验测试。
但是对于没有特殊手段或者懒得下载不会使用的用户最好的体验方式就是使用一款可以一键部署的平台。
部署与使用指南
平台注册
首先注册相关云服务账号。
注册之后,就可以来到主页面。
部署通义万相 2.1 文生视频
部署通义万相 2.1 文生视频我们点击平台的应用市场。
然后找到对应的文生视频应用。
下图就是对应平台的部署详情,大家可以仔细阅读一下。
使用通义万相 2.1 文生视频
点击右上角的部署按钮。
选择一下你需要的配置,点击立即购买。
购买成功后,就会显示正在创建,接下来我们等待片刻。
创建完毕我们点击快速启动应用,之后我们会来到下面的界面,界面整体布局和文生图类似。
性能测试
下面我们分别使用 RTX3090 和 RTX4090 进行测试。
RTX3090:
Prompt:'Create a short video of a peaceful park scene during the golden hour. The sun is setting behind large, lush trees. The camera slowly pans through the park, capturing people walking, jogging, and sitting on benches. Birds are chirping, and there's a gentle breeze rustling through the leaves. The atmosphere is calm, serene, and warm, with soft golden light filtering through the branches.'
Negative Prompt:'Avoid any dark or eerie elements, such as stormy weather, gloomy skies, or ominous shadows. Do not include any loud or chaotic activities, like running or aggressive movements. The scene should remain calm and pleasant without any distractions, such as animals or people involved in unsettling behavior.'
参数默认
RTX4090:
Prompt:'Create a lively street market scene during the daytime. The market is busy with people walking around, vendors selling fresh produce, flowers, and handmade goods. There's colorful signage, and the air is filled with the sounds of lively chatter, distant music, and the rustle of fabric. The sunlight is bright and warm, creating a vibrant atmosphere. People are smiling, interacting, and enjoying the lively energy of the market.'
Negative Prompt:'Do not include any empty spaces or desolate areas. Avoid gloomy or rainy weather, and keep the environment full of life and color. There should be no dark or deserted streets, and no aggressive or unsettling behavior. The scene should remain friendly and welcoming, with no negative or chaotic energy.'


