什么是文生视频?
文生视频(Text-to-Video)是利用人工智能技术,通过文本描述生成视频内容的一种创新技术。类似于图像生成技术,文生视频允许用户通过输入简单的文本描述,AI 模型会自动将其转化为动态视频。这种技术广泛应用于创作、广告、教育等领域,为内容创作者提供了新的创作方式和灵感。
通义万相 2.1 文生视频
阿里旗下通义万相宣布推出 2.1 版本模型升级,视频生成、图像生成两大能力均有显著提升。
在视频生成方面,通义万相 2.1 通过自研的高效 VAE 和 DiT 架构增强了时空上下文建模能力,支持无限长 1080P 视频的高效编解码,首次实现了中文文字视频生成功能,登上 VBench 榜单第一。

开源仓库代码
开发者可通过 GitHub(https://github.com/Wan-Video/Wan2.1)、HuggingFace(https://huggingface.co/Wan-AI )平台直接下载并进行体验测试。
对于没有特殊手段或者懒得下载不会使用的用户,可以选择使用一键部署的云服务平台。
部署与使用
平台部署
点击应用市场找到对应的文生视频服务,查看部署详情。
启动应用
选择需要的配置并创建实例,等待创建完毕后快速启动应用。
界面整体布局和文生图类似。
性能测试
分别使用 RTX3090 和 RTX4090 进行测试。
RTX3090:
Prompt:'Create a short video of a peaceful park scene during the golden hour. The sun is setting behind large, lush trees. The camera slowly pans through the park, capturing people walking, jogging, and sitting on benches. Birds are chirping, and there's a gentle breeze rustling through the leaves. The atmosphere is calm, serene, and warm, with soft golden light filtering through the branches.'
Negative Prompt:'Avoid any dark or eerie elements, such as stormy weather, gloomy skies, or ominous shadows. Do not include any loud or chaotic activities, like running or aggressive movements. The scene should remain calm and pleasant without any distractions, such as animals or people involved in unsettling behavior.'
参数默认

RTX4090:
Prompt:'Create a lively street market scene during the daytime. The market is busy with people walking around, vendors selling fresh produce, flowers, and handmade goods. There's colorful signage, and the air is filled with the sounds of lively chatter, distant music, and the rustle of fabric. The sunlight is bright and warm, creating a vibrant atmosphere. People are smiling, interacting, and enjoying the lively energy of the market.'



