EPO Patent Application: Digital Human Video Generation using AI

ChangeBridge: EPO Bulletin - AI & Computing (G06N)

Published March 18th, 2026

Detected March 24th, 2026

Summary

The European Patent Office has published a patent application by Beijing Baidu Netcom Science Technology Co., Ltd. for a method of generating digital human videos using large AI models. The technology relates to artificial intelligence, deep learning, and computer vision, with potential applications in video livestreaming, advertisement production, and e-commerce.

View original document View source feed page

What changed

This document is a patent bulletin notice regarding a new patent application (EP4632742A3) filed by Beijing Baidu Netcom Science Technology Co., Ltd. The patent describes a method and apparatus for generating digital human videos using large AI models, intelligent agents, and electronic devices. The technology involves acquiring requirement information, processing it with a first large model to generate a target script, and then using a second large model to create a video of a digital human performing actions based on the script.

This is a patent application notice and does not impose any new regulatory obligations or compliance requirements on regulated entities. It is for informational purposes only, highlighting advancements in AI technology within the scope of patent law. Compliance officers should note this development in the AI and computer vision landscape.

Source document (simplified)

← EPO Patent Bulletin

METHOD AND APPARATUS FOR GENERATING DIGITAL HUMAN VIDEO BASED ON LARGE MODEL, INTELLIGENT AGENT, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Search Report EP4632742A3 Kind: A3 Mar 18, 2026

Applicants

Beijing Baidu Netcom
Science Technology Co., Ltd.

Inventors

WU,, Tian, WANG,, Haifeng, TIAN,, Hao, WU,, Wenquan, DAI,, Dai, LIU,, Simei, WANG,, Li, ZHOU,, Hang, GAO,, Cong, XIE,, Qunyi, HAO,, Qingchang

Abstract

The present disclosure provides a method and apparatus for generating a digital human video based on a large model, an intelligent agent, an electronic device, and a storage medium, which relate to a field of artificial intelligence technologies, in particular to technical fields such as deep learning, large models, computer vision and so on, and may be applied to scenarios such as video livestreaming, advertisement production, and e-commerce sales. The method for generating a digital human video based on a large model includes: acquiring a requirement information, where the requirement information includes an action description information for describing a specified action video segment, and the action video segment represents a specified action of a target object; processing the requirement information using a first large model to obtain a target script, where the target script includes a target speech segment text matching the action description information; and processing the target script and the action video segment using a second large model to obtain a target video for displaying a target digital human performing a speech delivery based on the target speech segment text while performing the specified action.

IPC Classifications

G11B 27/031 20060101AFI20260212BHEP G06N 3/0475 20230101ALI20260212BHEP G06V 10/82 20220101ALI20260212BHEP G10L 13/08 20130101ALI20260212BHEP H04N 5/222 20060101ALI20260212BHEP H04N 21/00 20110101ALI20260212BHEP

Designated States

AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LI, LT, LU, LV, MC, ME, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR

View original document →