首页 文章详情

OpenAI 解说 LOL S13 职业联赛

AndroidRain | 4 2023-11-13 22:52 0 0 0
UniSMS (合一短信)

随着 OpenAI 首届开发者大会的召开,OpenAI 现在为开发者提供和更新了以下功能:

  1. Vision:GPT-4 可以理解图片了;

  2. Text to speech:文字转语音能力。

再结合之前 GPT 的 Completion 对话能力,我们就可以打造一个视频解说功能。

这次我们以英雄联盟职业联赛 S13 四分之一决赛 T1 VS LNG Game1 为例,摘取其中的 30s 片段,使用 GPT 为其附上视频解说。我们先看成品,原视频时长为 30s,生成的解说语音为 128s,所以这里合成时加速解说语音为 1.2 倍,并慢放了视频。

解说效果虽然和专业比赛解说还有很大差距,但是最终效果来看 GPT 的解说大部分还是可靠的。接下来我们看看实现此功能需要哪些代码片段。

0. 准备工作

编程语言使用 Python,需要提前安装好 OpenAI、OpenCV 相关库:

pip install openai
pip install opencv-python

1. 读取视频帧

这里主要使用 OpenCV 库,读取视频帧保存起来。

video = cv2.VideoCapture("data/lol.mp4")

base64Frames = []

while video.isOpened():
success, frame = video.read()
if not success:
_, buffer = cv2.imencode(".jpg", frame)



2. 生成解说脚本

这里使用 OpenAI Completion 的对话能力,将 Prompt 和模型作为参数传入。

模型使用 gpt-4-vision-preview,这是基于 GPT-4 的图片识别模型。

Prompt 中我们传入上一步提取到图片,这里没有必要传入所有视频帧,这样不仅可以规避 OpenAI API 调用频率限制,也可以减少 Token 的使用。这里采用 1s 提取 1 帧的方式,也就是每隔 60 帧提取一帧,对应代码 base64Frames[0::60]

除此之外,我们传入对比赛视频的简单介绍,包括队伍和其中的选手,以及从哪些方面解说的内容,以便让 GPT 的识别更加准确。

from openai import OpenAI

os.environ["OPENAI_API_KEY"] = "your openai key"
client = OpenAI()

"role": "user",
"content": [
# "这是英雄联盟游戏的直播比赛视频,每个图片间隔1s,这是两支队伍T1和LNG的比赛。T1上单宙斯,打野oner,中单faker,下路gumayusi,辅助keria。LNG上单zika,打野tarzan,中单scout,下路gala,辅助hang。tarzan在这波团战中表现很差。作为游戏解说,写出游戏直播脚本,描述游戏正在进行的事情,选手正在干什么,谁开启团战以及技能释放情况,以及队伍优劣势的分析",
"This is a live broadcast video of a League of Legends game, with each image spaced 1 second apart. It features a match between two teams, T1 and LNG. For T1, Zeus is in the top lane, oner as the jungler, Faker in the mid lane, Gumayusi in the bot lane, and Keria as the support. For LNG, Zika is in the top lane, Tarzan as the jungler, Scout in the mid lane, Gala in the bot lane, and Hang as the support. Tarzan's performance in this team fight was poor. As a game commentator, write a live broadcast script describing what is happening in the game, what the players are doing, who initiates the team fight and the skill release situation, as well as an analysis of the team's advantages and disadvantages.",
*map(lambda x: {"image": x, "resize": 768}, base64Frames[0::60]),
params = {
"model": "gpt-4-vision-preview",
"messages": PROMPT_MESSAGES,
"max_tokens": 500,

result = client.chat.completions.create(**params)

3. 生成音频


Welcome back to an intense match between T1 and LNG, and the tension on the Rift is palpable! Both teams are neck and neck with the gold at 12.8k, but wait—T1 is making a bold move at the Rift Herald.
Zeus is holding steady in the top lane, and it looks like they're trusting oner to secure the objective with Faker supporting just a few paces back. Gumayusi and Keria are not yet at the scene, but they could join swiftly if things get heated.
Hold on—Tarzan is looking to contest! But there seems to be hesitation; the rest of LNG are not fully positioned to back him up. Zika, Scout, Gala, and Hang are scattered, and Tarzan needs to be careful not to overcommit.
And there it is, T1 initiates the skirmish on their terms, collaring Tarzan who seems to be caught out. T1 pounces with impeccable timing and skill coordination. Faker, living up to the legend, orchestrates a masterful play and it looks like—yes—LNG's Tarzan is down!
This is a decisive moment for T1; they've secured the advantage, racking up not just the kill but also gaining control over the Herald. LNG must now regroup and reassess their positioning and communication. Tarzan's performance in that engage was indeed suboptimal, possibly due to miscommunication or a misread on the enemy's positioning.
T1 is showing the power of teamwork and presence on the map. As the dust settles, T1 emerges with not just the Rift Herald but a clear message: they are in it to win it, and any slip-up from LNG will be exploited to its fullest!
Stay tuned as we continue to break down this match and see if LNG can bounce back from this unfavorable exchange. It's all about the macro play, vision control, and those split-second decision-making skills that separate the good from the great in League of Legends.




我们将以上文本输入到以下程序中,生成对应的语音。因为 OpenAI tts 对英文支持较好,生成的中文语言比较缺乏情感,这里采用英文文本。

from pathlib import Path
from openai import OpenAI

client = OpenAI()

speech_file_path = "./nova.mp3"
response = client.audio.speech.create(



这里挑战的是长度为 30s 游戏比赛视频,从结果来看,10 分制的话,我会给他打到 7 分,及格且稍微超出预期。

对于单个图片的内容识别,对于 OpenAI 来说更是不在话下。我们拍个照片就可以了解照片中的所有事物,如果和 AR 结合起来,那就是科幻小说中的场景了,继续期待 AI 的下一步发展。

good-icon 0
favorite-icon 0
回复数量: 0