系列文章目录
在本系列文章中笔者将手把手带领大家实现基于强化学习的通关类小游戏,笔者将考虑多种方案,让角色顺利通关。
【强化学习】手把手教你实现游戏通关AI(1)——游戏界面实现
游戏界面展示
在本游戏中,笔者将游戏界面抽象成带有颜色的方格,游戏的目标就是让AI学习一条从起点到终点的最优路径。
- 红色——主角
- 绿色——终点
- 紫色——障碍物(怪兽)
- 灰色——墙
界面代码
game.py
import pygame
import sys
import time
monster = [33, 37]
start_position = 97
step = 50
target = 4
wall = [107, 0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 19, 29, 39, 49, 59, 69, 79, 89, 99, 91, 92, 93, 94, 95, 96, 98]
kill_wait_time = 2
is_killed = False
class Mygame():
def __init__(self):
super(Mygame, self).__init__()
self.action_space = ['u', 'd', 'l', 'r']
self.n_actions = len(self.action_space)
self.size = width, height = 500, 500
self.screen = pygame.display.set_mode(self.size)
self.background_color = (255, 255, 255)
self.person = start_position
self.monster = [33, 37]
self.draw_map()
def draw_map(self):
"""
:param person:当前人的位置
:param monster: 当前怪兽的位置
:return:
"""
rect = [0] * 110
self.screen.fill(self.background_color)
for i in range(10):
for j in range(10):
curr_rect = 10 * i + j
if target == curr_rect:
rect[curr_rect] = pygame.draw.rect(self.screen, (0, 228, 0), ((j * step, i * step), (step, step)), width=0)
elif curr_rect in wall:
rect[curr_rect] = pygame.draw.rect(self.screen, (192, 192, 192), ((j * step, i * step), (step, step)), width=0)
elif curr_rect == self.person:
rect[curr_rect] = pygame.draw.rect(self.screen, (192, 0, 0), ((j * step, i * step), (step, step)), width=0)
elif curr_rect in monster:
rect[curr_rect] = pygame.draw.rect(self.screen, (138, 43, 226), ((j * step, i * step), (step, step)), width=0)
else:
rect[curr_rect] = pygame.draw.rect(self.screen, (255, 228, 181), ((j * step, i * step), (step, step)), width=1)
pygame.display.update()
def judge_win(self):
"""
如果人到达终点, 则胜利
:return: True or False
"""
if self.person == target and is_killed == True:
return True
else:
return False
def reset(self):
self.person = 97
self.monster = [33, 37]
print(self.person)
return self.person, self.monster
def judge_collision(self, temp_person):
"""判断人是否碰撞到墙或怪物"""
if temp_person in wall or temp_person in monster or temp_person < 0:
return True
else:
return False
def kill_monster(person, monster, time):
"""
人静止两秒且怪物在攻击范围内才能杀死怪物
人的攻击范围是:
:param person:人当前的位置
:param monster: 怪兽当前的位置
:return: 返回怪兽坐标, 如果杀死怪兽了返回[-1, -1]
"""
if time < kill_wait_time:
return monster
attack_area = [person-1, person-2, person-3, person, person + 1, person+2, person + 3,\
person-13, person-12, person - 11, person - 10, person - 9, person - 8, person -7,\
person-23, person-22, person - 21, person - 20, person - 19, person - 18, person -17,\
person+13, person+12, person + 11, person + 10, person + 9, person + 8, person + 7]
if monster[0] in attack_area and monster[1] in attack_area:
print("kill monster!")
return [-1, -1]
else:
return monster
def step(self, action):
s = self.person
if action == 0:
s_ = s - 10
is_collision = self.judge_collision(s_)
if is_collision:
s_ = s
elif action == 1:
s_ = s + 10
is_collision = self.judge_collision(s_)
if is_collision:
s_ = s
elif action == 2:
s_ = s + 1
is_collision = self.judge_collision(s_)
if is_collision:
s_ = s
elif action == 3:
s_ = s - 1
is_collision = self.judge_collision(s_)
if is_collision:
s_ = s
if s_ == target:
reward = 1
done = True
s_ = 'terminal'
print("通关,好棒!")
elif s_ in monster:
reward = -1
done = True
s_ = 'terminal'
print("Fight with monster!")
else:
reward = 0
done = False
return s_, reward, done
界面代码解析
定义一个游戏类,类中包含游戏行为action,游戏窗口大小,游戏方格颜色等参数。
step函数
- step函数——负责更新状态以及获得奖励。执行action后先判断是否能碰撞,不会碰撞玩家才能进行移动。如果下一个状态s_是目标(绿色格子),获得奖励reward=1。 如果下一个状态s_是障碍物reward = -1,其余状态下reward = 0。
draw_map函数
根据当前人的位置、障碍物、怪兽、目标的位置画出界面。 界面上的每个方格对应了一个数字如图所示,初始化110个格子,当然,由于界面的大小设计的是10*10,因此,没有让100以后的格子显示出来。107设为墙的意思是防止玩家在初始位置就向下移动。
上述代码中我们定义了一个list用于记录墙的位置,一个list用于记录怪兽的位置:
wall = [107, 0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 19, 29, 39, 49, 59, 69, 79, 89, 99, 91, 92, 93, 94, 95, 96, 98]
monster = [33, 37]
总结
本文先讲述了基于pygame的通关游戏界面实现。下文中,我们将介绍采用强化学习Q-learning算法实现AI的自动通关。
|