这次用热区图的形式来探索,与平时所看的全场比赛事件分布热区图不一样的是,用的是危险传球——这次把射门前 15 秒内的传球都视为危险传球。
前面获取数据部分和上一篇大致相同
import pandas as pd
import matplotlib.pyplot as plt
from mplsoccer import Pitch, Sbopen, VerticalPitch
parser = Sbopen()
matches = parser.match(competition_id=2, season_id=44)
matches_events = []
for match_id in matches['match_id']:
data = parser.event(match_id)
matches_events.append(data[0])
获取阿森纳所有比赛的射门和准确传球,且传球不是定位球,然后寻找射门前 15 秒的传球,对于时间需要考虑上下半场,伤停补时的特殊处理
danger_passes = pd.DataFrame()
for events in matches_events:
# 上下半场
for period in [1, 2]:
# 只保留没有定位球的准确传球
pass_filter = (
events.team_name == 'Arsenal') & (
events.type_name == "Pass") & (
events.outcome_name.isnull()) & (
events.period == period) & (
events.sub_type_name.isnull())
passes = events.loc[
pass_filter,
[
"x", "y", "end_x", "end_y",
"minute", "second", "player_name",
],
]
# 只保留阿森纳的射门
shot_filter = (
events.team_name == 'Arsenal') & (
events.type_name == "Shot") & (
events.period == period)
shots = events.loc[shot_filter, ["minute", "second"]]
shot_times = shots['minute'] * 60 + shots['second']
# 设置为射门前 15 秒
shot_anchor = 15
shot_start = shot_times - shot_anchor
# 处理时间
shot_start = shot_start.apply(lambda i: i if i > 0 else (period - 1) * 45)
pass_times = passes['minute'] * 60 + passes['second']
pass_to_shot = pass_times.apply(
lambda x: True in ((shot_start < x) & (x < shot_times)).unique())
# 只保留危险传球
danger_passes_period = passes.loc[pass_to_shot]
danger_passes = pd.concat(
[danger_passes, danger_passes_period],
ignore_index=True
)
danger_passes.head()
x | y | end_x | end_y | minute | second | player_name | |
---|---|---|---|---|---|---|---|
0 | 52.6 | 54.3 | 54.6 | 45.0 | 8 | 4 | Patrick Vieira |
1 | 69.9 | 28.5 | 102.0 | 26.9 | 8 | 8 | Eduardo César Daude Gaspar |
2 | 102.0 | 26.9 | 104.7 | 37.6 | 8 | 10 | Ashley Cole |
3 | 60.0 | 45.6 | 68.9 | 33.9 | 12 | 7 | Robert Pirès |
4 | 68.5 | 33.9 | 79.1 | 31.1 | 12 | 9 | Eduardo César Daude Gaspar |
这次在球场上使用 6 * 5 的格子来制作,其中宽度区域对应中肋边路,六块长度区域在瓜迪奥拉的战术板经常能看到
pitch = Pitch(line_zorder=2, line_color='grey')
fig, ax = pitch.grid(
grid_height=0.9, title_height=0.06, axis=False,
endnote_height=0.04, title_space=0, endnote_space=0
)
bin_statistic = pitch.bin_statistic(
danger_passes.x, danger_passes.y,
statistic='count', bins=(6, 5), normalize=False,
)
# 绘制热区图
pcm = pitch.heatmap(
bin_statistic, cmap='Reds', edgecolor='black', ax=ax['pitch']
)
# 绘制说明
ax_cbar = fig.add_axes((1, 0.093, 0.03, 0.786))
cbar = plt.colorbar(pcm, cax=ax_cbar)
fig.suptitle('Danger passes by Arsenal', fontsize=30)
plt.show()
为了更加清晰的探索,将球员参与的传球次数绘制出来
这次偷个懒就不与 nickname 结合了
passes_count = danger_passes.groupby(['player_name']).x.count()
ax = passes_count.plot.barh(passes_count)
ax.set_xlabel('Number of danger passes')
ax.set_ylabel('')
plt.show()