说明
No Measurement , No Improvement
这套并不是标准量化交易指标,一方面我对那套指标的确也不是很熟,另外我觉得可能自己琢磨一套会更实用一些。主要考虑以下几方面:
- 1 交易的数量
- 2 盈利特性:单均盈利率
- 3 交易持续周期
- 4 最大持有的股票数
- 5 输赢:赢率和真实赢率
- 6 盈亏比
- 7 最大回撤
总体上说:
- 我希望算法产生的交易够多,但不要太多,这样可以具有统计性,又不至于偏向高频;
- 交易持续周期会影响到资金的占用、利用率等;
- 最大持有的股票数一方面可以使整个组合表现出更好的稳定性,当然也会影响到资金占用;
- 从信号的产生机制上,一定会混杂很多白噪声,所以要分为赢率和真实赢率;
- 盈亏比则是考验算法的生存性;
- 最大回撤则是投资风险的度量。回撤在所难免,关键要看能不能接受,值不值得;
内容
通过信号机制产生候选交易,通过模型来识别优质的交易,通过策略来进行修正
假设有一种信号机制(买入和卖出),可以为股票生成若干候选交易,我们希望对这些候选交易进行评估:
- 1 可以根据信号筛选交易
- 2 可以根据股票进行筛选
- 3 可以根据交易期间进行筛选
1 候选交易
根据买入和卖出信号集,我们可以获得足够的候选交易hs300_df
2 根据信号集进行筛选
信号集有一个选择字段是signal_tuple (我隐藏了),第一步将关注的候选集根据信号拆分出来。其他可选的是股票列表和时间。
tem_df1 = process_filter_signal_df(hs300_df)
3 根据筛选后的信号集计算基本指标
随意选定一只股票,看起来还不错,一共466个候选订单,真实赢率29%,单均盈利率是7%,盈亏比为3.4。
some_code = '600010'
tem_df2 = tem_df1[tem_df1['code'] == some_code]
process_kpi001_base(tem_df2)
{'total_orders': 466,
'win_rate': 0.4291845493562232,
'real_win_rate': 0.2918454935622318,
'win_v_loss': 3.4094531520791054,
'hold_stocks': 1,
'mean_margin': 0.07076018194400928}
Note: 候选订单并不是模拟交易的订单,可能会更多一些。候选订单的目的是尽量多的生成可能的交易,供模型学习。
4 期间表现
为了更直观的看到订单的实时表现,我们需要构造交易活动期间的所有数据:根据信号对的起始时间,通过merge全量分钟数据构造交易的活动(明细)数据
通过活动数据,我们可以观察:
- 1 每个订单的最大、最小涨跌以及最大回撤
- 2 若干个订单总的资金敞口和回撤
我们要计算两类数据:
- 1 交易在打开和平仓之间的明细数据,用来看资金敞口。
- 2 结合明细数据,在每笔订单的平仓点更新资本的数值,从而计算回撤
从单个的例子可以看到:
- 1 最大回撤比较高。主要原因是冲顶后,没有及时控制(虽然还是赚的)。
- 2 资金敞口很高。因为没有控制候选交易的数量,可以给每支股票一个cap。
- 3 赢单的亏损一般不会超过10%。这样可以设置止损。
- 4 输单的回撤事实上没有赢单大。因此回撤控制是用来确保更大的利益。
res_600010_dict ,detail_600010_dict= process_kpi002_performance_of_a_code(tem_df1,'600010')
res_600010_dict
{'max_drawback': -0.3254396180829447,
'max_exposure': 112383526.72755373,
'median_hold_days': 27.0,
'median_order_drawback': -0.13348810214741724,
'median_margin': -0.00986446020488569,
'real_win_min_interest': count mean std min 25% 50% 75% \
real_win
0 330.0 -0.060642 0.044292 -0.179422 -0.088277 -0.054769 -0.020774
1 136.0 -0.033090 0.014999 -0.084479 -0.044288 -0.036120 -0.027810
max
real_win
0 -0.002000
1 0.006458 ,
'real_win_max_draw_back': count mean std min 25% 50% 75% \
real_win
0 330.0 -0.095822 0.058116 -0.195652 -0.133609 -0.094828 -0.037787
1 136.0 -0.288716 0.054608 -0.313901 -0.313901 -0.313901 -0.313901
max
real_win
0 0.000000
1 -0.171184 ,
'real_lose_max_interest': count mean std min 25% 50% 75% \
real_lose
0 434.0 0.286946 0.374751 -0.008085 0.009760 0.079744 0.870202
1 32.0 0.031577 0.027086 -0.002000 0.005677 0.029256 0.059415
max
real_lose
0 0.952228
1 0.084441 ,
'real_lose_draw_back': count mean std min 25% 50% 75% \
real_lose
0 434.0 -0.150239 0.108157 -0.313901 -0.313901 -0.130222 -0.057851
1 32.0 -0.177587 0.017701 -0.195652 -0.195652 -0.177778 -0.166667
max
real_lose
0 0.000000
1 -0.132231 }
将多只股票的结果混合
code_detail_dict_list = [detail_600010_dict,detail_600031_dict,detail_300677_dict]
res_dict1 = process_kpi003_merge_codes(code_detail_dict_list)
{'median_hold_days': 13.0,
'median_order_drawback': -0.10698824984539267,
'median_margin': -0.00986446020488569,
'real_win_min_interest': count mean std min 25% 50% 75% \
real_win
0 583.0 -0.053453 0.044081 -0.200425 -0.076391 -0.043757 -0.017635
1 173.0 -0.035720 0.023631 -0.148880 -0.044288 -0.036120 -0.025843
max
real_win
0 -0.000747
1 0.006458 ,
'real_win_max_draw_back': count mean std min 25% 50% 75% \
real_win
0 583.0 -0.084104 0.058984 -0.375151 -0.130222 -0.075816 -0.030368
1 173.0 -0.267715 0.072453 -0.359838 -0.313901 -0.313901 -0.171184
max
real_win
0 0.000000
1 -0.074689 ,
'real_lose_max_interest': count mean std min 25% 50% 75% \
real_lose
0 705.0 0.259113 0.625734 -0.008085 0.007892 0.043737 0.201414
1 51.0 0.038536 0.053529 -0.002000 0.005958 0.015402 0.059415
max
real_lose
0 6.809396
1 0.318473 ,
'real_lose_draw_back': count mean std min 25% 50% 75% \
real_lose
0 705.0 -0.122146 0.100985 -0.359838 -0.171184 -0.094828 -0.040484
1 51.0 -0.181062 0.040239 -0.375151 -0.195652 -0.177778 -0.164822
max
real_lose
0 0.00000
1 -0.10279 ,
'max_exposure': 121383774.86651154,
'max_drawback': -0.25227245169791146}
附录:函数
过滤信号集
def process_filter_signal_df(raw_singal_df, signal_pair = 'xxxx' ,code_group = None , dt_var = 'buy_dt' ,dt_start = '2017-01-01 09:30:00',dt_end = '2022-01-01 09:30:00',
target_var = 'target_gt0.1'):
tem_df = raw_singal_df
singal_sel = tem_df['signal_tuple'] == signal_pair
if code_group is None:
code_sel = tem_df['code'] != ''
else:
code_sel = tem_df['code'].apply(lambda x: True if x in code_group else False)
dt_sel = tem_df[dt_var].apply(lambda x: True if x >= dt_start and x < dt_end else False )
the_sel = singal_sel & code_sel & dt_sel
tem_df1 = tem_df[the_sel]
return tem_df1
序列的回撤计算
import numpy as np
def single_draw_back(dim1_array):
nume = dim1_array.min()
denom =dim1_array[0]
return (nume -denom) / denom
def cal_draw_back1(dim1_list , init_cap= None):
cap_list = list(dim1_list)
min_cap = min(cap_list) -1
max_cap = max(cap_list) +1
first_cap = cap_list[0]
before = init_cap or first_cap
after = max_cap
cap_list.insert(0,before )
cap_list.insert(0,before -1 )
cap_list.append(after)
cap_s = pd.Series(cap_list)
cap_s1 =cap_s.diff().dropna().apply(lambda x: 1 if x >0 else 0 )
peak_list = np.argwhere((np.array(cap_s1.iloc[:-1] ==1 )) & (np.array(cap_s1.iloc[1:] ==0 )))
cap_list2 = cap_list[1:-1]
peak_pos_array = peak_list.ravel()
cap_array = np.array(cap_list2)
res_list = [0]
peak_pos_array_len = len(peak_pos_array)
if peak_pos_array_len:
for i in range(peak_pos_array_len):
tem_res = single_draw_back(cap_array[peak_pos_array[i]:])
res_list.append(tem_res)
return min(res_list)
计算基本KPI
def process_kpi001_base(tem_df1, target_var = 'target_gt0.1'):
res_dict = {}
res_dict['total_orders'] = len(tem_df1)
res_dict['win_rate'] = tem_df1['is_win'].mean()
res_dict['real_win_rate'] = tem_df1[target_var].mean()
win_amount = tem_df1['net_profit'][tem_df1['is_win'] ==1].sum()
loss_amount = abs(tem_df1['net_profit'][tem_df1['is_win'] ==0].sum())
res_dict['win_v_loss'] = win_amount/loss_amount
res_dict['hold_stocks'] = len(set(tem_df1['code']))
res_dict['mean_margin'] = tem_df1['margin'].mean()
return res_dict
计算期间表现KPI
def process_kpi002_performance_of_a_code(tem_df1,some_code, minute_df_name = 'S12_ddx_df',minute_df_path1 = './hs300_data_v3/%s/',init_cap = None,
real_win_thres = 0.1,real_lose_thres = -0.1):
res_dict = {}
detail_dict = {}
tem_df2 = tem_df1[tem_df1['code'] == some_code]
minute_df = fs.from_pickle(minute_df_name, minute_df_path1 % some_code)
minute_df['dt'] = minute_df['date'] + ' ' + minute_df['minutes']
minute_df_keep_cols = ['dt','LOW']
minute_df1 = minute_df[minute_df_keep_cols]
all_dt_minutes = sorted(list(set(minute_df1['dt'])))
order_res_df_list = []
order_performance_rec_list= []
order_close_df_list = []
for i in range(len(tem_df2)):
buy_keep_cols = ['order_num','code','buy_dt','buy_price','buy_vol','buy_amt','sell_dt','margin']
some_order_dict = dict(tem_df2[buy_keep_cols].iloc[i])
pos_a = all_dt_minutes.index(some_order_dict['buy_dt'])
pos_b = all_dt_minutes.index(some_order_dict['sell_dt'])
some_order_df = pd.DataFrame()
some_order_df['dt'] = all_dt_minutes[pos_a : pos_b + 1]
some_order_df['order_num'] = some_order_dict['order_num']
some_order_df['code'] = some_order_dict['code']
some_order_df['buy_price'] = some_order_dict['buy_price']
some_order_df['buy_vol'] = some_order_dict['buy_vol']
some_order_df['buy_amt'] = some_order_dict['buy_amt']
some_order_df1 = pd.merge(some_order_df, minute_df1, how='left', on='dt')
some_order_df1['sell_amt'] = some_order_df1['LOW'] * some_order_df1['buy_vol']
some_order_df1['fee'] = some_order_df1['sell_amt']* 0.002
some_order_df1['gross_profit'] = some_order_df1['sell_amt'] - some_order_df1['buy_amt']
some_order_df1['net_profit'] = some_order_df1['gross_profit'] - some_order_df1['fee']
some_order_df1['is_win'] = some_order_df1['net_profit'].apply(lambda x: 1 if x >0 else 0)
some_order_df1['margin'] = some_order_df1['net_profit']/ some_order_df1['buy_amt']
some_order_df1['capital'] = some_order_df1['buy_amt'] + some_order_df1['net_profit']
order_performance_dict = {}
order_performance_dict['order_id'] = some_order_dict['code'] + '_' +some_order_dict['order_num']
order_performance_dict['max_interest'] = some_order_df1['margin'].max()
order_performance_dict['min_interest'] = some_order_df1['margin'].min()
order_performance_dict['max_draw_back'] = cal_draw_back1(some_order_df1['capital'] )
order_performance_dict['hold_minutes'] = len(some_order_df1)
order_performance_dict['hold_days'] = int(len(some_order_df1) /240) + 1
order_performance_dict['close_margin'] = some_order_dict['margin']
order_performance_rec_list.append(order_performance_dict)
res_keep_cols = ['dt','order_num','code', 'buy_amt', 'net_profit']
order_res_df_list.append(some_order_df1[res_keep_cols])
order_close_df_list.append(some_order_df1[res_keep_cols].iloc[-1:])
order_performance_df = pd.DataFrame(order_performance_rec_list)
order_res_df = pd.concat(order_res_df_list, ignore_index=True)
order_res_df1= order_res_df.groupby(['dt']).sum()
order_close_df = pd.concat(order_close_df_list, ignore_index=True)
order_close_s = order_close_df.groupby(['dt'])['net_profit'].sum()
init_cap = init_cap or order_res_df1['buy_amt'].max()
new_cap_df_list = []
clost_dt_list = sorted(list(order_close_s.index))
for i in range(len(clost_dt_list)):
if i ==0:
sub_df = order_res_df1[order_res_df1.index <= clost_dt_list[i]]
else:
sub_df = order_res_df1[(order_res_df1.index <= clost_dt_list[i]) & ((order_res_df1.index > clost_dt_list[i-1]) )]
sub_df['updated_capital'] = sub_df['net_profit'] + init_cap
new_cap_df_list.append(sub_df)
init_cap += order_close_s[i]
new_cap_df = pd.concat(new_cap_df_list)
order_performance_df['real_win'] = order_performance_df['close_margin'].apply(lambda x: 1 if x >= real_win_thres else 0 )
order_performance_df['real_lose'] = order_performance_df['close_margin'].apply(lambda x: 1 if x < real_lose_thres else 0 )
res_dict['max_drawback'] = cal_draw_back1(new_cap_df['updated_capital'])
res_dict['max_exposure'] = order_res_df1['buy_amt'].max()
res_dict['median_hold_days'] = order_performance_df['hold_days'].median()
res_dict['median_order_drawback'] = order_performance_df['max_draw_back'].median()
res_dict['median_margin'] = order_performance_df['close_margin'].median()
res_dict['real_win_min_interest'] = order_performance_df.groupby(['real_win'])['min_interest'].describe()
res_dict['real_win_max_draw_back'] = order_performance_df.groupby(['real_win'])['max_draw_back'].describe()
res_dict['real_lose_max_interest'] = order_performance_df.groupby(['real_lose'])['max_interest'].describe()
res_dict['real_lose_draw_back'] = order_performance_df.groupby(['real_lose'])['max_draw_back'].describe()
detail_dict['order_performance_df'] = order_performance_df
detail_dict['order_res_df1'] = order_res_df1
detail_dict['new_cap_df'] = new_cap_df
return res_dict,detail_dict
混合多只票的KPI
def process_kpi003_merge_codes(code_detail_dict_list):
perfomance_df_list = []
order_res_df_list = []
capital_df_list = []
res_dict = {}
for the_dict in code_detail_dict_list:
tem_performance = the_dict['order_performance_df']
perfomance_df_list.append(tem_performance)
tem_order_df = the_dict['order_res_df1']
order_res_df_list.append(tem_order_df)
tem_cap_df = the_dict['new_cap_df'].reset_index()
capital_df_list.append(tem_cap_df)
concat_performance_df = pd.concat(perfomance_df_list, ignore_index=True)
concat_order_res_df = pd.concat(order_res_df_list).reset_index()
concat_order_res_df1 = concat_order_res_df.groupby(['dt']).sum()
merge_capital_df = pd.DataFrame()
tem_set = set([])
for tem_cap_df in capital_df_list:
tem_set = tem_set | set(tem_cap_df['dt'])
merge_capital_df['dt'] = sorted(list(tem_set))
capital_df_list1 = []
for tem_cap_df in capital_df_list:
tem_cap_df1 = pd.merge(merge_capital_df,tem_cap_df,how='left').fillna(method='pad')
capital_df_list1.append(tem_cap_df1)
concat_capital_df = pd.concat(capital_df_list1,ignore_index=True)
concat_capital_df1 = concat_capital_df.groupby(['dt'])['updated_capital'].sum().reset_index()
res_dict['median_hold_days'] = concat_performance_df['hold_days'].median()
res_dict['median_order_drawback'] = concat_performance_df['max_draw_back'].median()
res_dict['median_margin'] = concat_performance_df['close_margin'].median()
res_dict['real_win_min_interest'] = concat_performance_df.groupby(['real_win'])['min_interest'].describe()
res_dict['real_win_max_draw_back'] = concat_performance_df.groupby(['real_win'])['max_draw_back'].describe()
res_dict['real_lose_max_interest'] = concat_performance_df.groupby(['real_lose'])['max_interest'].describe()
res_dict['real_lose_draw_back'] = concat_performance_df.groupby(['real_lose'])['max_draw_back'].describe()
res_dict['max_exposure'] = concat_order_res_df1['buy_amt'].max()
res_dict['max_drawback'] = cal_draw_back1(concat_capital_df1['updated_capital'])
return res_dict
|