前言

不管是quantile regression，还是在其他方法上把loss function改为quantile loss function，都是根据设定的quantile，返回一个值。并不是直接返回一个范围。即一个分位数quantile，对应一个模型。

分位数回归可调用的库

1. scikit-learn

官网：
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.QuantileRegressor.html#s%20%20klearn.linear_model.QuantileRegressor

实例code：

from sklearn.linear_model import QuantileRegressor
import numpy as np
n_samples, n_features = 10, 1
rng = np.random.RandomState(0)
y = rng.randn(n_samples)
x = rng.randn(n_samples, n_features)
reg = QuantileRegressor(quantile=0.8).fit(x, y)

2. statsmodels

官网：
https://www.statsmodels.org/stable/examples/notebooks/generated/quantile_regression.html

实例code：

import statsmodels.api as sm
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt

data = sm.datasets.engel.load_pandas().data
# StatsModel是在初始化对象时传入数据，而fit方法只传递一些可以调试的超参数。
mod = smf.quantreg("foodexp ~ income", data)
res = mod.fit(q=0.5)
print(res.summary())

quantile loss function - python实现

这个要注意，不同的基础函数所对应的loss function的输入输出格式是不同的。

1. 在neural network 中添加分位数损失函数

代码

def tilted_loss(quantile, y, f):
    """
    the final overall loss is the mean of the sum.
    y: y_true
    f: y_pred
    """
    e = (y - f)
    return K.mean(K.maximum(quantile * e, (quantile - 1) * e), axis=-1)


def mcycle_model(length):
    model = Sequential()
    model.add(LSTM(units=128, input_shape=(1, length)))  # , input_dim=length
    model.add(Dense(units=10, input_dim=1, activation='relu'))
    model.add(Dense(1))
    return model


if __name__ == '__main__':
	# X ,Y分别是特征数据、标签数据。
	x_train, x_test, y_train, y_test = train_test_split(X, Y, random_state=2)
	x_train = np.array(x_train)[:, np.newaxis, :]
    x_test = np.array(x_test)[:, np.newaxis, :]

    qs = [0.1, 0.5, 0.9]
    plt.plot(y_test.values, 'h', label='Origin')  
    len_x = len(cols_x)  # 输入的维度，即特征的个数
    model = mcycle_model(len_x)
    for q in qs:
        model.compile(loss=lambda y, f: tilted_loss(q, y, f), optimizer='adadelta')
        model.fit(x_train, y_train, epochs=20, batch_size=32, verbose=0)
        # Predict the quantile
        y_pred = model.predict(x_test)
        plt.plot(y_pred, '.--', label=q, alpha=0.3)  # plot out this quantile

    plt.legend()
    plt.show()

2. 在xgboost上添加分位数损失函数

我们使用构造函数构造 XGBRegressor 的时候，里边的 objective 参数才是真正的损失函数（loss function）。xgb 使用 sklearn api 的时候需要用到的损失函数，其返回值是一阶导和二阶导。
这里首先要参考xgboost官网（https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html）来看他的loss function的格式，输入输出是什么。
如果直接把上面的loss function带入的话，会报错。

 File "E:/ml_regression.py", line 77, in xgb_reg
    reg.fit(x_train, y_train, eval_set=[(x_test, y_test)])
  File "C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\xgboost\core.py", line 575, in inner_f
    return f(**kwargs)
  File "C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\xgboost\sklearn.py", line 972, in fit
    callbacks=callbacks,
  File "C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\xgboost\core.py", line 575, in inner_f
    return f(**kwargs)
  File "C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\xgboost\training.py", line 181, in train
    bst.update(dtrain, i, obj)
  File "C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\xgboost\core.py", line 1783, in update
    grad, hess = fobj(pred, dtrain)
  File "C:\Users\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 511, in __iter__
    raise TypeError("Cannot iterate over a scalar tensor.")
TypeError: Cannot iterate over a scalar tensor.

代码

def quantile_loss(y_true, y_pred, alpha, delta, threshold, var):
    """
    loss function:
    x is the error between y_true and y_pred
    if 0<x<alpha: grad1 = -x, hess1 = 1; grad2 = grad1, hess2 = hess1
    elif x >= max(alpha, threshold): grad1 = 0, hess1 = 0; grad2 = (2 * np.random.randint(2, size=len(y_true)) - 1.0) * var, hess2 = 1
    :param y_true:
    :param y_pred:
    :param alpha:  quantile
    :param delta:
    :param threshold:
    :param var:  i think this is variance
    :return: grad: i think that this is negtive gradient
    """
    x = y_true - y_pred
    grad = (x < (alpha - 1.0) * delta) * (1.0 - alpha) - (
            (x >= (alpha - 1.0) * delta) & (x < alpha * delta)) * x / delta - alpha * (x > alpha * delta)
    hess = ((x >= (alpha - 1.0) * delta) & (x < alpha * delta)) / delta

    grad = (np.abs(x) < threshold) * grad - (np.abs(x) >= threshold) * (
            2 * np.random.randint(2, size=len(y_true)) - 1.0) * var
    hess = (np.abs(x) < threshold) * hess + (np.abs(x) >= threshold)
    return grad, hess  # grad一阶导 hess二阶导， sklearn的loss function格式

# loss function using method
q=0.1
reg = xgb.XGBRegressor(
    objective=lambda y,f:quantile_loss(y,f,q))

补充

上面我们提到，xgboost的loss function要求return grad, hess。其中grad即一阶梯度，hess为二阶梯度，即hessian。在数学中, 海森矩阵(Hessian matrix或Hessian)是一个自变量为向量的实值函数的二阶偏导数组成的方块矩阵。

reference

1.杜克大学的材料，包括公式、实例。
https://www2.stat.duke.edu/~fl35/teaching/440-19F/decks/cs01_5_deck.html#/example-engel-curves
2. University of Minho大学课件
http://www.econ.uiuc.edu/~roger/courses/NIPE/lectures/L1.pdf