   使用LSTM转化宏观经济变量(待完善)



Deep learning in asset pricing

将多个non-stationary宏观变量转化为少数state processes ht


1、Recurrent Neural Networks

In a recurrent neural network we store the output activations from one or more of the layers of the network. Often these are hidden later activations. Then, the next time we feed an input example to the network, we include the previously-stored outputs as additional inputs.


You can think of the additional inputs as being concatenated to the end of the “normal” inputs to the previous layer. For example, if a hidden layer had 10 regular input nodes and 128 hidden nodes in the layer, then it would actually have 138 total inputs (assuming you are feeding the layer’s outputs into itself à la Elman) rather than into another layer).

例如:input nodes为10,hidden nodes为128, 总计有138个输入

Of course, the very first time you try to compute the output of the network you’ll need to fill in those extra 128 inputs with 0s or something.

?2、Shortcoming: 短期记忆

Now, even though RNNs are quite powerful, they suffer from Vanishing gradient problem which hinders them from using long term information, like they are good for storing memory 3-4 instances of past iterations but larger number of instances don't provide good results so we don't just use regular RNNs. Instead, we use a better variation of RNNs: Long Short Term Networks(LSTM).

3、Long Short Term Memory(LSTM)

Long short-term memory (LSTM) units (or blocks) are a building unit for layers of a recurrent neural network (RNN). A RNN composed of LSTM units is often called an LSTM network.

A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell is responsible for "remembering" values over arbitrary time intervals; hence the word "memory" in LSTM. Each of the three gates can be thought of as a "conventional" artificial neuron, as in a multi-layer (or feedforward) neural network: that is, they compute an activation (using an activation function) of a weighted sum. Intuitively, they can be thought as regulators of the flow of values that goes through the connections of the LSTM; hence the denotation "gate". There are connections between these gates and the cell.

The expression long short-term refers to the fact that LSTM is a model for the short-term memory which can last for a long period of time. An LSTM is well-suited to classify, process and predict time series given time lags of unknown size and duration between important events. LSTMs were developed to deal with the exploding and vanishing gradient problem when training traditional RNNs.

?4、Components of LSTMs

So the LSTM cell contains the following components

  • Forget Gate “f” ( a neural network with sigmoid)
  • Candidate layer “C"(a NN with Tanh)
  • Input Gate “I” ( a NN with sigmoid )
  • Output Gate “O”( a NN with sigmoid)
  • Hidden state “H” ( a vector )
  • Memory state “C” ( a vector)

  • Inputs to the LSTM cell at any step are Xt (current input) , Ht-1 (previous hidden state ) and Ct-1 (previous memory state).

  • Outputs from the LSTM cell are Ht (current hidden state ) and Ct (current memory state)

加:2021-08-28 09:00:22  更:2021-08-28 09:22:44 
