Gated Recurrent Units (GRUs) are a type of recurrent neural network (RNN) architecture, similar to LSTMs, which were introduced by Cho et al. in 2014. GRUs were designed to be simpler and faster to train than LSTMs, while still being able to capture long-term dependencies in sequential data.
Like LSTMs, GRUs also have the ability to selectively forget or remember information over time, but they use a simpler architecture that involves only two gates: a reset gate and an update gate. The reset gate determines how much of the previous hidden state should be forgotten, while the update gate determines how much of the new input should be used to update the current hidden state.
During training, GRUs are updated using backpropagation through time, which involves calculating the gradients of the loss function with respect to the weights of the network. The gradients are then used to update the weights of the network using an optimization algorithm such as stochastic gradient descent.
GRUs have been applied to a wide range of tasks, including machine translation, speech recognition, and image captioning. They have been shown to achieve similar performance to LSTMs on many tasks, while being faster to train and requiring fewer parameters. GRUs are particularly useful for applications where computational resources are limited, or where real-time processing is required.