Generative Adversarial Networks (GANs):
- Type of deep neural network architecture that uses unsupervised machine learning
- Made up by generator and a discriminator network. Both networks train each other, while simultaneously trying to outwit each other.
Generator network
- Generate new data from a randomly generated vector of numbers, called a latent space.
Discriminator network
- Tries to differentiate between the real data and the data generated.
- It can either perform multi-class classification or binary classification.
Important concepts related to GANs
- Important measure quality of the models use divergence (KL divergence, JS divergence...).
- Nash equilibrium, which is a state that we try to achieve during training.
- Objective functions: To measure the similarity.
- Scoring algorithms: Calculating the accuracy of a GAN is simple. Some scoring algorithms: some scoring algorithms, some scoring algorithms, Mode Score...
Problems with training GANs:
- Mode collapse: generates samples that have little variety or when a model starts generating the same images.
- Vanishing gradients: gradient is so small that the initial layers learn very slowly or stop learning completely.
- Internal covariate shift: occurs when there is a change in the input distribution to our network which make slows down the training process, takes a long time to converge to a global minimum. Batch normalization and other normalization techniques can solve this problem
Solving stability problems when training GANs
- Feature matching: maximize the objective function of the discriminator and minimize the objective function of the generator network to improve the convergence of the GANs
- Mini-batch discrimination: While training GANs, when we pass the independent inputs to the discriminator network. Coordination between the
gradients might be missing.
- Historical averaging: takes the average of the parameters in the past and adds this to the respective cost functions of the generator and the discriminator network.
- One-sided label smoothing: approach to provide smoothed labels to the discriminator network. have decimal values such as 0.9 (true), 0.8 (true), 0.1 (fake), or 0.2 (fake), instead of labeling every example as either 1 (true) or 0 (fake).
- Batch normalization: normalizes the feature vectors to have no mean or unit variance. used to stabilize learning and to deal with poor weight
initialization problems.
- Instance normalization: normalize each feature map by utilizing information from that feature map only.
Nhận xét