Using GANs for High-Resolution Image Synthesis
Generative Adversarial Networks, or GANs, are a type of machine learning systems that are made to create fake data. We will make a simple GAN to combine high-resolution photos in this guide. To make this GAN work, we will use PyTorch.
Step 1: Setup and Import Necessary Libraries
First, we need to get the packages we need and install them.
# Install necessary libraries !pip install torch torchvision matplotlib # Import libraries import torch import torch.nn as nn import torch.optim as optim from torchvision import datasets, transforms from torch.utils.data import DataLoader import matplotlib.pyplot as plt import numpy as np |
We use pip to set up PyTorch, torchvision, and matplotlib.
We bring in PyTorch and its tools for neural networks, optimizers, and working with data.
For datasets and changes, we bring in torchvision.
We bring in matplotlib so that pictures can be shown.
We use numpy to do tasks with numbers.
Step 2: Define the Generator and Discriminator
The design for the creator and discriminator networks will be set.
# Define the Generator class Generator(nn.Module): def __init__(self, input_dim, output_dim, hidden_dim): super(Generator, self).__init__() self.model = nn.Sequential( nn.Linear(input_dim, hidden_dim), nn.ReLU(True), nn.Linear(hidden_dim, hidden_dim * 2), nn.ReLU(True), nn.Linear(hidden_dim * 2, hidden_dim * 4), nn.ReLU(True), nn.Linear(hidden_dim * 4, output_dim), nn.Tanh() ) def forward(self, x): return self.model(x) # Define the Discriminator class Discriminator(nn.Module): def __init__(self, input_dim, hidden_dim): super(Discriminator, self).__init__() self.model = nn.Sequential( nn.Linear(input_dim, hidden_dim * 4), nn.LeakyReLU(0.2, inplace=True), nn.Linear(hidden_dim * 4, hidden_dim * 2), nn.LeakyReLU(0.2, inplace=True), nn.Linear(hidden_dim * 2, hidden_dim), nn.LeakyReLU(0.2, inplace=True), nn.Linear(hidden_dim, 1), nn.Sigmoid() ) def forward(self, x): return self.model(x) |
We set up an input dimension, an output dimension, and a secret dimension for the Generator class.
Except for the output layer, which uses a Tanh activation function, the generator is made up of linear layers followed by ReLU activations.
We set up an input dimension and a secret dimension for the Discriminator class.
The discriminator is made up of linear layers and then LeakyReLU activations. The output layer, on the other hand, uses a Sigmoid activation function.
Step 3: Set Up Training Parameters and Data Loader
We need to setup on the training parameters and the way our dataset will be loaded.
# Training parameters batch_size = 128 learning_rate = 0.0002 num_epochs = 100 latent_dim = 100 image_size = 64 image_channels = 1 hidden_dim = 256 # Transformations for the dataset transform = transforms.Compose([ transforms.Resize(image_size), transforms.ToTensor(), transforms.Normalize([0.5], [0.5]) ]) # Load the dataset dataset = datasets.MNIST(root='data', train=True, transform=transform, download=True) dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True) |
Some of the configurations we set for training are batch_size, learning_rate, num_epochs, latent_dim, image_size, image_channels, and hidden_dim.
To change the size of the photos, turn them into tensors, and make them normal, we make transformations.
We load the MNIST dataset and make a data loader that can batch-load and shuffle the data.
Step 4: Initialize the Networks and Optimizers
The creator and discriminator networks, along with their optimizers, will be set up for the first time.
# Initialize the networks generator = Generator(latent_dim, image_channels * image_size * image_size, hidden_dim).to(device) discriminator = Discriminator(image_channels * image_size * image_size, hidden_dim).to(device) # Optimizers optimizer_g = optim.Adam(generator.parameters(), lr=learning_rate) optimizer_d = optim.Adam(discriminator.parameters(), lr=learning_rate) # Loss function criterion = nn.BCELoss() |
We set up the discriminator and generator networks and then move them to the device (GPU if it's available).
We set up Adam optimizers for both networks with the learning rate that was given.
To train, we use binary cross-entropy loss.
Step 5: Train the GAN
We will teach the GAN by changing the generator and discriminator over and over again.
# Function to create real and fake labels def create_labels(size, is_real): if is_real: return torch.ones(size, 1).to(device) else: return torch.zeros(size, 1).to(device) # Training the GAN for epoch in range(num_epochs): for i, (images, _) in enumerate(dataloader): # Flatten the images images = images.view(images.size(0), -1).to(device) # Train the Discriminator optimizer_d.zero_grad() real_labels = create_labels(images.size(0), True) fake_labels = create_labels(images.size(0), False) outputs = discriminator(images) d_loss_real = criterion(outputs, real_labels) d_loss_real.backward() z = torch.randn(images.size(0), latent_dim).to(device) fake_images = generator(z) outputs = discriminator(fake_images.detach()) d_loss_fake = criterion(outputs, fake_labels) d_loss_fake.backward() optimizer_d.step() d_loss = d_loss_real + d_loss_fake # Train the Generator optimizer_g.zero_grad() outputs = discriminator(fake_images) g_loss = criterion(outputs, real_labels) g_loss.backward() optimizer_g.step() if (i+1) % 200 == 0: print(f"Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(dataloader)}], D Loss: {d_loss.item()}, G Loss: {g_loss.item()}") # Save generated images after each epoch fake_images = fake_images.view(fake_images.size(0), 1, image_size, image_size) save_image(fake_images, f"images_{epoch+1}.png") |
We set up a method called create_labels to make labels, both real and fake.
We go through the data loader's epochs and batches over and over again.
We smooth the pictures and move them to the device for each group.
To teach the discriminator, we change its weights based on the real and fake pictures.
We teach the generator by changing its weights based on what the discriminator tells us.
After each epoch, we print the loss values and save the photos that were taken.
Using GANs for High-Resolution Image Synthesis
Generative Adversarial Networks, or GANs, are a type of machine learning systems that are made to create fake data. We will make a simple GAN to combine high-resolution photos in this guide. To make this GAN work, we will use PyTorch.
Step 1: Setup and Import Necessary Libraries
First, we need to get the packages we need and install them.
# Install necessary libraries !pip install torch torchvision matplotlib # Import libraries import torch import torch.nn as nn import torch.optim as optim from torchvision import datasets, transforms from torch.utils.data import DataLoader import matplotlib.pyplot as plt import numpy as np |
We use pip to set up PyTorch, torchvision, and matplotlib.
We bring in PyTorch and its tools for neural networks, optimizers, and working with data.
For datasets and changes, we bring in torchvision.
We bring in matplotlib so that pictures can be shown.
We use numpy to do tasks with numbers.
Step 2: Define the Generator and Discriminator
The design for the creator and discriminator networks will be set.
# Define the Generator class Generator(nn.Module): def __init__(self, input_dim, output_dim, hidden_dim): super(Generator, self).__init__() self.model = nn.Sequential( nn.Linear(input_dim, hidden_dim), nn.ReLU(True), nn.Linear(hidden_dim, hidden_dim * 2), nn.ReLU(True), nn.Linear(hidden_dim * 2, hidden_dim * 4), nn.ReLU(True), nn.Linear(hidden_dim * 4, output_dim), nn.Tanh() ) def forward(self, x): return self.model(x) # Define the Discriminator class Discriminator(nn.Module): def __init__(self, input_dim, hidden_dim): super(Discriminator, self).__init__() self.model = nn.Sequential( nn.Linear(input_dim, hidden_dim * 4), nn.LeakyReLU(0.2, inplace=True), nn.Linear(hidden_dim * 4, hidden_dim * 2), nn.LeakyReLU(0.2, inplace=True), nn.Linear(hidden_dim * 2, hidden_dim), nn.LeakyReLU(0.2, inplace=True), nn.Linear(hidden_dim, 1), nn.Sigmoid() ) def forward(self, x): return self.model(x) |
We set up an input dimension, an output dimension, and a secret dimension for the Generator class.
Except for the output layer, which uses a Tanh activation function, the generator is made up of linear layers followed by ReLU activations.
We set up an input dimension and a secret dimension for the Discriminator class.
The discriminator is made up of linear layers and then LeakyReLU activations. The output layer, on the other hand, uses a Sigmoid activation function.
Step 3: Set Up Training Parameters and Data Loader
We need to setup on the training parameters and the way our dataset will be loaded.
# Training parameters batch_size = 128 learning_rate = 0.0002 num_epochs = 100 latent_dim = 100 image_size = 64 image_channels = 1 hidden_dim = 256 # Transformations for the dataset transform = transforms.Compose([ transforms.Resize(image_size), transforms.ToTensor(), transforms.Normalize([0.5], [0.5]) ]) # Load the dataset dataset = datasets.MNIST(root='data', train=True, transform=transform, download=True) dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True) |
Some of the configurations we set for training are batch_size, learning_rate, num_epochs, latent_dim, image_size, image_channels, and hidden_dim.
To change the size of the photos, turn them into tensors, and make them normal, we make transformations.
We load the MNIST dataset and make a data loader that can batch-load and shuffle the data.
Step 4: Initialize the Networks and Optimizers
The creator and discriminator networks, along with their optimizers, will be set up for the first time.
# Initialize the networks generator = Generator(latent_dim, image_channels * image_size * image_size, hidden_dim).to(device) discriminator = Discriminator(image_channels * image_size * image_size, hidden_dim).to(device) # Optimizers optimizer_g = optim.Adam(generator.parameters(), lr=learning_rate) optimizer_d = optim.Adam(discriminator.parameters(), lr=learning_rate) # Loss function criterion = nn.BCELoss() |
We set up the discriminator and generator networks and then move them to the device (GPU if it's available).
We set up Adam optimizers for both networks with the learning rate that was given.
To train, we use binary cross-entropy loss.
Step 5: Train the GAN
We will teach the GAN by changing the generator and discriminator over and over again.
# Function to create real and fake labels def create_labels(size, is_real): if is_real: return torch.ones(size, 1).to(device) else: return torch.zeros(size, 1).to(device) # Training the GAN for epoch in range(num_epochs): for i, (images, _) in enumerate(dataloader): # Flatten the images images = images.view(images.size(0), -1).to(device) # Train the Discriminator optimizer_d.zero_grad() real_labels = create_labels(images.size(0), True) fake_labels = create_labels(images.size(0), False) outputs = discriminator(images) d_loss_real = criterion(outputs, real_labels) d_loss_real.backward() z = torch.randn(images.size(0), latent_dim).to(device) fake_images = generator(z) outputs = discriminator(fake_images.detach()) d_loss_fake = criterion(outputs, fake_labels) d_loss_fake.backward() optimizer_d.step() d_loss = d_loss_real + d_loss_fake # Train the Generator optimizer_g.zero_grad() outputs = discriminator(fake_images) g_loss = criterion(outputs, real_labels) g_loss.backward() optimizer_g.step() if (i+1) % 200 == 0: print(f"Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(dataloader)}], D Loss: {d_loss.item()}, G Loss: {g_loss.item()}") # Save generated images after each epoch fake_images = fake_images.view(fake_images.size(0), 1, image_size, image_size) save_image(fake_images, f"images_{epoch+1}.png") |
We set up a method called create_labels to make labels, both real and fake.
We go through the data loader's epochs and batches over and over again.
We smooth the pictures and move them to the device for each group.
To teach the discriminator, we change its weights based on the real and fake pictures.
We teach the generator by changing its weights based on what the discriminator tells us.
After each epoch, we print the loss values and save the photos that were taken.
Copyrights © 2024 letsupdateskills All rights reserved