Using GANs for High-Resolution Image Synthesis

Using GANs for High-Resolution Image Synthesis

Generative Adversarial Networks, or GANs, are a type of machine learning systems that are made to create fake data. We will make a simple GAN to combine high-resolution photos in this guide. To make this GAN work, we will use PyTorch.

Step 1: Setup and Import Necessary Libraries

First, we need to get the packages we need and install them.

# Install necessary libraries
!pip install torch torchvision matplotlib

# Import libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np

 

We use pip to set up PyTorch, torchvision, and matplotlib.

We bring in PyTorch and its tools for neural networks, optimizers, and working with data.

For datasets and changes, we bring in torchvision.

We bring in matplotlib so that pictures can be shown.

We use numpy to do tasks with numbers.

Step 2: Define the Generator and Discriminator

The design for the creator and discriminator networks will be set.

# Define the Generator
class Generator(nn.Module):
    def __init__(self, input_dim, output_dim, hidden_dim):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(True),
            nn.Linear(hidden_dim, hidden_dim * 2),
            nn.ReLU(True),
            nn.Linear(hidden_dim * 2, hidden_dim * 4),
            nn.ReLU(True),
            nn.Linear(hidden_dim * 4, output_dim),
            nn.Tanh()
        )

    def forward(self, x):
        return self.model(x)

# Define the Discriminator
class Discriminator(nn.Module):
    def __init__(self, input_dim, hidden_dim):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, hidden_dim * 4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(hidden_dim * 4, hidden_dim * 2),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(hidden_dim, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.model(x)

We set up an input dimension, an output dimension, and a secret dimension for the Generator class.

Except for the output layer, which uses a Tanh activation function, the generator is made up of linear layers followed by ReLU activations.

We set up an input dimension and a secret dimension for the Discriminator class.

The discriminator is made up of linear layers and then LeakyReLU activations. The output layer, on the other hand, uses a Sigmoid activation function.

Step 3: Set Up Training Parameters and Data Loader

We need to setup on the training parameters and the way our dataset will be loaded.

# Training parameters
batch_size = 128
learning_rate = 0.0002
num_epochs = 100
latent_dim = 100
image_size = 64
image_channels = 1
hidden_dim = 256

# Transformations for the dataset
transform = transforms.Compose([
    transforms.Resize(image_size),
    transforms.ToTensor(),
    transforms.Normalize([0.5], [0.5])
])

# Load the dataset
dataset = datasets.MNIST(root='data', train=True, transform=transform, download=True)
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

Some of the configurations we set for training are batch_size, learning_rate, num_epochs, latent_dim, image_size, image_channels, and hidden_dim.

To change the size of the photos, turn them into tensors, and make them normal, we make transformations.

We load the MNIST dataset and make a data loader that can batch-load and shuffle the data.

Step 4: Initialize the Networks and Optimizers

The creator and discriminator networks, along with their optimizers, will be set up for the first time.

# Initialize the networks
generator = Generator(latent_dim, image_channels * image_size * image_size, hidden_dim).to(device)
discriminator = Discriminator(image_channels * image_size * image_size, hidden_dim).to(device)

# Optimizers
optimizer_g = optim.Adam(generator.parameters(), lr=learning_rate)
optimizer_d = optim.Adam(discriminator.parameters(), lr=learning_rate)

# Loss function
criterion = nn.BCELoss()

We set up the discriminator and generator networks and then move them to the device (GPU if it's available).

We set up Adam optimizers for both networks with the learning rate that was given.

To train, we use binary cross-entropy loss.

Step 5: Train the GAN

We will teach the GAN by changing the generator and discriminator over and over again.

# Function to create real and fake labels
def create_labels(size, is_real):
    if is_real:
        return torch.ones(size, 1).to(device)
    else:
        return torch.zeros(size, 1).to(device)

# Training the GAN
for epoch in range(num_epochs):
    for i, (images, _) in enumerate(dataloader):
        # Flatten the images
        images = images.view(images.size(0), -1).to(device)
        
        # Train the Discriminator
        optimizer_d.zero_grad()
        
        real_labels = create_labels(images.size(0), True)
        fake_labels = create_labels(images.size(0), False)
        
        outputs = discriminator(images)
        d_loss_real = criterion(outputs, real_labels)
        d_loss_real.backward()
        
        z = torch.randn(images.size(0), latent_dim).to(device)
        fake_images = generator(z)
        outputs = discriminator(fake_images.detach())
        d_loss_fake = criterion(outputs, fake_labels)
        d_loss_fake.backward()
        
        optimizer_d.step()
        
        d_loss = d_loss_real + d_loss_fake
        
        # Train the Generator
        optimizer_g.zero_grad()
        
        outputs = discriminator(fake_images)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        
        optimizer_g.step()
        
        if (i+1) % 200 == 0:
            print(f"Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(dataloader)}], D Loss: {d_loss.item()}, G Loss: {g_loss.item()}")

    # Save generated images after each epoch
    fake_images = fake_images.view(fake_images.size(0), 1, image_size, image_size)
    save_image(fake_images, f"images_{epoch+1}.png")

We set up a method called create_labels to make labels, both real and fake.

We go through the data loader's epochs and batches over and over again.

We smooth the pictures and move them to the device for each group.

To teach the discriminator, we change its weights based on the real and fake pictures.

We teach the generator by changing its weights based on what the discriminator tells us.

After each epoch, we print the loss values and save the photos that were taken.

logo

Generative AI

Using GANs for High-Resolution Image Synthesis

Beginner 5 Hours

Using GANs for High-Resolution Image Synthesis

Generative Adversarial Networks, or GANs, are a type of machine learning systems that are made to create fake data. We will make a simple GAN to combine high-resolution photos in this guide. To make this GAN work, we will use PyTorch.

Step 1: Setup and Import Necessary Libraries

First, we need to get the packages we need and install them.

# Install necessary libraries
!pip install torch torchvision matplotlib

# Import libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np

 

We use pip to set up PyTorch, torchvision, and matplotlib.

We bring in PyTorch and its tools for neural networks, optimizers, and working with data.

For datasets and changes, we bring in torchvision.

We bring in matplotlib so that pictures can be shown.

We use numpy to do tasks with numbers.

Step 2: Define the Generator and Discriminator

The design for the creator and discriminator networks will be set.

# Define the Generator
class Generator(nn.Module):
    def __init__(self, input_dim, output_dim, hidden_dim):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(True),
            nn.Linear(hidden_dim, hidden_dim * 2),
            nn.ReLU(True),
            nn.Linear(hidden_dim * 2, hidden_dim * 4),
            nn.ReLU(True),
            nn.Linear(hidden_dim * 4, output_dim),
            nn.Tanh()
        )

    def forward(self, x):
        return self.model(x)

# Define the Discriminator
class Discriminator(nn.Module):
    def __init__(self, input_dim, hidden_dim):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, hidden_dim * 4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(hidden_dim * 4, hidden_dim * 2),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(hidden_dim, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.model(x)

We set up an input dimension, an output dimension, and a secret dimension for the Generator class.

Except for the output layer, which uses a Tanh activation function, the generator is made up of linear layers followed by ReLU activations.

We set up an input dimension and a secret dimension for the Discriminator class.

The discriminator is made up of linear layers and then LeakyReLU activations. The output layer, on the other hand, uses a Sigmoid activation function.

Step 3: Set Up Training Parameters and Data Loader

We need to setup on the training parameters and the way our dataset will be loaded.

# Training parameters
batch_size = 128
learning_rate = 0.0002
num_epochs = 100
latent_dim = 100
image_size = 64
image_channels = 1
hidden_dim = 256

# Transformations for the dataset
transform = transforms.Compose([
    transforms.Resize(image_size),
    transforms.ToTensor(),
    transforms.Normalize([0.5], [0.5])
])

# Load the dataset
dataset = datasets.MNIST(root='data', train=True, transform=transform, download=True)
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

Some of the configurations we set for training are batch_size, learning_rate, num_epochs, latent_dim, image_size, image_channels, and hidden_dim.

To change the size of the photos, turn them into tensors, and make them normal, we make transformations.

We load the MNIST dataset and make a data loader that can batch-load and shuffle the data.

Step 4: Initialize the Networks and Optimizers

The creator and discriminator networks, along with their optimizers, will be set up for the first time.

# Initialize the networks
generator = Generator(latent_dim, image_channels * image_size * image_size, hidden_dim).to(device)
discriminator = Discriminator(image_channels * image_size * image_size, hidden_dim).to(device)

# Optimizers
optimizer_g = optim.Adam(generator.parameters(), lr=learning_rate)
optimizer_d = optim.Adam(discriminator.parameters(), lr=learning_rate)

# Loss function
criterion = nn.BCELoss()

We set up the discriminator and generator networks and then move them to the device (GPU if it's available).

We set up Adam optimizers for both networks with the learning rate that was given.

To train, we use binary cross-entropy loss.

Step 5: Train the GAN

We will teach the GAN by changing the generator and discriminator over and over again.

# Function to create real and fake labels
def create_labels(size, is_real):
    if is_real:
        return torch.ones(size, 1).to(device)
    else:
        return torch.zeros(size, 1).to(device)

# Training the GAN
for epoch in range(num_epochs):
    for i, (images, _) in enumerate(dataloader):
        # Flatten the images
        images = images.view(images.size(0), -1).to(device)
        
        # Train the Discriminator
        optimizer_d.zero_grad()
        
        real_labels = create_labels(images.size(0), True)
        fake_labels = create_labels(images.size(0), False)
        
        outputs = discriminator(images)
        d_loss_real = criterion(outputs, real_labels)
        d_loss_real.backward()
        
        z = torch.randn(images.size(0), latent_dim).to(device)
        fake_images = generator(z)
        outputs = discriminator(fake_images.detach())
        d_loss_fake = criterion(outputs, fake_labels)
        d_loss_fake.backward()
        
        optimizer_d.step()
        
        d_loss = d_loss_real + d_loss_fake
        
        # Train the Generator
        optimizer_g.zero_grad()
        
        outputs = discriminator(fake_images)
        g_loss = criterion(outputs, real_labels)
        g_loss.backward()
        
        optimizer_g.step()
        
        if (i+1) % 200 == 0:
            print(f"Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(dataloader)}], D Loss: {d_loss.item()}, G Loss: {g_loss.item()}")

    # Save generated images after each epoch
    fake_images = fake_images.view(fake_images.size(0), 1, image_size, image_size)
    save_image(fake_images, f"images_{epoch+1}.png")

We set up a method called create_labels to make labels, both real and fake.

We go through the data loader's epochs and batches over and over again.

We smooth the pictures and move them to the device for each group.

To teach the discriminator, we change its weights based on the real and fake pictures.

We teach the generator by changing its weights based on what the discriminator tells us.

After each epoch, we print the loss values and save the photos that were taken.

Frequently Asked Questions for generative-ai

line

Copyrights © 2024 letsupdateskills All rights reserved