Implementing a CycleGAN for Image Translation
1. Import Libraries
import tensorflow as tf from tensorflow.keras import layers, models import numpy as np import matplotlib.pyplot as plt |
To start, we import TensorFlow and Keras to build and train our models, as well as NumPy to work with data and Matplotlib to see the pictures that were made.
2. Load and Preprocess the Data
dataset, metadata = tfds.load('cycle_gan/horse2zebra', with_info=True, as_supervised=True) train_horses, train_zebras = dataset['trainA'], dataset['trainB'] def preprocess_image(image, label): image = tf.image.resize(image, [286, 286]) image = tf.image.random_crop(image, size=[256, 256, 3]) image = tf.image.random_flip_left_right(image) image = (image / 127.5) - 1 return image train_horses = train_horses.map(preprocess_image).batch(1) train_zebras = train_zebras.map(preprocess_image).batch(1) |
We use TensorFlow Datasets to load the horse2zebra dataset. The pictures are enlarged, cropped at random to 256x256 pixels, and made to be normal across the [-1, 1] range. Random horizontal flips are also used to add to the training data.
3. Define the Generators
def build_generator(): inputs = layers.Input(shape=[256, 256, 3]) down1 = layers.Conv2D(64, 4, strides=2, padding='same')(inputs) down1 = layers.LeakyReLU()(down1) down2 = layers.Conv2D(128, 4, strides=2, padding='same')(down1) down2 = layers.InstanceNormalization()(down2) down2 = layers.LeakyReLU()(down2) up1 = layers.Conv2DTranspose(128, 4, strides=2, padding='same')(down2) up1 = layers.InstanceNormalization()(up1) up1 = layers.ReLU()(up1) up2 = layers.Conv2DTranspose(64, 4, strides=2, padding='same')(up1) up2 = layers.InstanceNormalization()(up2) up2 = layers.ReLU()(up2) outputs = layers.Conv2D(3, 7, padding='same', activation='tanh')(up2) return models.Model(inputs, outputs) |
A set of convolutional and inverted convolutional layers make up the generator model. Normalization of instances and activations of the ReLU help keep training stable. A tanh activation is used in the last layer to make outputs in the [-1, 1] band.
4. Define the Discriminators
def build_discriminator(): inputs = layers.Input(shape=[256, 256, 3]) down1 = layers.Conv2D(64, 4, strides=2, padding='same')(inputs) down1 = layers.LeakyReLU()(down1) down2 = layers.Conv2D(128, 4, strides=2, padding='same')(down1) down2 = layers.InstanceNormalization()(down2) down2 = layers.LeakyReLU()(down2) outputs = layers.Conv2D(1, 4, padding='same')(down2) return models.Model(inputs, outputs) |
The discriminator model is a type of convolutional network that tells the difference between real and fake pictures. To make things run faster, it uses LeakyReLU activations and instance normalization.
5. Compile the Models
generator_g = build_generator() generator_f = build_generator() discriminator_x = build_discriminator() discriminator_y = build_discriminator() generator_g.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') generator_f.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') discriminator_x.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') discriminator_y.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') |
We use the Adam algorithm and mean squared error loss to help us build the generators and discriminators. This setup helps keep the mechanics of training fixed.
6. Training the CycleGAN
def train_cyclegan(epochs, batch_size=1): for epoch in range(epochs): for image_x, image_y in tf.data.Dataset.zip((train_horses, train_zebras)): with tf.GradientTape(persistent=True) as tape: fake_y = generator_g(image_x, training=True) fake_x = generator_f(image_y, training=True) cycle_x = generator_f(fake_y, training=True) cycle_y = generator_g(fake_x, training=True) disc_real_x = discriminator_x(image_x, training=True) disc_real_y = discriminator_y(image_y, training=True) disc_fake_x = discriminator_x(fake_x, training=True) disc_fake_y = discriminator_y(fake_y, training=True) gen_g_loss = tf.reduce_mean(tf.square(disc_fake_y - 1.0)) gen_f_loss = tf.reduce_mean(tf.square(disc_fake_x - 1.0)) cycle_loss = tf.reduce_mean(tf.abs(image_x - cycle_x)) + tf.reduce_mean(tf.abs(image_y - cycle_y)) total_gen_g_loss = gen_g_loss + cycle_loss total_gen_f_loss = gen_f_loss + cycle_loss disc_x_loss = (tf.reduce_mean(tf.square(disc_real_x - 1.0)) + tf.reduce_mean(tf.square(disc_fake_x))) / 2.0 disc_y_loss = (tf.reduce_mean(tf.square(disc_real_y - 1.0)) + tf.reduce_mean(tf.square(disc_fake_y))) / 2.0 gradients_g = tape.gradient(total_gen_g_loss, generator_g.trainable_variables) gradients_f = tape.gradient(total_gen_f_loss, generator_f.trainable_variables) gradients_disc_x = tape.gradient(disc_x_loss, discriminator_x.trainable_variables) gradients_disc_y = tape.gradient(disc_y_loss, discriminator_y.trainable_variables) generator_g.optimizer.apply_gradients(zip(gradients_g, generator_g.trainable_variables)) generator_f.optimizer.apply_gradients(zip(gradients_f, generator_f.trainable_variables)) discriminator_x.optimizer.apply_gradients(zip(gradients_disc_x, discriminator_x.trainable_variables)) discriminator_y.optimizer.apply_gradients(zip(gradients_disc_y, discriminator_y.trainable_variables)) print(f'Epoch: {epoch}, Generator G Loss: {total_gen_g_loss.numpy()}, Generator F Loss: {total_gen_f_loss.numpy()}') train_cyclegan(epochs=100) |
The training loop is controlled by the train_cyclegan function. It uses gradient descent to update the generators and discriminators every time. Translations are useful and correct because of the cycle consistency loss, and pictures that are made are lifelike because of the discriminators. The producers are taught how to trick the discriminators, and the discriminators are taught how to tell the difference between real and fake pictures.
You can set up a CycleGAN to do single image-to-image translation by following these steps. This method makes it possible to change images in flexible and useful ways without using paired datasets. It opens up a lot of new options in computer vision and image processing.
Implementing a CycleGAN for Image Translation
1. Import Libraries
import tensorflow as tf from tensorflow.keras import layers, models import numpy as np import matplotlib.pyplot as plt |
To start, we import TensorFlow and Keras to build and train our models, as well as NumPy to work with data and Matplotlib to see the pictures that were made.
2. Load and Preprocess the Data
dataset, metadata = tfds.load('cycle_gan/horse2zebra', with_info=True, as_supervised=True) train_horses, train_zebras = dataset['trainA'], dataset['trainB'] def preprocess_image(image, label): image = tf.image.resize(image, [286, 286]) image = tf.image.random_crop(image, size=[256, 256, 3]) image = tf.image.random_flip_left_right(image) image = (image / 127.5) - 1 return image train_horses = train_horses.map(preprocess_image).batch(1) train_zebras = train_zebras.map(preprocess_image).batch(1) |
We use TensorFlow Datasets to load the horse2zebra dataset. The pictures are enlarged, cropped at random to 256x256 pixels, and made to be normal across the [-1, 1] range. Random horizontal flips are also used to add to the training data.
3. Define the Generators
def build_generator(): inputs = layers.Input(shape=[256, 256, 3]) down1 = layers.Conv2D(64, 4, strides=2, padding='same')(inputs) down1 = layers.LeakyReLU()(down1) down2 = layers.Conv2D(128, 4, strides=2, padding='same')(down1) down2 = layers.InstanceNormalization()(down2) down2 = layers.LeakyReLU()(down2) up1 = layers.Conv2DTranspose(128, 4, strides=2, padding='same')(down2) up1 = layers.InstanceNormalization()(up1) up1 = layers.ReLU()(up1) up2 = layers.Conv2DTranspose(64, 4, strides=2, padding='same')(up1) up2 = layers.InstanceNormalization()(up2) up2 = layers.ReLU()(up2) outputs = layers.Conv2D(3, 7, padding='same', activation='tanh')(up2) return models.Model(inputs, outputs) |
A set of convolutional and inverted convolutional layers make up the generator model. Normalization of instances and activations of the ReLU help keep training stable. A tanh activation is used in the last layer to make outputs in the [-1, 1] band.
4. Define the Discriminators
def build_discriminator(): inputs = layers.Input(shape=[256, 256, 3]) down1 = layers.Conv2D(64, 4, strides=2, padding='same')(inputs) down1 = layers.LeakyReLU()(down1) down2 = layers.Conv2D(128, 4, strides=2, padding='same')(down1) down2 = layers.InstanceNormalization()(down2) down2 = layers.LeakyReLU()(down2) outputs = layers.Conv2D(1, 4, padding='same')(down2) return models.Model(inputs, outputs) |
The discriminator model is a type of convolutional network that tells the difference between real and fake pictures. To make things run faster, it uses LeakyReLU activations and instance normalization.
5. Compile the Models
generator_g = build_generator() generator_f = build_generator() discriminator_x = build_discriminator() discriminator_y = build_discriminator() generator_g.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') generator_f.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') discriminator_x.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') discriminator_y.compile(optimizer=tf.keras.optimizers.Adam(2e-4, beta_1=0.5), loss='mse') |
We use the Adam algorithm and mean squared error loss to help us build the generators and discriminators. This setup helps keep the mechanics of training fixed.
6. Training the CycleGAN
def train_cyclegan(epochs, batch_size=1): for epoch in range(epochs): for image_x, image_y in tf.data.Dataset.zip((train_horses, train_zebras)): with tf.GradientTape(persistent=True) as tape: fake_y = generator_g(image_x, training=True) fake_x = generator_f(image_y, training=True) cycle_x = generator_f(fake_y, training=True) cycle_y = generator_g(fake_x, training=True) disc_real_x = discriminator_x(image_x, training=True) disc_real_y = discriminator_y(image_y, training=True) disc_fake_x = discriminator_x(fake_x, training=True) disc_fake_y = discriminator_y(fake_y, training=True) gen_g_loss = tf.reduce_mean(tf.square(disc_fake_y - 1.0)) gen_f_loss = tf.reduce_mean(tf.square(disc_fake_x - 1.0)) cycle_loss = tf.reduce_mean(tf.abs(image_x - cycle_x)) + tf.reduce_mean(tf.abs(image_y - cycle_y)) total_gen_g_loss = gen_g_loss + cycle_loss total_gen_f_loss = gen_f_loss + cycle_loss disc_x_loss = (tf.reduce_mean(tf.square(disc_real_x - 1.0)) + tf.reduce_mean(tf.square(disc_fake_x))) / 2.0 disc_y_loss = (tf.reduce_mean(tf.square(disc_real_y - 1.0)) + tf.reduce_mean(tf.square(disc_fake_y))) / 2.0 gradients_g = tape.gradient(total_gen_g_loss, generator_g.trainable_variables) gradients_f = tape.gradient(total_gen_f_loss, generator_f.trainable_variables) gradients_disc_x = tape.gradient(disc_x_loss, discriminator_x.trainable_variables) gradients_disc_y = tape.gradient(disc_y_loss, discriminator_y.trainable_variables) generator_g.optimizer.apply_gradients(zip(gradients_g, generator_g.trainable_variables)) generator_f.optimizer.apply_gradients(zip(gradients_f, generator_f.trainable_variables)) discriminator_x.optimizer.apply_gradients(zip(gradients_disc_x, discriminator_x.trainable_variables)) discriminator_y.optimizer.apply_gradients(zip(gradients_disc_y, discriminator_y.trainable_variables)) print(f'Epoch: {epoch}, Generator G Loss: {total_gen_g_loss.numpy()}, Generator F Loss: {total_gen_f_loss.numpy()}') train_cyclegan(epochs=100) |
The training loop is controlled by the train_cyclegan function. It uses gradient descent to update the generators and discriminators every time. Translations are useful and correct because of the cycle consistency loss, and pictures that are made are lifelike because of the discriminators. The producers are taught how to trick the discriminators, and the discriminators are taught how to tell the difference between real and fake pictures.
You can set up a CycleGAN to do single image-to-image translation by following these steps. This method makes it possible to change images in flexible and useful ways without using paired datasets. It opens up a lot of new options in computer vision and image processing.
Copyrights © 2024 letsupdateskills All rights reserved