RSK World - Complete Documentation - CycleGAN Image Translation | CycleGAN Image Translation | Unpaired Image Translation | Cycle-Consistent GAN | PyTorch | RSK World

Project Description

This project implements CycleGAN for unpaired image-to-image translation using cycle-consistent adversarial networks. The architecture uses two generators (G_A2B and G_B2A) and two discriminators (D_A and D_B) to learn bidirectional mappings between two image domains without paired training examples. Perfect for learning CycleGAN fundamentals and image translation applications. The system includes adversarial training, cycle consistency loss, style transfer capabilities, TensorBoard integration, and comprehensive training tools.

CycleGAN uses cycle-consistent adversarial training where two generators learn to translate images from domain A to domain B and vice versa, while two discriminators distinguish between real and translated images. The cycle consistency loss ensures that images translated from A to B and back to A match the original, maintaining content while changing style. The implementation provides complete PyTorch support, comprehensive training pipeline, evaluation metrics (FID, IS, LPIPS), and deployment tools for image-to-image translation applications.

Project Screenshots

1 / 4

Core Features

CycleGAN Architecture

Dual generator setup
Dual discriminator setup
Cycle consistency loss
Unpaired training
Bidirectional translation

Unpaired Image Translation

No paired training data
Bidirectional translation
Cycle consistency
Domain adaptation
Style transfer

Cycle Consistency Loss

Forward cycle (A→B→A)
Backward cycle (B→A→B)
Identity mapping loss
Content preservation
Meaningful translations

Adversarial Training

Adversarial loss (GAN)
Cycle consistency loss
Identity mapping loss
Dual discriminator training
Training stability

TensorBoard Integration

Real-time loss visualization
Translated image tracking
Training progress monitoring
Interactive dashboard
Comprehensive logging

Web Interface

Flask-based web app
Interactive image translation
Real-time translation
Download translated images
User-friendly interface

Advanced Features

Style Transfer Applications

Photo to painting
Season transfer
Day to night conversion
Object transfiguration
Multiple applications

Data Augmentation

Adaptive augmentation
Mixup augmentation
Cutout augmentation
Multiple augmentation levels

Multiple Dataset Support

Custom unpaired datasets
Horse2Zebra dataset
Apple2Orange dataset
Season transfer datasets

Resume Training

Checkpoint resuming
Automatic checkpoint detection
Training continuation
Progress preservation
Early stopping support

Web Interface Features

Feature	Description	Usage
Image Translation	Translate images between domains	Upload image and select translation direction
Real-time Translation	Translate images in real-time	Translated images appear immediately
Download Images	Download translated images	Click download button for each translated image
Model Selection	Choose different trained models	Select from available checkpoints

Technologies Used

This CycleGAN Image Translation project is built using modern deep learning and computer vision technologies. The core implementation uses Python as the primary programming language and PyTorch for deep learning operations. The project includes a CycleGAN architecture with dual generator-discriminator networks for unpaired image-to-image translation. The project includes Jupyter Notebook support for interactive development and demonstrations, and comprehensive adversarial training with cycle consistency loss for meaningful image translations.

CycleGAN uses cycle-consistent adversarial training where two generators learn to translate images from domain A to domain B and vice versa, while two discriminators distinguish between real and translated images. The system supports unpaired training without requiring paired examples, cycle consistency loss for maintaining content while changing style, and identity mapping loss for domain preservation, making it suitable for various image-to-image translation applications.

Python 3.8+ PyTorch 2.0+ CycleGAN Cycle Consistency Image Translation TensorBoard Computer Vision Jupyter Notebook Style Transfer Unpaired Training

Installation & Usage

Installation

Install all required dependencies for the CycleGAN Image Translation project:

# Install all requirements
pip install -r requirements.txt

# The CycleGAN model will be trained on your unpaired data
# Prepare your dataset in datasets/your_dataset/ directory
# Organize as trainA/ and trainB/ folders
# Images should be preprocessed and resized

PyTorch Installation

Install PyTorch (CPU or GPU version):

# For CPU only
pip install torch torchvision torchaudio

# For CUDA (GPU support) - CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# For CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Verify installation
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"

Verify Installation

Test the model and verify all components work:

# Test model architecture
python test_model.py

# This will verify:
# - Model can be instantiated
# - Forward pass works
# - All components function correctly
# - Device compatibility (CPU/CUDA)

Training the Model

Train the CycleGAN model on your unpaired image dataset:

# Prepare your dataset
# Organize images in datasets/your_dataset/ directory
# Create trainA/ and trainB/ folders for two domains
# Preprocess images to consistent size

# Basic training with default parameters
python train.py --dataroot ./datasets/your_dataset --name experiment_name --epochs 200

# Configure in config.py:
# - BATCH_SIZE = 1
# - EPOCHS = 200
# - LEARNING_RATE = 0.0002
# - LAMBDA_A = 10.0 (cycle loss weight)
# - LAMBDA_B = 10.0 (cycle loss weight)
# - LAMBDA_IDENTITY = 0.5 (identity loss weight)

# Or use Jupyter notebook
jupyter notebook CycleGAN_Demo.ipynb

# Training will:
# - Load and preprocess unpaired images
# - Initialize two generators and two discriminators
# - Train with adversarial loss + cycle consistency loss
# - Bidirectional translation (A→B and B→A)
# - Save checkpoints and translated samples
# - Log to TensorBoard

Training Parameters (config.py):

BATCH_SIZE: Training batch size (default: 1)
EPOCHS: Number of training epochs (default: 200)
LEARNING_RATE: Learning rate (default: 0.0002)
LAMBDA_A: Cycle loss weight for A→B→A (default: 10.0)
LAMBDA_B: Cycle loss weight for B→A→B (default: 10.0)
LAMBDA_IDENTITY: Identity loss weight (default: 0.5)
CROP_SIZE: Image crop size (default: 256)
LOAD_SIZE: Image load size (default: 286)
DEVICE: Device to use - 'cuda' or 'cpu' (default: 'cuda')

Image Translation

Translate images using the trained CycleGAN model:

# Translate images from trained model
python test.py --dataroot ./datasets/your_dataset --name experiment_name --model test --direction AtoB

# Translate in both directions
python test.py --dataroot ./datasets/your_dataset --name experiment_name --model test --direction BtoA

# Or use Jupyter notebook
jupyter notebook CycleGAN_Demo.ipynb

# Using Python API
from models.cyclegan_model import CycleGANModel
import torch

model = CycleGANModel()
model.load_networks('latest', './checkpoints/experiment_name/')
model.set_requires_grad([], False)
model.eval()

# Translate image from domain A to B
translated = model.translate_image(input_image, direction='AtoB')
print(f"Translated image shape: {translated.shape}")

Model Evaluation

Evaluate the trained CycleGAN model performance:

# Evaluate trained model
python evaluate.py --dataroot ./datasets/your_dataset --name experiment_name

# The evaluation includes:
# - FID (Fréchet Inception Distance)
# - Inception Score (IS)
# - LPIPS (Learned Perceptual Image Patch Similarity)
# - Cycle consistency metrics
# - Translation quality assessment

Translation Visualization

Visualize image translations and compare domains:

# Translation visualization
python test.py --dataroot ./datasets/your_dataset --name experiment_name --model test --num_test 50

# The visualization includes:
# - Original images from domain A
# - Translated images to domain B
# - Cycle reconstructed images (A→B→A)
# - Side-by-side comparisons
# - Domain comparison grids

Jupyter Notebook

Open the interactive Jupyter notebook for demonstrations:

# CycleGAN demonstration notebook
jupyter notebook CycleGAN_Demo.ipynb

# The notebook includes:
# - Model architecture visualization
# - Generator and discriminator explanation
# - Training setup examples
# - Image translation examples
# - Cycle consistency demonstrations
# - Style transfer applications

# Or use JupyterLab
jupyter lab CycleGAN_Demo.ipynb

Project Structure

                cyclegan-translation/

                ├── README.md                          # Main documentation

                ├── requirements.txt                   # Python dependencies

                ├── LICENSE                            # License file

                ├── CHANGELOG.md                      # Changelog

                ├── PROJECT_STRUCTURE.md               # Project structure

                │

                ├── Core Modules

                │   ├── train.py                       # Training script

                │   ├── train_advanced.py             # Advanced training

                │   ├── test.py                        # Testing script

                │   ├── evaluate.py                    # Model evaluation

                │   ├── example_usage.py              # Example usage

                │   └── config.py                      # Configuration settings

                │

                ├── Models (models/)

                │   ├── cyclegan_model.py             # Main CycleGAN model

                │   ├── networks.py                   # Generator & Discriminator

                │   └── base_model.py                 # Base model class

                │

                ├── Data (data/)

                │   └── dataset.py                     # Dataset loader

                │

                ├── Utils

                │   ├── utils/                        # Image pool, visualization

                │   └── util/                          # General utilities

                │

                ├── Metrics (metrics/)

                │   ├── metrics.py                     # Metrics calculator

                │   ├── fid_score.py                  # FID score

                │   ├── inception_score.py            # Inception Score

                │   └── lpips_score.py                # LPIPS score

                │

                ├── Logger (logger/)

                │   ├── tensorboard_logger.py         # TensorBoard logger

                │   └── file_logger.py                # File logger

                │

                ├── Web (web/)

                │   ├── app.py                         # Flask application

                │   └── templates/                    # Web templates

                │

                ├── Scripts (scripts/)

                │   └── download_pretrained.py        # Download pre-trained models

                │

                └── CycleGAN_Demo.ipynb               # Jupyter notebook demo

Configuration Options

Model Configuration

Customize model and training parameters in config.py:

# Model Architecture (config.py)
INPUT_NC = 3                      # Number of input image channels
OUTPUT_NC = 3                     # Number of output image channels
NGF = 64                          # Number of generator filters
NDF = 64                          # Number of discriminator filters
NETG = 'resnet_9blocks'           # Generator architecture
NETD = 'basic'                    # Discriminator architecture
NORM = 'instance'                 # Normalization type

# Training Parameters (config.py)
BATCH_SIZE = 1                    # Training batch size
LEARNING_RATE = 0.0002           # Learning rate
EPOCHS = 200                      # Number of training epochs
BETA1 = 0.5                       # Adam optimizer beta1
BETA2 = 0.999                     # Adam optimizer beta2
LAMBDA_A = 10.0                   # Cycle loss weight (A→B→A)
LAMBDA_B = 10.0                   # Cycle loss weight (B→A→B)
LAMBDA_IDENTITY = 0.5             # Identity loss weight
CROP_SIZE = 256                   # Image crop size
LOAD_SIZE = 286                   # Image load size
DEVICE = 'cuda'                   # Device: 'cuda' or 'cpu'

# Testing Configuration
DIRECTION = 'AtoB'                # Translation direction
NUM_TEST = 50                     # Number of test images

Configuration Tips:

LAMBDA_A/LAMBDA_B: Cycle loss weights. Standard is 10.0. Higher = stronger cycle consistency
CROP_SIZE: Image size after cropping. Common: 256, 512. Higher = more memory
BATCH_SIZE: Usually 1 for CycleGAN. Larger batches may cause memory issues
LEARNING_RATE: Start with 0.0002. CycleGAN uses lower learning rates
LAMBDA_IDENTITY: Identity loss weight. 0.5 is standard. Use 0.0 if not needed
NETG: Generator architecture. 'resnet_9blocks' is standard, 'unet_256' for simpler tasks
NORM: Normalization type. 'instance' is standard, 'batch' for some cases

Training Progress Logging

The training script automatically logs progress to TensorBoard and saves checkpoints:

# Training logs are saved to:
# checkpoints/experiment_name/logs/ - TensorBoard logs
# checkpoints/experiment_name/ - Model checkpoints
# results/experiment_name/ - Translated sample images

# View TensorBoard logs
tensorboard --logdir checkpoints/experiment_name/logs

# TensorBoard shows:
# - Generator loss per epoch
# - Discriminator loss per epoch
# - Cycle consistency loss per epoch
# - Identity loss per epoch
# - Translated image samples
# - Cycle reconstructed images
# - Model parameters

# Checkpoints are saved as:
# checkpoints/experiment_name/netG_A2B_latest.pth
# checkpoints/experiment_name/netG_B2A_latest.pth
# checkpoints/experiment_name/netD_A_latest.pth
# checkpoints/experiment_name/netD_B_latest.pth

Advanced Training Options

Use the advanced training script with additional features:

# Advanced training with all features
python train_advanced.py \
    --dataroot ./datasets/your_dataset \
    --name experiment_name \
    --epochs 200 \
    --batch-size 1 \
    --tensorboard \
    --early-stopping 10 \
    --lr-scheduler \
    --augment \
    --grad-clip 1.0 \
    --mixed_precision

# Features enabled:
# --tensorboard: Enable TensorBoard logging
# --early-stopping N: Stop if no improvement for N epochs
# --lr-scheduler: Use learning rate scheduling
# --augment: Enable data augmentation
# --grad-clip: Gradient clipping value
# --mixed_precision: Mixed precision training (faster, less memory)

# Resume training from checkpoint
python train_advanced.py \
    --dataroot ./datasets/your_dataset \
    --name experiment_name \
    --continue_train \
    --epochs 200

Training on Different Datasets

Examples for training on different datasets:

# Training on Horse2Zebra dataset
python train.py --dataroot ./datasets/horse2zebra --name horse2zebra --epochs 200

# Training on Apple2Orange dataset
python train.py --dataroot ./datasets/apple2orange --name apple2orange --epochs 200

# Training on custom unpaired dataset
python train.py \
    --dataroot ./datasets/custom_dataset \
    --name custom_experiment \
    --epochs 200 \
    --crop_size 256 \
    --load_size 286

# For high-resolution images (512x512 or larger)
python train.py \
    --dataroot ./datasets/custom_dataset \
    --name high_res_experiment \
    --crop_size 512 \
    --load_size 572 \
    --batch_size 1

Detailed Architecture

CycleGAN Components

1. Generator G_A2B:

Translates images from domain A to domain B
Uses ResNet-based architecture (9 blocks standard)
Instance normalization for style transfer
Learns domain A to domain B mapping
Outputs translated images in domain B

2. Generator G_B2A:

Translates images from domain B to domain A
Mirror architecture of G_A2B
Enables bidirectional translation
Learns domain B to domain A mapping
Outputs translated images in domain A

3. Discriminators D_A and D_B:

D_A distinguishes real A from translated A (B→A)
D_B distinguishes real B from translated B (A→B)
Provides adversarial feedback during training
Uses PatchGAN architecture (70x70 patches)
Enables realistic image translations

Loss Function

CycleGAN uses adversarial training with cycle consistency loss:

Adversarial Loss:
L_GAN(G_A2B, D_B) = E[log D_B(B)] + E[log(1 - D_B(G_A2B(A)))]
L_GAN(G_B2A, D_A) = E[log D_A(A)] + E[log(1 - D_A(G_B2A(B)))]

Cycle Consistency Loss:
L_cyc(G_A2B, G_B2A) = E[||G_B2A(G_A2B(A)) - A||₁] + E[||G_A2B(G_B2A(B)) - B||₁]

Identity Loss (optional):
L_identity(G_A2B, G_B2A) = E[||G_B2A(A) - A||₁] + E[||G_A2B(B) - B||₁]

Total Generator Loss:
L_G = L_GAN(G_A2B, D_B) + L_GAN(G_B2A, D_A) + λ_cyc × L_cyc + λ_id × L_identity

Where:
- G_A2B: Generator A→B
- G_B2A: Generator B→A
- D_A, D_B: Discriminators
- A, B: Images from domains A and B
- λ_cyc: Cycle loss weight (default: 10.0)
- λ_id: Identity loss weight (default: 0.5)

The cycle consistency loss ensures meaningful translations

Cycle Consistency

Ensures meaningful bidirectional translations:

Forward Cycle: A → G_A2B(A) → G_B2A(G_A2B(A)) ≈ A
Backward Cycle: B → G_B2A(B) → G_A2B(G_B2A(B)) ≈ B

Where:
- A: Image from domain A
- B: Image from domain B
- G_A2B: Generator translating A to B
- G_B2A: Generator translating B to A

This ensures:
- Content preservation during translation
- Meaningful bidirectional mappings
- No mode collapse
- Realistic translations

Generator Architecture Details

ResNet Generator (9 blocks):

Input Layer: Receives images of size [batch, 3, 256, 256]
Initial Convolution: 7x7 conv → InstanceNorm → ReLU
Downsampling: Two 3x3 conv layers with stride=2
Residual Blocks: 9 ResNet blocks with InstanceNorm
Upsampling: Two 3x3 transposed conv layers with stride=2
Output Layer: 7x7 conv → Tanh activation
Output: Translated image [batch, 3, 256, 256]

Discriminator Architecture (PatchGAN):

Input Layer: Receives images of size [batch, 3, 256, 256]
Convolutional Blocks: 4 conv layers with LeakyReLU
Downsampling: Uses stride=2 convolutions
Patch Output: Outputs 70x70 patch predictions
Output: Real/fake probability map [batch, 1, 70, 70]

Mathematical Formulation

Complete mathematical description of CycleGAN:

# Generator A→B: G_A2B: A → B
B_fake = G_A2B(A)
where A ∈ domain A, B_fake ∈ domain B

# Generator B→A: G_B2A: B → A
A_fake = G_B2A(B)
where B ∈ domain B, A_fake ∈ domain A

# Discriminator A: D_A: A → [0, 1]
D_A(A) ∈ [0, 1]  # Real A probability
D_A(A_fake) ∈ [0, 1]  # Fake A probability

# Discriminator B: D_B: B → [0, 1]
D_B(B) ∈ [0, 1]  # Real B probability
D_B(B_fake) ∈ [0, 1]  # Fake B probability

# Cycle Consistency:
A_recon = G_B2A(G_A2B(A)) ≈ A
B_recon = G_A2B(G_B2A(B)) ≈ B

# Loss Function:
L_total = L_GAN + λ_cyc × L_cyc + λ_id × L_identity

# Where:
# - G_A2B, G_B2A: Generators
# - D_A, D_B: Discriminators
# - λ_cyc: Cycle loss weight (10.0)
# - λ_id: Identity loss weight (0.5)

Identity Loss

Understanding the identity mapping loss in CycleGAN:

Identity Loss = 0.0: No identity mapping, standard CycleGAN
Identity Loss = 0.5: Balanced identity preservation (recommended)
Identity Loss > 0.5: Stronger identity preservation, less style change
Identity Loss < 0.5: More style change, less identity preservation
Formula: L_id = ||G_B2A(A) - A||₁ + ||G_A2B(B) - B||₁
Use Case: Useful when domains are similar (e.g., photo enhancement)

Advanced Features Usage

Image Translation Usage

Translate images using the trained CycleGAN model:

# Translate images using trained CycleGAN model
from models.cyclegan_model import CycleGANModel
from PIL import Image
import torch

# Load model
model = CycleGANModel()
model.load_networks('latest', './checkpoints/experiment_name/')
model.set_requires_grad([], False)
model.eval()

# Load input image
input_image = Image.open("input_image.jpg")
input_tensor = model.preprocess_image(input_image)

# Translate from domain A to B
translated = model.translate_image(input_tensor, direction='AtoB')

# Save translated image
from utils.visualization import save_image
save_image(translated, "translated_image.png")

# Translate from domain B to A
translated_back = model.translate_image(input_tensor, direction='BtoA')
save_image(translated_back, "translated_back.png")

Cycle Consistency Visualization

Visualize cycle consistency and bidirectional translations:

from models.cyclegan_model import CycleGANModel
from utils.visualization import visualize_cycle

# Load model
model = CycleGANModel()
model.load_networks('latest', './checkpoints/experiment_name/')
model.eval()

# Load input image from domain A
input_image = Image.open("input_A.jpg")
input_tensor = model.preprocess_image(input_image)

# Forward cycle: A → B → A
translated_B = model.translate_image(input_tensor, direction='AtoB')
reconstructed_A = model.translate_image(translated_B, direction='BtoA')

# Visualize cycle consistency
visualize_cycle(input_tensor, translated_B, reconstructed_A, "cycle_consistency.png")

# Backward cycle: B → A → B
input_B = Image.open("input_B.jpg")
input_B_tensor = model.preprocess_image(input_B)
translated_A = model.translate_image(input_B_tensor, direction='BtoA')
reconstructed_B = model.translate_image(translated_A, direction='AtoB')

# Save results
visualize_cycle(input_B_tensor, translated_A, reconstructed_B, "cycle_consistency_B.png")

Model Evaluation

Evaluate model performance with SSIM and diversity metrics:

from evaluate import evaluate_model

# Evaluate on test dataset
results = evaluate_model(
    checkpoint_path="outputs/checkpoints/generator.pth",
    dataset_path="./data/train",
    device="cuda"
)

# Results include:
# - SSIM (Structural Similarity Index)
# - Diversity metrics
# - Generated image quality
# - Model performance statistics

print(f"SSIM: {results['ssim']:.4f}")
print(f"Diversity: {results['diversity']:.4f}")
print(f"Quality Score: {results['quality']:.4f}")

Dataset Preparation

Prepare your custom dataset for training:

# Prepare custom dataset
# Place images in data/custom/ directory
# Supported formats: .jpg, .png, .jpeg

# Directory structure:
# data/custom/
#   ├── image1.jpg
#   ├── image2.png
#   └── ...

# The training script will automatically:
# - Load images from directory
# - Apply data augmentation
# - Resize to specified dimensions
# - Normalize pixel values
# - Create data loaders for training

# Use custom dataset for training
python train.py --dataset custom --data-dir data/custom --epochs 50

Latent Space Exploration

Explore the learned latent space:

# Style mixing visualization is available in the Jupyter notebooks
jupyter notebook notebooks/04_style_mixing.ipynb

# The notebooks include:
# - Style mixing at different layers
# - Interpolation examples in W space
# - Progressive growing visualization
# - Generated image samples

# Visualize style mixing
from visualize import visualize_style_mixing
visualize_style_mixing(generator, num_samples=16, mix_layers=[4, 5, 6])

Advanced Visualization Techniques

Use the visualization script for comprehensive analysis:

# Style mixing visualization
python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode style_mix --output style_mix.png

# Progressive growing visualization
python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode progressive --output progressive.png

# Interpolation visualization
python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode interpolate --num-steps 10 --output interpolation.png

# Generated samples grid
python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode samples --num-samples 16 --output samples.png

# All visualizations at once
python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode all --output_dir visualizations/

Model Export and Deployment

Export trained models for deployment:

# Export to ONNX format (for production deployment)
python scripts/convert_checkpoint.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --format onnx \
    --output stylegan_generator.onnx \
    --latent-dim 512

# Export to TorchScript (PyTorch mobile/edge)
python scripts/convert_checkpoint.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --format torchscript \
    --output stylegan_generator.pt

# Export generator and discriminator separately
python scripts/convert_checkpoint.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --format onnx \
    --components generator \
    --output-dir exported_models/

# Verify exported model
python scripts/convert_checkpoint.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --format onnx \
    --output stylegan_generator.onnx \
    --verify

Model Comparison and Analysis

Compare different trained models:

# Compare multiple models
python evaluate.py \
    --checkpoints outputs/checkpoints/generator_epoch_50.pth \
                  outputs/checkpoints/generator_epoch_100.pth \
    --dataset ./data/train \
    --output comparison_report.html

# Compare models with different truncation values
python generate.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --num_images 16 \
    --truncation 0.5 \
    --output_dir comparisons/trunc_0.5

python generate.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --num_images 16 \
    --truncation 0.7 \
    --output_dir comparisons/trunc_0.7

# Generate side-by-side comparisons
python visualize.py \
    --checkpoint outputs/checkpoints/generator.pth \
    --mode compare \
    --num-samples 16 \
    --output-dir model_comparisons/

Complete Training Workflow

Step-by-Step Training Process

Step 1: Prepare Data

# Prepare custom dataset in data/train/ directory

# For custom dataset:
# Place images in data/train/
# Supported formats: .jpg, .png, .jpeg

# Preprocess images first:
python preprocess.py --input_dir ./raw_images --output_dir ./data/train --size 256

# The training script will automatically:
# - Load images from directory
# - Resize and normalize images
# - Create data loaders for progressive training

Step 2: Train Model

# Start training
python train.py --dataset ./data/train --output_dir ./outputs --epochs 100

# Training will:
# 1. Load and preprocess images
# 2. Initialize generator and discriminator
# 3. Train with adversarial loss + gradient penalty
# 4. Progressive growing from low to high resolution
# 5. Save checkpoints and best model
# 6. Log training history to TensorBoard
# 7. Generate sample images during training

Step 3: Monitor Training

Watch console output for epoch progress
Check TensorBoard: tensorboard --logdir outputs/logs
View generated samples in outputs/samples/
Best model saved as outputs/checkpoints/best_generator.pth

Step 4: Evaluate Model

# Evaluate on test set
python evaluate.py --checkpoint outputs/checkpoints/generator.pth --dataset ./data/train

# Calculate SSIM and diversity metrics

Step 5: Generate Images

# Generate images
python generate.py --checkpoint outputs/checkpoints/generator.pth --num_images 16 --output_dir ./generated

# Generate style interpolation
python generate.py --checkpoint outputs/checkpoints/generator.pth --interpolate --num_steps 10 --output_dir ./generated

API Usage Examples

Image Generation Endpoint (cURL)

Generate images using the REST API:

curl -X POST http://localhost:5000/generate \
  -H "Content-Type: application/json" \
  -d '{
    "num_images": 10
  }'

# Response:
# {
#   "num_images": 10,
#   "images": ["base64_encoded_image1", ...],
#   "status": "success"
# }

Latent Interpolation Endpoint (cURL)

Generate interpolation between latent points:

curl -X POST http://localhost:5000/interpolate \
  -H "Content-Type: application/json" \
  -d '{
    "num_steps": 10
  }'

# Response:
# {
#   "num_steps": 10,
#   "images": ["base64_encoded_image1", ...],
#   "status": "success"
# }

Health Check (cURL)

Check API server health and model status:

curl -X GET http://localhost:5000/health

# Response:
# {
#   "status": "healthy",
#   "model_loaded": true,
#   "device": "cuda"
# }

Python Requests Example

Use the API with Python requests library:

import requests
import base64
from PIL import Image
from io import BytesIO

# Image generation endpoint
response = requests.post(
    'http://localhost:5000/generate',
    json={'num_images': 10}
)
data = response.json()

# Decode and save images
for i, img_base64 in enumerate(data['images']):
    img_data = base64.b64decode(img_base64)
    img = Image.open(BytesIO(img_data))
    img.save(f'generated_{i}.png')

# Interpolation endpoint
interp_response = requests.post(
    'http://localhost:5000/interpolate',
    json={'num_steps': 10}
)
print(interp_response.json())

# Health check
health = requests.get('http://localhost:5000/health')
print(health.json())

JavaScript/Fetch Example

Use the API with JavaScript fetch API:

// Image generation
fetch('http://localhost:5000/generate', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify({
        num_images: 10
    })
})
.then(res => res.json())
.then(data => {
    console.log('Generated', data.num_images, 'images');
    // Display images from base64 data
    data.images.forEach((imgBase64, i) => {
        const img = document.createElement('img');
        img.src = 'data:image/png;base64,' + imgBase64;
        document.body.appendChild(img);
    });
});

// Interpolation
fetch('http://localhost:5000/interpolate', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify({
        num_steps: 10
    })
})
.then(res => res.json())
.then(data => {
    console.log('Interpolation steps:', data.num_steps);
});

// Health check
fetch('http://localhost:5000/health')
.then(res => res.json())
.then(data => console.log('Status:', data));

CycleGAN Model Variants

Model	Max Resolution	Latent Dim	Use Case	Quality
Basic CycleGAN	256x256	ResNet-6	Fast training, basic tasks	Good
Standard CycleGAN	256x256	ResNet-9	Balanced quality/speed	Better
High-Res CycleGAN	512x512	ResNet-9	Higher quality translation	Best
UNet CycleGAN	256x256	UNet-256	Simpler architecture	Good

Dataset Information

Dataset Formats

The project supports multiple dataset formats for image training:

Built-in datasets: MNIST, CIFAR-10 (automatically downloaded)
Custom dataset: Directory of images (JPG, PNG, JPEG)
Automatic image loading and preprocessing
Data augmentation support
Train/validation split support
Multiple image format support

Custom Dataset Format

Training data is stored as image files in a directory:

# Custom dataset directory structure
data/custom/
├── image1.jpg
├── image2.png
├── image3.jpeg
└── ...

# The training script automatically:
# - Loads images from directory
# - Applies data augmentation
# - Resizes to specified dimensions
# - Normalizes pixel values
# - Creates data loaders for training

Adding Custom Training Data

Add your own image dataset for training:

# Place images in data/custom/ directory
# Supported formats: .jpg, .png, .jpeg

# Example:
mkdir -p data/custom
cp your_images/*.jpg data/custom/

# Use in training
python train.py --dataset custom --data-dir data/custom --epochs 50

# The script will automatically:
# - Load all images from directory
# - Apply augmentation if enabled
# - Resize and normalize images
# - Create train/validation splits

Troubleshooting & Best Practices

Common Issues

CUDA Out of Memory: Reduce batch_size in train.py, use smaller d_model (256 instead of 512), reduce num_layers, or use CPU mode
Model Not Found: Ensure model is trained first by running train.py or loading from models/ directory. Check model path is correct
Vocabulary Not Found: Ensure vocabularies are saved during training. Check vocab_dir path matches training save_dir
Slow Generation: Use smaller latent_dim (64 instead of 128), reduce hidden_dims, or use CPU mode
API Connection Error: Check if api.py is running on port 5000. Verify model path is correct
Import Errors: Verify all dependencies installed: pip install -r requirements.txt. Check Python version (3.8+)
Image Size Mismatch: Ensure all images are same size or use data augmentation to resize. Check IMAGE_SIZE in config
Poor Generation Quality: Train for more epochs, use larger latent_dim, increase training data, or adjust beta (KL weight)
Training Loss Not Decreasing: Check learning rate (may be too high/low), verify data format, check for data issues
Validation Loss Increasing: Model may be overfitting. Increase beta (KL weight), use more data, or reduce model size
Blurry Generated Images: Increase latent_dim, train longer, or adjust beta to balance reconstruction and KL divergence
Mode Collapse: Increase beta value, use more diverse training data, or try different architectures
KL Divergence Too High: Reduce beta value or increase model capacity
KL Divergence Too Low: Increase beta value to encourage better latent structure
Training Instability: Reduce learning rate, use gradient clipping, or try different optimizer
Memory Issues: Reduce batch size, use smaller image size, or enable mixed precision training

Performance Optimization Tips

GPU Memory: Use gradient accumulation for effective larger batch sizes: accumulate gradients over N batches before updating
Mixed Precision: Enable AMP (Automatic Mixed Precision) for 2x speedup and 50% memory reduction
Data Loading: Use multiple workers (num_workers=4-8) and pin_memory=True for faster data loading
Model Pruning: Remove unnecessary layers or reduce hidden dimensions for faster inference
Quantization: Use INT8 quantization for 4x speedup in production (with slight quality loss)
Batch Inference: Generate multiple images in batches rather than one at a time
Model Caching: Load model once and reuse for multiple generations
ONNX Runtime: Use exported ONNX models with ONNX Runtime for faster inference

Best Practices

Training Data: Use diverse, high-quality image datasets. More data = better results. Aim for 1K+ images minimum
Data Format: Ensure images are in supported formats (JPG, PNG, JPEG). Consistent image sizes recommended
Data Preprocessing: Normalize pixel values, resize images to consistent dimensions, apply augmentation if needed
Batch Size: Use smaller batches (32-64) for limited GPU memory. Larger batches (128+) for faster training if memory allows
Learning Rate: Start with 0.001 and adjust based on training loss. Use learning rate scheduling for better convergence
Gradient Clipping: Default is 1.0. Increase if training is unstable, decrease if gradients are too small
Beta (KL Weight): Start with 1.0. Higher = more regularization, lower = better reconstruction. Adjust based on results
Model Selection: Start with latent_dim=64 for speed/testing. Use 128+ for production quality. Higher = more expressive
Evaluation: Regularly evaluate reconstruction and KL divergence on validation set. Monitor for overfitting
Latent Dimension: Higher latent_dim = more expressive but slower. Common values: 64, 128, 256. Start with 128
Checkpointing: Model saves checkpoints automatically. Can resume training from checkpoint if needed
API Rate Limiting: Implement rate limiting for production deployments. Consider using nginx or similar
Logging: Monitor TensorBoard logs for debugging and optimization. View in outputs/logs/
Device Selection: Use CUDA if available for faster training. CPU works but much slower
Beta Tuning: Start with β=1.0, increase for better latent structure, decrease for better reconstruction
Latent Dimension: Start with 128, increase for more expressive models, decrease for faster training
Early Stopping: Monitor validation loss, stop if no improvement for 10-15 epochs
Checkpointing: Save checkpoints every 5-10 epochs to avoid losing progress
Data Augmentation: Use for small datasets to improve generalization
Learning Rate: Use learning rate scheduling (ReduceLROnPlateau) for better convergence

Use Cases and Applications

Image Generation: Generate new images from random latent vectors
Image Reconstruction: Reconstruct and denoise images
Image Interpolation: Create smooth transitions between images
Anomaly Detection: Detect outliers by high reconstruction error
Data Augmentation: Generate synthetic training data
Image Editing: Manipulate images in latent space
Feature Learning: Learn meaningful image representations
Dimensionality Reduction: Compress images to low-dimensional latent space
Style Transfer: Transfer styles by manipulating latent vectors
Image Completion: Complete missing parts of images

Performance Optimization

GPU Usage: Set CUDA_VISIBLE_DEVICES for multi-GPU systems. Use GPU for training and inference when available
Model Selection: Use latent_dim=64 for fastest inference. Larger models (128+, 256+) for better quality
Batch Processing: Generate multiple images in batches for efficient processing. Reduces overhead
Caching: API server caches model in memory. Model loads once on first request, then reused
Image Size: Use smaller image sizes (64x64) for faster generation. Larger images (128x128+) for better quality
Latent Sampling: Sample from standard normal distribution N(0, I) for generation. Use interpolation for smooth transitions
Model Quantization: Consider model quantization for production to reduce memory and speed up inference
Async Processing: For high-throughput, consider async API or queue system for batch image generation
Memory Management: Clear GPU cache between batches if running out of memory: torch.cuda.empty_cache()

Expected Training Times

Approximate training times for different configurations:

Dataset	Image Size	Latent Dim	Batch Size	Epochs	GPU	Time
MNIST	28×28	64	128	50	GTX 1080	~15 min
MNIST	28×28	128	128	50	GTX 1080	~20 min
CIFAR-10	32×32	128	128	100	GTX 1080	~2 hours
Custom	64×64	256	64	50	RTX 3090	~3 hours
Custom	128×128	512	32	50	RTX 3090	~8 hours

Note: Times are approximate and depend on hardware, dataset size, and other factors. CPU training is typically 10-20x slower.

Model Size and Memory Requirements

Approximate model sizes and memory usage:

Latent Dim	Hidden Dims	Model Size	GPU Memory (Training)	GPU Memory (Inference)
64	[32, 64, 128]	~5 MB	~500 MB	~200 MB
128	[32, 64, 128, 256]	~15 MB	~1.5 GB	~500 MB
256	[64, 128, 256, 512]	~50 MB	~4 GB	~1.5 GB
512	[128, 256, 512, 1024]	~200 MB	~12 GB	~4 GB

Note: Memory usage depends on batch size and image size. Larger batches and images require more memory.

Real-World Examples & Use Cases

Example 1: Training on Custom Face Dataset

Complete workflow for training on a custom face dataset:

# 1. Prepare dataset
mkdir -p data/faces
# Copy face images to data/faces/

# 2. Train CycleGAN model
python train.py \
    --dataroot ./datasets/faces \
    --name faces_experiment \
    --epochs 200 \
    --batch_size 1 \
    --crop_size 256 \
    --tensorboard

# 3. Translate faces between domains
python test.py \
    --dataroot ./datasets/faces \
    --name faces_experiment \
    --model test \
    --direction AtoB \
    --num_test 50

# 4. Visualize cycle consistency
python test.py \
    --dataroot ./datasets/faces \
    --name faces_experiment \
    --model test \
    --direction AtoB \
    --results_dir ./face_translations

Example 2: Photo to Painting

Use CycleGAN for photo to painting style transfer:

# Train on photo and painting datasets
python train.py --dataroot ./datasets/photo2painting --name photo2painting --epochs 200

# Translate photos to paintings
python test.py \
    --dataroot ./datasets/photo2painting \
    --name photo2painting \
    --model test \
    --direction AtoB

# Save result
# Translated images saved in results/photo2painting/test_latest/images/

Example 3: Season Transfer

Transfer images between seasons (summer to winter):

# Train on summer and winter datasets
python train.py --dataroot ./datasets/summer2winter --name summer2winter --epochs 200

# Translate summer images to winter
python test.py \
    --dataroot ./datasets/summer2winter \
    --name summer2winter \
    --model test \
    --direction AtoB

# Translate winter images to summer
python test.py \
    --dataroot ./datasets/summer2winter \
    --name summer2winter \
    --model test \
    --direction BtoA

Example 4: Day to Night Conversion

Convert day images to night and vice versa:

# Train on day and night datasets
python train.py --dataroot ./datasets/day2night --name day2night --epochs 200

# Convert day images to night
python test.py \
    --dataroot ./datasets/day2night \
    --name day2night \
    --model test \
    --direction AtoB

# Convert night images to day
python test.py \
    --dataroot ./datasets/day2night \
    --name day2night \
    --model test \
    --direction BtoA

Example 5: Object Transfiguration

Transform objects between different types (e.g., horse to zebra):

# Train on horse and zebra datasets
python train.py --dataroot ./datasets/horse2zebra --name horse2zebra --epochs 200

# Transform horses to zebras
python test.py \
    --dataroot ./datasets/horse2zebra \
    --name horse2zebra \
    --model test \
    --direction AtoB

# Transform zebras to horses
python test.py \
    --dataroot ./datasets/horse2zebra \
    --name horse2zebra \
    --model test \
    --direction BtoA

Integration Examples

Integration with Flask Web Application

Integrate CycleGAN into a Flask web application:

from flask import Flask, request, jsonify, send_file
from models.cyclegan_model import CycleGANModel
from PIL import Image
import io
import base64

app = Flask(__name__)
model = CycleGANModel()
model.load_networks('latest', './checkpoints/experiment_name/')
model.set_requires_grad([], False)
model.eval()

@app.route('/translate', methods=['POST'])
def translate():
    # Receive image file
    file = request.files['image']
    direction = request.form.get('direction', 'AtoB')
    image = Image.open(file.stream)
    
    # Translate image
    translated = model.translate_image(image, direction=direction)
    
    # Convert to base64 for JSON response
    buffered = io.BytesIO()
    translated.save(buffered, format="PNG")
    img_str = base64.b64encode(buffered.getvalue()).decode()
    
    return jsonify({'image': img_str, 'direction': direction})

@app.route('/translate_file', methods=['POST'])
def translate_file():
    # Receive image file
    file = request.files['image']
    direction = request.form.get('direction', 'AtoB')
    image = Image.open(file.stream)
    
    # Translate image
    translated = model.translate_image(image, direction=direction)
    
    # Return translated image
    img_io = io.BytesIO()
    translated.save(img_io, 'PNG')
    img_io.seek(0)
    return send_file(img_io, mimetype='image/png')

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Integration with FastAPI

Create a FastAPI service for CycleGAN image translation:

from fastapi import FastAPI, File, UploadFile, Form
from fastapi.responses import FileResponse
from models.cyclegan_model import CycleGANModel
from PIL import Image
import io

app = FastAPI()
model = CycleGANModel()
model.load_networks('latest', './checkpoints/experiment_name/')
model.set_requires_grad([], False)
model.eval()

@app.post("/translate")
async def translate_image_endpoint(
    file: UploadFile = File(...),
    direction: str = Form('AtoB')
):
    """Translate uploaded image between domains."""
    image_data = await file.read()
    image = Image.open(io.BytesIO(image_data))
    translated = model.translate_image(image, direction=direction)
    
    # Save and return translated image
    img_io = io.BytesIO()
    translated.save(img_io, 'PNG')
    img_io.seek(0)
    return FileResponse(img_io, media_type='image/png')

@app.get("/health")
async def health_check():
    """Health check endpoint."""
    return {"status": "healthy", "model_loaded": True}

Integration with Streamlit

Create an interactive Streamlit application for image translation:

import streamlit as st
from models.cyclegan_model import CycleGANModel
from PIL import Image

st.title("CycleGAN Image Translation")

# Load model
@st.cache_resource
def load_cyclegan_model():
    model = CycleGANModel()
    model.load_networks('latest', './checkpoints/experiment_name/')
    model.set_requires_grad([], False)
    model.eval()
    return model

model = load_cyclegan_model()

# Sidebar controls
direction = st.sidebar.selectbox("Translation Direction", ["AtoB", "BtoA"])

# Image upload
uploaded_file = st.file_uploader("Upload Image", type=["jpg", "jpeg", "png"])

# Translate button
if uploaded_file is not None and st.button("Translate Image"):
    with st.spinner("Translating image..."):
        image = Image.open(uploaded_file)
        translated = model.translate_image(image, direction=direction)
        
        col1, col2 = st.columns(2)
        with col1:
            st.image(image, caption="Original Image", use_container_width=True)
        with col2:
            st.image(translated, caption="Translated Image", use_container_width=True)

# Cycle consistency visualization
st.header("Cycle Consistency")
if uploaded_file is not None and st.button("Show Cycle Consistency"):
    with st.spinner("Computing cycle consistency..."):
        image = Image.open(uploaded_file)
        # Forward cycle
        translated = model.translate_image(image, direction='AtoB')
        reconstructed = model.translate_image(translated, direction='BtoA')
        
        col1, col2, col3 = st.columns(3)
        with col1:
            st.image(image, caption="Original", use_container_width=True)
        with col2:
            st.image(translated, caption="Translated", use_container_width=True)
        with col3:
            st.image(reconstructed, caption="Reconstructed", use_container_width=True)

Contact Information

Get in Touch

Developer: Molla Samser
Designer & Tester: Rima Khatun

rskworld.in

help@rskworld.in support@rskworld.in

+91 93305 39277

License

This project is for educational purposes only. See LICENSE file for more details.

Theme Settings

Color Scheme

Display Options

Font Size

CycleGAN Image Translation

Project Description

Project Screenshots

Core Features

CycleGAN Architecture

Unpaired Image Translation

Cycle Consistency Loss

Adversarial Training

TensorBoard Integration

Web Interface

Advanced Features

Style Transfer Applications

Data Augmentation

Multiple Dataset Support

Resume Training

Web Interface Features

Technologies Used

Installation & Usage

Installation

PyTorch Installation

Verify Installation

Training the Model

Image Translation

Model Evaluation

Translation Visualization

Jupyter Notebook

Project Structure

Configuration Options

Model Configuration

Training Progress Logging

Advanced Training Options

Training on Different Datasets

Detailed Architecture

CycleGAN Components

Loss Function

Cycle Consistency

Generator Architecture Details

Mathematical Formulation

Identity Loss

Advanced Features Usage

Image Translation Usage

Cycle Consistency Visualization

Model Evaluation

Dataset Preparation

Latent Space Exploration

Advanced Visualization Techniques

Model Export and Deployment

Model Comparison and Analysis

Complete Training Workflow

Step-by-Step Training Process

API Usage Examples

Image Generation Endpoint (cURL)

Latent Interpolation Endpoint (cURL)

Health Check (cURL)

Python Requests Example

JavaScript/Fetch Example

CycleGAN Model Variants

Dataset Information

Dataset Formats

Custom Dataset Format

Adding Custom Training Data

Troubleshooting & Best Practices

Common Issues

Performance Optimization Tips

Best Practices

Use Cases and Applications

Performance Optimization

Expected Training Times

Model Size and Memory Requirements

Real-World Examples & Use Cases

Example 1: Training on Custom Face Dataset

Example 2: Photo to Painting

Example 3: Season Transfer

Example 4: Day to Night Conversion

Example 5: Object Transfiguration

Integration Examples