help@rskworld.in +91 93305 39277
RSK World
  • Home
  • Development
    • Web Development
    • Mobile Apps
    • Software
    • Games
    • Project
  • Technologies
    • Data Science
    • AI Development
    • Cloud Development
    • Blockchain
    • Cyber Security
    • Dev Tools
    • Testing Tools
  • About
  • Contact

Theme Settings

Color Scheme
Display Options
Font Size
100%
Back

CycleGAN Image Translation

Complete Documentation & Project Details for Unpaired Image Translation & Cycle-Consistent GAN

Project Description

This project implements CycleGAN for unpaired image-to-image translation using cycle-consistent adversarial networks. The architecture uses two generators (G_A2B and G_B2A) and two discriminators (D_A and D_B) to learn bidirectional mappings between two image domains without paired training examples. Perfect for learning CycleGAN fundamentals and image translation applications. The system includes adversarial training, cycle consistency loss, style transfer capabilities, TensorBoard integration, and comprehensive training tools.

CycleGAN uses cycle-consistent adversarial training where two generators learn to translate images from domain A to domain B and vice versa, while two discriminators distinguish between real and translated images. The cycle consistency loss ensures that images translated from A to B and back to A match the original, maintaining content while changing style. The implementation provides complete PyTorch support, comprehensive training pipeline, evaluation metrics (FID, IS, LPIPS), and deployment tools for image-to-image translation applications.

Project Screenshots

1 / 4
CycleGAN Image Translation

Core Features

CycleGAN Architecture

  • Dual generator setup
  • Dual discriminator setup
  • Cycle consistency loss
  • Unpaired training
  • Bidirectional translation

Unpaired Image Translation

  • No paired training data
  • Bidirectional translation
  • Cycle consistency
  • Domain adaptation
  • Style transfer

Cycle Consistency Loss

  • Forward cycle (A→B→A)
  • Backward cycle (B→A→B)
  • Identity mapping loss
  • Content preservation
  • Meaningful translations

Adversarial Training

  • Adversarial loss (GAN)
  • Cycle consistency loss
  • Identity mapping loss
  • Dual discriminator training
  • Training stability

TensorBoard Integration

  • Real-time loss visualization
  • Translated image tracking
  • Training progress monitoring
  • Interactive dashboard
  • Comprehensive logging

Web Interface

  • Flask-based web app
  • Interactive image translation
  • Real-time translation
  • Download translated images
  • User-friendly interface

Advanced Features

Style Transfer Applications

  • Photo to painting
  • Season transfer
  • Day to night conversion
  • Object transfiguration
  • Multiple applications

Data Augmentation

  • Adaptive augmentation
  • Mixup augmentation
  • Cutout augmentation
  • Multiple augmentation levels

Multiple Dataset Support

  • Custom unpaired datasets
  • Horse2Zebra dataset
  • Apple2Orange dataset
  • Season transfer datasets

Resume Training

  • Checkpoint resuming
  • Automatic checkpoint detection
  • Training continuation
  • Progress preservation
  • Early stopping support

Web Interface Features

Feature Description Usage
Image Translation Translate images between domains Upload image and select translation direction
Real-time Translation Translate images in real-time Translated images appear immediately
Download Images Download translated images Click download button for each translated image
Model Selection Choose different trained models Select from available checkpoints

Technologies Used

This CycleGAN Image Translation project is built using modern deep learning and computer vision technologies. The core implementation uses Python as the primary programming language and PyTorch for deep learning operations. The project includes a CycleGAN architecture with dual generator-discriminator networks for unpaired image-to-image translation. The project includes Jupyter Notebook support for interactive development and demonstrations, and comprehensive adversarial training with cycle consistency loss for meaningful image translations.

CycleGAN uses cycle-consistent adversarial training where two generators learn to translate images from domain A to domain B and vice versa, while two discriminators distinguish between real and translated images. The system supports unpaired training without requiring paired examples, cycle consistency loss for maintaining content while changing style, and identity mapping loss for domain preservation, making it suitable for various image-to-image translation applications.

Python 3.8+ PyTorch 2.0+ CycleGAN Cycle Consistency Image Translation TensorBoard Computer Vision Jupyter Notebook Style Transfer Unpaired Training

Installation & Usage

Installation

Install all required dependencies for the CycleGAN Image Translation project:

# Install all requirements pip install -r requirements.txt # The CycleGAN model will be trained on your unpaired data # Prepare your dataset in datasets/your_dataset/ directory # Organize as trainA/ and trainB/ folders # Images should be preprocessed and resized

PyTorch Installation

Install PyTorch (CPU or GPU version):

# For CPU only pip install torch torchvision torchaudio # For CUDA (GPU support) - CUDA 11.8 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # For CUDA 12.1 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 # Verify installation python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"

Verify Installation

Test the model and verify all components work:

# Test model architecture python test_model.py # This will verify: # - Model can be instantiated # - Forward pass works # - All components function correctly # - Device compatibility (CPU/CUDA)

Training the Model

Train the CycleGAN model on your unpaired image dataset:

# Prepare your dataset # Organize images in datasets/your_dataset/ directory # Create trainA/ and trainB/ folders for two domains # Preprocess images to consistent size # Basic training with default parameters python train.py --dataroot ./datasets/your_dataset --name experiment_name --epochs 200 # Configure in config.py: # - BATCH_SIZE = 1 # - EPOCHS = 200 # - LEARNING_RATE = 0.0002 # - LAMBDA_A = 10.0 (cycle loss weight) # - LAMBDA_B = 10.0 (cycle loss weight) # - LAMBDA_IDENTITY = 0.5 (identity loss weight) # Or use Jupyter notebook jupyter notebook CycleGAN_Demo.ipynb # Training will: # - Load and preprocess unpaired images # - Initialize two generators and two discriminators # - Train with adversarial loss + cycle consistency loss # - Bidirectional translation (A→B and B→A) # - Save checkpoints and translated samples # - Log to TensorBoard

Training Parameters (config.py):

  • BATCH_SIZE: Training batch size (default: 1)
  • EPOCHS: Number of training epochs (default: 200)
  • LEARNING_RATE: Learning rate (default: 0.0002)
  • LAMBDA_A: Cycle loss weight for A→B→A (default: 10.0)
  • LAMBDA_B: Cycle loss weight for B→A→B (default: 10.0)
  • LAMBDA_IDENTITY: Identity loss weight (default: 0.5)
  • CROP_SIZE: Image crop size (default: 256)
  • LOAD_SIZE: Image load size (default: 286)
  • DEVICE: Device to use - 'cuda' or 'cpu' (default: 'cuda')

Image Translation

Translate images using the trained CycleGAN model:

# Translate images from trained model python test.py --dataroot ./datasets/your_dataset --name experiment_name --model test --direction AtoB # Translate in both directions python test.py --dataroot ./datasets/your_dataset --name experiment_name --model test --direction BtoA # Or use Jupyter notebook jupyter notebook CycleGAN_Demo.ipynb # Using Python API from models.cyclegan_model import CycleGANModel import torch model = CycleGANModel() model.load_networks('latest', './checkpoints/experiment_name/') model.set_requires_grad([], False) model.eval() # Translate image from domain A to B translated = model.translate_image(input_image, direction='AtoB') print(f"Translated image shape: {translated.shape}")

Model Evaluation

Evaluate the trained CycleGAN model performance:

# Evaluate trained model python evaluate.py --dataroot ./datasets/your_dataset --name experiment_name # The evaluation includes: # - FID (Fréchet Inception Distance) # - Inception Score (IS) # - LPIPS (Learned Perceptual Image Patch Similarity) # - Cycle consistency metrics # - Translation quality assessment

Translation Visualization

Visualize image translations and compare domains:

# Translation visualization python test.py --dataroot ./datasets/your_dataset --name experiment_name --model test --num_test 50 # The visualization includes: # - Original images from domain A # - Translated images to domain B # - Cycle reconstructed images (A→B→A) # - Side-by-side comparisons # - Domain comparison grids

Jupyter Notebook

Open the interactive Jupyter notebook for demonstrations:

# CycleGAN demonstration notebook jupyter notebook CycleGAN_Demo.ipynb # The notebook includes: # - Model architecture visualization # - Generator and discriminator explanation # - Training setup examples # - Image translation examples # - Cycle consistency demonstrations # - Style transfer applications # Or use JupyterLab jupyter lab CycleGAN_Demo.ipynb

Project Structure

cyclegan-translation/
├── README.md # Main documentation
├── requirements.txt # Python dependencies
├── LICENSE # License file
├── CHANGELOG.md # Changelog
├── PROJECT_STRUCTURE.md # Project structure
│
├── Core Modules
│ ├── train.py # Training script
│ ├── train_advanced.py # Advanced training
│ ├── test.py # Testing script
│ ├── evaluate.py # Model evaluation
│ ├── example_usage.py # Example usage
│ └── config.py # Configuration settings
│
├── Models (models/)
│ ├── cyclegan_model.py # Main CycleGAN model
│ ├── networks.py # Generator & Discriminator
│ └── base_model.py # Base model class
│
├── Data (data/)
│ └── dataset.py # Dataset loader
│
├── Utils
│ ├── utils/ # Image pool, visualization
│ └── util/ # General utilities
│
├── Metrics (metrics/)
│ ├── metrics.py # Metrics calculator
│ ├── fid_score.py # FID score
│ ├── inception_score.py # Inception Score
│ └── lpips_score.py # LPIPS score
│
├── Logger (logger/)
│ ├── tensorboard_logger.py # TensorBoard logger
│ └── file_logger.py # File logger
│
├── Web (web/)
│ ├── app.py # Flask application
│ └── templates/ # Web templates
│
├── Scripts (scripts/)
│ └── download_pretrained.py # Download pre-trained models
│
└── CycleGAN_Demo.ipynb # Jupyter notebook demo

Configuration Options

Model Configuration

Customize model and training parameters in config.py:

# Model Architecture (config.py) INPUT_NC = 3 # Number of input image channels OUTPUT_NC = 3 # Number of output image channels NGF = 64 # Number of generator filters NDF = 64 # Number of discriminator filters NETG = 'resnet_9blocks' # Generator architecture NETD = 'basic' # Discriminator architecture NORM = 'instance' # Normalization type # Training Parameters (config.py) BATCH_SIZE = 1 # Training batch size LEARNING_RATE = 0.0002 # Learning rate EPOCHS = 200 # Number of training epochs BETA1 = 0.5 # Adam optimizer beta1 BETA2 = 0.999 # Adam optimizer beta2 LAMBDA_A = 10.0 # Cycle loss weight (A→B→A) LAMBDA_B = 10.0 # Cycle loss weight (B→A→B) LAMBDA_IDENTITY = 0.5 # Identity loss weight CROP_SIZE = 256 # Image crop size LOAD_SIZE = 286 # Image load size DEVICE = 'cuda' # Device: 'cuda' or 'cpu' # Testing Configuration DIRECTION = 'AtoB' # Translation direction NUM_TEST = 50 # Number of test images

Configuration Tips:

  • LAMBDA_A/LAMBDA_B: Cycle loss weights. Standard is 10.0. Higher = stronger cycle consistency
  • CROP_SIZE: Image size after cropping. Common: 256, 512. Higher = more memory
  • BATCH_SIZE: Usually 1 for CycleGAN. Larger batches may cause memory issues
  • LEARNING_RATE: Start with 0.0002. CycleGAN uses lower learning rates
  • LAMBDA_IDENTITY: Identity loss weight. 0.5 is standard. Use 0.0 if not needed
  • NETG: Generator architecture. 'resnet_9blocks' is standard, 'unet_256' for simpler tasks
  • NORM: Normalization type. 'instance' is standard, 'batch' for some cases

Training Progress Logging

The training script automatically logs progress to TensorBoard and saves checkpoints:

# Training logs are saved to: # checkpoints/experiment_name/logs/ - TensorBoard logs # checkpoints/experiment_name/ - Model checkpoints # results/experiment_name/ - Translated sample images # View TensorBoard logs tensorboard --logdir checkpoints/experiment_name/logs # TensorBoard shows: # - Generator loss per epoch # - Discriminator loss per epoch # - Cycle consistency loss per epoch # - Identity loss per epoch # - Translated image samples # - Cycle reconstructed images # - Model parameters # Checkpoints are saved as: # checkpoints/experiment_name/netG_A2B_latest.pth # checkpoints/experiment_name/netG_B2A_latest.pth # checkpoints/experiment_name/netD_A_latest.pth # checkpoints/experiment_name/netD_B_latest.pth

Advanced Training Options

Use the advanced training script with additional features:

# Advanced training with all features python train_advanced.py \ --dataroot ./datasets/your_dataset \ --name experiment_name \ --epochs 200 \ --batch-size 1 \ --tensorboard \ --early-stopping 10 \ --lr-scheduler \ --augment \ --grad-clip 1.0 \ --mixed_precision # Features enabled: # --tensorboard: Enable TensorBoard logging # --early-stopping N: Stop if no improvement for N epochs # --lr-scheduler: Use learning rate scheduling # --augment: Enable data augmentation # --grad-clip: Gradient clipping value # --mixed_precision: Mixed precision training (faster, less memory) # Resume training from checkpoint python train_advanced.py \ --dataroot ./datasets/your_dataset \ --name experiment_name \ --continue_train \ --epochs 200

Training on Different Datasets

Examples for training on different datasets:

# Training on Horse2Zebra dataset python train.py --dataroot ./datasets/horse2zebra --name horse2zebra --epochs 200 # Training on Apple2Orange dataset python train.py --dataroot ./datasets/apple2orange --name apple2orange --epochs 200 # Training on custom unpaired dataset python train.py \ --dataroot ./datasets/custom_dataset \ --name custom_experiment \ --epochs 200 \ --crop_size 256 \ --load_size 286 # For high-resolution images (512x512 or larger) python train.py \ --dataroot ./datasets/custom_dataset \ --name high_res_experiment \ --crop_size 512 \ --load_size 572 \ --batch_size 1

Detailed Architecture

CycleGAN Components

1. Generator G_A2B:

  • Translates images from domain A to domain B
  • Uses ResNet-based architecture (9 blocks standard)
  • Instance normalization for style transfer
  • Learns domain A to domain B mapping
  • Outputs translated images in domain B

2. Generator G_B2A:

  • Translates images from domain B to domain A
  • Mirror architecture of G_A2B
  • Enables bidirectional translation
  • Learns domain B to domain A mapping
  • Outputs translated images in domain A

3. Discriminators D_A and D_B:

  • D_A distinguishes real A from translated A (B→A)
  • D_B distinguishes real B from translated B (A→B)
  • Provides adversarial feedback during training
  • Uses PatchGAN architecture (70x70 patches)
  • Enables realistic image translations

Loss Function

CycleGAN uses adversarial training with cycle consistency loss:

Adversarial Loss: L_GAN(G_A2B, D_B) = E[log D_B(B)] + E[log(1 - D_B(G_A2B(A)))] L_GAN(G_B2A, D_A) = E[log D_A(A)] + E[log(1 - D_A(G_B2A(B)))] Cycle Consistency Loss: L_cyc(G_A2B, G_B2A) = E[||G_B2A(G_A2B(A)) - A||₁] + E[||G_A2B(G_B2A(B)) - B||₁] Identity Loss (optional): L_identity(G_A2B, G_B2A) = E[||G_B2A(A) - A||₁] + E[||G_A2B(B) - B||₁] Total Generator Loss: L_G = L_GAN(G_A2B, D_B) + L_GAN(G_B2A, D_A) + λ_cyc × L_cyc + λ_id × L_identity Where: - G_A2B: Generator A→B - G_B2A: Generator B→A - D_A, D_B: Discriminators - A, B: Images from domains A and B - λ_cyc: Cycle loss weight (default: 10.0) - λ_id: Identity loss weight (default: 0.5) The cycle consistency loss ensures meaningful translations

Cycle Consistency

Ensures meaningful bidirectional translations:

Forward Cycle: A → G_A2B(A) → G_B2A(G_A2B(A)) ≈ A Backward Cycle: B → G_B2A(B) → G_A2B(G_B2A(B)) ≈ B Where: - A: Image from domain A - B: Image from domain B - G_A2B: Generator translating A to B - G_B2A: Generator translating B to A This ensures: - Content preservation during translation - Meaningful bidirectional mappings - No mode collapse - Realistic translations

Generator Architecture Details

ResNet Generator (9 blocks):

  • Input Layer: Receives images of size [batch, 3, 256, 256]
  • Initial Convolution: 7x7 conv → InstanceNorm → ReLU
  • Downsampling: Two 3x3 conv layers with stride=2
  • Residual Blocks: 9 ResNet blocks with InstanceNorm
  • Upsampling: Two 3x3 transposed conv layers with stride=2
  • Output Layer: 7x7 conv → Tanh activation
  • Output: Translated image [batch, 3, 256, 256]

Discriminator Architecture (PatchGAN):

  • Input Layer: Receives images of size [batch, 3, 256, 256]
  • Convolutional Blocks: 4 conv layers with LeakyReLU
  • Downsampling: Uses stride=2 convolutions
  • Patch Output: Outputs 70x70 patch predictions
  • Output: Real/fake probability map [batch, 1, 70, 70]

Mathematical Formulation

Complete mathematical description of CycleGAN:

# Generator A→B: G_A2B: A → B B_fake = G_A2B(A) where A ∈ domain A, B_fake ∈ domain B # Generator B→A: G_B2A: B → A A_fake = G_B2A(B) where B ∈ domain B, A_fake ∈ domain A # Discriminator A: D_A: A → [0, 1] D_A(A) ∈ [0, 1] # Real A probability D_A(A_fake) ∈ [0, 1] # Fake A probability # Discriminator B: D_B: B → [0, 1] D_B(B) ∈ [0, 1] # Real B probability D_B(B_fake) ∈ [0, 1] # Fake B probability # Cycle Consistency: A_recon = G_B2A(G_A2B(A)) ≈ A B_recon = G_A2B(G_B2A(B)) ≈ B # Loss Function: L_total = L_GAN + λ_cyc × L_cyc + λ_id × L_identity # Where: # - G_A2B, G_B2A: Generators # - D_A, D_B: Discriminators # - λ_cyc: Cycle loss weight (10.0) # - λ_id: Identity loss weight (0.5)

Identity Loss

Understanding the identity mapping loss in CycleGAN:

  • Identity Loss = 0.0: No identity mapping, standard CycleGAN
  • Identity Loss = 0.5: Balanced identity preservation (recommended)
  • Identity Loss > 0.5: Stronger identity preservation, less style change
  • Identity Loss < 0.5: More style change, less identity preservation
  • Formula: L_id = ||G_B2A(A) - A||₁ + ||G_A2B(B) - B||₁
  • Use Case: Useful when domains are similar (e.g., photo enhancement)

Advanced Features Usage

Image Translation Usage

Translate images using the trained CycleGAN model:

# Translate images using trained CycleGAN model from models.cyclegan_model import CycleGANModel from PIL import Image import torch # Load model model = CycleGANModel() model.load_networks('latest', './checkpoints/experiment_name/') model.set_requires_grad([], False) model.eval() # Load input image input_image = Image.open("input_image.jpg") input_tensor = model.preprocess_image(input_image) # Translate from domain A to B translated = model.translate_image(input_tensor, direction='AtoB') # Save translated image from utils.visualization import save_image save_image(translated, "translated_image.png") # Translate from domain B to A translated_back = model.translate_image(input_tensor, direction='BtoA') save_image(translated_back, "translated_back.png")

Cycle Consistency Visualization

Visualize cycle consistency and bidirectional translations:

from models.cyclegan_model import CycleGANModel from utils.visualization import visualize_cycle # Load model model = CycleGANModel() model.load_networks('latest', './checkpoints/experiment_name/') model.eval() # Load input image from domain A input_image = Image.open("input_A.jpg") input_tensor = model.preprocess_image(input_image) # Forward cycle: A → B → A translated_B = model.translate_image(input_tensor, direction='AtoB') reconstructed_A = model.translate_image(translated_B, direction='BtoA') # Visualize cycle consistency visualize_cycle(input_tensor, translated_B, reconstructed_A, "cycle_consistency.png") # Backward cycle: B → A → B input_B = Image.open("input_B.jpg") input_B_tensor = model.preprocess_image(input_B) translated_A = model.translate_image(input_B_tensor, direction='BtoA') reconstructed_B = model.translate_image(translated_A, direction='AtoB') # Save results visualize_cycle(input_B_tensor, translated_A, reconstructed_B, "cycle_consistency_B.png")

Model Evaluation

Evaluate model performance with SSIM and diversity metrics:

from evaluate import evaluate_model # Evaluate on test dataset results = evaluate_model( checkpoint_path="outputs/checkpoints/generator.pth", dataset_path="./data/train", device="cuda" ) # Results include: # - SSIM (Structural Similarity Index) # - Diversity metrics # - Generated image quality # - Model performance statistics print(f"SSIM: {results['ssim']:.4f}") print(f"Diversity: {results['diversity']:.4f}") print(f"Quality Score: {results['quality']:.4f}")

Dataset Preparation

Prepare your custom dataset for training:

# Prepare custom dataset # Place images in data/custom/ directory # Supported formats: .jpg, .png, .jpeg # Directory structure: # data/custom/ # ├── image1.jpg # ├── image2.png # └── ... # The training script will automatically: # - Load images from directory # - Apply data augmentation # - Resize to specified dimensions # - Normalize pixel values # - Create data loaders for training # Use custom dataset for training python train.py --dataset custom --data-dir data/custom --epochs 50

Latent Space Exploration

Explore the learned latent space:

# Style mixing visualization is available in the Jupyter notebooks jupyter notebook notebooks/04_style_mixing.ipynb # The notebooks include: # - Style mixing at different layers # - Interpolation examples in W space # - Progressive growing visualization # - Generated image samples # Visualize style mixing from visualize import visualize_style_mixing visualize_style_mixing(generator, num_samples=16, mix_layers=[4, 5, 6])

Advanced Visualization Techniques

Use the visualization script for comprehensive analysis:

# Style mixing visualization python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode style_mix --output style_mix.png # Progressive growing visualization python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode progressive --output progressive.png # Interpolation visualization python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode interpolate --num-steps 10 --output interpolation.png # Generated samples grid python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode samples --num-samples 16 --output samples.png # All visualizations at once python visualize.py --checkpoint outputs/checkpoints/generator.pth --mode all --output_dir visualizations/

Model Export and Deployment

Export trained models for deployment:

# Export to ONNX format (for production deployment) python scripts/convert_checkpoint.py \ --checkpoint outputs/checkpoints/generator.pth \ --format onnx \ --output stylegan_generator.onnx \ --latent-dim 512 # Export to TorchScript (PyTorch mobile/edge) python scripts/convert_checkpoint.py \ --checkpoint outputs/checkpoints/generator.pth \ --format torchscript \ --output stylegan_generator.pt # Export generator and discriminator separately python scripts/convert_checkpoint.py \ --checkpoint outputs/checkpoints/generator.pth \ --format onnx \ --components generator \ --output-dir exported_models/ # Verify exported model python scripts/convert_checkpoint.py \ --checkpoint outputs/checkpoints/generator.pth \ --format onnx \ --output stylegan_generator.onnx \ --verify

Model Comparison and Analysis

Compare different trained models:

# Compare multiple models python evaluate.py \ --checkpoints outputs/checkpoints/generator_epoch_50.pth \ outputs/checkpoints/generator_epoch_100.pth \ --dataset ./data/train \ --output comparison_report.html # Compare models with different truncation values python generate.py \ --checkpoint outputs/checkpoints/generator.pth \ --num_images 16 \ --truncation 0.5 \ --output_dir comparisons/trunc_0.5 python generate.py \ --checkpoint outputs/checkpoints/generator.pth \ --num_images 16 \ --truncation 0.7 \ --output_dir comparisons/trunc_0.7 # Generate side-by-side comparisons python visualize.py \ --checkpoint outputs/checkpoints/generator.pth \ --mode compare \ --num-samples 16 \ --output-dir model_comparisons/

Complete Training Workflow

Step-by-Step Training Process

Step 1: Prepare Data

# Prepare custom dataset in data/train/ directory # For custom dataset: # Place images in data/train/ # Supported formats: .jpg, .png, .jpeg # Preprocess images first: python preprocess.py --input_dir ./raw_images --output_dir ./data/train --size 256 # The training script will automatically: # - Load images from directory # - Resize and normalize images # - Create data loaders for progressive training

Step 2: Train Model

# Start training python train.py --dataset ./data/train --output_dir ./outputs --epochs 100 # Training will: # 1. Load and preprocess images # 2. Initialize generator and discriminator # 3. Train with adversarial loss + gradient penalty # 4. Progressive growing from low to high resolution # 5. Save checkpoints and best model # 6. Log training history to TensorBoard # 7. Generate sample images during training

Step 3: Monitor Training

  • Watch console output for epoch progress
  • Check TensorBoard: tensorboard --logdir outputs/logs
  • View generated samples in outputs/samples/
  • Best model saved as outputs/checkpoints/best_generator.pth

Step 4: Evaluate Model

# Evaluate on test set python evaluate.py --checkpoint outputs/checkpoints/generator.pth --dataset ./data/train # Calculate SSIM and diversity metrics

Step 5: Generate Images

# Generate images python generate.py --checkpoint outputs/checkpoints/generator.pth --num_images 16 --output_dir ./generated # Generate style interpolation python generate.py --checkpoint outputs/checkpoints/generator.pth --interpolate --num_steps 10 --output_dir ./generated

API Usage Examples

Image Generation Endpoint (cURL)

Generate images using the REST API:

curl -X POST http://localhost:5000/generate \ -H "Content-Type: application/json" \ -d '{ "num_images": 10 }' # Response: # { # "num_images": 10, # "images": ["base64_encoded_image1", ...], # "status": "success" # }

Latent Interpolation Endpoint (cURL)

Generate interpolation between latent points:

curl -X POST http://localhost:5000/interpolate \ -H "Content-Type: application/json" \ -d '{ "num_steps": 10 }' # Response: # { # "num_steps": 10, # "images": ["base64_encoded_image1", ...], # "status": "success" # }

Health Check (cURL)

Check API server health and model status:

curl -X GET http://localhost:5000/health # Response: # { # "status": "healthy", # "model_loaded": true, # "device": "cuda" # }

Python Requests Example

Use the API with Python requests library:

import requests import base64 from PIL import Image from io import BytesIO # Image generation endpoint response = requests.post( 'http://localhost:5000/generate', json={'num_images': 10} ) data = response.json() # Decode and save images for i, img_base64 in enumerate(data['images']): img_data = base64.b64decode(img_base64) img = Image.open(BytesIO(img_data)) img.save(f'generated_{i}.png') # Interpolation endpoint interp_response = requests.post( 'http://localhost:5000/interpolate', json={'num_steps': 10} ) print(interp_response.json()) # Health check health = requests.get('http://localhost:5000/health') print(health.json())

JavaScript/Fetch Example

Use the API with JavaScript fetch API:

// Image generation fetch('http://localhost:5000/generate', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({ num_images: 10 }) }) .then(res => res.json()) .then(data => { console.log('Generated', data.num_images, 'images'); // Display images from base64 data data.images.forEach((imgBase64, i) => { const img = document.createElement('img'); img.src = 'data:image/png;base64,' + imgBase64; document.body.appendChild(img); }); }); // Interpolation fetch('http://localhost:5000/interpolate', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({ num_steps: 10 }) }) .then(res => res.json()) .then(data => { console.log('Interpolation steps:', data.num_steps); }); // Health check fetch('http://localhost:5000/health') .then(res => res.json()) .then(data => console.log('Status:', data));

CycleGAN Model Variants

Model Max Resolution Latent Dim Use Case Quality
Basic CycleGAN 256x256 ResNet-6 Fast training, basic tasks Good
Standard CycleGAN 256x256 ResNet-9 Balanced quality/speed Better
High-Res CycleGAN 512x512 ResNet-9 Higher quality translation Best
UNet CycleGAN 256x256 UNet-256 Simpler architecture Good

Dataset Information

Dataset Formats

The project supports multiple dataset formats for image training:

  • Built-in datasets: MNIST, CIFAR-10 (automatically downloaded)
  • Custom dataset: Directory of images (JPG, PNG, JPEG)
  • Automatic image loading and preprocessing
  • Data augmentation support
  • Train/validation split support
  • Multiple image format support

Custom Dataset Format

Training data is stored as image files in a directory:

# Custom dataset directory structure data/custom/ ├── image1.jpg ├── image2.png ├── image3.jpeg └── ... # The training script automatically: # - Loads images from directory # - Applies data augmentation # - Resizes to specified dimensions # - Normalizes pixel values # - Creates data loaders for training

Adding Custom Training Data

Add your own image dataset for training:

# Place images in data/custom/ directory # Supported formats: .jpg, .png, .jpeg # Example: mkdir -p data/custom cp your_images/*.jpg data/custom/ # Use in training python train.py --dataset custom --data-dir data/custom --epochs 50 # The script will automatically: # - Load all images from directory # - Apply augmentation if enabled # - Resize and normalize images # - Create train/validation splits

Troubleshooting & Best Practices

Common Issues

  • CUDA Out of Memory: Reduce batch_size in train.py, use smaller d_model (256 instead of 512), reduce num_layers, or use CPU mode
  • Model Not Found: Ensure model is trained first by running train.py or loading from models/ directory. Check model path is correct
  • Vocabulary Not Found: Ensure vocabularies are saved during training. Check vocab_dir path matches training save_dir
  • Slow Generation: Use smaller latent_dim (64 instead of 128), reduce hidden_dims, or use CPU mode
  • API Connection Error: Check if api.py is running on port 5000. Verify model path is correct
  • Import Errors: Verify all dependencies installed: pip install -r requirements.txt. Check Python version (3.8+)
  • Image Size Mismatch: Ensure all images are same size or use data augmentation to resize. Check IMAGE_SIZE in config
  • Poor Generation Quality: Train for more epochs, use larger latent_dim, increase training data, or adjust beta (KL weight)
  • Training Loss Not Decreasing: Check learning rate (may be too high/low), verify data format, check for data issues
  • Validation Loss Increasing: Model may be overfitting. Increase beta (KL weight), use more data, or reduce model size
  • Blurry Generated Images: Increase latent_dim, train longer, or adjust beta to balance reconstruction and KL divergence
  • Mode Collapse: Increase beta value, use more diverse training data, or try different architectures
  • KL Divergence Too High: Reduce beta value or increase model capacity
  • KL Divergence Too Low: Increase beta value to encourage better latent structure
  • Training Instability: Reduce learning rate, use gradient clipping, or try different optimizer
  • Memory Issues: Reduce batch size, use smaller image size, or enable mixed precision training

Performance Optimization Tips

  • GPU Memory: Use gradient accumulation for effective larger batch sizes: accumulate gradients over N batches before updating
  • Mixed Precision: Enable AMP (Automatic Mixed Precision) for 2x speedup and 50% memory reduction
  • Data Loading: Use multiple workers (num_workers=4-8) and pin_memory=True for faster data loading
  • Model Pruning: Remove unnecessary layers or reduce hidden dimensions for faster inference
  • Quantization: Use INT8 quantization for 4x speedup in production (with slight quality loss)
  • Batch Inference: Generate multiple images in batches rather than one at a time
  • Model Caching: Load model once and reuse for multiple generations
  • ONNX Runtime: Use exported ONNX models with ONNX Runtime for faster inference

Best Practices

  • Training Data: Use diverse, high-quality image datasets. More data = better results. Aim for 1K+ images minimum
  • Data Format: Ensure images are in supported formats (JPG, PNG, JPEG). Consistent image sizes recommended
  • Data Preprocessing: Normalize pixel values, resize images to consistent dimensions, apply augmentation if needed
  • Batch Size: Use smaller batches (32-64) for limited GPU memory. Larger batches (128+) for faster training if memory allows
  • Learning Rate: Start with 0.001 and adjust based on training loss. Use learning rate scheduling for better convergence
  • Gradient Clipping: Default is 1.0. Increase if training is unstable, decrease if gradients are too small
  • Beta (KL Weight): Start with 1.0. Higher = more regularization, lower = better reconstruction. Adjust based on results
  • Model Selection: Start with latent_dim=64 for speed/testing. Use 128+ for production quality. Higher = more expressive
  • Evaluation: Regularly evaluate reconstruction and KL divergence on validation set. Monitor for overfitting
  • Latent Dimension: Higher latent_dim = more expressive but slower. Common values: 64, 128, 256. Start with 128
  • Checkpointing: Model saves checkpoints automatically. Can resume training from checkpoint if needed
  • API Rate Limiting: Implement rate limiting for production deployments. Consider using nginx or similar
  • Logging: Monitor TensorBoard logs for debugging and optimization. View in outputs/logs/
  • Device Selection: Use CUDA if available for faster training. CPU works but much slower
  • Beta Tuning: Start with β=1.0, increase for better latent structure, decrease for better reconstruction
  • Latent Dimension: Start with 128, increase for more expressive models, decrease for faster training
  • Early Stopping: Monitor validation loss, stop if no improvement for 10-15 epochs
  • Checkpointing: Save checkpoints every 5-10 epochs to avoid losing progress
  • Data Augmentation: Use for small datasets to improve generalization
  • Learning Rate: Use learning rate scheduling (ReduceLROnPlateau) for better convergence

Use Cases and Applications

  • Image Generation: Generate new images from random latent vectors
  • Image Reconstruction: Reconstruct and denoise images
  • Image Interpolation: Create smooth transitions between images
  • Anomaly Detection: Detect outliers by high reconstruction error
  • Data Augmentation: Generate synthetic training data
  • Image Editing: Manipulate images in latent space
  • Feature Learning: Learn meaningful image representations
  • Dimensionality Reduction: Compress images to low-dimensional latent space
  • Style Transfer: Transfer styles by manipulating latent vectors
  • Image Completion: Complete missing parts of images

Performance Optimization

  • GPU Usage: Set CUDA_VISIBLE_DEVICES for multi-GPU systems. Use GPU for training and inference when available
  • Model Selection: Use latent_dim=64 for fastest inference. Larger models (128+, 256+) for better quality
  • Batch Processing: Generate multiple images in batches for efficient processing. Reduces overhead
  • Caching: API server caches model in memory. Model loads once on first request, then reused
  • Image Size: Use smaller image sizes (64x64) for faster generation. Larger images (128x128+) for better quality
  • Latent Sampling: Sample from standard normal distribution N(0, I) for generation. Use interpolation for smooth transitions
  • Model Quantization: Consider model quantization for production to reduce memory and speed up inference
  • Async Processing: For high-throughput, consider async API or queue system for batch image generation
  • Memory Management: Clear GPU cache between batches if running out of memory: torch.cuda.empty_cache()

Expected Training Times

Approximate training times for different configurations:

Dataset Image Size Latent Dim Batch Size Epochs GPU Time
MNIST 28×28 64 128 50 GTX 1080 ~15 min
MNIST 28×28 128 128 50 GTX 1080 ~20 min
CIFAR-10 32×32 128 128 100 GTX 1080 ~2 hours
Custom 64×64 256 64 50 RTX 3090 ~3 hours
Custom 128×128 512 32 50 RTX 3090 ~8 hours

Note: Times are approximate and depend on hardware, dataset size, and other factors. CPU training is typically 10-20x slower.

Model Size and Memory Requirements

Approximate model sizes and memory usage:

Latent Dim Hidden Dims Model Size GPU Memory (Training) GPU Memory (Inference)
64 [32, 64, 128] ~5 MB ~500 MB ~200 MB
128 [32, 64, 128, 256] ~15 MB ~1.5 GB ~500 MB
256 [64, 128, 256, 512] ~50 MB ~4 GB ~1.5 GB
512 [128, 256, 512, 1024] ~200 MB ~12 GB ~4 GB

Note: Memory usage depends on batch size and image size. Larger batches and images require more memory.

Real-World Examples & Use Cases

Example 1: Training on Custom Face Dataset

Complete workflow for training on a custom face dataset:

# 1. Prepare dataset mkdir -p data/faces # Copy face images to data/faces/ # 2. Train CycleGAN model python train.py \ --dataroot ./datasets/faces \ --name faces_experiment \ --epochs 200 \ --batch_size 1 \ --crop_size 256 \ --tensorboard # 3. Translate faces between domains python test.py \ --dataroot ./datasets/faces \ --name faces_experiment \ --model test \ --direction AtoB \ --num_test 50 # 4. Visualize cycle consistency python test.py \ --dataroot ./datasets/faces \ --name faces_experiment \ --model test \ --direction AtoB \ --results_dir ./face_translations

Example 2: Photo to Painting

Use CycleGAN for photo to painting style transfer:

# Train on photo and painting datasets python train.py --dataroot ./datasets/photo2painting --name photo2painting --epochs 200 # Translate photos to paintings python test.py \ --dataroot ./datasets/photo2painting \ --name photo2painting \ --model test \ --direction AtoB # Save result # Translated images saved in results/photo2painting/test_latest/images/

Example 3: Season Transfer

Transfer images between seasons (summer to winter):

# Train on summer and winter datasets python train.py --dataroot ./datasets/summer2winter --name summer2winter --epochs 200 # Translate summer images to winter python test.py \ --dataroot ./datasets/summer2winter \ --name summer2winter \ --model test \ --direction AtoB # Translate winter images to summer python test.py \ --dataroot ./datasets/summer2winter \ --name summer2winter \ --model test \ --direction BtoA

Example 4: Day to Night Conversion

Convert day images to night and vice versa:

# Train on day and night datasets python train.py --dataroot ./datasets/day2night --name day2night --epochs 200 # Convert day images to night python test.py \ --dataroot ./datasets/day2night \ --name day2night \ --model test \ --direction AtoB # Convert night images to day python test.py \ --dataroot ./datasets/day2night \ --name day2night \ --model test \ --direction BtoA

Example 5: Object Transfiguration

Transform objects between different types (e.g., horse to zebra):

# Train on horse and zebra datasets python train.py --dataroot ./datasets/horse2zebra --name horse2zebra --epochs 200 # Transform horses to zebras python test.py \ --dataroot ./datasets/horse2zebra \ --name horse2zebra \ --model test \ --direction AtoB # Transform zebras to horses python test.py \ --dataroot ./datasets/horse2zebra \ --name horse2zebra \ --model test \ --direction BtoA

Integration Examples

Integration with Flask Web Application

Integrate CycleGAN into a Flask web application:

from flask import Flask, request, jsonify, send_file from models.cyclegan_model import CycleGANModel from PIL import Image import io import base64 app = Flask(__name__) model = CycleGANModel() model.load_networks('latest', './checkpoints/experiment_name/') model.set_requires_grad([], False) model.eval() @app.route('/translate', methods=['POST']) def translate(): # Receive image file file = request.files['image'] direction = request.form.get('direction', 'AtoB') image = Image.open(file.stream) # Translate image translated = model.translate_image(image, direction=direction) # Convert to base64 for JSON response buffered = io.BytesIO() translated.save(buffered, format="PNG") img_str = base64.b64encode(buffered.getvalue()).decode() return jsonify({'image': img_str, 'direction': direction}) @app.route('/translate_file', methods=['POST']) def translate_file(): # Receive image file file = request.files['image'] direction = request.form.get('direction', 'AtoB') image = Image.open(file.stream) # Translate image translated = model.translate_image(image, direction=direction) # Return translated image img_io = io.BytesIO() translated.save(img_io, 'PNG') img_io.seek(0) return send_file(img_io, mimetype='image/png') if __name__ == '__main__': app.run(host='0.0.0.0', port=5000)

Integration with FastAPI

Create a FastAPI service for CycleGAN image translation:

from fastapi import FastAPI, File, UploadFile, Form from fastapi.responses import FileResponse from models.cyclegan_model import CycleGANModel from PIL import Image import io app = FastAPI() model = CycleGANModel() model.load_networks('latest', './checkpoints/experiment_name/') model.set_requires_grad([], False) model.eval() @app.post("/translate") async def translate_image_endpoint( file: UploadFile = File(...), direction: str = Form('AtoB') ): """Translate uploaded image between domains.""" image_data = await file.read() image = Image.open(io.BytesIO(image_data)) translated = model.translate_image(image, direction=direction) # Save and return translated image img_io = io.BytesIO() translated.save(img_io, 'PNG') img_io.seek(0) return FileResponse(img_io, media_type='image/png') @app.get("/health") async def health_check(): """Health check endpoint.""" return {"status": "healthy", "model_loaded": True}

Integration with Streamlit

Create an interactive Streamlit application for image translation:

import streamlit as st from models.cyclegan_model import CycleGANModel from PIL import Image st.title("CycleGAN Image Translation") # Load model @st.cache_resource def load_cyclegan_model(): model = CycleGANModel() model.load_networks('latest', './checkpoints/experiment_name/') model.set_requires_grad([], False) model.eval() return model model = load_cyclegan_model() # Sidebar controls direction = st.sidebar.selectbox("Translation Direction", ["AtoB", "BtoA"]) # Image upload uploaded_file = st.file_uploader("Upload Image", type=["jpg", "jpeg", "png"]) # Translate button if uploaded_file is not None and st.button("Translate Image"): with st.spinner("Translating image..."): image = Image.open(uploaded_file) translated = model.translate_image(image, direction=direction) col1, col2 = st.columns(2) with col1: st.image(image, caption="Original Image", use_container_width=True) with col2: st.image(translated, caption="Translated Image", use_container_width=True) # Cycle consistency visualization st.header("Cycle Consistency") if uploaded_file is not None and st.button("Show Cycle Consistency"): with st.spinner("Computing cycle consistency..."): image = Image.open(uploaded_file) # Forward cycle translated = model.translate_image(image, direction='AtoB') reconstructed = model.translate_image(translated, direction='BtoA') col1, col2, col3 = st.columns(3) with col1: st.image(image, caption="Original", use_container_width=True) with col2: st.image(translated, caption="Translated", use_container_width=True) with col3: st.image(reconstructed, caption="Reconstructed", use_container_width=True)

Contact Information

Get in Touch

Developer: Molla Samser
Designer & Tester: Rima Khatun

rskworld.in
help@rskworld.in support@rskworld.in
+91 93305 39277

License

This project is for educational purposes only. See LICENSE file for more details.

About RSK World

Founded by Molla Samser, with Designer & Tester Rima Khatun, RSK World is your one-stop destination for free programming resources, source code, and development tools.

Founder: Molla Samser
Designer & Tester: Rima Khatun

Development

  • Game Development
  • Web Development
  • Mobile Development
  • Software Development
  • Development Tools

Legal

  • Terms & Conditions
  • Privacy Policy
  • Disclaimer

Contact Info

Nutanhat, Mongolkote
Purba Burdwan, West Bengal
India, 713147

+91 93305 39277

hello@rskworld.in
support@rskworld.in

© 2025 RSK World. All rights reserved.

Content used for educational purposes only. View Disclaimer