DiT-MoE

DiT-MoE

Large scale parameter diffusion transformer model

  • Provide PyTorch model definition
  • Includes pre trained weights
  • Support training and sampling code
  • Support large-scale parameter expansion
  • Optimized reasoning ability
  • Provide expert routing analysis tools
  • Contains synthesis data generation script

Product Details

DiT MoE is a diffusion transformer model implemented using PyTorch, capable of scaling up to 16 billion parameters and exhibiting highly optimized inference capabilities while competing with dense networks. It represents the cutting-edge technology in the field of deep learning for processing large-scale datasets and has important research and application value.