DiT-MoE

Large scale parameter diffusion transformer model

Provide PyTorch model definition
Includes pre trained weights
Support training and sampling code
Support large-scale parameter expansion
Optimized reasoning ability
Provide expert routing analysis tools
Contains synthesis data generation script

Product Details

DiT MoE is a diffusion transformer model implemented using PyTorch, capable of scaling up to 16 billion parameters and exhibiting highly optimized inference capabilities while competing with dense networks. It represents the cutting-edge technology in the field of deep learning for processing large-scale datasets and has important research and application value.

DiT-MoE

Product Details

Related Projects

Understood zKnown

MBox AI Meet

Klee

CrossPrism for MacOS