Official implementation of Watermark Anything with Localized Messages
Qwen2.5-VL is the multimodal large language model series
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
DeepSeek Coder: Let the Code Write Itself
FAIR Sequence Modeling Toolkit 2
PyTorch code and models for the DINOv2 self-supervised learning
GLM-4-Voice | End-to-End Chinese-English Conversational Model
CogView4, CogView3-Plus and CogView3(ECCV 2024)
The Clay Foundation Model - An open source AI model and interface
Open-source framework for intelligent speech interaction
Generate Any 3D Scene in Seconds
Qwen3-ASR is an open-source series of ASR models
VMZ: Model Zoo for Video Modeling
Video understanding codebase from FAIR for reproducing video models
Tool for exploring and debugging transformer model behaviors
CLIP, Predict the most relevant text snippet given an image
A Unified Framework for Text-to-3D and Image-to-3D Generation
Advancing Open-source World Models
Controllable & emotion-expressive zero-shot TTS
Easy Docker setup for Stable Diffusion with user-friendly UI
Inference script for Oasis 500M
HY-Motion model for 3D character animation generation
A Production-ready Reinforcement Learning AI Agent Library
Official implementation of DreamCraft3D