Flax is a neural network library for JAX
An open source library for GPU-accelerated robot learning
4M: Massively Multimodal Masked Modeling
ICLR2024 Spotlight: curation/training code, metadata, distribution
[CVPR 2025 Best Paper Award] VGGT
Official implementation of DreamCraft3D
TGMC: TerraGov Marine Corps, a SS13 mod
Python framework for adversarial attacks, and data augmentation
Chemcrow
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
Unifying 3D Mesh Generation with Language Models
A personal context-agent that learns how you work
Tools for merging pretrained large language models
Controllable and fast Text-to-Speech for over 7000 languages
Unified Multimodal Understanding and Generation Models
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
PyTorch code and models for VJEPA2 self-supervised learning from video
Educational framework exploring multi-agent orchestration
A lightweight vision library for performing large object detection
This repo contains the code for 1D tokenizer and generator
Flexible Photo Recrafting While Preserving Your Identity
A SOTA open-source image editing model