Tools for merging pretrained large language models
Controllable and fast Text-to-Speech for over 7000 languages
Unified Multimodal Understanding and Generation Models
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
PyTorch code and models for VJEPA2 self-supervised learning from video
Educational framework exploring multi-agent orchestration
Chemcrow
A lightweight vision library for performing large object detection
Python framework for adversarial attacks, and data augmentation
This repo contains the code for 1D tokenizer and generator
Flexible Photo Recrafting While Preserving Your Identity
A SOTA open-source image editing model
Multi-Agent daTa geneRation Infra and eXperimentation framework
Build cross-modal and multimodal applications on the cloud
GUI Exploration Lab. One of the best GUI agent solutions
Large-language-model & vision-language-model based on Linear Attention
Deep and online learning with spiking neural networks in Python
Chat & pretrained large audio language model proposed by Alibaba Cloud
Did you say you like data?
An Efficient and Easy-to-use Federated Learning Framework
Run LLMs locally on Cloud Workstations
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Tool-integrated Reasoning LLM Agents
NOTICE OF CONSOLIDATION & PARTNERSHIP PENDING As of April 2026, the 20