Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
LLM-based agent for general purpose software engineering tasks
Large Multimodal Models for Video Understanding and Editing
Open multimodal web agent built by Ai2
Machine learning on FPGAs using HLS
Advanced NLP with spaCy: A free online course
This repository is a curated collection of links to various courses
Towards Efficient Self-Evolving Agent System
Driving with Graph Visual Question Answering
Constrained Value Alignment via Safe Reinforcement Learning
An Efficient Web-enhanced Question Answering System
Unleashing 10,000+ Word Generation from Long Context LLMs
Autoregressive Model Beats Diffusion
Empowering Code Generation with OSS-Instruct
A Pioneering Open-Source Alternative to GPT-4o
Pretrained time-series foundation model developed by Google Research
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
PyTorch code and models for V-JEPA self-supervised learning from video
An implementation of a deep learning recommendation model (DLRM)
Self-supervised visual learning using momentum contrast in PyTorch
Code to accompany "A Method for Animating Children's Drawings"
Easy-to-use,Modular and Extendible package of deep-learning models
The Unified Machine Learning Framework
OCR expert VLM powered by Hunyuan's native multimodal architecture