Showing 6 open source projects for "python module"

View related business solutions
  • Create a personalized AI chatbot for each team in minutes Icon
    Create a personalized AI chatbot for each team in minutes

    Get better, faster answers for your whole team with an AI chatbot trained on your company documents.

    QueryPal is the lifeline your team needs. Our AI chatbot integrates seamlessly with your communication channels, using advanced language understanding to identify and auto-answer repetitive questions — in seconds.
    Learn More
  • The CI/CD Platform built for Mobile DevOps Icon
    The CI/CD Platform built for Mobile DevOps

    For mobile app developers interested in a powerful CI/CD platform for mobile app development and mobile DevOps

    Save time, money, and developer frustration with fast, flexible, and scalable mobile CI/CD that just works. Whether you swear by native or would rather go cross-platform, we have you covered. From Swift to Objective-C, Java to Kotlin, as well as Xamarin, Cordova, Ionic, React Native, and Flutter: Whatever you choose, we will automatically configure your initial workflows and have you building in minutes.
    Learn More
  • 1
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    HunyuanCustom

    HunyuanCustom

    Multimodal-Driven Architecture for Customized Video Generation

    HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images. The architecture builds on HunyuanVideo, with added modules for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    VisualGLM-6B

    VisualGLM-6B

    Chinese and English multimodal conversational language model

    VisualGLM-6B is an open-source multimodal conversational language model developed by ZhipuAI that supports both images and text in Chinese and English. It builds on the ChatGLM-6B backbone, with 6.2 billion language parameters, and incorporates a BLIP2-Qformer visual module to connect vision and language. In total, the model has 7.8 billion parameters. Trained on a large bilingual dataset — including 30 million high-quality Chinese image-text pairs from CogView and 300 million English pairs...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Perception Models

    Perception Models

    State-of-the-art Image & Video CLIP, Multimodal Large Language Models

    Perception Models is a state-of-the-art framework developed by Facebook Research for advanced image and video perception tasks. It introduces two primary components: the Perception Encoder (PE) for visual feature extraction and the Perception Language Model (PLM) for multimodal decoding and reasoning. The PE module is a family of vision encoders designed to excel in image and video understanding, surpassing models like SigLIP2, InternVideo2, and DINOv2 across multiple benchmarks. Meanwhile,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Arryved POS System Icon
    Arryved POS System

    Drive contagious loyalty with your guests and staff with a POS and Brewery Management system that helps run your craft brewery better.

    Arryved was built to help craft beverage makers thrive.
    Learn More
  • 5
    Step1X-3D

    Step1X-3D

    High-Fidelity and Controllable Generation of Textured 3D Assets

    Step1X-3D is an open-source framework for generating high-fidelity textured 3D assets from scratch — both their geometry and surface textures — using modern generative AI techniques. It combines a hybrid architecture: a geometry generation stage using a VAE-DiT model to output a watertight 3D representation (e.g. TSDF surface), and a texture synthesis stage that conditions on geometry and optionally reference input (or prompts) to produce view-consistent textures using a diffusion-based...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Video Pre-Training

    Video Pre-Training

    Learning to Act by Watching Unlabeled Online Videos

    The Video PreTraining (VPT) repository provides code and model artifacts for a project where agents learn to act by watching human gameplay videos—specifically, gameplay of Minecraft—using behavioral cloning. The idea is to learn general priors of control from large-scale, unlabeled video data, and then optionally fine-tune those priors for more goal-directed behavior via environment interaction. The repository contains demonstration models of different widths, fine-tuned variants (e.g. for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB