Showing 1058 open source projects for "python data analysis"

View related business solutions
  • 1
    MemU

    MemU

    MemU is an open-source memory framework for AI companions

    MemU is an agentic memory layer for LLM applications, specifically designed for AI companions. Transform your memory into an intelligent file system that automatically organizes, connects, and evolves with your memories. Simple, fast, and reliable memory infrastructure for AI applications. Powerful tools and dedicated support to scale your AI applications with confidence. Full proprietary features, commercial usage rights, and white-labeling options for your enterprise needs. SSO/RBAC...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 2
    nesa

    nesa

    Run AI models end-to-end encrypted

    nesa is an open-source initiative focused on building decentralized AI infrastructure that enables secure, verifiable, and privacy-preserving machine learning and inference across distributed environments. The project aims to address key challenges in modern AI systems, such as data privacy, trust, and centralization, by leveraging cryptographic techniques and decentralized architectures. NESA allows developers to run AI computations in a way that ensures data integrity and confidentiality,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    MathModelAgent

    MathModelAgent

    An Agent Designed for Mathematical Modeling

    MathModelAgent is an AI agent system designed specifically for assisting with mathematical modeling tasks and academic problem solving. The platform automates the process of analyzing mathematical problems, constructing models, generating code for simulations or computations, and producing a complete research-style report. The project uses a multi-agent architecture where different specialized agents handle tasks such as problem interpretation, modeling design, programming implementation,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 5
    Ragas

    Ragas

    Supercharge Your LLM Application Evaluations

    Objective metrics, intelligent test generation, and data-driven insights for LLM apps. Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient evaluation workflows. Don't have a test dataset ready? We also do production-aligned test set generation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    WhatsApp MCP Server

    WhatsApp MCP Server

    WhatsApp MCP server enabling AI access to chats and messaging

    ...It acts as a bridge between WhatsApp and large language models, allowing controlled access to messages, chats, and contacts. whatsapp-mcp is composed of two main components: a Go-based bridge that connects to the WhatsApp Web API and stores data locally, and a Python-based MCP server that exposes tools for AI interaction. All message data is stored in a local SQLite database and is only accessed when explicitly requested through defined tools, giving users control over how their data is used. It supports both sending and receiving messages, including various media types such as images, audio, videos, and documents. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    A.I.G

    A.I.G

    Full-stack AI Red Teaming platform

    AI-Infra-Guard is a powerful open-source security platform from Tencent’s Zhuque Lab designed to assess the safety and resilience of AI infrastructures, codebases, and components through automated scanning and evaluation tools. It brings together AI infrastructure vulnerability scanning, MCP server risk analysis, and jailbreak evaluation into a unified workflow so that enterprises and individuals can identify critical security issues without relying on external services. Users can deploy it...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    plexe

    plexe

    Build a machine learning model from a prompt

    ...The project supports both a Python library and a managed cloud option, meeting teams wherever they prefer to run workloads. The overall goal is to compress the path from idea to usable model while keeping humans in the loop for review and adjustment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 10
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models. While you’re in the research phase, and want to validate your data, find potential methodological problems, and/or validate...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...
    Downloads: 151 This Week
    Last Update:
    See Project
  • 12
    PaddleX

    PaddleX

    PaddlePaddle End-to-End Development Toolkit

    PaddleX is a deep learning full-process development tool based on the core framework, development kit, and tool components of Paddle. It has three characteristics opening up the whole process, integrating industrial practice, and being easy to use and integrate. Image classification and labeling is the most basic and simplest labeling task. Users only need to put pictures belonging to the same category in the same folder. When the model is trained, we need to divide the training set, the...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    JSON_REPAIR

    JSON_REPAIR

    A python module to repair invalid JSON from LLMs

    json_repair is an open-source Python library designed to automatically fix malformed JSON data and convert it into valid, parseable structures. The tool is particularly useful in scenarios where JSON output is generated by large language models or external services that may produce syntactically invalid responses. Instead of failing when encountering errors such as missing quotes, trailing commas, or incomplete objects, the library analyzes the malformed data and reconstructs it into valid JSON. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    NeMo Curator

    NeMo Curator

    Scalable data pre processing and curation toolkit for LLMs

    NeMo Curator is a Python library specifically designed for fast and scalable dataset preparation and curation for large language model (LLM) use-cases such as foundation model pretraining, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and paramter-efficient fine-tuning (PEFT). It greatly accelerates data curation by leveraging GPUs with Dask and RAPIDS, resulting in significant time savings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ToolUniverse

    ToolUniverse

    Democratizing AI scientists with ToolUniverse

    ...Instead of requiring custom pipelines or fine-tuning, ToolUniverse wraps around existing models and enables them to reason, experiment, and iterate on complex workflows such as drug discovery, data analysis, and hypothesis testing. The platform abstracts tool usage behind a consistent interface, allowing AI agents to compose multi-step workflows, refine tool definitions automatically, and even generate new tools from natural language descriptions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Lightly

    Lightly

    A python library for self-supervised learning on images

    A python library for self-supervised learning on images. We, at Lightly, are passionate engineers who want to make deep learning more efficient. That's why - together with our community - we want to popularize the use of self-supervised methods to understand and curate raw image data. Our solution can be applied before any data annotation step and the learned representations can be used to visualize and analyze datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    uqlm

    uqlm

    Uncertainty Quantification for Language Models, is a Python package

    UQLM is a Python library developed to detect hallucinations and quantify uncertainty in the outputs of large language models. The system implements a variety of uncertainty quantification techniques that assign confidence scores to model responses. These scores help developers determine how likely a generated answer is to contain errors or fabricated information.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    High-Level Training Utilities Pytorch

    High-Level Training Utilities Pytorch

    High-level training, data augmentation, and utilities for Pytorch

    Contains significant improvements, bug fixes, and additional support. Get it from the releases, or pull the master branch. This package provides a few things. A high-level module for Keras-like training with callbacks, constraints, and regularizers. Comprehensive data augmentation, transforms, sampling, and loading. Utility tensor and variable functions so you don't need numpy as often. Have any feature requests? Submit an issue! I'll make it happen. Specifically, any data augmentation, data...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    PySINDy

    PySINDy

    A package for the sparse identification of nonlinear dynamical systems

    PySINDy is a Python library that implements the Sparse Identification of Nonlinear Dynamics (SINDy) method for discovering mathematical models of dynamical systems from data. The framework focuses on identifying governing equations that describe the behavior of complex physical systems by selecting sparse combinations of candidate functions. Instead of fitting a purely predictive machine learning model, PySINDy attempts to recover interpretable differential equations that explain how a system evolves over time. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Sage Chat

    Sage Chat

    Chat with any codebase in under two minutes | Fully local

    Sage is an open-source AI developer assistant designed to help engineers understand and work with complex codebases more effectively. The tool functions similarly to an intelligent research agent that can analyze a repository and answer questions about how the software works. Instead of focusing solely on code generation, Sage emphasizes code comprehension, system architecture analysis, and integration guidance. Developers can ask natural language questions about a project, and the system...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Trail of Bits Skills Marketplace

    Trail of Bits Skills Marketplace

    Trail of Bits Claude Code skills for security research, vulnerability

    Trail of Bits Skills Marketplace is a specialized Claude Code skills marketplace built by the security research firm Trail of Bits that focuses on enhancing AI-assisted workflows for vulnerability discovery, testing, and secure development. The repository groups a set of plug-in skills tailored toward static analysis, code auditing, secure defaults detection, and other practices that matter in software security. Users can easily add the marketplace to a Claude Code environment, browse...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    LitterBox

    LitterBox

    A secure sandbox environment for malware developers and red teamers

    LitterBox is a controlled malware-analysis and payload-testing sandbox aimed at red teams who need to validate evasions and behaviors before deployment. It provides an isolated environment to exercise payloads against modern detection stacks, verify signatures and heuristics, and observe runtime characteristics without leaking binaries to third-party vendors. The README frames typical use cases: testing evasion, validating detections, analyzing behavior, and keeping sensitive tooling...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 53 This Week
    Last Update:
    See Project
  • 24
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Alibi Detect

    Alibi Detect

    Algorithms for outlier, adversarial and drift detection

    Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline detectors for tabular data, text, images and time series. Both TensorFlow and PyTorch backends are supported for drift detection.
    Downloads: 5 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB