Showing 155 open source projects for "python data analysis"

View related business solutions
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • Field Service+ for MS Dynamics 365 & Salesforce Icon
    Field Service+ for MS Dynamics 365 & Salesforce

    Empower your field service with mobility and reliability

    Resco’s mobile solution streamlines your field service operations with offline work, fast data sync, and powerful tools for frontline workers, all natively integrated into Dynamics 365 and Salesforce.
    Learn More
  • 1
    BruteForceAI

    BruteForceAI

    Advanced LLM-powered brute-force tool combining AI intelligence

    BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation. It combines that analysis layer with automated credential testing workflows, framing itself as a more adaptive alternative to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    MetaGPT

    MetaGPT

    The Multi-Agent Framework

    ...Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Bespoke Curator

    Bespoke Curator

    Synthetic data curation for post-training and data extraction

    Curator is an open-source Python library designed to build synthetic data pipelines for training and evaluating machine learning models, particularly large language models. The system helps developers generate, transform, and curate high-quality datasets by combining automated generation with structured validation and filtering. It supports workflows where models are used to produce synthetic examples that can later be refined into reliable training datasets for reasoning, question answering, or structured information extraction tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LLM Vision

    LLM Vision

    Visual intelligence for your home.

    LLM Vision is an open-source integration for Home Assistant that adds multimodal large language model capabilities to smart home environments. The project enables Home Assistant to analyze images, video files, and live camera feeds using vision-capable AI models. Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Collect! is a highly configurable debt collection software Icon
    Collect! is a highly configurable debt collection software

    Everything that matters to debt collection, all in one solution.

    The flexible & scalable debt collection software built to automate your workflow. From startup to enterprise, we have the solution for you.
    Learn More
  • 5
    GraphRAG

    GraphRAG

    A modular graph-based Retrieval-Augmented Generation (RAG) system

    The GraphRAG project is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    JSON_REPAIR

    JSON_REPAIR

    A python module to repair invalid JSON from LLMs

    json_repair is an open-source Python library designed to automatically fix malformed JSON data and convert it into valid, parseable structures. The tool is particularly useful in scenarios where JSON output is generated by large language models or external services that may produce syntactically invalid responses. Instead of failing when encountering errors such as missing quotes, trailing commas, or incomplete objects, the library analyzes the malformed data and reconstructs it into valid JSON. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    OpenPlanter

    OpenPlanter

    Language-model investigation agent with a terminal UI

    OpenPlanter is an open-source Python project focused on building an intelligent automated planting or gardening system powered by software control and data processing. The repository is designed to help developers and hobbyists create programmable plant management workflows that can monitor, schedule, and optimize growing conditions. It emphasizes automation and extensibility, allowing integration with sensors, environmental data, and control logic for smart cultivation setups. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Sage Chat

    Sage Chat

    Chat with any codebase in under two minutes | Fully local

    Sage is an open-source AI developer assistant designed to help engineers understand and work with complex codebases more effectively. The tool functions similarly to an intelligent research agent that can analyze a repository and answer questions about how the software works. Instead of focusing solely on code generation, Sage emphasizes code comprehension, system architecture analysis, and integration guidance. Developers can ask natural language questions about a project, and the system...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Vanna 2.0

    Vanna 2.0

    Chat with your SQL database

    ...Vanna can be integrated into many environments, including notebooks, web applications, messaging platforms, and data dashboards, making it flexible for analytics and data exploration workflows. The system streams query results, visualizations, and summaries directly to user interfaces, allowing non-technical users to interact with complex data systems through conversational queries. It also includes enterprise-grade features such as user-aware security, permission enforcement, and query auditing for production deployments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • 10
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    Pixeltable is an open-source Python data infrastructure framework designed to support the development of multimodal AI applications. The system provides a declarative interface for managing the entire lifecycle of AI data pipelines, including storage, transformation, indexing, retrieval, and orchestration of datasets. Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a table-based abstraction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NeMo Curator

    NeMo Curator

    Scalable data pre processing and curation toolkit for LLMs

    NeMo Curator is a Python library specifically designed for fast and scalable dataset preparation and curation for large language model (LLM) use-cases such as foundation model pretraining, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and paramter-efficient fine-tuning (PEFT). It greatly accelerates data curation by leveraging GPUs with Dask and RAPIDS, resulting in significant time savings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    ...By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    MathModelAgent

    MathModelAgent

    An Agent Designed for Mathematical Modeling

    MathModelAgent is an AI agent system designed specifically for assisting with mathematical modeling tasks and academic problem solving. The platform automates the process of analyzing mathematical problems, constructing models, generating code for simulations or computations, and producing a complete research-style report. The project uses a multi-agent architecture where different specialized agents handle tasks such as problem interpretation, modeling design, programming implementation,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HolmesGPT

    HolmesGPT

    CNCF Sandbox Project

    HolmesGPT is an open-source AI agent designed to help DevOps and site reliability engineering teams diagnose and resolve production incidents. The system aggregates signals from observability tools such as logs, metrics, alerts, and distributed traces, then analyzes them using large language models to identify potential root causes. Rather than requiring engineers to manually correlate large volumes of monitoring data, HolmesGPT automatically synthesizes evidence and presents explanations in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BertViz

    BertViz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    BertViz is an interactive tool for visualizing attention in Transformer language models such as BERT, GPT2, or T5. It can be run inside a Jupyter or Colab notebook through a simple Python API that supports most Huggingface models. BertViz extends the Tensor2Tensor visualization tool by Llion Jones, providing multiple views that each offer a unique lens into the attention mechanism. The head view visualizes attention for one or more attention heads in the same layer. It is based on the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 17
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Kor

    Kor

    LLM

    This is a half-baked prototype that “helps” you extract structured data from text using LLMs. Specify the schema of what should be extracted and provide some examples. Kor will generate a prompt, send it to the specified LLM and parse out the output. You might even get results back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ai-cookbook

    ai-cookbook

    Examples and tutorials to help developers build AI systems

    ...The repository contains examples that demonstrate how to build AI workflows using modern tools such as large language models, autonomous agents, and external APIs. Developers can learn how to construct applications like intelligent assistants, automation pipelines, and AI-powered data analysis tools through step-by-step tutorials and ready-to-run scripts. The code examples are designed to emphasize practical architecture patterns that are commonly used in production environments, helping developers understand how to integrate AI services into software products.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    LlamaIndex

    LlamaIndex

    Central interface to connect your LLM's with external data

    LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. LlamaIndex is a simple, flexible interface between your external data and LLMs. It provides the following tools in an easy-to-use fashion. Provides indices over your unstructured and structured data for use with LLM's. These indices help to abstract away common boilerplate and pain points for in-context learning. Dealing with prompt limitations (e.g. 4096 tokens for Davinci) when...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Strix

    Strix

    Open-source AI hackers to find and fix your app’s vulnerabilities

    Strix is an open source agent-driven security platform that uses autonomous AI agents to identify, investigate, and validate vulnerabilities in software applications. The system is designed to mimic the behavior of real attackers by executing dynamic testing and verifying findings through proof-of-concept exploitation. Unlike traditional vulnerability scanners that rely heavily on static analysis, Strix agents actively run code, probe systems, and attempt exploitation to confirm whether...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    LongBench

    LongBench

    LongBench v2 and LongBench (ACL 25'&24')

    LongBench is a comprehensive benchmark designed to evaluate the ability of large language models to understand and reason over very long textual contexts. Traditional language model benchmarks typically evaluate tasks involving relatively short inputs, which does not reflect many real-world applications such as analyzing large documents or entire code repositories. LongBench addresses this gap by providing datasets that require models to process and reason over long sequences of text across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    uqlm

    uqlm

    Uncertainty Quantification for Language Models, is a Python package

    UQLM is a Python library developed to detect hallucinations and quantify uncertainty in the outputs of large language models. The system implements a variety of uncertainty quantification techniques that assign confidence scores to model responses. These scores help developers determine how likely a generated answer is to contain errors or fabricated information.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB