Showing 1058 open source projects for "python data analysis"

View related business solutions
  • Network Discovery Software | JDisc Discovery Icon
    Network Discovery Software | JDisc Discovery

    JDisc Discovery supports the IT organizationss of medium-sized businesses and large-scale enterprises.

    JDisc Discovery is a comprehensive network inventory and IT asset management solution designed to help organizations gain clear, up-to-date visibility into their IT environment. It automatically scans and maps devices across the network, including servers, workstations, virtual machines, and network hardware, to create a detailed inventory of all connected assets. This includes critical information such as hardware configurations, software installations, patch levels, and relationshipots between devices.
    Learn More
  • Infor M3 ERP Icon
    Infor M3 ERP

    Enterprise manufacturers and distributors requiring a solution to manage and execute complex processes

    Efficiently executing the complex processes of enterprise manufacturers and distributors. Infor M3 is a cloud-based, manufacturing and distribution ERP system that leverages the latest technologies to provide an exceptional user experience and powerful analytics in a multicompany, multicountry, and multisite platform. Infor M3 and related CloudSuite™ industry solutions include industry-leading functionality for the chemical, distribution, equipment, fashion, food and beverage, and industrial manufacturing industries. Staying ahead of the competition means staying agile. Our new capabilities bring improved data-driven insights and streamlined workflows to help you make informed decisions and take quick action.
    Learn More
  • 1
    NBA Sports Betting Machine Learning

    NBA Sports Betting Machine Learning

    NBA sports betting using machine learning

    NBA-Machine-Learning-Sports-Betting is an open-source Python project that applies machine learning techniques to predict outcomes of National Basketball Association games for analytical and betting-related research. The system gathers historical team statistics and game data spanning multiple seasons, beginning with the 2007–2008 NBA season and continuing through the present. Using this dataset, the project constructs matchup features that represent team performance trends and contextual information about each game. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Anomaly Detection Learning Resources

    Anomaly Detection Learning Resources

    Anomaly detection related books, papers, videos, and toolboxes

    ...It includes materials covering a wide range of anomaly detection domains, including time series data, graph data, tabular datasets, and real-time monitoring systems. By compiling resources from multiple programming ecosystems such as Python, R, and other machine learning platforms, the repository allows users to discover both research papers and practical implementations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DeepVariant

    DeepVariant

    DeepVariant is an analysis pipeline that uses a deep neural networks

    DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data. DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Fli

    Fli

    Google Flights MCP and Python Library

    Fli is a powerful Python library and command-line tool that provides direct programmatic access to Google Flights data through reverse-engineered API interactions rather than traditional web scraping. This approach enables faster, more reliable, and more stable access to flight information, avoiding the fragility associated with HTML parsing and UI changes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Transforming NetOps Through No-Code Network Automation - NetBrain Icon
    Transforming NetOps Through No-Code Network Automation - NetBrain

    For anyone searching for a complete no-code automation platform for hybrid network observability and AIOps

    NetBrain, founded in 2004, provides a powerful no-code automation platform for hybrid network observability, allowing organizations to enhance their operational efficiency through automated workflows. The platform applies automation across three key workflows: troubleshooting, change management, and assessment.
    Learn More
  • 5
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 6
    E2B Cookbook

    E2B Cookbook

    Examples of using E2B

    ...The repository acts as a practical learning resource for developers who want to integrate AI agents with secure cloud execution environments that allow large language models to run code and interact with tools. The examples illustrate how developers can build AI workflows capable of performing tasks such as data analysis, code execution, and application generation inside isolated sandbox environments. E2B itself provides secure Linux-based sandboxes that enable AI systems to safely run generated code and interact with real computing resources without compromising the host environment. The cookbook organizes examples across multiple frameworks and model providers, allowing developers to experiment with integrations involving models from OpenAI, Anthropic, and other ecosystems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Apache Hamilton

    Apache Hamilton

    Helps data scientists define testable self-documenting dataflows

    Apache Hamilton is an open-source Python framework designed to simplify the creation and management of dataflows used in analytics, machine learning pipelines, and data engineering workflows. The framework enables developers to define data transformations as simple Python functions, where each function represents a node in a dataflow graph and its parameters define dependencies on other nodes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Finance

    Finance

    150+ quantitative finance Python programs

    Finance is a repository that compiles structured notes and educational material related to financial analysis, markets, and quantitative finance concepts. The project focuses on explaining key principles used in finance and investment analysis, including topics such as financial statements, valuation models, portfolio theory, and financial markets. The repository is designed as a study reference for students and professionals who want to understand financial systems and the analytical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LangKit

    LangKit

    An open-source toolkit for monitoring Language Learning Models (LLMs)

    LangKit is an open-source text metrics toolkit for monitoring language models. It offers an array of methods for extracting relevant signals from the input and/or output text, which are compatible with the open-source data logging library whylogs. Productionizing language models, including LLMs, comes with a range of risks due to the infinite amount of input combinations, which can elicit an infinite amount of outputs. The unstructured nature of text poses a challenge in the ML observability...
    Downloads: 2 This Week
    Last Update:
    See Project
  • White Labeled Fintech Software Solutions | Centrex Icon
    White Labeled Fintech Software Solutions | Centrex

    Centrex is a full suite of white labeled fintech solutions built and designed for brokers, lenders, banks, investors, fintechs

    The Centrex products include: CRM, loan origination, loan and advance servicing software, syndication management, white labeled mobile app, money manager, underwriting, Esign, and website smart app builder. The Centrex services include: fintech software consulting, admin retainer services, and managed data cloud.
    Learn More
  • 10
    DoWhy

    DoWhy

    DoWhy is a Python library for causal inference

    DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks. Much like machine learning libraries have done for prediction, DoWhy is a Python library that aims to spark causal thinking and analysis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    NeMo Retriever Library is a scalable microservice framework designed for extracting, structuring, and enriching content from documents to support downstream generative AI applications. It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. Additionally, it can generate embeddings for extracted content and integrate with vector databases like Milvus, making it well-suited for retrieval-augmented generation pipelines.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    sktime

    sktime

    A unified framework for machine learning with time series

    sktime is a library for time series analysis in Python. It provides a unified interface for multiple time series learning tasks. Currently, this includes time series classification, regression, clustering, annotation, and forecasting. It comes with time series algorithms and scikit-learn compatible tools to build, tune and validate time series models. Our objective is to enhance the interoperability and usability of the time series analysis ecosystem in its entirety. sktime provides a unified interface for distinct but related time series learning tasks. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Evidently

    Evidently

    Evaluate and monitor ML models from validation to production

    Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    Integuru v0

    Integuru v0

    The first AI agent that builds permissionless integrations

    Integuru is an open-source AI agent designed to automatically create integrations between software platforms by reverse-engineering their internal APIs. Instead of relying on official developer documentation or publicly available APIs, the system analyzes network traffic generated by user interactions within a web application. Developers capture browser requests and authentication data, which the agent then uses to infer the structure of the platform’s internal API endpoints. Based on this...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI-Powered Knowledge Graph is an open-source project focused on building knowledge graph systems that integrate artificial intelligence and machine learning to represent complex relationships between data entities. Knowledge graphs organize information as networks of nodes and relationships, allowing applications to analyze connections between concepts, datasets, or real-world entities. By incorporating AI techniques such as natural language processing and semantic reasoning, the project...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Director

    Director

    AI video agents framework for next-gen video interactions

    Director is a video database management system designed to organize, search, and retrieve large collections of video content efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    FinRobot

    FinRobot

    An Open-Source AI Agent Platform for Financial Analysis using LLMs

    FinRobot is an open-source AI framework focused on automating financial data workflows by combining data ingestion, feature engineering, model training, and automated decision-making pipelines tailored for quantitative finance applications. It provides developers and quants with structured modules to fetch market data, process time series, generate technical indicators, and construct features appropriate for machine learning models, while also supporting backtesting and evaluation metrics to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 105 This Week
    Last Update:
    See Project
  • 19
    Materials Discovery: GNoME

    Materials Discovery: GNoME

    AI discovers 520000 stable inorganic crystal structures for research

    Materials Discovery (GNoME) is a large-scale research initiative by Google DeepMind focused on applying graph neural networks to accelerate the discovery of stable inorganic crystal materials. The project centers on Graph Networks for Materials Exploration (GNoME), a message-passing neural network architecture trained on density functional theory (DFT) data to predict material stability and energy formation. Using GNoME, DeepMind identified 381,000 new stable materials, later expanding the...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    chatd

    chatd

    Chat with your documents using local AI

    chatd is an open-source desktop application that allows users to interact with their documents through a locally running large language model. The software focuses on privacy and security by ensuring that all document processing and inference occur entirely on the user’s computer without sending data to external cloud services. It includes a built-in integration with the Ollama runtime, which provides a cross-platform environment for running large language models locally. The application...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    WeClone

    WeClone

    One-stop solution for creating your digital avatar from chat history

    WeClone is an open source AI project designed to replicate a person’s conversational style and personality by training models on chat history data. The system analyzes message patterns, linguistic style, and contextual behavior in order to generate responses that resemble the original user’s communication style. It is intended primarily as an experimental exploration of digital personality modeling and conversational AI personalization. By processing large volumes of conversation data,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    HDBSCAN

    HDBSCAN

    A high performance implementation of HDBSCAN clustering

    ...In practice this means that HDBSCAN returns a good clustering straight away with little or no parameter tuning -- and the primary parameter, minimum cluster size, is intuitive and easy to select. HDBSCAN is ideal for exploratory data analysis; it's a fast and robust algorithm that you can trust to return meaningful clusters (if there are any).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Vulnhuntr

    Vulnhuntr

    AI tool for detecting complex vulnerabilities in Python codebases

    Vulnhuntr is an open source security tool that uses large language models to analyze codebases and identify remotely exploitable vulnerabilities. It focuses on Python projects and applies static code analysis combined with LLM reasoning to trace how user input flows through an application. Instead of scanning entire repositories at once, it builds call chains step by step, allowing deeper inspection of complex, multi-stage issues that traditional tools may miss. Vulnhuntr can generate detailed findings, including vulnerability explanations and potential exploit paths, helping developers and security teams understand risks faster. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    DeepAudit

    DeepAudit

    AI multi-agent platform for automated code security auditing system

    DeepAudit is an open source code security auditing platform that uses a multi-agent architecture to analyze and identify vulnerabilities in software projects. Instead of relying solely on traditional static analysis, it simulates the reasoning process of security experts through coordinated agents responsible for orchestration, reconnaissance, analysis, and verification. DeepAudit performs deep semantic understanding of code, enabling it to detect complex vulnerabilities that span multiple...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Ploomber

    Ploomber

    The fastest way to build data pipelines

    Ploomber is an open-source framework designed to simplify the development and deployment of data science and machine learning pipelines. It allows developers to transform exploratory data analysis workflows into production-ready pipelines without rewriting large portions of code. The system integrates with common development environments such as Jupyter Notebook, VS Code, and PyCharm, enabling data scientists to continue working with familiar tools while building scalable workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB