Showing 1058 open source projects for "python data analysis"

View related business solutions
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • Premier Construction Software Icon
    Premier Construction Software

    Premier is a global leader in financial construction ERP software.

    Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
    Learn More
  • 1
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    ...By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Oumi

    Oumi

    Everything you need to build state-of-the-art foundation models

    Oumi is an open-source framework that provides everything needed to build state-of-the-art foundation models, end-to-end. It aims to simplify the development of large-scale machine-learning models.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    VikingDB MCP Server

    VikingDB MCP Server

    A mcp server for vikingdb store and search

    An MCP server that interfaces with VikingDB, a high-performance vector database developed by ByteDance, enabling efficient vector storage and search capabilities. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LangCheck

    LangCheck

    Simple, Pythonic building blocks to evaluate LLM applications

    Simple, Pythonic building blocks to evaluate LLM applications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Inventory and Order Management Software for Multichannel Sellers Icon
    Inventory and Order Management Software for Multichannel Sellers

    Avoid stockouts, overselling, and losing control as your business grows.

    We are the most powerful inventory and order management platform for Amazon, Walmart, and multichannel product sellers. Centralize orders, product information, and fulfillment operations to run more efficiently, sell more products, and stay compliant with marketplace requirements so you can grow profitably.
    Learn More
  • 5
    Qlib

    Qlib

    Qlib is an AI-oriented quantitative investment platform

    Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies. An increasing number of SOTA Quant research works/papers are released in Qlib. With Qlib, users can easily try their ideas to create better Quant investment strategies. At the module level, Qlib is a platform that consists of...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    MLE-bench

    MLE-bench

    AI multi-agent framework for automating data-driven R&D workflows

    RD-Agent is an open source AI framework designed to automate research and development workflows in data-driven domains. It uses large language models and multiple collaborating agents to simulate the typical cycle of research, experimentation, and improvement that human data scientists follow. It separates the process into two core phases: a research stage that proposes hypotheses and ideas, and a development stage that implements and evaluates them through code execution and experiments. By...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    refinery

    refinery

    Open-source choice to scale, assess and maintain natural language data

    The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact. You are one of the people we've built refinery for. refinery helps you to build better NLP models in a data-centric approach. Semi-automate your labeling, find low-quality subsets in your training data, and monitor your data in one place. refinery doesn't get rid of manual labeling, but it makes sure that your valuable time is spent well. Also, the makers...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MindsDB

    MindsDB

    Making Enterprise Data Intelligent and Responsive for AI

    MindsDB is an AI data solution that enables humans, AI, agents, and applications to query data in natural language and SQL, and get highly accurate answers across disparate data sources and types. MindsDB connects to diverse data sources and applications, and unifies petabyte-scale structured and unstructured data. Powered by an industry-first cognitive engine that can operate anywhere (on-prem, VPC, serverless), it empowers both humans and AI with highly informed decision-making...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 10
    Kor

    Kor

    LLM

    This is a half-baked prototype that “helps” you extract structured data from text using LLMs. Specify the schema of what should be extracted and provide some examples. Kor will generate a prompt, send it to the specified LLM and parse out the output. You might even get results back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AutoKeras

    AutoKeras

    AutoML library for deep learning

    AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras is to make machine learning accessible to everyone. AutoKeras only support Python 3. If you followed previous steps to use virtualenv to install tensorflow, you can just activate the virtualenv. Currently, AutoKeras is only compatible with Python >= 3.7 and TensorFlow >= 2.8.0. AutoKeras supports several tasks with extremely simple interface. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Video-subtitle-remover (VSR)

    Video-subtitle-remover (VSR)

    AI tool that removes hardcoded subtitles and text from videos locally

    Video Subtitle Remover is an AI-based application designed to remove hardcoded subtitles from videos and generate new files without the embedded text. Video Subtitle Remover analyzes video frames and detects subtitle regions, then replaces the removed areas using an AI algorithm that fills the space with reconstructed visual content. This process aims to maintain the original resolution and visual continuity of the video after subtitle removal. It allows users to define a specific subtitle...
    Downloads: 121 This Week
    Last Update:
    See Project
  • 13
    Robyn

    Robyn

    Experimental, AI/ML-powered and open sourced Marketing Mix Modeling

    Robyn is an open-source, AI/ML-powered Marketing Mix Modeling (MMM) toolkit developed by Meta Marketing Science under the “facebookexperimental” GitHub umbrella. Its goal is to democratize rigorous MMM: what traditionally required expert statisticians and expensive consulting becomes accessible to any company with data. Robyn takes in historical data (spends on different marketing channels, conversions, or revenue, and optional context or organic-media variables) and uses a combination of...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    notebooklm-py

    notebooklm-py

    Unofficial Python API and agentic skill for Google NotebookLM

    ...The project covers notebook management, source ingestion, conversational querying, research workflows, and sharing controls, while also enabling the generation of a wide range of study and media artifacts. These outputs include audio overviews, videos, slide decks, infographics, quizzes, flashcards, reports, data tables, and mind maps, with configurable formats and export options.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    River ML

    River ML

    Online machine learning in Python

    River is a Python library for online machine learning. It aims to be the most user-friendly library for doing machine learning on streaming data. River is the result of a merger between creme and scikit-multiflow.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Dynamiq

    Dynamiq

    An orchestration framework for agentic AI and LLM applications

    Dynamiq is an open-source orchestration framework designed to streamline the development of generative AI applications that rely on large language models and autonomous agents. The framework focuses on simplifying the creation of complex AI workflows that involve multiple agents, retrieval systems, and reasoning steps. Instead of building each component manually, developers can use Dynamiq’s structured APIs and modular architecture to connect language models, vector databases, and external...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Cube Studio

    Cube Studio

    Cube Studio open source cloud native one-stop machine learning

    Cube Studio is an open-source, cloud-native end-to-end machine learning and AI platform designed to support the full lifecycle of AI development — from data preparation and interactive notebook coding to distributed training, model tuning, and deployment in production-ready environments. It provides a unified interface where teams can manage data sources, track datasets, and build pipelines using drag-and-drop workflow orchestration, making it accessible for both engineers and data...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    OpenViking

    OpenViking

    Context database designed specifically for AI Agents

    OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    ChatDev

    ChatDev

    Create Customized Software using Natural Language Idea

    ChatDev is an AI-powered development tool designed to simulate the software development lifecycle using multi-agent collaboration. It allows multiple AI agents to take on roles such as product managers, developers, and testers to collaboratively generate, refine, and evaluate software code. This project explores how AI can be leveraged to automate and optimize development workflows.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    EverMemOS

    EverMemOS

    Long-term memory OS for AI with structured recall and context awarenes

    EverMemOS is an open-source memory operating system built to give AI agents long-term, structured memory. It captures conversations, transforms them into organized memory units, and enables agents to recall past interactions with context and meaning. Instead of treating each prompt independently, it builds evolving user profiles, tracks preferences, and connects related events into coherent narratives. Its architecture combines memory storage, indexing, and retrieval with agent-level...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    Vanna

    Vanna

    Chat with your SQL database

    Vanna.AI is an AI-powered tool for natural language database querying, enabling users to interact with databases using simple English queries. It converts natural language questions into SQL queries, making data access more intuitive for non-technical users.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Weights and Biases

    Weights and Biases

    Tool for visualizing and tracking your machine learning experiments

    Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to production models. Quickly identify model regressions. Use W&B to visualize results in real time, all in a central dashboard. Focus on the interesting ML. Spend less time manually tracking results in spreadsheets and text files. Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    Guardrails

    Guardrails

    Framework for validating and controlling LLM outputs in AI apps

    ...These guards can detect and mitigate specific issues by applying validators that analyze content, enforce rules, or ensure structured output formats. Guardrails also supports generating structured data from language models, allowing developers to enforce schemas or type constraints on responses. A companion ecosystem known as a hub provides reusable validators that can be combined into input and output guards to address different reliability and safety concerns.
    Downloads: 5 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB