Showing 55 open source projects for "website using python"

View related business solutions
  • Powerful Website Security | Continuous Web Threat Platform Icon
    Powerful Website Security | Continuous Web Threat Platform

    Continuously detect, prioritize, and validate web threats to quickly mitigate security, privacy, and compliance risks.

    Reflectiz is a comprehensive web exposure management platform that helps organizations proactively identify, monitor, and mitigate security, privacy, and compliance risks across their online environments. Designed to address the growing complexity of modern websites, Reflectiz provides full visibility and control over first, third, and even fourth-party components, such as scripts, trackers, and open-source libraries that often evade traditional security tools.
    Learn More
  • ACI Learning: Internal Audit, Cybersecurity, and IT Training Icon
    ACI Learning: Internal Audit, Cybersecurity, and IT Training

    Proven skill building for every aspect of your support or IT team.

    Traditional training doesn't equip employees with the practical skills they need to drive business success. ACI Learning provides hands-on IT and cybersecurity training designed to build real-world, on-the-job skills. Our outcome-based programs empower employees with certification prep, industry-recognized credentials, and flexible learning options. With expert-led video training, labs, and scalable solutions, we help businesses, individuals, governments, and academic institutions develop a skilled workforce, align with business goals, and stay ahead in a rapidly evolving digital world.
    Learn More
  • 1
    Python API for JMComic

    Python API for JMComic

    Python crawler and API for downloading JMComic albums and images

    JMComic-Crawler-Python is a Python library and crawler framework designed to programmatically access and download comic content from the JMComic platform. It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    FinalRecon

    FinalRecon

    All-in-one Python web reconnaissance tool for fast target analysis

    FinalRecon is an all-in-one web reconnaissance tool written in Python that helps security professionals gather information about a target website quickly and efficiently. It combines multiple reconnaissance techniques into a single command-line utility so users do not need to run several separate tools to collect similar data. FinalRecon focuses on providing a fast overview of a web target while maintaining accuracy in the collected results.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Using the browser steps configuration, add basic steps before performing change detection, such as logging into websites, adding a product to a cart, accepting cookie logins, entering dates, and refining searches. Monitor and track PDF file changes, and know when a PDF file has text changes. Know when your favourite product is on sale, or other special deals are announced before anyone else.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Digital business card + lead capture + contact enrichment Icon
    Digital business card + lead capture + contact enrichment

    Your complete in-person marketing platform

    Share digital business cards, capture leads, and enrich validated contact info - at events, in the field, and beyond. Powered by AI and our proprietary data engine, Popl drives growth for companies around the world, turning every handshake into an opportunity.
    Learn More
  • 5
    watercrawl

    watercrawl

    AI-ready web crawler that extracts and structures website content

    WaterCrawl is an open source web crawling and data extraction platform designed to transform website content into structured data suitable for machine learning and AI workflows. It enables developers and researchers to crawl web pages, extract meaningful information, and convert it into formats that are easier to process and analyze. It provides a modern crawling system that can automatically navigate links, control crawl depth, and collect content from targeted sections of a website....
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    finvizfinance

    finvizfinance

    Finviz analysis python library

    finvizfinance is a package that collects financial information from FinViz website. Stock charts, fundamental & technical information, insider information and stock news. Forex charts and performance. Crypto charts and performance. Screener and Group provide data frames for comparing stocks according to different filters and trading signals. Getting information (fundament, description, outer rating, stock news, inside trader) of an individual stock.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 7
    Pydoll

    Pydoll

    Async Python library in automating Chromium browsers without WebDriver

    Pydoll is a Python library designed for automating Chromium-based web browsers such as Chrome and Edge without relying on a traditional WebDriver layer. Instead of using external drivers, it connects directly to the Chrome DevTools Protocol through WebSocket, allowing scripts to control browser behavior more efficiently and with fewer compatibility issues.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Scrapling

    Scrapling

    An adaptive Web Scraping framework

    Scrapling is an adaptive web scraping framework designed to handle everything from a single HTTP request to large-scale, concurrent crawls. Built for modern websites, it intelligently adapts to structural changes by automatically relocating elements when page layouts update. The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 16 This Week
    Last Update:
    See Project
  • Silverware is an enterprise-grade hospitality platform built for hotels, resorts, and complex multi-venue operations. Icon
    Silverware is an enterprise-grade hospitality platform built for hotels, resorts, and complex multi-venue operations.

    Silverware powers high-end hospitality environments

    Silverware is built for hotel, resort, and multi-venue hospitality operators who need enterprise-grade control, deep integrations, and always-on reliability to run complex operations at scale.
    Learn More
  • 10
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers. JobFunnel can be easily automated to run nightly with crontab. If you...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Selectolax

    Selectolax

    Python binding to Modest and Lexbor engines

    A fast HTML5 parser with CSS selectors using Modest and Lexbor engines. Selectolax supports two backends: Modest and Lexbor. By default, all examples use the Modest backend. Most of the features between backends are almost identical, but there are still some differences. Currently, the Lexbor backend is in beta and missing some of the features. To use lexbor, just import the parser and use it in the similar way to the HTMLParser.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    Grab Framework Project

    Grab Framework Project

    Web Scraping Framework

    Grab is a python framework for building web scrapers. With Grab you can build web scrapers of various complexity, from simple 5-line scripts to complex asynchronous website crawlers processing millions of web pages. Grab provides an API for performing network requests and for handling the received content e.g. interacting with DOM tree of the HTML document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Scrapy-Redis

    Scrapy-Redis

    Redis-based components for Scrapy

    ...Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version 0.3 changed the requests serialization from marshal to cPickle, therefore persisted requests using version 0.2 will not able to work on 0.3. The class scrapy_redis.spiders.RedisSpider enables a spider to read the urls from redis. The urls in the redis queue will be processed one after another, if the first request yields more requests, the spider will process those requests before fetching another url from redis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    bilibili-manga-downloader

    bilibili-manga-downloader

    Download and manage Bilibili Manga chapters with GUI downloader

    BiliBili-Manga-Downloader is an open source desktop application designed to download manga chapters from the Bilibili Manga platform for offline reading and local management. It was created to address limitations of the web reading experience, such as intrusive advertisements, inconvenient image zooming, and inconsistent navigation during reading sessions. It provides a graphical user interface that allows users to search for manga titles using keywords, view detailed information about...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 17
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. It...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    Newspaper4k is a Python library designed for extracting, processing, and analyzing news articles from websites. It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Scweet

    Scweet

    Scrape tweets, profiles, followers and following from Twitter/X

    Scweet is a Python-based Twitter/X scraping library and CLI designed to collect tweets, profile timelines, followers, following lists, and user profile data without requiring the official Twitter/X API or a developer account. Instead of depending on deprecated unauthenticated scraping methods, it works by using X’s web GraphQL API together with authenticated browser cookies, which gives it a more current and practical approach for data extraction.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    owllook

    owllook

    Vertical novel search engine with unified reading and tracking tools

    Owllook is an open source vertical search engine designed for discovering and reading online novels from multiple sources. Instead of redirecting users to different sites, the system parses content from many novel platforms and presents it in a unified reading interface. It focuses on providing a simple and comfortable reading experience with features such as searching for books, following updates, bookmarking chapters, and maintaining a personal bookshelf. It aggregates results from...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    ...Diskover also helps identify outdated or unused files, duplicate data, and inefficient storage usage that can waste resources or increase operational costs. A Python-based indexing engine performs the scanning and indexing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Letterboxd Recommendations

    Letterboxd Recommendations

    Scraping publicly-accessible Letterboxd data for movie recommendations

    Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username. A user's "star" ratings are scraped from their Letterboxd profile and assigned numerical ratings from 1 to 10 (accounting for half stars). Their ratings are then combined with a sample of ratings from the top 4000 most active users on the site to create a collaborative filtering recommender model using singular value...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Kemono Downloader

    Kemono Downloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 394 This Week
    Last Update:
    See Project
  • 25
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB