A Repo For Document AI
Data and tools for generating and inspecting OLMo pre-training data
A curated list of data mining papers about fraud detection
A Heterogeneous Benchmark for Information Retrieval
Extract schema, statistics and entities from datasets
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
Fast and customizable framework for automatic ML model creation
The library to build & auto-optimize LLM applications
Efficient Retrieval Augmentation and Generation Framework
A full spaCy pipeline and models for scientific/biomedical documents
Libraries for applying sparsification recipes to neural networks
Data processing for and with foundation models
Haystack is an open source NLP framework to interact with your data
Neural Network Compression Framework for enhanced OpenVINO
Pretrained model hub for Keras 3
Efficient few-shot learning with Sentence Transformers
Easy-to-use and powerful NLP library with Awesome model zoo
Training data (data labeling, annotation, workflow) for all data types
A coding-free framework built on PyTorch
Data loaders and abstractions for text and NLP
Making large AI models cheaper, faster and more accessible
A Unified Library for Parameter-Efficient Learning
Build AI-powered semantic search applications
Bring the notion of Model-as-a-Service to life
Hub of ready-to-use datasets for ML models