Best AI Datasets 2026

The #1 ai datasets in 2026 is csv with a Nerq Trust Score of 83/100 (A-), based on Nerq's independent analysis of 50 ai datasets across 5 trust dimensions. Rankings update daily — last updated: 2026-05-06.

According to Nerq's analysis, the top 5 ai datasets by trust score are: 1. csv (83/100), 2. @wordpress/fields (82/100), 3. @wordpress/dataviews (82/100), 4. datasets (81/100), 5. huggingface-hub (81/100). Nerq Trust Scores range from 65 to 83 among the top 50. Scores are based on 5 independent trust dimensions including security, maintenance, and community adoption. Updated daily.

Top 10 AI Datasets by Nerq Trust Score (2026)
#NameTrustGrade
1csv83A-
2@wordpress/fields82A-
3@wordpress/dataviews82A-
4datasets81A-
5huggingface-hub81A-
6@sqlrooms/data-table76B+
7@humanspeak/svelte-virtual-list74B
8@llm-tools/embedjs74B
9@edgeandnode/amp74B
10@friendliai/ai-provider72B

Top 50 AI Datasets by Nerq Trust Score

#NameTrustGradeStarsDescription
1csv83A-1552.1kA mature CSV toolset with simple api, full of options and tested against large datasets.
2@wordpress/fields82A-27.3kDataViews is a component that provides an API to render datasets using different types of layouts (t...
3@wordpress/dataviews82A-47.6kDataViews is a component that provides an API to render datasets using different types of layouts (t...
4datasets81A-16324.3kHuggingFace community-driven open-source library of datasets
5huggingface-hub81A-47675.5kClient library to download and publish models, datasets and other repos on the huggingface.co hub
6@sqlrooms/data-table76B+4.1kA high-performance data table component library for SQLRooms applications. This package provides fle...
7@humanspeak/svelte-virtual-list74B2.6kA lightweight, high-performance virtual list component for Svelte 5 that renders large datasets with...
8@llm-tools/embedjs74B174A NodeJS RAG framework to easily work with LLMs and custom datasets
9@edgeandnode/amp74B176Build and manage blockchain datasets.
10@friendliai/ai-provider72B94<!-- header start --> <p align="center"> <img src="https://huggingface.co/datasets/FriendliAI/docu...
11autoviz72B3.4kAutomatically Visualize any dataset, any size with a single line of code
12@datawheel/vizbuilder72B11A React component that generates multiple kinds of charts from a tesseract-olap dataset.
13vaex72B4.9kOut-of-Core DataFrames to visualize and explore big tabular datasets
14hapi-csv72B441Hapi plugin for converting a Joi response schema and dataset to csv
15@lovrabet/dataset-mcp-server72B309MCP server for Lovrabet Dataset access
16vue-dataset71B584A vue component to display datasets with filtering, paging and sorting capabilities!
17process-versions71B130A dataset showing the compiled process version dependencies of different Node.js versions
18abses70B116ABSESpy makes it easier to build artificial Social-ecological systems with real GeoSpatial datasets ...
19@donedeal0/superdiff70B8.4kSuperdiff provides a rich and readable diff for arrays, objects, texts and coordinates. It supports ...
20cellxgene-schema70B-267Tool for applying and validating cellxgene integration schema to single cell datasets
21azureml-opendatasets70B-8.4kProvides a set of APIs to consume Azure Open Datasets.
22ancpbids69B-3.7kRead/write/validate/query BIDS datasets
23@ldo/jsonld-dataset-proxy69B-754Edit RDFJS Dataset just like regular JavaScript Object Literals.
24@muze-nl/simplystore69B-1SimplyStore is a radically simpler backend storage server. It does not have a database, certainly no...
25node-dataset69B-100A Node.js module for working with data sets created in code, loaded from files, or retrieved from a ...
26data_magic68B-14364.0kProvides datasets to application stored in YAML files
27cellxgene68B-868Web application for exploration of large scale scRNA-seq datasets
28cemba-data68B-4Pipelines for single nucleus methylome and multi-omic dataset.
29act-atmos68B-1.2kPackage for working with atmospheric time series datasets
30@vespermcp/mcp-server67B-244AI-powered dataset discovery, quality analysis, and preparation MCP server with multimodal support (...
31arcana67B-618Abstraction of Repository-Centric ANAlysis (Arcana): A rramework for analysing on file-based dataset...
32azureml-contrib-dataset67B-996Contains experimental Dataset features for the azureml-core package.
33azureml-datadrift67B-220Contains functionality for data drift detection for various datasets used in machine learning.
34mnemospark67B-544mnemospark is an OpenClaw plugin that gives agentic systems instant, secure access to cloud storage,...
35@cherrystudio/embedjs67B-541A NodeJS RAG framework to easily work with LLMs and custom datasets
36baran67B-1189.5kText Splitter for Large Language Model Datasets.
37devise-pwned_password67B-2970.8kDevise extension that checks user passwords against the PwnedPasswords dataset https://haveibeenpwne...
38sequel_pg67B-6766.5ksequel_pg overwrites the inner loop of the Sequel postgres adapter row fetching code with a C versio...
39gruff67B-3776.5kBeautiful graphs for one or multiple datasets. Can be used on websites or in documents.
40rgeo-shapefile67B-3620.0kRGeo is a geospatial data library for Ruby. RGeo::Shapefile is an optional RGeo module for reading t...
41vesper-wizard66B-966Zero-friction setup wizard for Vesper — local MCP server, unified dataset API, and agent auto-config...
42@data_wise/hyper-markdown66B-3A powerful Vue 3 Markdown editor with rich features including ECharts, D3.js, Mermaid, KaTeX, and da...
43anemoi-datasets66B--A package to hold various functions to support training of ML models on ECMWF data.
44@quicknode/hypercore-cli66B-30Developer-friendly CLI for streaming and backfilling HyperCore datasets from Quicknode
45@opengis/mapdataset66B-5A Map Dataset Component displays geospatial vector data with the ability to filter features and colo...
46@stdlib/datasets-standard-card-deck65B-2A list of two or three letter abbreviations for each card in a standard 52-card deck.
47@stdlib/datasets-female-first-names-en65B-87A list of common female first names in English speaking countries.
48@stdlib/datasets-spache-revised65B-4A list of simple American-English words (revised Spache).
49@stdlib/datasets-male-first-names-en65B-99A list of common male first names in English speaking countries.
50ANAC XML Bandi di Gara65B-600Software per la gestione dei Bandi di Gara e generazione dataset XML per ANAC (ex AVCP -Legge 190/20...

How We Rank AI Datasets

These ai datasets are ranked by Nerq Trust Score, which evaluates security, maintenance, community adoption, and transparency across multiple data points. Only entities with a trust score of 30 or above are included. Scores are updated continuously as new data becomes available.

FAQ

What are the best ai datasets in 2026?

Based on Nerq Trust Scores, the top-ranked ai datasets are listed above, scored on security, activity, documentation, and community metrics.

How are ai datasets ranked?

Nerq ranks tools using Trust Score v2, which combines security analysis, maintenance activity, documentation quality, and community adoption signals.

Are these ai datasets safe to use?

Each tool has an individual safety report. Click any tool name to see its detailed trust analysis.

What does a Nerq Trust Score of A mean?

An A grade (80-89) means the entity has strong signals across security, maintenance, documentation, and community adoption. A+ (90-100) is the highest possible rating.

How does Nerq evaluate ai datasets?

Nerq analyzes ai datasets across multiple dimensions including security vulnerabilities, license compliance, maintenance activity, documentation quality, and community adoption. Each dimension is scored independently and combined into an overall Trust Score (0-100).

We use cookies for analytics and caching. Privacy Policy