Raw data used to construct the vendor-purchased-2019 dataset.
Datasets associated with projects and papers from the Observatory on Social Media.
Use the official Botometer API to retrieve botscores from Botometer X or calculate new botscores with data provided by the user.
An archive of posts from Mastodon starting in August 2023.
A snapshot of Bluesky collected at the end of April 2024.
An archive of the BlueSky firehose from August 2023 to the present.
NielsenIQ provides access to multiple marketing datasets, including retail scanner data, consumer panel data, and Ad Intel data.
Access to statistical content information produced by U.S. federal agencies, states, private organizations, and major intergovernmental organizations.
Includes access to 65 billion U.S. and international datasets from over 90 sources.
The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories that provides access to data on language technology research and development.