Access benchmark-quality RL, multimodal, vision, and STEM datasets to accelerate your post-training research. Choose from pre-defined packs or create custom datasets tailored to your experiments.
Choose from our curated data collections optimized for post-training research and ready to request:
Samples are delivered via email and typically within 48 hours of your request, so you can begin integration and evaluation without delay.
Yes, you can select any combination of pre-defined packs or custom datasets in a single request form, and we’ll bundle them in one delivery.
We provide samples in ML–ready formats (e.g., image folders, CSV/JSON for tabular and text, WAV for audio). All modalities listed in the catalog—vision, audio, STEM, coding, and more—are available.
Sample datasets are provided under a research-only license. For full-pack access or commercial use, please ask about terms and pricing.
Yes, select "custom" option in your request and provide additional details. Our research team will work with you to assemble the right dataset.
You’ll receive curated sample files and metadata, followed by outreach from our research team to discuss full-pack access, volume, pricing, and any custom adjustments.
Request your data packs today and accelerate your research.