AI training data glossary for enterprise model teams.
A concise reference for the terms buyers, model teams, and governance reviewers use when discussing training data, alignment, medical AI, speech AI, and provenance.
Definitions
Browse by AI workflow
Data Operations
Data Annotation
Data annotation is the process of adding labels, metadata, or structured judgments to raw data so machine learning models can learn from it. For AI teams, annotation quality determines whether a model learns the right signal or simply memorizes noisy patterns.
Alignment
RLHF
RLHF, or reinforcement learning from human feedback, is a model-alignment method that uses human preference judgments to teach AI systems which outputs are more helpful, accurate, safe, or appropriate. The quality of the human feedback strongly affects the quality of the aligned model.
Fine-Tuning
SFT
SFT, or supervised fine-tuning, trains a model on curated input-output examples so it learns a desired task, style, domain, or reasoning pattern. Good SFT data is explicit, consistent, and aligned to the behavior the model should show in production.
Speech AI
Dual-Channel Audio
Dual-channel audio records two speakers or sides of a conversation on separate tracks, such as an agent and a customer in a call center. This separation makes it easier to train ASR, speaker diarization, conversation analytics, and voice AI systems.
Medical AI
DICOM
DICOM is the standard format used to store and exchange medical imaging data such as CT, MRI, X-ray, and ultrasound studies. For medical AI, DICOM data is valuable when it is de-identified, paired with reports or findings, and organized by modality and clinical context.
Governance
Data Provenance
Data provenance is the documentation of where training data came from, how it was collected or structured, and what metadata supports review. In AI, provenance helps teams evaluate dataset quality, licensing, compliance, and suitability for a model’s intended use.
Looking for governance language?
The glossary explains terminology; the FAQ and compliance pages explain how provenance and documentation fit enterprise review.