RAVDESS
Description of the Data
The RAVDESS dataset is a speech and audio dataset designed to perform actions like emotion classification and other general NLP tasks.
The dataset has the following directory structure:
ravdess_data/ ├── Actor_01/ │ ├── 03-01-01-01-01-01-01.wav (neutral) │ ├── 03-01-02-01-01-01-01.wav (calm) │ ├── 03-01-03-01-01-01-01.wav (happy) │ ├── 03-01-04-01-01-01-01.wav (sad) │ ├── 03-01-05-01-01-01-01.wav (angry) │ ├── 03-01-06-01-01-01-01.wav (fearful) │ ├── 03-01-07-01-01-01-01.wav (disgust) │ ├── 03-01-08-01-01-01-01.wav (surprised) │ ├── 03-01-01-02-01-01-01.wav (neutral, strong intensity) │ └── ... (more files with different statements/repetitions) ├── Actor_02/ │ ├── 03-01-01-01-01-02-01.wav (neutral) │ ├── 03-01-03-01-01-02-01.wav (happy) │ └── ... (similar pattern for all emotions) ├── Actor_03/ │ └── ... (continues for all 24 actors) └── ...
File naming convention: modality-vocal_channel-emotion-intensity-statement-repetition-actor.wav
-
Modality: 03 = audio+video (we’ll use these)
-
Vocal channel: 01 = speech, 02 = song
-
Emotion: 01=neutral, 02=calm, 03=happy, 04=sad, 05=angry, 06=fearful, 07=disgust, 08=surprised
-
Intensity: 01 = normal, 02 = strong
-
Statement: 01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door"
-
Repetition: 01 = first repetition, 02 = second repetition
-
Actor: 01-24 = actor ID