Skip to main content


A Data Collection Concept, Dataset, and Benchmark for Machine Analysis of Free-Standing Social Interactions in the Wild

About ConfLab

We proposed ConfLab (Conference Living Lab) as a new concept for in-the-wild recording of real-life social human behavior, and provided a dataset from the first edition of ConfLab at ACM Multimedia 2019.

The Dataset

The interaction space with 48 subjects was captured via overhead videos, in which f-formations (conversation groups) were annotated. Each person in an F-formation is associated to their body pose tracks, wearable sensor data, and speaking status labels.

Overhead video

10 overhead cameras

∼ 45 min; 1920×1080 @ 60 fps

Wearable data

Recorded by a badge wearable:

  • Low-freq. audio (1250 Hz)
  • BT proximity (5 Hz)
  • 9-axis IMU (56 Hz)

F-formation annotations (16min)

Annotated at 1Hz for 16 min of interaction.

Full body pose tracks (16min)

Full body pose tracks (17 body joints) annotated seperately per camera (5 cameras) for all participants in the scene.

Action annotations (16min)

Speaking status (binary) annotated continuously (60Hz) for all participants in the scene.

Survey measures

Data subjects reported research interests and level of experience within the MM community.

The ConfLab Template

ConfLab aims to be a template for future data collection in real-life, in-the-wild events through contributions such as the following.

dummy img
Mingle MIDGE
The Midge is a custom wearable device developed primarily for the collection of high resolution signals in in-the-wild mingling settings with multiple participants. The badge is a 55x35mm wearable PCB featuring two microphones, a 9 axis IMU, micro SD card storage, 300mAh battery, BMD 300 processor with bluetooth low energy
dummy img
Time synchronization at acquisition
A Modular Approach for Synchronized Wireless Multimodal Multisensor Data Acquisition in Highly Dynamic Social Settings2020Chirag Raman, Stephanie Tan and Hayley HungProceedings of the 28th ACM International Conference on Multimedia
dummy img
Continuous annotation of keypoints and actions

ConfLab is the first large-scale mingling dataset to be annotated for full-body poses. This was possible by using continuous annotation techniques for both keypoints and actions. In pilot studies we measured a 3x speed up in keypoint annotation when using our continuous method when compared to the traditional technique of annotating every frame, followed by interpolation.

The following video shows some of the dataset and annotation interface:

Our continuous methods are implemented and made available as part of the Covfee framework.

Covfee: an extensible web framework for continuous-time annotation of human behavior2022Jose Vargas-Quiros, Stephanie Tan, Chirag Raman, Laura Cabrera-Quiros, and Hayley HungUnderstanding Social Behavior in Dyadic and Small Group Interactions, Proceedings of Machine Learning Research