Data collection & upload

What gets logged, when it's batched, where it ends up, and how to define what 'a row of data' means for your study.

LUIDA records four kinds of data per session, three of which are automatic and one of which you have to define yourself. This page covers all four, and walks through the lifecycle of a custom-data record from "the participant clicked something" to "you download a CSV."

The four data streams

Stream	Source	What's recorded	When uploaded
Participant info	`ParticipantManager.js`	sID, pID, IDFC, environment (platform, VR, OS)	Once, at session start
Between-subjects assignment	`ConditionManager.js`	The assigned values per session	Once, after eligibility check
Questionnaire answers	`Questionnaire` prefabs	One row per (participant, question)	When participant submits
Custom data	`LUIDA-DataCollector` + your script	Whatever you define	Batched, on demand

The first three are handled for you — you don't need to write code. The fourth is where most experiment-specific data lives.

What "custom data" means

Custom data is everything that's specific to your study and not a questionnaire answer. Reaction times, button presses, trajectories, decision events — anything that's a per-trial or per-event measurement.

The lifecycle is:

state-listening action
  → SendDataToCollector(label, value)        ← writes to in-memory $.groupState.collectedData
  → ... possibly more SendDataToCollector calls ...
  → ProcessAndSaveCollectedData()            ← runs your data-calculator script,
                                               appends one record to the upload queue
  → ... possibly many trials' worth of records accumulating ...
  → UploadCollectedData()                    ← batches and uploads everything in queue

Think of SendDataToCollector as appending to a scratchpad, ProcessAndSaveCollectedData as snapshotting the scratchpad into a queued record, and UploadCollectedData as flushing the queue to the backend.

The data calculator script

When you do GameObject › LUIDA › Data Collector, LUIDA spawns a LUIDA-DataCollector prefab and creates a JavaScript file (a "data calculator") at Assets/_Experiment_/Scripts/CustomDataCollection/<scene>_DataCollector.js.

You open and edit this file. It must export one function: calculateData(). That function reads three globals — CONDITION, PARTICIPANTS, COLLECTED_DATA — and returns a single JavaScript object representing one row of data.

A minimal Stroop-task data calculator:

function calculateData() {
  return {
    sID: $.groupState.sessionID,
    pID: 1,
    trialID: $.getStateCompat("global", "exp_trialID", "integer"),
    depth: CONDITION["depth"],
    fontColor: CONDITION["fontColor"],
    textMeaning: CONDITION["textMeaning"],
    responseTarget: CONDITION["responseTarget"],
    correct: COLLECTED_DATA["correct"],
    rt_ms: COLLECTED_DATA["rt_ms"],
  };
}

What's available inside calculateData:

CONDITION — the active variable values (within + between) for the current trial.
PARTICIPANTS — the player handles, 1-indexed.
COLLECTED_DATA — every key/value written via SendDataToCollector(key, value) since the last ProcessAndSaveCollectedData.
$.groupState.sessionID, $.getStateCompat(...), etc. — the full ClusterScript API.

The function returns one object. That object becomes one row of custom-data CSV, with the keys becoming column headers.

When does each lifecycle step run?

Typically you wire up the lifecycle through state-listening actions:

State: Stimulus
  On State Start:    (start a timer, show the stimulus)
State: Response
  On State Exit:     SendDataToCollector("rt_ms", Date.now() - stimulusTime)
                     SendDataToCollector("correct", responseMatchesTarget)
                     ProcessAndSaveCollectedData()
State: Trial - Rest
  On State Start:    UploadCollectedData()       ← batches & uploads the trial's data
State: End
  On State Start:    UploadCollectedData()       ← final flush in case anything is in the queue

Two design principles:

ProcessAndSaveCollectedData runs once per trial, at end of the trial-defining state. Each call generates one row.
UploadCollectedData runs as often as you want — but each call is one network round-trip. Common pattern: upload at the end of each trial (so partial data isn't lost if a participant disconnects). Cheaper pattern: upload only at End (one batch, one round-trip).

Batching — what LUIDA actually does on upload

UploadCollectedData sends the queue via callExternal with type: "uploadCustomData". But the queue can be larger than Cluster's per-message payload limit (a few hundred KB), so CustomDataUploader.js batches:

Splits the queue into chunks where each chunk is at most 100 records OR ~100,000 UTF-8 bytes, whichever fills first.
Sends each chunk in sequence with a 1-second pause between chunks.
Tracks uploadIndex and steps so partial uploads can resume.
On onExternalCallEnd with meta === "customDataUploaded", advances to the next chunk.

The batching is invisible to you — from the experiment's perspective, UploadCollectedData() returns immediately and the batches dribble out over the next few seconds. You should not call UploadCollectedData() so frequently that batches overlap — give each upload at least a few seconds before starting the next.

Where the data ends up

The Web Console stores custom data as JSON files in S3, organized by sID. The Web Console's "Custom data" view parses these into a single CSV per experiment for download.

Filename pattern in S3:

<experiment-folder-id>/<worldId>/<sID>-<chunk-index>-customData.json

You'll never look at this directly. Use the Web Console's data download UI instead.

Patterns for common DV types

Reaction time — record Date.now() at stimulus on, again at response, subtract:

// On State Start of "Stimulus":
SendDataToCollector("stimulusOnsetMs", Date.now());

// On State Exit of "Response":
const onset = COLLECTED_DATA["stimulusOnsetMs"];
SendDataToCollector("rt_ms", Date.now() - onset);
ProcessAndSaveCollectedData();

Multiple-choice answer — record which option was clicked. Usually wired via a state-listening item that sets COLLECTED_DATA["choice"] when its trigger fires.

Trajectory / continuous data — don't record per-frame data via SendDataToCollector; the per-trial record granularity is wrong. Either downsample to a few key timepoints, or write to a separate per-trial JSON via Customized Action that you upload at trial end.

Text input from participant — use requestTextInput, store the result via SendDataToCollector("text", input).

Where to go next

Reference → Variables in scripts — the CONDITION, PARTICIPANTS, COLLECTED_DATA globals in detail.
Reference → Actions: data logging — SendDataToCollector, ProcessAndSaveCollectedData, UploadCollectedData API.
Web Console → Downloading data — what the data looks like once exported.

Data collection & upload

On this page