Restrict dataVals vowel onset to times that can have formant values by cwnaber · Pull Request #165 · carrien/free-speech

cwnaber · 2026-03-10T15:59:50Z

Background about time sampling

When generating dataVals.mat with gen_dataVals_from_wave_viewer, we use two related time scales and sets of data:

those related to the raw audio signal, which normally have a sampling rate at 16kHz. For example, sigmat.ampl and sigmat.ampl_taxis
Those related to the formant tracking, which normally have a sampling rate of 333.3 Hz. For example, sigmat.ftrack_taxis or eventually dataVals.f1

One difference between the two time scales is the first and last sample timepoint, even ignoring sampling rate differences. sigmat.ampl_taxis starts at zero, but sigmat_ftrack_taxis starts at around 72 milliseconds. This is due to needing to collect a certain amount of signal data to determine formant values.

Background about determining onset and offset

To determine the vowel onset and offset in a trial that has no user events, gen_dataVals_from_wave_viewer follows this process:

To find the onset, find the first sample in sigmat.ampl where the amplitude is above the amplitude threshold. This happens in old line 306, in embedded function get_onset_from_ampl
To find the offset, it's more complicated, but in short, look for the first time after the onset that the amplitude drops below the amplitude threshold, and only pick a value that could exist in sigmat.ftrack_taxis

The problem

I encountered an edge case pictured below, where the amplitude started above the threshold at the very beginning of the trial and then went below the threshold around 25 milliseconds in, before the formant tracking time scale started counting. Given our current code, the onset was set at time point 0, and the offset was set at time point 0.072, the first allowable value for the formant time axis. The code to determine the offset couldn't accommodate the offset being on the first sample of sigmat.ftrack_taxis and errored out.

Code error of the problem

In old L348, onsetIndFtrack was set to 1. Then in L355, offsetIndFtrack was set to 0, and finally in L359 offsetIndAmp couldn't be computed because sigmat.ftrack_taxis(offsetIndFtrack) tried to index on a value of 0.

Proposed solution

I think the root issue here is that we restrict the offset to allowable values for ftrack_taxis, but we don't do that for the onset. This pull request introduces a change to only allow the onset to be allowable values on ftrack_taxis. The onset will still be determined with the high sampling rate of sigmat.ampl, but will only consider values that also exist in ftrack_taxis.

Alternate solution

If we want to keep the onset-determining code the same, we could add a check to the offset-finding code to look for this edge case and error out helpfully, such as telling the user they need to put in user events.

Testing done

Since this is important code which we shouldn't change lightly, I made a script to compare dataVals.mat files created before this change and after this change. Ideally this code change should only fix edge cases, but not impact how the offset/onset are determined in "normal" trials. My simplistic measure was to see if the duration of a trial in dataVals.mat changed with the old vs new gen_dataVals_from_wave_viewer. I ran the script on 5 sample participants; this code change successfully did not affect the duration on any trials, except the one edge case which alerted me to the problem in the first place (pictured in The Problem.)

cwnaber · 2026-03-10T16:52:30Z

speech/gen_dataVals_from_wave_viewer.m


 % find the index of ftrack_taxis that's closest to and greater than ampl_taxis's onset index
 [~, onsetIndFtrack] = find(sigmat.ftrack_taxis - sigmat.ampl_taxis(onsetIndAmp)>0, 1); 
 if ~isempty(onsetIndFtrack)


This change is already on the master branch, from when I used this dataVals-vowel-timing-fix branch before. Kind of confusing, sorry.

cwnaber added 2 commits January 27, 2026 09:37

fix typo in variable name

ef83744

restrict onset index to within ftrack_taxis

bf9589d

cwnaber requested a review from carrien March 10, 2026 16:48

cwnaber marked this pull request as ready for review March 10, 2026 16:48

cwnaber commented Mar 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict dataVals vowel onset to times that can have formant values#165

Restrict dataVals vowel onset to times that can have formant values#165
cwnaber wants to merge 2 commits intomasterfrom
dataVals-vowel-timing-fix

cwnaber commented Mar 10, 2026 •

edited

Loading

Uh oh!

cwnaber Mar 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cwnaber commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background about time sampling

Background about determining onset and offset

The problem

Code error of the problem

Proposed solution

Alternate solution

Testing done

Uh oh!

cwnaber Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cwnaber commented Mar 10, 2026 •

edited

Loading

cwnaber Mar 10, 2026 •

edited

Loading