Describe the bug
There are two logic bugs in the function validate_and_prepare_single_dict_task, which can lead to incorrect covariate alignment and wrong task_n_future_covariates counting.
-
Incorrect task_n_future_covariates counting
The function currently sets:
task_n_future_covariates = len(task_future_covariates_list)
However, task_future_covariates_list is built by iterating through all past_covariates keys (both past-only and known-future).
This makes task_n_future_covariates incorrectly equal to the total number of covariates, not the number of known-future covariates.
The returned value thus disrupts the following function _construct_slice, and its task_n_past_only_covariates, which are for bacth-building when trainning.
-
Covariate row order not guaranteed
Even though the keys are sorted, the current logic does not ensure that “known-future” covariates appear as the last rows in task_future_covariates_tensor. Also mislead _construct_slice in
task_future_covariates = task_past_tensor[
-task_n_future_covariates:, slice_idx : slice_idx + self.prediction_length
]
Expected behavior
task_n_future_covariates should reflect only the number of known-future covariates (len(task_future_covariates_keys)).
- Both tensors (
task_context_tensor, task_future_covariates_tensor) should have consistent row ordering:
- Past-only covariates first
- Known-future covariates last
To reproduce
You can reproduce the issue using the following minimal example:
task = {
"target": np.arange(10),
"past_covariates": {
"temp": np.arange(10),
"holiday": np.array(["yes", "no", "no", "yes", "no", "yes", "no", "no", "no", "yes"]),
},
"future_covariates": {
"holiday": np.array(["no", "yes", "yes", "no"]),
},
}
out = validate_and_prepare_single_dict_task(task, idx=0, prediction_length=4)
print(out[-1]) # task_n_future_covariates
Expected result: 1 (only holiday has future values)
Current result: 2 (counts both temp and holiday)
Proposed fix
This PR(#344) separates past-only and future-known covariates explicitly, and fixes the counting logic:
Code diff
# Separate keys to ensure correct ordering
past_only_keys = [k for k in task_past_covariates_keys_all if k not in task_future_covariates_keys]
ordered_covariate_keys = past_only_keys + task_future_covariates_keys
# Build tensors in the same order
for key in ordered_covariate_keys:
...
# Correct counting logic
task_n_future_covariates = len(task_future_covariates_keys)
Describe the bug
There are two logic bugs in the function
validate_and_prepare_single_dict_task, which can lead to incorrect covariate alignment and wrongtask_n_future_covariatescounting.Incorrect
task_n_future_covariatescountingThe function currently sets:
However,
task_future_covariates_listis built by iterating through allpast_covariateskeys (both past-only and known-future).This makes
task_n_future_covariatesincorrectly equal to the total number of covariates, not the number of known-future covariates.The returned value thus disrupts the following function
_construct_slice, and itstask_n_past_only_covariates, which are for bacth-building when trainning.Covariate row order not guaranteed
Even though the keys are sorted, the current logic does not ensure that “known-future” covariates appear as the last rows in
task_future_covariates_tensor. Also mislead_construct_sliceinExpected behavior
task_n_future_covariatesshould reflect only the number of known-future covariates (len(task_future_covariates_keys)).task_context_tensor,task_future_covariates_tensor) should have consistent row ordering:To reproduce
You can reproduce the issue using the following minimal example:
Expected result:
1(onlyholidayhas future values)Current result:
2(counts bothtempandholiday)Proposed fix
This PR(#344) separates past-only and future-known covariates explicitly, and fixes the counting logic:
Code diff