Skip to content

[BUG] Lags can only be applied to target variables #1587

@manmeet3591

Description

@manmeet3591

My data is as follows:
image

I want to train a model which takes previous 10 time steps of all 4 variables rainfall_, dmi_, nino12_ and nino34_ and gives the 11th step of rainfall_ as target. My training dataset is as follows:

Create time_idx and group_id

df = train_df.copy()
df['time_idx'] = df.index
df['group_id'] = 0  # Only one time series

# Define the TimeSeriesDataSet
max_encoder_length = 10  # Number of time steps for the encoder
max_prediction_length = 1  # Number of time steps for the prediction

training_dataset = TimeSeriesDataSet(
    data=df,
    time_idx="time_idx",
    target="rainfall_",
    group_ids=["group_id"],
    max_encoder_length=max_encoder_length,
    min_encoder_length=max_encoder_length,
    max_prediction_length=max_prediction_length,
    # time_varying_known_reals=["dmi_", "nino34_", "nino12_"],
    time_varying_unknown_reals=["dmi_", "nino34_", "nino12_", "rainfall_"],
    lags={"rainfall_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 
          "nino12_":  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]},  # Include previous 10 time steps of rainfall_
    # lags={
    #     "rainfall_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    #     "dmi_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    #     "nino34_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    #     "nino12_": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    # }
)

# Convert dataset to a dataloader for model training
train_dataloader = training_dataset.to_dataloader(train=True, batch_size=64)

This is throwing up the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-22-0a8da5b66b64>](https://localhost:8080/#) in <cell line: 10>()
      8 max_prediction_length = 1  # Number of time steps for the prediction
      9 
---> 10 training_dataset = TimeSeriesDataSet(
     11     data=df,
     12     time_idx="time_idx",

6 frames
[/usr/local/lib/python3.10/dist-packages/sklearn/base.py](https://localhost:8080/#) in _check_feature_names(self, X, reset)
    479                 )
    480 
--> 481             raise ValueError(message)
    482 
    483     def _validate_data(

ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- nino12__lagged_by_1
Feature names seen at fit time, yet now missing:
- nino12_

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    Reproduced/confirmed

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions