Skip to content

Commit 77966c4

Browse files
authored
fix: refine the prompt (microsoft#286)
1 parent f2831e7 commit 77966c4

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

rdagent/scenarios/kaggle/experiment/prompts.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ kg_feature_interface: |-
114114
3. Ensure consistency in column count across train, validation, and test sets post-feature engineering. For example, fit PCA on the training set and apply the same transformation to validation and test sets to keep the number of columns aligned, and use OneHotEncoder may also cause different number of columns.
115115
4. Ensure that the generation of new features does not drastically increase the number of columns, which can slow down data processing. For example, avoid creating pairwise interactions for all features, as this would lead to a quadratic increase in the number of columns.
116116
5. Avoids raising a `ValueError` or any other exceptions that could interrupt the main program's flow. The code should not include checks that could potentially lead to a `ValueError`. Instead, focus on writing robust and fault-tolerant feature engineering functions that handle edge cases and missing data gracefully, without stopping the program.
117+
6. Specific categories of features can be filtered, and processing can be applied to those categories. For example, normalization can be applied to float-type features, but such processing should not be done on one-hot encoded features.
117118
118119
kg_model_interface: |-
119120
Your code should contain several parts:
@@ -312,4 +313,4 @@ kg_model_output_format: |-
312313
313314
kg_model_simulator: |-
314315
The models will be trained on the competition dataset and evaluated on their ability to predict the target. Metrics like accuracy and AUC-ROC is used to evaluate the model performance.
315-
Model performance will be iteratively improved based on feedback from evaluation results.
316+
Model performance will be iteratively improved based on feedback from evaluation results.

0 commit comments

Comments
 (0)