Skip to content

Conversation

@etiennekintzler
Copy link
Contributor

Importance weight is part of the VW specification for Simple label (see https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Input-format#simple) but was not implemented/forgotten in DFtoVW.SimpleLabel.

For instance, the previous SimpleLabel class could not handle this user's needs: https://stackoverflow.com/questions/65385962/how-to-convert-csv-columns-into-vowpal-wabbit-txt-input-file/65387264#65387264

I named the attribute weight but I can change it to importance if necessary.

The usage is the following:

import pandas as pd
from vowpalwabbit.DFtoVW import DFtoVW, Feature, SimpleLabel

df = pd.DataFrame({
    "y": [1, 2, -1],
    "w": [2.5, 1.2, 3.75],
    "x": ["a", "b", "c"]
})

sl = SimpleLabel(label="y", weight="w")
conv = DFtoVW(df=df, label=sl, features=Feature("x"))
conv.convert_df()

# ['1 2.5 | x=a', '2 1.2 | x=b', '-1 3.75 | x=c']

@etiennekintzler etiennekintzler changed the title [py] DFtoVW -> Add weight attribute to SimpleLabel feat: [py] DFtoVW -> Add weight attribute to SimpleLabel Jun 2, 2021
else:
out = label_col
out += " " + self.weight.get_col(df)
return out
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly linked to this PR (but shorter)

Copy link
Collaborator

@olgavrou olgavrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm :)

@jackgerrits jackgerrits enabled auto-merge (squash) June 3, 2021 14:44
@jackgerrits jackgerrits merged commit 110a3e4 into VowpalWabbit:master Jun 3, 2021
@etiennekintzler etiennekintzler deleted the add_weight_to_simplelabel branch June 3, 2021 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants