-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: Privacy Preserving Learning #3485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Privacy Preserving Learning #3485
Conversation
…vsinghal157/vowpal_wabbit into RLOS_21_Privacy_Preserving_Learning
| uint32_t _stride_shift; | ||
| bool _seeded; // whether the instance is sharing model state with others | ||
| size_t _privacy_activation_threshold; | ||
| std::unordered_map<uint64_t, std::bitset<32>> _feature_bitset; // define the bitset for each feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be a unique_ptr set to nullptr unless the privacy mode is on, to be a bit explicit and avoid extra memory allocations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that suggestion, making it a shared ptr due to shallow copy fn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| { | ||
| private: | ||
| // struct to store the tag hash and if it is set or not | ||
| struct tag_hash_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish it could be an optional
…vsinghal157/vowpal_wabbit into RLOS_21_Privacy_Preserving_Learning
| { | ||
| INTERACTIONS::generate_interactions<audit_regressor_data, const uint64_t, audit_regressor_feature, true, | ||
| audit_regressor_interaction, sparse_parameters>(rd.all->interactions, rd.all->extent_interactions, | ||
| audit_regressor_interaction, sparse_parameters, true>(rd.all->interactions, rd.all->extent_interactions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| audit_regressor_interaction, sparse_parameters, true>(rd.all->interactions, rd.all->extent_interactions, | |
| audit_regressor_interaction, sparse_parameters, true /*privacy_activation*/>(rd.all->interactions, rd.all->extent_interactions, |
| GD::foreach_feature<std::pair<float, float>, float, vec_add_with_norm, LazyGaussian>(w, all->ignore_some_linear, | ||
| all->ignore_linear, all->interactions, all->extent_interactions, all->permutations, *ec, dotwithnorm, | ||
| all->_generate_interactions_object_cache); | ||
| if (all->privacy_activation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems not great that this is new global state
| float weight = 1.f; // a relative importance weight for the example, default = 1 | ||
| v_array<char> tag; // An identifier for the example. | ||
| size_t example_counter = 0; | ||
| uint64_t tag_hash; // Storing the hash of the tag for privacy preservation learning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initialize
| if (b.all->weights.sparse && privacy_activation) | ||
| { | ||
| b.all->weights.sparse_weights.set_tag( | ||
| hashall(ec.tag.begin(), ec.tag.size(), b.all->hash_seed) % b.all->feature_bitset_size); | ||
| GD::foreach_feature<ftrl_update_data, inner_update_proximal>(*b.all, ec, b.data); | ||
| b.all->weights.sparse_weights.unset_tag(); | ||
| } | ||
| else if (!b.all->weights.sparse && privacy_activation) | ||
| { | ||
| b.all->weights.dense_weights.set_tag( | ||
| hashall(ec.tag.begin(), ec.tag.size(), b.all->hash_seed) % b.all->feature_bitset_size); | ||
| GD::foreach_feature<ftrl_update_data, inner_update_proximal>(*b.all, ec, b.data); | ||
| b.all->weights.dense_weights.unset_tag(); | ||
| } | ||
| else | ||
| { | ||
| GD::foreach_feature<ftrl_update_data, inner_update_proximal>(*b.all, ec, b.data); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a massive uptick in complexity when just doing an update. Can this be abstracted?
| { | ||
| // iterate through one namespace (or its part), callback function FuncT(some_data_R, feature_value_x, feature_index) | ||
| template <class DataT, void (*FuncT)(DataT&, float feature_value, uint64_t feature_index), class WeightsT> | ||
| template <class DataT, void (*FuncT)(DataT&, float feature_value, uint64_t feature_index), bool privacy_activation> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
privacy_activation seems unused?
|
Replaced by #3334 |
Part of the Empirical Analysis of Privacy Preserving Learning Project.
This PR introduces a command line argument that implements aggregated learning by saving only those features that have seen a minimum threshold of users thus upholding the privacy of the user.
Methodology:
vowpalwabbit/array_parameters.handvowpalwabbit/array_parameters_dense.h)vowpalwabbit/parser.cc)vowpalwabbit/gd_predict.h-> (vowpalwabbit/array_parameters.handvowpalwabbit/array_parameters_dense.h))vowpalwabbit/gd.cc->(vowpalwabbit/array_parameters.handvowpalwabbit/array_parameters_dense.h))(The default value of the threshold is 10)
This PR includes:
vowpalwabbit/parse_args.cc)test/core.vwtest.json)test/unit_test/weights_test.cc)test/benchmarks/standalone/benchmark_text_input.cc)Implementation details:
--privacy_activation: To activate the feature--privacy_activation_threshold arg (=10): To set the thresholdFuture Work:
Wiki page for the same : https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Privacy-Preserving-Learning