-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Code that doesn't work
julia> using DataFrames, Lasso
julia> df = DataFrame(x=randn(100), y=3randn(100) .+ 1);
julia> fit(LassoModel, @formula(x ~ 1 + y), df)
ERROR: ArgumentError: Model type LassoModel doesn't support intercept specified in formula x ~ 1 + y
Stacktrace:
[1] apply_schema(t::FormulaTerm{Term, Tuple{ConstantTerm{Int64}, Term}}, schema::StatsModels.Schema, Mod::Type{LassoModel})
@ StatsModels ~/.julia/packages/StatsModels/fK0P3/src/schema.jl:288
[2] ModelFrame(f::FormulaTerm{Term, Tuple{ConstantTerm{Int64}, Term}}, data::NamedTuple{(:x, :y), Tuple{Vector{Float64}, Vector{Float64}}}; model::Type{LassoModel}, contrasts::Dict{Symbol, Any})
@ StatsModels ~/.julia/packages/StatsModels/fK0P3/src/modelframe.jl:84
[3] kwcall(::NamedTuple{(:model, :contrasts), Tuple{UnionAll, Dict{Symbol, Any}}}, ::Type{ModelFrame}, f::FormulaTerm{Term, Tuple{ConstantTerm{Int64}, Term}}, data::NamedTuple{(:x, :y), Tuple{Vector{Float64}, Vector{Float64}}})
@ StatsModels ~/.julia/packages/StatsModels/fK0P3/src/modelframe.jl:73
[4] fit(::Type{LassoModel}, ::FormulaTerm{Term, Tuple{ConstantTerm{Int64}, Term}}, ::DataFrame; contrasts::Dict{Symbol, Any}, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ StatsModels ~/.julia/packages/StatsModels/fK0P3/src/statsmodel.jl:85
[5] fit(::Type{LassoModel}, ::FormulaTerm{Term, Tuple{ConstantTerm{Int64}, Term}}, ::DataFrame)
@ StatsModels ~/.julia/packages/StatsModels/fK0P3/src/statsmodel.jl:78
[6] top-level scope
@ REPL[7]:1Why can I not manually specify an intercept like @formula(x ~ 1 + y)? The documentation ?@formula says:
1,0, and-1indicate the presence (for1) or absence (for0and-1) of an intercept column.
So 1 is a valid intercept specification, like in R. This @formula also works in GLM.lm.
Code that works
If I write @formula(x ~ y), Lasso.jl will automatically fit a model with an intercept:
julia> fit(LassoModel, @formula(x ~ y), df)
StatsModels.TableRegressionModel{LassoModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, MinAICc}, Matrix{Float64}}
x ~ y
Coefficients:
LassoModel using MinAICc(2) segment of the regularization path.
Coefficients:
──────────────
Estimate
──────────────
x1 -0.132743
x2 0.0497596
──────────────I assume the first coefficient is the intercept and the second one is multiplied by y, so the model is:
x = -0.132743 + 0.0497596 * y
So, intercepts are supported, but I can't manually specify that I want an intercept.
More code that doesn't work
Let's fit a model without an intercept. I specify this with the 0 in @formula(x ~ 0 + y).
julia> fit(LassoModel, @formula(x ~ 0 + y), df)
StatsModels.TableRegressionModel{LassoModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, MinAICc}, Matrix{Float64}}
x ~ 0 + y
Coefficients:
LassoModel using MinAICc(2) segment of the regularization path.
Coefficients:
──────────────
Estimate
──────────────
x1 -0.132743
x2 0.0497596
──────────────It seems like the package ignored the zero in the formula, fitted an intercept -0.132743 anyway and produced the same model as above, even though the @formula is different. R's glmnet supports fitting without an intercept since 2013.
It would be nice if it were possible to specify the intercept in the formula.
Versions
- Julia v1.9-beta2
- Lasso v0.7.0