Logistic regression - QN
Binary classification via logistic regression using Quasi-Newton optimizer in Julia.
This page comes from a single Julia file: logistic2.jl.
You can access the source code for such Julia documentation using the 'Edit on GitHub' link in the top right. You can view the corresponding notebook in nbviewer here: logistic2.ipynb, or open it in binder here: logistic2.ipynb.
Setup
Add the Julia packages used in this demo. Change false to true in the following code block if you are using any of the following packages for the first time.
if false
import Pkg
Pkg.add([
"ADTypes"
"InteractiveUtils"
"LaTeXStrings"
"LinearAlgebra"
"MIRTjim"
"Optim"
"Plots"
"Random"
"Statistics"
])
endTell Julia to use the following packages. Run Pkg.add() in the preceding code block first, if needed.
using ADTypes: AutoForwardDiff
using InteractiveUtils: versioninfo
using LaTeXStrings
using LinearAlgebra: dot, eigvals
using MIRTjim: prompt
using Optim: optimize
import Optim # Options
using Plots: default, gui, savefig
using Plots: histogram!, plot, plot!, scatter, scatter!
using Random: seed!
using Statistics: mean
default(); default(markersize=6, linewidth=2, markerstrokecolor=:auto, label="",
tickfontsize=12, labelfontsize=18, legendfontsize=18, titlefontsize=18)The following line is helpful when running this file as a script; this way it will prompt user to hit a key after each figure is displayed.
isinteractive() ? prompt(:prompt) : prompt(:draw);Data
Generate synthetic data from two classes
if !@isdefined(yy)
seed!(0)
n0 = 60
n1 = 50
mu0 = [-1, 1]
mu1 = [1, -1]
v0 = mu0 .+ randn(2,n0) # class -1
v1 = mu1 .+ randn(2,n1) # class 1
nex = 0
if false
nex = 4 # extra dim (beyond the 2 shown) to make "larger scale"
v0 = [v0; rand(nex,n0)] # (2+nex, n0)
v1 = [v1; rand(nex,n1)] # (2+nex, n1)
end
M = n0 + n1 # how many samples
yy = [-ones(Int, n0); ones(Int, n1)] # (M) labels
vv = [[v0 v1]; ones(1,M)] # (npar, M) training data - with intercept
npar = 3 + nex # unknown parameters
end;Scatter plot and initial decision boundary
if !@isdefined(ps)
x0 = [-1; 3; rand(nex); 5]
v1p = range(-1,1,101) * 4
v2p_fun(x) = @. (-x[end] - x[1] * v1p) / x[2]
ps = plot(aspect_ratio = 1, size = (550, 500), legend = :topright,
xaxis = (L"v_1", (-4, 4), [-4 -1 0 1 4]),
yaxis = (L"v_2", (-4, 4), [-4 -1 0 1 4]),
)
plot!(v1p, v2p_fun(x0), color=:red, label="initial")
plot!(v1p, v1p, color=:yellow, label="ideal")
alpha = 0.7
scatter!(v0[1,:], v0[2,:], color = :green; alpha)
scatter!(v1[1,:], v1[2,:], color = :blue, marker = :square; alpha)
end
psprompt()Cost function
Logistic regression with Tikhonov regularization involves minimizing the following cost function:
\[f(x) = 1_M' h.(A x) + (β/2) ‖ x ‖_2^2\]
where $h(z) = \log(1 + e^{-z})$ is the logistic loss function.
Here $A$ is $M × N$ matrix with $M$ samples of $N$ features along each row (typically including the intercept $1$). The $m$th row of $A$ has already been multiplied by the $m$th binary class label that is ±1.
The cost function gradient is $∇ f(x) = A' \dot{h}.(A x) + β x$, and its Lipschitz constant is $‖A‖_2^2 / 4 + β$.
After optimizing $x$, the classifier is simply $\text{sign}(⟨v,x⟩)$ where the feature vector $v$ typically includes the intercept $1$.
if !@isdefined(cost)
pot(t) = log(1 + exp(-t)) # logistic
dpot(t) = -1 / (exp(t) + 1)
tmp = vv * vv' # (npar, npar) covariance
tmp = eigvals(tmp)
@show maximum(tmp) / minimum(tmp)
pLip = maximum(tmp) / 4 # 1/4 comes from logistic curvature
reg = 0 # no regularization because N ≪ M here
Lip = pLip + reg # Lipschitz constant
A = yy .* vv' # M × N matrix of features times labels
gfun(x) = A' * dpot.(A * x) + reg * x # gradient
if false
tmp = gfun(x0)
@show size(tmp)
end
cost(x::AbstractVector) = sum(pot, A * x) + reg/2 * sum(abs2, x)
cost(x::AbstractMatrix) = cost.(eachcol(x)) ## to handle arrays
end;maximum(tmp) / minimum(tmp) = 3.459135516758462L-BFGS optimizer
opt = Optim.Options(
store_trace = true,
show_warnings = false,
extended_trace = true, # for trace of x
)
outq = optimize(cost, gfun, x0, opt;
inplace = false, autodiff = AutoForwardDiff())
xqs = hcat(Optim.x_trace(outq)...)
xq = outq.minimizer
xh = xqs[:,end] # final estimate3-element Vector{Float64}:
1.849573391951605
-2.4828583265875577
-0.7377052753332445Plot cost
ifun(xs) = 0:(size(xs,2)-1)
pc = plot(xaxis = ("iteration", (0,16), 0:4:16), yaxis = ("Cost function",))
plot!(ifun(xqs), cost(xqs) .- cost(xh), label = "QN", marker = :o)prompt()Plot decision boundaries
if true
psh = deepcopy(ps)
v2p = @. (-xh[end] - xh[1] * v1p) / xh[2]
plot!(psh, v1p, v2p, color = :magenta, label = "final")
end
pshprompt()Plot iterate convergence
efun1(x) = vec(sqrt.(sum(abs2, x .- xh, dims=1)))
efun(x) = log10.(efun1(x))
pic = plot(
xaxis = ("Iteration", (0, 16), 0:2:16),
yaxis = (L"\log_{10}(‖ \mathbf{x}_k - \mathbf{x}_* ‖)", (-9, 3), -9:3),
legend = :topright,
)
plot!(ifun(xqs), efun(xqs), label = "QN", marker = :o)
picprompt()Plot 1D separation
inprod0 = [v0; ones(1,n0)]' * xh
inprod1 = [v1; ones(1,n1)]' * xh
accuracy0 = round(count(<(0), inprod0) / n0 * 100, digits=1)
accuracy1 = round(count(>(0), inprod1) / n1 * 100, digits=1)
plot(xaxis=("⟨x,v⟩",))
bins = -15:15
alpha = 0.5
histogram!(inprod0; alpha, bins, color = :green, linecolor = :green,
label = "class 0: $accuracy0%")
histogram!(inprod1; alpha, bins, color = :blue, linecolor = :blue,
label = "class 1: $accuracy1%")prompt()Method
Stand-alone function for (regularized) logistic regression
"""
xh = logistic(data, label, reg)
Perform regularized logistic regression for binary `label`s
by minimizing
``f(x) = 1_M' h.(A x) + β/2 ‖ x ‖_2^2``
where
``h(z) = log(1 + e^{-z})``
is the logistic loss function.
In:
- `data` `N × M` where `N` is number of features (including offset)
- `label` vector of `M` labels ±1
- `reg` regularization parameter
Out:
- `xh` minimizer of ``f``
"""
function logistic(data::AbstractMatrix, labels::AbstractVector, reg::Real)
any(x -> ∉(x, (-1,1)), labels) && throw("labels must be ±1")
pot(t) = log(1 + exp(-t)) # logistic
dpot(t) = -1 / (exp(t) + 1) # derivative
tmp = data * data' # (N, N) covariance
tmp = eigvals(tmp)
pLip = maximum(tmp) / 4 # 1/4 comes from logistic curvature
Lip = pLip + reg # Lipschitz constant
A = labels .* data'
cost(x) = sum(pot, A * x) + reg/2 * sum(abs2, x)
gfun(x) = A' * dpot.(A * x) + reg * x # gradient
x0 = zeros(size(data,1))
outq = optimize(cost, gfun, x0;
inplace = false, autodiff = AutoForwardDiff())
return outq.minimizer
end;
xl = logistic(vv, yy, reg)
@assert xl ≈ xhReproducibility
This page was generated with the following version of Julia:
using InteractiveUtils: versioninfo
io = IOBuffer(); versioninfo(io); split(String(take!(io)), '\n')12-element Vector{SubString{String}}:
"Julia Version 1.12.5"
"Commit 5fe89b8ddc1 (2026-02-09 16:05 UTC)"
"Build Info:"
" Official https://julialang.org release"
"Platform Info:"
" OS: Linux (x86_64-linux-gnu)"
" CPU: 4 × AMD EPYC 7763 64-Core Processor"
" WORD_SIZE: 64"
" LLVM: libLLVM-18.1.7 (ORCJIT, znver3)"
" GC: Built with stock GC"
"Threads: 1 default, 1 interactive, 1 GC (on 4 virtual cores)"
""And with the following package versions
import Pkg; Pkg.status()Status `~/work/book-la-demo/book-la-demo/docs/Project.toml`
[47edcb42] ADTypes v1.21.0
[6e4b80f9] BenchmarkTools v1.6.3
[aaaa29a8] Clustering v0.15.8
[35d6a980] ColorSchemes v3.31.0
[3da002f7] ColorTypes v0.12.1
[c3611d14] ColorVectorSpace v0.11.0
[717857b8] DSP v0.8.4
[72c85766] Demos v0.1.0 `~/work/book-la-demo/book-la-demo`
[e30172f5] Documenter v1.17.0
[4f61f5a4] FFTViews v0.3.2
[7a1cc6ca] FFTW v1.10.0
[587475ba] Flux v0.16.9
[a09fc81d] ImageCore v0.10.5
[9ee76f2b] ImageGeoms v0.11.2
[71a99df6] ImagePhantoms v0.8.1
[b964fa9f] LaTeXStrings v1.4.0
[7031d0ef] LazyGrids v1.1.0
[599c1a8e] LinearMapsAA v0.12.0
[98b081ad] Literate v2.21.0
[7035ae7a] MIRT v0.18.3
[170b2178] MIRTjim v0.26.0
[eb30cadb] MLDatasets v0.7.21
[efe261a4] NFFT v0.14.3
[6ef6ca0d] NMF v1.0.3
[15e1cf62] NPZ v0.4.3
[0b1bfda6] OneHotArrays v0.2.10
[429524aa] Optim v2.0.1
[91a5bcdd] Plots v1.41.6
[f27b6e38] Polynomials v4.1.1
[2913bbd2] StatsBase v0.34.10
[1986cc42] Unitful v1.28.0
[d6d074c3] VideoIO v1.6.1
[b77e0a4c] InteractiveUtils v1.11.0
[37e2e46d] LinearAlgebra v1.12.0
[44cfe95a] Pkg v1.12.1
[9a3f8284] Random v1.11.0This page was generated using Literate.jl.