Linear regression and SAT scores

This example is just a way to provide some template code for a homework problem on linear regression.

The data here comes from the 2007 paper by L. M. Lesser titled Critical Values and Transforming Data: Teaching Statistics with Social Justice and is based on data collected by the College Board, the organization that runs the SAT exam for high-school students. This data includes average SAT Math scores for 10 different family annual income brackets. The homework problem uses this data to explore the relationship between income and SAT scores via linear regression.

This page comes from a single Julia file: sat-regress.jl.

You can access the source code for such Julia documentation using the 'Edit on GitHub' link in the top right. You can view the corresponding notebook in nbviewer here: sat-regress.ipynb, or open it in binder here: sat-regress.ipynb.

Setup

Add the Julia packages used in this demo. Change false to true in the following code block if you are using any of the following packages for the first time.

if false
    import Pkg
    Pkg.add([
        "LinearAlgebra"
        "MIRTjim"
        "Plots"
    ])
end

Tell Julia to use the following packages. Run Pkg.add() in the preceding code block first, if needed.

using Plots: default, gui, scatter, savefig
default(); default(label="", markerstrokecolor=:auto, widen=true, linewidth=2,
 markersize = 6, tickfontsize=14, labelfontsize = 18, legendfontsize=16)

Data

Normally we would read such data from a data file such as a .csv file using CSV.jl. This data is small enough to just paste here directly.

data = [
 "Income Bracket (in \$1000s)" "0 – 10" "10 – 20" "20 – 30" "30 – 40" "40 – 50" "50 – 60" "60 – 70" "70 – 80" "80 – 100" "100+"
 "Math" 457 465 474 488 501 509 515 521 534 564
 "Verbal" 429 445 462 478 493 500 505 511 523 549
 "Writing" 427 440 454 470 483 490 496 502 514 543
];

math = Int.(data[2,2:end]) # math scores
10-element Vector{Int64}:
 457
 465
 474
 488
 501
 509
 515
 521
 534
 564
income = [5:10:75; 90; 120] # middle of each range
10-element Vector{Int64}:
   5
  15
  25
  35
  45
  55
  65
  75
  90
 120
scatter(income, math; label="Data", legend = :bottomright,
 xaxis = ("Family Annual Income (1000\$)",),
 yaxis = ("SAT Average Math Score", (425,575), 425:50:575),
)
Example block output

This page was generated using Literate.jl.