STAT 1103 Week 11 Notes: Simple Linear Regression

Summary

Difficulty: ★★★★☆

Covers: Simple linear regression, Regression vs correlation, Regression line and prediction equation, Residuals and prediction error, Explained variance and R-squared, Hypothesis testing of the slope, Regression assumptions, Running and interpreting regression in Stata

What is Regression?

Simple linear regression is a statistical method that predicts a numeric outcome (Y) from a numeric predictor (X).

It answers the question:

How much does Y change when X changes?

Regression is used when we want to predict, not just describe relationships.

Regression vs Correlation

Correlation and regression are closely related but serve different purposes.

Correlation (r)
Describes the strength and direction of the relationship between X and Y.

Regression
Uses that relationship to predict Y from X and explain variation in Y.

Regression can be thought of as an extension of correlation.

In simple linear regression:

R² = r²

This means the amount of variance explained in regression is based on the correlation.

Regression does not prove causation

Regression is usually used in correlational (non-experimental) research.

Even if X predicts Y:

This does not mean X causes Y.

Causation requires experimental design and converging evidence.

Regression shows prediction, not proof of cause.

When do we use regression?

Common research designs include:

  • cross-sectional surveys (measure variables once)
  • longitudinal studies (earlier measures predict later outcomes)

The regression line

Regression finds the line of best fit through a scatterplot.

This line summarises the relationship between X and Y.

The regression equation is:

ŷ = a + bX

Where:

SymbolMeaningPlain English
ŷpredicted Ypredicted outcome value
ainterceptpredicted Y when X = 0
bslopechange in Y for a 1-unit increase in X

Interpreting the slope (b)

The slope is the key result.

It tells you how much Y changes when X increases by one unit.

  • Positive b → higher X predicts higher Y
  • Negative b → higher X predicts lower Y

Example:

b = 0.26
Every +1 in X predicts +0.26 in Y.

Residuals (prediction errors)

A residual is the difference between:

observed Y − predicted Y

Residual = Y − ŷ

Residuals measure prediction error.

  • small residuals → good prediction
  • large residuals → poor prediction
Residual signMeaning
Positivepoint lies above the regression line
Negativepoint lies below the regression line

Regression uses least squares to minimise total error.

What Is Variance? (R²)

Regression explains how much variation in Y is accounted for by X.

R² represents:

explained variance ÷ total variance in Y

R² ranges from 0 to 1 and is often reported as a percentage.

Examples:

  • R² = .03 → 3% explained (small)
  • R² = .25 → 25% explained (large)

Hypothesis testing in regression

We test whether the slope differs from zero.

Hypotheses:

H₀: b = 0 (X does not predict Y)
H₁: b ≠ 0 (X predicts Y)

Decision rule:

  • p < .05 → significant predictor
  • p ≥ .05 → not significant

The test statistic is:

t = b / SE(b)

Assumptions of regression

Check three key assumptions:

  1. Relationship between X and Y is roughly linear
  2. Residuals are approximately normal
  3. Residual spread is constant (no pattern)

If these are met, results are interpretable.

Running regression in Stata

Regression command:

regress y x

Visual check:

graph twoway (scatter y x) (lfit y x)

Residual checks:

predict r, residual
histogram r
swilk r
rvfplot, yline(0)

Reading regression output

OutputMeaning
b (coefficient)direction and size of prediction
p-valuesignificance of predictor
proportion of variance explained

Using regression to predict

Once you know a and b:

ŷ = a + bX

You can predict Y for any value of X.

This is used for:

  • forecasting
  • policy decisions
  • real-world prediction

Reporting regression results

Significant result:

X significantly predicted Y, b = __, p = __, explaining __% of variance (R²).

Not significant:

X did not significantly predict Y, b = __, p = __.

🌼 About Daisy

Hi! I’m Daisy, the voice behind The Psych Diaries. I’m a psychology student sharing study notes, templates, and honest rambles about university life.

Read Daisy’s Diary →

Discover more from The Psych Diaries

Subscribe now to keep reading and get access to the full archive.

Continue reading