Advertising data – model fitting, interpretation

Suppose we model the product sales as a function of the TV, radio and newspaper advertising budgets:

\[y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \epsilon\]

where \(y=\) sales, \(x_1=\) TV, \(x_2=\) radio, \(x_3=\) newspaper.

That is,

\[{\tt sales} = \beta_0 + \beta_1{\tt TV} + \beta_2{\tt radio} + \beta_3{\tt newspaper} + \epsilon.\]

adv <- read.csv("http://www.stats.ox.ac.uk/~laws/LMs/data/advert.csv")
adv.lm <- lm(sales ~ TV + radio + newspaper, data = adv)

Note: R automatically includes an intercept.

To explcitly include an intercept \(\beta_0\), use: lm(sales ~ 1 + TV + radio + newspaper, data = adv)

To explicitly exclude an intercept, use: lm(sales ~ -1 + TV + radio + newspaper, data = adv)

Normally we will want to include an intercept.

options(digits = 3)
summary(adv.lm)
## 
## Call:
## lm(formula = sales ~ TV + radio + newspaper, data = adv)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.828 -0.891  0.242  1.189  2.829 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.93889    0.31191    9.42   <2e-16 ***
## TV           0.04576    0.00139   32.81   <2e-16 ***
## radio        0.18853    0.00861   21.89   <2e-16 ***
## newspaper   -0.00104    0.00587   -0.18     0.86    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.69 on 196 degrees of freedom
## Multiple R-squared:  0.897,  Adjusted R-squared:  0.896 
## F-statistic:  570 on 3 and 196 DF,  p-value: <2e-16

Interpreting the coefficients in this model:

Above, an “increase of 1” in a budget means an increase of one thousand dollars, and an “increase of 0.046” in sales means an increase of 0.19 thousand units of sales. That is, units of measurement matter. Equally, we could say the above amounts (0.046, 0.189, -0.001) give the predicted change in sales (in thousands) in the three cases.

These interpretations correspond to changing one explanatory variable while holding the others constant. However it is not always possible to change one explanatory variable while holding the others constant. E.g. suppose we have a model that includes both \(x\) and \(x^2\) as explanatory variables – it is not possible to change one of these two without changing the other. Some data will have similar (but not quite so extreme) features – some explanatory variables may be highly correlated, so as one variable changes another tends to change too.

We should not makes statements of causality such as “increasing \(x_1\) by 1 causes an increase of xxx in \(y\)”. Rather we prefer to say “increasing \(x_1\) by 1 is associated with an increase of xxx in \(y\)”. For example some other variable(s) could be the actual cause of increases in \(x_1\) and \(y\).