cat("An example based on made-up data to demonstrate the difference between the two types of interaction terms that can arise in GLM analyses.\n\nIn this example, we imagine that we have points in categories A and B, one response variable Z and two predictor variables X and Y (it does not matter what they represent).\n\n")

choice1=menu(c("In category B, the Z numbers are random","In category B, the Z numbers follow a similar functional dependency to category A","In category B, the Z numbers follow a different functional dependency to category A"))
#choice1=3

Xuncert=0.5
Yuncert=70

#Category A
na=7
Xa=rnorm(na,mean=15,sd=5)
Ya=rnorm(na,mean=0,sd=Yuncert)
Za=(5*Xa)+3+rnorm(na,mean=0,sd=Xuncert)
fit1=glm(Za~Xa+Ya)
summary(fit1)
#From model selection, we drop Ya and the fit becomes:
fit1=glm(Za~Xa)
summary(lm(Za~Xa))$r.squared	#As expected, r2 is high ...
fit1	#...and the coefficients are approximately 3 for Intercept and 5 for Xa ... (call this functional dependence F)


#Category B:
nb=30
Xb=rnorm(nb,mean=0,sd=1)
Yb=rnorm(nb,mean=0,sd=Yuncert)
if (choice1==1) {Zb=mean(Za)+rnorm(nb,mean=0,sd=Xuncert)}
if (choice1==2) {Zb=(10*Xb)+45+rnorm(nb,mean=0,sd=Xuncert)}
if (choice1==3) {Zb=(10*Yb)+45+rnorm(nb,mean=0,sd=Xuncert)}
fit2=glm(Zb~Xb+Yb)
summary(fit2)

#For choice1==1: From model selection, we drop Xb and Yb and the fit becomes:
fit2=glm(Zb~1)
summary(lm(Zb~1))$r.squared	#For choice1==1: As expected, r2 is low ...
fit2	#For choice1==1: ...and the best fit is the null model with just an Intercept close to the value of mean(Za)

#For choice1==2: From model selection, we drop Yb and the fit becomes:
fit2=glm(Zb~Xb)
summary(lm(Zb~Xb))$r.squared	#For choice1==2: As expected, r2 is high ...
fit2	#For choice1==2: ...and the best fit is the linear fit with coefficients 45 (Intercept) and 10 (Xb)

#For choice1==3: From model selection, we drop Xb and the fit becomes:
fit2=glm(Zb~Yb)
summary(lm(Zb~Yb))$r.squared	#For choice1==3: As expected, r2 is high ...
fit2	#For choice1==3: ...and the best fit is the linear fit with coefficients 45 (Intercept) and 10 (Xb)


#Now analyse all in one go with interaction terms:
catgry=c(rep("A",times=na),rep("B",times=nb))
X=c(Xa,Xb);Y=c(Ya,Yb);Z=c(Za,Zb)
df=data.frame(catgry,X,Y,Z)
#Y is independent of catgry (same formula used to generate it irrespective of the value of catgry), but X isn't.
fit3=glm(Z~X*Y*catgry,data=df)
summary(fit3)

cat("See script code comments for an explanation of what this all means.\n")

#With choice1==1, the GLM succeeds in returning the right result: during model selection, we drop X:Y:catgry, X:Y, Y:catgry and Y (in that order) and the fit then shows something like:
#Coefficients:
#            Estimate Std. Error t value Pr(>|t|)    
#(Intercept)  2.41140    0.45690   5.278 8.13e-06 ***
#X            5.03250    0.03118 161.390  < 2e-16 ***
#catgryB     71.47891    0.46066 155.166  < 2e-16 ***
#X:catgryB   -5.05751    0.07732 -65.414  < 2e-16 ***
#which is interpreted as (approx.) Z=(5*X)+3+ifelse(catgry==B,mean(Za)-3-(5*X),0) , i.e. Z=(5*X)+3 for Category A and Z=mean(Za) for Category B.

#With choice1==2, the GLM also succeeds in returning the right result: during model selection, we drop X:Y:catgry, X:Y, Y:catgry and Y (in that order) and the fit then shows something like:
#Coefficients:
#            Estimate Std. Error t value Pr(>|t|)    
#(Intercept)  2.72072    0.35834   7.593 9.77e-09 ***
#X            5.00045    0.02498 200.147  < 2e-16 ***
#catgryB     42.26303    0.36264 116.541  < 2e-16 ***
#X:catgryB    4.94054    0.07428  66.517  < 2e-16 ***
#which is interpreted as (approx.) Z=(5*X)+3+ifelse(catgry==B,42+(5*X),0) , i.e. Z=(5*X)+3 for Category A and Z=(10*X)+45 for Category B.

#With choice1==3, the GLM also succeeds in returning the right result: during model selection, we drop X:Y:catgry and X:Y (in that order) and the fit then shows something like:
#Coefficients:
#             Estimate Std. Error  t value Pr(>|t|)    
#(Intercept)  2.666115   0.380620    7.005 7.31e-08 ***
#X            5.014154   0.024901  201.364  < 2e-16 ***
#Y           -0.002272   0.001881   -1.208    0.236    
#catgryB     42.336590   0.384565  110.090  < 2e-16 ***
#X:catgryB   -5.070247   0.064399  -78.732  < 2e-16 ***
#Y:catgryB   10.002734   0.002032 4921.584  < 2e-16 ***
#(we keep Y, but it's effect is negligible). This is interpreted as (approx.) Z=(5*X)+3+ifelse(catgry==B,42-(5*X)+(10*Y),0) , i.e. Z=(5*X)+3 for Category A and Z=45+(10*Y) for Category B. Note, however, that we only got to this result by allowing Y:catgry as an interaction term.


cat("Essentially, it all works fine in these examples IF we allow that mysterious term Y:catgry . This is the interaction term!\n\n")

cat("This interaction term is not the same as an association or collinearity term: A:B being significant does NOT mean the opposite of \"A and B are independent\" (a concept that does not involve the response variable Z at all). It actually means something defined in terms of Z: \"A and B have an interaction wrt Z if the dependence of Z on A is modified by a change in the value of B and/or the dependence of Z on B is modified by a change in the value of A\". This is a completely different statistical concept (e.g. Y here is independent of catgry but we need Y:catgry anyway to get the right GLM fit).\n")

cat("This interaction term is also not the same as C:D where C and D are continuous variables: that is the multiplicative quantity C*D. Interaction terms for categorical variables only arise when you include categorical variables (obviously). One way around the need for them is to rerun your GLM fits for each subset of your data 'by hand'.\n")