Exercise for SIENA workshop at ASNA: Can clustering be completely described by homophily?

In doing this exercise we are going to put on two different hats: the reason there is clustering in networks is that people chose similar others or the clustering occurs as a result of endogenous processes. The data was collected by Andrea Knecht and used in the tutorial Snijders et al. (2010) and is described on the SIENA website (Data sets; this exercise also borrows heavily from an assignment designed by Christian Steglich).

Network data: 

There is network data for 4 waves and 25 actors, klas03e-friends-waveA.dat, klas03e-friends-waveB.dat, klas03e-friends-waveC.dat, klas03e-friends-waveD.dat. The missing data code is “9”.

Constant covariate:

Sex for each of the 25 actors is the first column of klas03e-demographics.dat (1 = girl, 2=boy)

Dyadic covariate:

A dyadic dummy indicating prior friendship in primary school, klas03e-primary.dat

Read in the data and plot the sociograms of wave 1 and wave 4 (and 2 and 3 if you like), with vertices coloured by “sex”.  Does there appear to exist clustering with respect to sex?

Model I

Fit a model that only has density; sender, receiver and similarity effect of sex; and, main effect of the dyadic covariate. Note that “reciprocity” is included by default and that you therefore have to turn this effect off (setting “include = FALSE”  in “includeEffects( )” ). According to this model actors form ties independently of each other conditional only on the exogenous attributes – there are no endogenous network effects.

Do boys or girls or girls send more ties? Receive more ties? Is the intuition from the sociograms supported by the model, i.e. is there a homophily effect of sex?

Model II

To model I, add reciprocity, transitive triplets, 3-cycles, indegree popularity (sqrt) and outdegree popularity (sqrt). These all represent endogenous network processes.

Do the effects of “sex” remain?

Model III

Remove the sender, receiver and similarity effect of sex; and, main effect of the dyadic covariate.  

Do the parameters and standard errors differ much from model II? Are there any structural effects that were explained by sex?

Looking at the structural effects (in both Model II and III), is there (local) clustering beyond what is explained by sex? Are 3-cycles consistent with (local) hierarchy and if so does the friendship network show signs of being hierarchical? Are there endogenous popularity effects?