Description students data set Evelien Zeggelink

This data set was collected by Evelien Zeggelink in collaboration with Gerhard van de Bunt. It is similar to the van de Bunt data, but collected in a later year.

Download the data set.


The data were collected among the freshman cohort majoring in Sociology in the University of Groningen in 1998-1999. Except for a few existing relationships (acquaintances from a former school), the students did not know each other at the first measurement. The data were collected at seven time points, but two of these are missing. The first time point, t0, was at the start of the freshmen year; times t2 and t3 were 6 and 9 weeks later, respectively; t5 and t6 were about 21 and 27 weeks after t0. The number of students was 34, with varying amounts of nonresponse at the later moments.

Sociology in Groningen is a relatively small discipline. In the 1990s, the number of freshmen each year varied roughly between 30 and 50. In the Netherlands, in principle everyone with the highest high school certificate can enter university. Students enter college right after high school (at the age of about 18 years), or after having finished higher vocational training first (at the age of about 22). The latter students followed a special program of two years instead of the regular four-year program. This means that these students followed different classes at different points in time than the regular students. The classes of these two programs overlapped only partly. As a result, 'program' has aspects of a proximity variable. All classes took place in the same building, in which the classrooms, offices, and cafeteria were situated. During class breaks students could drink coffee or tea, or have lunch, in this cafeteria. The cafeteria was divided into a smoking area, and a non-smoking area upstairs. This provides us the second proximity variable, smoking behavior, because students who wanted to smoke needed to use a different part of the cafeteria.

The two programs are more than a proximity variable because students who followed the same program are in general of about the same age. Those who followed the regular program will on average have been 18 years old, and those who followed the short program on average 22 years. Moreover, many of those who followed the short program had left their parents' home earlier, and had lived in the city of Groningen for a longer period. As such, their starting point differed tremendously from that of the "younger" students. The academic aspirations of the students of the short program will also be stronger in general. They are more mature and probably made a more deliberate choice to study sociology. As a result, the program variable captures not only proximity effects but also a number of similarity effects.

The first questionnaire was filled out while most of the freshmen participated in the so-called introduction period at the island of Schiermonnikoog (in the Wadden Sea) to get acquainted with each other. All subsequent questionnaires were handed out during lectures and the students were allowed to fill them out at home. They were encouraged not to discuss their answers with their fellow students. If possible, students were reminded during lecture times to return the questionnaires.

The measurement of friendship was as for the van de Bunt data, see van de Bunt, van Duijn, and Snijders (1999). The students were asked to rate their relationships on a six point scale, with response categories described as follows.

Label Description of the response categories
1. Best friendship Persons whom you would call your 'real' friends
2. Friendship
Persons with whom you have a good relationship, but whom you do not (yet) consider a 'real' friend
3. Friendly relationship
Persons with whom you regularly have pleasant contact during classes. The contact could grow into a friendship
4. Neutral relationship
Persons with whom you have not much in common. In case of an accidental meeting the contact is good. The chance of it growing into a friendship is not large
0. Unknown person Persons whom you do not know
5. Troubled relationship
Persons with whom you can't get on very well, and with whom you definitely do not want to start a relationship. There is a certain risk of getting into a conflict

Next to the sociometric data, a number of individual characteristics are available: sex, age, education program, and smoking behavior.


The digraph data files are stu98t0.txt to stu98t6.txt. The networks are coded as 0 = unknown, 1 = best friend, 2 = friend, 3 = friendly relation, 4 = neutral, 5 = troubled relation, 6 = item non-response, 9 = actor non-response. Note that 6 and 9 are missing data codes.
For the measurements after t0, the code 8 is used for the diagonal.
However, from the questionnaires it is clear that the code 6 is used also to express 'unknown', and probably it is best to recode 6 to 0.

The actor attributes are in the file stud98.txt, collected at t0. The variables are:

  1. Id number
  2. Gender: 1= male; 2=female
  3. Age in years
  4. Program: 1= regular 4-year program; 2=2-year program
  5. Smoking: 1=no; 2=yes, at parties (social); 3=1-3 cigarettes per day, 4=4-10 p.d.; 5= more than 10 p.d., 99 = missing.


Back to the Siena data sets page