Description students data set Marijtje van Duijn

This data set was collected by Gerhard van de Bunt, Marijtje van Duijn, Frans Stokman, and Evelien Zeggelink, and is discussed extensively in van Duijn, Zeggelink, Huisman, Stokman, and Wasseur (2003). It is rather similar to the van de Bunt data, but collected in a later year.

Download the data set.

Background

The data were collected among the freshman cohort majoring in Sociology in the University of Groningen in 1996-1997. Except for a few existing relationships (acquaintances from a former school), they did not know each other at the first measurement. The data were collected at five time points. The first time point was at the start of the freshmen year; the further time points were 3, 6, 13, and 35 weeks later, respectively. The number of responding students was 38, 25, 28, 18, and 18, respectively. For the first measurement point, these were practically all freshman students. Some of the decrease was due to dropout. The data for the last time point are not quite reliable: some of the questionnaires were handed in before and some after an unexpected event that had a great impact on the students.

Sociology in Groningen is a relatively small discipline. Over the five years before this data collection, the number of freshmen each year varied roughly between 30 and 50. In the Netherlands, in principle everyone with the highest high school certificate can enter university. Students enter college right after high school (at the age of about 18 years), or after having finished higher vocational training first (at the age of about 22). The latter students followed a special program of two years instead of the regular four-year program. This means that these students followed different classes at different points in time than the regular students. The classes of these two programs overlapped only partly. As a result, 'program' has aspects of a proximity variable. All classes took place in the same building, in which the classrooms, offices, and cafeteria were situated. During class breaks students could drink coffee or tea, or have lunch, in this cafeteria. The cafeteria was divided into a smoking area, and a non-smoking area upstairs. This provides us the second proximity variable, smoking behavior, because students who wanted to smoke needed to use a different part of the cafeteria.

The two programs are more than a proximity variable because students who followed the same program are in general of about the same age. Those who followed the regular program will on average have been 18 years old, and those who followed the short program on average 22 years. Moreover, many of those who followed the short program had left their parents' home earlier, and had lived in the city of Groningen for a longer period. As such, their starting point differed tremendously from that of the "younger" students. The academic aspirations of the students of the short program will also be stronger in general. They are more mature and probably made a more deliberate choice to study sociology. As a result, the program variable captures not only proximity effects but also a number of similarity effects.

Questionnaires were presented to the students seven times during their first year at the university. In five questionnaires the networks were measured, together with, sometimes, other information; in two questionnaires only individual information was asked of the student related to the invisible similarity variables. The first questionnaire was filled out while most of the freshmen participated in the so-called introduction period at the island of Schiermonnikoog (in the Wadden Sea) to get acquainted with each other. All subsequent questionnaires were handed out during lectures and the students were allowed to fill them out at home. They were encouraged not to discuss their answers with their fellow students. If possible, students were reminded during lecture times to return the questionnaires.

The measurement of friendship was as for the van de Bunt data, see van de Bunt, van Duijn, and Snijders (1999). The students were asked to rate their relationships on a six point scale, with response categories described as follows.

Label Description of the response categories
1. Best friendship Persons whom you would call your 'real' friends
2. Friendship
 
Persons with whom you have a good relationship, but whom you do not (yet) consider a 'real' friend
3. Friendly relationship
 
Persons with whom you regularly have pleasant contact during classes. The contact could grow into a friendship
4. Neutral relationship
 
Persons with whom you have not much in common. In case of an accidental meeting the contact is good. The chance of it growing into a friendship is not large
0. Unknown person Persons whom you do not know
5. Troubled relationship
 
Persons with whom you can't get on very well, and with whom you definitely do not want to start a relationship. There is a certain risk of getting into a conflict

Next to the sociometric data, a number of individual characteristics are available, including sex, education program, and smoking behavior.

Coding

The digraph data files are t1.dat to t5.dat. The networks are coded as 0 = unknown, 1 = best friend, 2 = friend, 3 = friendly relation, 4 = neutral, 5 = troubled relation, 6 = item non-response, 9 = actor non-response. Note that 6 and 9 are missing data codes.

The actor attributes are in the files cov1.dat and cov4.dat, collected at the same time point as the network data sets are t1.dat and t4.dat.. The first variables are:

  1. Id number
  2. Gender: 1= male; 2=female
  3. Program: 1= regular 4-year program; 2=2-year program
  4. Smoking: 0=no; 2=yes, at parties (social); 3=1-3 cigarettes per day, 4=4-10 p.d.; 5= more than 10 p.d.
  5. Using soft-drugs: 1=no, 2= yes, less than once a month; 3=yes, 1-3 times p.m.; 4=yes, 1-3 times per week; 5=yes, more than 3 times p.w.
  6. The next 13 columns consist of scores on an importance of or interest in a (social) activity. The respondent indicated the amount of attention given to the activity on a visual analogue scale (VAS) with anchors 1-10 (under the line) and none-much (above the line). The respondent was instructed to indicate their scores first for the activity of least importance etc. The list at the first measurement consisted of 22 activities, of which 13 were also asked at the fourth measurement. The recorded scores are from 0-42 (the width in mm of the VAS).

  7. Going out
  8. Going to a concert or movie
  9. Having dinner at home
  10. Exercising/doing sports
  11. Watching sports
  12. Having coffee in the university cafeteria
  13. Religious involvement/Church activities
  14. Listening to music
  15. Discussing politics
  16. Discussing personal feelings
  17. Making jokes/having fun
  18. Student association involvement/organizing activities
  19. Discussing the classes or study program (sociology).

  20. Satisfaction with number of "real" friends or friends (categories 1 or 2 in network question): 1=very dissatisfied, 2=rather dissatisfied, 3=not satisfied/not dissatisfied; rather satisfied; 5= very satisfied.
  21. Too few/many friends: 1=far too few; 2=too few; 3=exactly right number; 4=too many; 5=far too many.
  22. Which part of current friends live in the city of Groningen? 1=none; 2=approx. 25%; 3=approx 50%; 4=approx. 75%; 5=100%.
99 indicates missing values (only in cov4.dat). See van Duijn et al. (2003) for further information about this network and the actor attributes.

References



Back to the Siena data sets page