Abstract
Geographic proximity is a determinant factor of friendship. Friendship datasets that include detailed geographic information are scarce, and when this information is available, the dependence of friendship on distance is often modelled by pre-specified parametric functions or derived from theory without further empirical assessment. This paper aims to give a detailed representation of the association between distance and the likelihood of friendship existence and friendship dynamics, and how this is modified by a few basic social and individual factors. The data employed is a three-wave network of 336 adolescents living in a small Swedish town, for whom information has been collected on their household locations.
The analysis is a three-step process that combines (1) nonparametric logistic regressions to unravel the overall functional form of the dependence of friendship on distance, without assuming it has a particular strength or shape; (2) parametric logistic regressions to construct suitable transformations of distance that can be employed in (3) stochastic models for longitudinal network data, to assess how distance, individual covariates, and network structure shape adolescent friendship dynamics. It was found that the log-odds of friendship existence and friendship dynamics decrease smoothly with the logarithm of distance.
For adolescents in different schools the dependence is linear, and stronger than for adolescents in the same school. Living nearby accounts, in this dataset, for an aspect of friendship dynamics that is not explicitly modelled by network structure or by individual covariates. In particular, the estimated distance effect is not correlated with reciprocity or transitivity effects.
The second edition of an extensive textbook on multilevel analysis.
Material about this book is available
at a separate web page.
Chapters
Abstract.
Studies of peer effects in educational settings confront two main problems.
The first is the presence of endogenous sorting which confounds the effects
of social influence and social selection on individual attainment.
The second is how to account for the local network dependencies through
which peer effects influence individual behavior. We empirically address
these problems using longitudinal data on academic performance, friendship,
and advice seeking relations among students in a full-time graduate academic
program. We specify stochastic agent-based models that permit estimation of
the interdependent contribution of social selection and social influence
to individual performance. We report evidence of peer effects.
Students tend to assimilate the average performance of their friends
and of their advisors. At the same time, students attaining similar
levels of academic performance are more likely to develop friendship and advice ties.
Together, these results imply that processes of social influence and
social selection are sub-components of a more general a co-evolutionary process
linking network structure and individual behavior.
We discuss possible points of contact between our findings and
current research in the economics and sociology of education.
Key words: Peer effects; Stochastic actor-oriented models; Social networks; Network dynamics; Education
Abstract.
Statistical models for social networks as dependent variables must represent
the typical network dependencies between tie variables such as
reciprocity, homophily, transitivity, etc. This review first treats models
for single (cross-sectionally observed) networks and then for network
dynamics.For single networks, the older literature concentrated on conditionally
uniform models. Various types of latent space models have
been developed: for discrete, general metric, ultrametric, Euclidean,
and partially ordered spaces. Exponential random graph models were
proposed long ago but now are applied more and more thanks to the
non-Markovian social circuit specifications that were recently proposed.
Modeling network dynamics is less complicated than modeling single
network observations because dependencies are spread out in time. For
modeling network dynamics, continuous-time models are more fruitful.
Actor-oriented models here provide a model that can represent many
dependencies in a flexible way. Strong model development is now going
on to combine the features of these models and to extend them to more
complicated outcome spaces.
Key words: Social networks, Statistical modeling, Inference
Abstract.
This article examines the dynamics of peer relationships across the first
2 grades of Dutch junior high schools (average age
13 - 14). Specifically, we studied how gender and compositional
changes in classrooms structured the changes in peer
relationships between the 2 grades.
Expectations were derived from past research, and we tested whether these held when
methods for data analysis were applied that control appropriately for
the dependence structure of the data (more specifically,
multilevel analysis and a multilevel application of actor-oriented models
for network evolution). Analyses revealed
that the stability of peer acceptance was relatively low and that it
was affected neither by the level of classroom
stability nor by gender. Dyadic relationships were moderately stable.
Tendencies toward reciprocity, network closure, and
gender similarity shaped the changes in networks of peer relationships
within classes. Contrary to past findings, female
newcomers in classrooms were equally as well accepted as male newcomers
or established class members.
Key words: Social networks, Statistical modeling, Inference
Abstract.
This article examines the use of various research designs in the social sciences as
well as the choices that are made when a quasi-experimental design is used. A content analysis
was carried out on articles published in 18 social science journals with various impact
factors. The presence of quasi-experimental studies was investigated as well as choices in the
design and analysis stage. It was found that quasi-experimental designs are not very often
used in the inspected journals, and when they are applied they are not very well designed
and analyzed. These findings suggest that the literature on how to deal with selection bias
has not yet found its way to the practice of the applied researcher.
Key words: Quasi-experiments, Social science, Selection bias, Research designs, Content analysis.
Abstract.
The issue of the influence of norms on behavior is as old as sociology itself. This paper explores the
effect of normative homophily (i.e. "sharing the same normative choices") on the evolution of the advice
network among lay judges in a courthouse. Blau's (1955, 1964) social exchange theory suggests that
members select advisors based on the status of the advisor. Additional research shows that members of an
organization use similarities with others in ascribed, achieved or inherited characteristics, as well as other
kinds of ties, to mitigate the potentially negative effects of this strong status rule. We elaborate and test
these theories using data on advisor choice in the Commercial Court of Paris.Weuse a jurisprudential case
about unfair competition (material and "moral" damages), a case thatwesubmitted to all the judges of this
court, to test the effect of normative homophily on the selection of advisors, controlling for status effects.
Normative homophily is measured by the extent to which two judges are equally "punitive" in awarding
damages to plaintiffs. Statistical analyses combine longitudinal advice network data collected among
the judges with their normative dispositions. Contrary to what could be expected from conventional
sociological theories, we find no pure effect of normative homophily on the choice of advisors. In this
case, therefore, sharing the same norms and values does not have, by itself, a mitigating effect and does
not contribute to the evolution of the network. We argue that status effects, conformity and alignments on
positions of opinion leaders in controversies still provide the best insights into the relationship between
norms, structure and behavior.
Key words:Advice networks, Longitudinal analysis, Homophily, Norms, Social selection, Status, Learning.
Abstract.
For exponential random graph models, under quite general conditions, it is
proved that induced subgraphs on node sets disconnected from the other
nodes still have distributions from an exponential random graph model.
This can help in the theoretical interpretation of such models. An
application is that for saturated snowball samples from a potentially larger
graph which is a realization of an exponential random graph model, it is
possible to do the analysis of the observed snowball sample within the
framework of exponential random graph models without any knowledge of
the larger graph.
Key words: Connected component, network delineation, network boundary, random graphs, snowball sample.
Abstract.
A model for network panel data is discussed, based on the assumption
that the observed data are discrete observations of a
continuous-time Markov process on the space of all directed graphs on a given
node set, in which changes in tie variables are independent
conditional on the current graph. The model for tie changes is
parametric
and designed for applications to social network analysis, where the
network dynamics can be interpreted as being generated by choices
made by the social actors represented by the nodes of the graph. An
algorithm for calculating the Maximum Likelihood estimator is
presented, based on data augmentation and stochastic approximation.
An application to an evolving friendship network is given and a small
simulation study is presented which suggests that for small data sets
the Maximum Likelihood estimator is more efficient than the earlier
proposed Method of Moments estimator.
Key words: Graphs, Longitudinal data, Method of moments, Stochastic approximation, Robbins-Monro algorithm.
Abstract.
A recurrent problem in the analysis of behavioral dynamics, given a simultaneously evolving
social network, is the difficulty of separating effects of partner selection from effects of social
influence. Because misattribution of selection effects to social influence, or vice versa, suggests
wrong conclusions about the social mechanisms underlying the observed dynamics, special
diligence in data analysis is advisable. While a dependable and valid method would benefit
several research areas, according to the best of our knowledge, it has been lacking in the extant
literature. In this paper, we present a recently developed family of statistical models that enables
researchers to separate the two effects in a statistically adequate manner. To illustrate our
method, we investigate the roles of homophile selection and peer influence mechanisms in the
joint dynamics of friendship formation and substance use among adolescents. Making use of a
three-wave panel measured in the years 1995-97 at a school in Scotland, we are able to assess the
strength of selection and influence mechanisms and quantify the relative contributions of
homophile selection, assimilation to peers, and control mechanisms to observed similarity of
substance use among friends.
Key words: statistical modeling, social networks, graphs, longitudinal, network dynamics, smoking, alcohol consumption.
The methods proposed in this paper are implemented in the SIENA program .
Key words: statistical modeling, longitudinal, Markov chain, agent-based model, peer selection, peer influence.
Key words: Smoking; Adolescents; Selection; Influence; Friends; Reciprocity; Siena
Key words: Residuals, Hausman test, empirical Bayes, spline functions, deletion residuals, influence diagnostics, non-linear transformations, mixed models, Hierarchical Linear Model.
Abstract.
Multiple regression quadratic assignment procedures (MRQAP)
tests are permutation tests for multiple
linear regression model coefficients for data organized in
square matrices of relatedness among n
objects. Such a data structure is typical in social network
studies, where variables indicate some type
of relation between a given set of actors.
We present a new permutation method (called "double semipartialing",
or DSP) that complements the family of extant approaches to MRQAP tests.
We assess the
statistical bias (type I error rate) and statistical power of the set
of five methods, including DSP, across a
variety of conditions of network autocorrelation,
of spuriousness (size of confounder effect),
and of skewness in the data.
These conditions are explored across three assumed
data distributions: normal, gamma,
and negative binomial.
We find that the Freedman-Lane method and the DSP method are the most robust
against a wide array of these conditions.
We also find that all five methods perform better if the test
statistic is pivotal.
Finally, we find limitations of usefulness for MRQAP tests:
All tests degrade under
simultaneous conditions of extreme skewness and
high spuriousness for gamma and negative binomial
distributions.
Key words:MRQAP, Mantel tests, permutation tests, social networks, network autocorrelation, collinearity, dyadic data.
Key words: Longitudinal social networks; Data augmentation; Bayesian inference; Random graphs.
Key words: Social networks; ERGM; Dependence structure
Key words: delinquency; friendship networks; interdependence; SIENA
Key words: Exponential random graph models; p* models; Statistical models for social networks
Abstract.
A parametric, continuous-time Markov model for digraph panel data is considered.
The parameter is estimated by the method of moments.
A convenient method for estimating the variance-covariance matrix of the moment estimator relies on the delta method, requiring the Jacobian matrix - that is, the matrix of partial derivatives - of the estimating function.
The Jacobian matrix was estimated hitherto by Monte Carlo methods based on finite differences.
Three new Monte Carlo estimators of the Jacobian matrix are proposed,
which are related to the likelihood ratio / score function method of derivative estimation
and have theoretical and practical advantages compared to the finite differences method.
Some light is shed on the practical performance of the methods
by applying them
in a situation where the true Jacobian matrix is known and in a situation where the true Jacobian matrix is unknown.
Key words: digraphs, continuous-time Markov processes, gradient estimation, likelihood ratio / score function method, variance reduction, control variates.
The methods proposed in this paper are implemented in the SIENA program .
Abstract.
A deeper understanding of the relation
between individual behavior and individual actions on one hand and
the embeddedness of individuals in social structures on the other
hand can be obtained by empirically studying the dynamics of
individual outcomes and network structure, and how these mutually
affect each other. In methodological terms, this means that behavior
of individuals -- indicators of performance and success, attitudes
and other cognitions, behavioral tendencies -- and the ties between
them are studied as a social process evolving over time, where
behavior and network ties mutually influence each other. We propose
a statistical methodology for this type of investigation and
illustrate it by an example.
Key words: statistical modeling, social networks, graphs, longitudinal, network dynamics.
The methods proposed in this paper are implemented in the SIENA program .
Abstract.
We analyse the co-evolution of social networks and substance use behaviour of adolescents
and address the problem of separating the effects of homophily and assimilation. Adolescents
who prefer friends with the same substance-use behaviour exhibit the homophily principle.
Adolescents who adapt their substance use behaviour to match that of their friends display
the assimilation principle. We use the Siena software to illustrate the co-evolution of
friendship networks, smoking, cannabis use and drinking among sport-active teenagers.
Results indicate strong network selection effects occurring with a preference for same sex
reciprocated relationships in closed networks. Assimilation occurs among cannabis and
alcohol but not tobacco users. Homophily prevails among tobacco and alcohol users.
Cannabis use influences smoking behavior positively (i.e., increasing cannabis increases
smoking). Weaker effects include drinkers smoking more and cannabis users drinking more.
Homophily and assimilation are not significant mechanisms with regard to sporting activity
for any substance. There is, however, a significant reduction of sporting activity among
smokers. Also, girls engaged in less sport than boys. Some recommendations for health
promotion programmes are made.
Key words: statistical modeling, social networks, graphs, longitudinal, network dynamics.
The methods proposed in this paper are implemented in the SIENA program .
Abstract.
Social networks can be defined as the patterns of ties between social
actors.
This paper gives a review of recently developed statistical models
and estimation methods for the analysis of social network
panel data.
To represent the feedback processes inherent in network dynamics,
it is helpful to regard such panel data as momentary observations
on a continuous-time process on the space of directed graphs.
Tie-oriented and actor-oriented stochastic models are presented,
which can reflect endogenous network dynamics
as well as effects of exogenous variables.
These models do not allow explicit calculations, but they can be implemented
as computer simulation models.
Stochastic approximation methods can be used to estimate the parameters.
An example is given where the models are applied to an early precursor
of email communication.
Abstract.
Abstract
The purpose of this study is to examine whether peer relations within classrooms were related
to students' academic progress, and if so, whether this can be explained by students' relatedness
and engagement, in line with Connell and Wellborn's self-system model. We analyzed data of
18,735 students in 796 school classes in Dutch junior high schools, using multilevel analysis.
Academic progress, conceptualized as regular promotion to the next year versus grade retention,
moving upward, and moving downward in the track system, was measured at the time of
transition between Grades 1 and 2 (equivalent to US Grades 7 and 8). The results indicated that
students who were accepted by their peers had lower probabilities to retain a grade or to move
downward in the track system. Although peer acceptance was associated with relatedness and
engagement, these variables did not explain why peer acceptance was associated to academic progress. Furthermore, peer acceptance and relatedness were more strongly related in classes with
more negative class climates.
Abstract.
The chapter in this volume
by Dronkers and Hox presents an interesting multilevel event history analysis of divorce risks.
The sibling design gives excellent opportunities for studying the similarity between
brothers and sisters in the risks of divorce. Various discussion points are raised,
all of which bear in some way upon the choice of predictor variables in the multilevel
logistic regression. Questions are posed about
the level of detail of modeling time trends; about the fact that sampling weights are a
function of number of siblings; and about the inclusion in the fixed part of the model
of the fraction of previously divorced siblings, which is correlated with the family-level
random intercept.
Abstract.
We give a nontechnical introduction into recently developed methods for analyzing the coevolution of social networks and behavior(s) of the network actors. This coevolution is crucial for a variety of research topics that currently receive a lot of attention, such as the role of peer groups in adolescent development. A family of dynamic actor-driven models for the coevolution process is sketched, and it is shown how to use the SIENA software for estimating these models. We illustrate the method by analyzing the coevolution of friendship networks, taste in music, and alcohol consumption of teenagers.
Key words: network dynamics, longitudinal, social networks, stochastic modeling.
The methods proposed in this paper are implemented in the SIENA program .
Abstract.
The most
promising class of statistical models for expressing structural
properties of social networks observed at one moment in time, is the
class of Exponential Random Graph Models (ERGMs), also known as p*
models. The strong point of these models is that they can represent
a variety of structural tendencies, such as transitivity, that
define complicated dependence patterns not easily modeled by more
basic probability models. Recently, MCMC algorithms have been
developed which produce approximate Maximum Likelihood estimators.
Applying these models in their traditional specification to observed
network data often has led to problems, however, which can be traced
back to the fact that important parts of the parameter space
correspond to nearly degenerate distributions, which may lead to
convergence problems of estimation algorithms, and a poor fit to
empirical data.
This paper proposes new specifications of
Exponential Random Graph Models. These specifications represent
structural properties such as transitivity and heterogeneity of
degrees by more complicated graph statistics than the traditional
star and triangle counts. Three kinds of statistic are proposed:
geometrically weighted degree distributions, alternating
k-triangles, and alternating independent two-paths. Examples are
presented both of modeling graphs and digraphs, in which the new
specifications lead to much better results than the earlier existing
specifications of the ERGM. It is concluded that the new
specifications increase the range and applicability of the ERGM as a
tool for the statistical analysis of social networks.
Key words: statistical modeling, social networks, graphs, transitivity, clustering, maximum likelihood, MCMC, p* model.
Also see Snijders (2002).
The methods
proposed in this paper are implemented in the SIENA program , part of the StOCNET package.
Abstract.
This paper proposes a multilevel extension to the p2 model for the analysis of social networks. In the p2 model dichotomous tie observations between actors in a given set can be regressed on explanatory variables. The multilevel p2 model is a model for social networks with a multilevel data structure, e.g., networks observed in multiple schools. It defines an identical model for the independent observations of the same type of social network, where the parameters can be allowed to vary across the social networks using random effects. For the multilevel p2 model a Bayesian MCMC algorithm has been developed, which is briefly described here. The model is applied to investigate reported received practical support among Dutch high school pupils of different ethnic backgrounds.
The methods proposed in this paper are implemented in the StOCNET package.
Abstract.
This chapter treats statistical methods for network evolution. It
is argued that it is most fruitful to consider models where network
evolution is represented as the result of many (usually non-observed)
small changes occurring between the consecutively observed networks.
Accordingly, the focus is on models where a continuous-time network
evolution is assumed although the observations are made at discrete
time points (two or more).
Three models are considered in detail, all based on the assumption
that the observed networks are outcomes of a Markov process
evolving in continuous time. The independent arcs model is a trivial
baseline model. The reciprocity model expresses effects of reciprocity,
but lacks other structural effects. The actor-oriented model is based
on a model of actors changing their outgoing ties as a consequence of
myopic stochastic optimization of an objective function. This framework
offers the flexibility to represent a variety of network effects. An
estimation algorithm is treated, based on a Markov chain Monte Carlo
implementation of the method of moments.
Key words: network evolution, Markov process, stochastic actor-oriented network model.
Also see Snijders (2001).
The methods proposed in this paper are implemented
in the SIENA program.
Abstract.
In research on the social capital of individuals,
there has been little standardisation of measurement instruments.
In this paper we propose two innovations. First, a new
measurement method: the Resource Generator; an instrument with concretely
worded items covering `general' social capital in a population, that
combines advantages of earlier techniques. Construction, use, and first
empirical findings are discussed for a representative sample (N = 1,004)
for the Dutch population in 1999-2000. Second, we propose to
investigate social capital by latent trait analysis, and we
identify separately accessed portions of social capital: prestige and
education related social capital, entrepreneurial social capital, skills
social capital, and personal support social capital. This underlines that
social capital measurement needs multiple measures, and cannot be reduced
to one total measure of indirectly `owned' resources. Constructing a theory-based
Resource Generator can be a challenge for different
contexts of use, but also retrieve meaningful
information for investigating the productivity and goal specificity of social capital.
This paper is part of the
Ph.D. research by Martin van der Gaag on measurement of social capital.
Abstract.
Key words:
Complete network, Longitudinal study, Dynamics, Explained variation,
Coefficient of Determination, Entropy.
Also see Snijders (2001).
Abstract.
The methods proposed in this paper are implemented
in
the StOCNET package.
Snijders, Tom A.B.,
Explained Variation in Dynamic Network Models.
Mathématiques, Informatique et Sciences Humaines /
Mathematics and Social Sciences, 168, 2004(4), p. 31-41.
A measure for explained variation is proposed for stochastic actor-driven models
for data on social networks. The measure is based on the entropy of the distribution of the choices
made by the actors during the network evolution process. This measure can be helpful in the
specification and interpretation of statistical models for longitudinal network data.
The methods proposed in this paper are implemented
in the SIENA program .
van Duijn, M.A.J., Snijders, T.A.B., & Zijlstra, B.H.
p2: a random effects model with covariates for directed graphs.
Statistica Neerlandica, 58 (2004), 234-254.
A random effects model is proposed for the analysis of binary dyadic data that
represent a social network or directed graph, using nodal and/or dyadic attributes
as covariates. The network structure is reflected by modeling the dependence
between the relations to and from the same actor or node. Parameter estimates
are proposed that are based on an iterated generalized least squares procedure.
An application is presented to a data set on friendship relations between American
lawyers.
Snijders, Tom A.B. (2003). Entries in SAGE Encyclopedia of
Social Science Research Methods.
The following entries in M.
Lewis-Beck, A.E. Bryman, and T.F. Liao (eds.),
The SAGE Encyclopedia of Social Science Research Methods.
Thousand Oaks, CA: Sage, 2003:
Abstract.
A class of statistical models is proposed which aims to recover latent
settings structures in social networks. Settings may be regarded as
clusters of vertices. The measurement model builds on two assumptions.
The observed network is assumed to be generated by hierarchically
nested latent transitive structures, expressed by ultrametrics.
It is assumed that expected tie strength decreases with ultrametric
distance. The approach could be described as model-based clustering
with an ultrametric space as the underlying metric to capture the dependence
in the observations. Maximum likelihood methods as well
as Bayesian methods are applied for statistical inference. Both approaches
are implemented using Markov chain Monte Carlo methods.
The methods proposed in this paper are implemented in the StOCNET package.
Abstract. Markov chains can be used for the modeling of complex longitudinal network data. One class of probability models to model the evolution of social networks are stochastic actor-oriented models for network change, proposed by Snijders (1996, 2001). These models are continuous-time Markov chain models that are implemented as simulation models. In this paper an extension of the simulation algorithm of stochastic actor-oriented models is proposed to include networks of changing composition. In empirical research, the composition of networks may change due to actors joining or leaving the network at some points in time. The composition changes are modeled as exogenous events that occur at given time points and are implemented in the simulation algorithm. The estimation of the network effects and the effects of actor and dyadic attributes that influence the evolution of the network, is based on the simulation of Markov chains.
Key words: network evolution, Markov process, stochastic actor-oriented network model, changing composition.
Also see Snijders (2001).
The methods proposed in this paper are implemented
in the SIENA. program.
Abstract.
This is a chapter in the volume on
the 1999 SCALE conference on social capital (Amsterdam, december 9-11, 1999).
The chapter presents a conceptual approach to the measurement of social
capital as defined on the level of individuals, with the aim to develop a
yardstick for social capital that can be used in prospective studies
investigating its productivity and goal specificity. It discusses several
theoretical choices that should be made before starting
measurements, and introduces an empirical approach to the
construction of domain specific social capital measures.
This paper is part of the
Ph.D. research by Martin van der Gaag on measurement of social capital.
Key words: MANOVA, incomplete data, missing at random, hierarchical linear model, Hotelling's test, Wald test, Lawley - Hotelling trace criterion, trend tests, compound symmetry model.
Abstract.
Degrees (the number of links attached to a given node) play a particular
and important role in empirical network analysis because of their obvious
importance for expressing the position of nodes.
It is argued here that there is no general straightforward relation
between the degree distribution on one hand and structural aspects on
the other hand, as this relation depends on further characteristics of
the presumed model for the network. Therefore empirical inference from
observed network characteristics to the processes that could be responsible
for network genesis and dynamics cannot be based only, or mainly, on the
observed degree distribution.
As an elaboration and practical implementation of this point,
a statistical model for the dynamics of networks, expressed as digraphs
with a fixed vertex set,
is proposed in which the outdegree distribution is governed by parameters
that are not connected to the parameters for the structural dynamics.
The use of such an approach in statistical modeling
minimizes the influence of the observed degrees on the conclusions
about the structural aspects of the network dynamics.
The model is a stochastic actor-oriented model, and deals
with the degrees in a manner resembling Tversky's
Elimination by Aspects approach.
A statistical procedure for parameter estimation in
this model is proposed, and an example is given.
Also see
Snijders (2001).
The methods proposed in this paper are implemented
in the SIENA. program.
Abstract.
A multilevel approach is proposed to the study of the evolution
of multiple networks. In this approach, the basic evolution process
is assumed to be the same, while parameter values may differ
between different networks.
For the network evolution process,
stochastic actor-oriented models are used, of which the parameters
are estimated by Markov chain Monte Carlo methods.
This is applied to the study of effects of delinquent behavior
on friendship formation, a question of long standing in criminology.
The evolution of friendship is studied empirically in 19 school classes.
It is concluded that there is evidence for an effect of
similarity in delinquent behavior on friendship evolution.
Similarity of the degree of
delinquent behavior has a positive effect on tie formation
but also on tie dissolution.
The last result seems to contradict criminological theories, and deserves
further study.
Key words: actor-oriented model; longitudinal data; social networks; criminology; adolescents.
Also see Snijders (2001).
Abstract.
Markov graphs and exponential random graph models are an important family
of probability distributions for graphs and digraphs because they allow
the kind of dependency that is often considered in social network
analysis, e.g., transitivity of choice. To estimate parameters in these
statistical models, pseudo-likelihood methods have been proposed, but they
are of doubtful value. Maximum likelihood (ML) estimates would be better
but are hard to calculate.
These can be approximated, however, by MCMC methods that solve the moment
equation. The use of MCMC methods in these models often is hampered by
convergence problems, of which the cause can be traced to steepness of the
moments as functions of the parameters;
moreover, in the region where this steepness
occurs, the distribution can have a bimodal shape, which in itself already
leads to serious convergence problems.
A possible way out of these problems is to model the degrees more
carefully. On one hand, precisely modeling the degrees may confine the
algorithm to a region in the parameter space where the moment function is
well-behaved and where the distribution has a unimodal shape. On the other
hand, modeling the degrees may lead to a better fitting model, which also
can lead to a better-behaving algorithm.
Three types of specification of exponential random digraph models are
considered: (1) conditional on the number of ties; (2) conditional on all
in- and out-degrees; (3) conditional on the number of ties, and icluding
incidental vertex parameters. In some examples, it is investigated
how well it is possible to achieve convergence in the MCMC parameter
estimation, and how the parameter estimates differ between these
specifications.
Also see Snijders (JoSS, 2002).
The methods proposed in this paper are implemented
in
the StOCNET package.
Abstract.
In this study we try to estimate the size of the homeless population
in Budapest by using
two "non-standard" sampling methods: snowball sampling and the
capture-recapture method. Using
two methods and three different data sets we are able to compare
the methods as well as the results,
and we also suggest some further applications. Apart from the
practical purpose of our study there
is a methodological one as well: to use two relatively unknown
methods for the estimations of this
very peculiar kind of population.
Key words: snowball sampling, capture-recapture, hidden population, homeless.
Abstract.
A number of estimation methods of the variance components
in Wing & Kristofferson's model for inter-response times are examined
and compared by means of a simulation study.
The estimation methods studied are the method of moments,
maximum likelihood, and an alternative approach in which
the WK-model is recognized as a moving average model.
Key words: discrete motor responses, moving average model, EM, maximum likelihood, method of moments.
By clicking here you can run the JAVA applet that is used in this paper to demonstrate proprties of the treated probability model.
The estimation procedure in this publication is available in the program SIENA.
Abstract.
This paper is about estimating the parameters of the
exponential random graph model, also known as the p* model,
using frequentist Markov chain Monte Carlo (MCMC) methods.
The exponential random graph model is simulated using Gibbs
or Metropolis-Hastings sampling.
The estimation procedures considered are based on
the Robbins-Monro algorithm for approximating
a solution to the likelihood equation.
A major problem with exponential random
graph models resides in the fact that such models
can have, for certain parameter values, bimodal
(or multimodal) distributions
for the sufficient statistics such as the number of ties.
The bimodality of the exponential graph distribution
for certain parameter values seems a severe limitation
to its practical usefulness.
The possibility of bi- or multimodality is reflected in the possibility that the
outcome space is divided into two (or more) regions
such that the more usual type of MCMC algorithms,
updating only single relations, dyads, or triplets,
have extremely long sojourn times
within such regions, and a negligible probability to
move from one region to another.
In such situations, convergence to the target distribution
is extremely slow.
To be useful, MCMC algorithms must be able to make transitions
from a given graph to a very different graph.
It is proposed to include transitions to the graph complement
as updating steps
to improve the speed of convergence to the target distribution.
Estimation procedures implementing these ideas work satisfactorily for some
data sets and model specifications, but not for all.
Key words: p* model; Markov graph; digraphs; exponential family; maximum likelihood; method of moments; Robbins-Monro algorithm; Gibbs sampling; Metropolis-Hastings algorithm.
Also see Snijders, Pattison, Robins, and Handcock (2006).
The methods proposed in this paper are implemented in
the SIENA program in
the StOCNET package.
Abstract.
Available variance component tests are reviewed and three new score tests are presented.
In the first score test, the asymptotic normal distribution of the test statistic
is used as a reference distribution.
In the other two score tests, a Satterthwaite approximation is used
for the null distribution of the test statistic.
We evaluate the performance of the score tests and other available tests
by means of a Monte Carlo study.
The new tests are computationally relatively cheap and have
good power properties.
Key words: multilevel models; variance components; random coefficients; score tests; Monte Carlo study.
Abstract.
A class of statistical models is proposed for longitudinal network data.
The dependent variable is the changing (or evolving) relation network,
represented by two or more observations of a directed graph
with a fixed set of nodes.
The nodes are modeled as actors whose choices determine the network.
Individual and dyadic exogenous variables can be used as covariates.
The change in the network is modeled as the stochastic result of
network effects (reciprocity, transitivity, etc.) and these covariates.
The existing network structure is a dynamic constraint for the
evolution of the structure itself.
The models are continuous time Markov chain models that
can be implemented as simulation models.
The network evolution is modeled as the consequence of the actors
making new choices, or withdrawing existing choices, on the basis
of functions, with fixed and random components, that the actors
try to maximize.
The models parameters must be estimated from observed data.
For estimating and testing these models,
statistical procedures are proposed which are based on the method of moments.
The statistical procedures are implemented
using a stochastic approximation algorithm based on
computer simulations of the network evolution process.
Key words: actor-oriented model; longitudinal data; continuous-time Markov process; Robbins-Monro algorithm; simulation models; method of moments; stochastic approximation; simulated moments; random utility; Markov chain Monte Carlo.
This paper is related to various other papers;
these can be found by searching in this publication list for the key word SIENA.
The methods proposed in this paper are implemented
in the program SIENA.
Abstract.
A statistical approach to a posteriori
blockmodeling for digraphs and
valued digraphs is proposed.
The probability model assumes that the vertices
of the digraph are partitioned
into several unobserved (latent) classes and that the
probability of a relationship between two vertices
depends only on the classes to which they belong.
A Bayesian estimator, based on Gibbs sampling, is proposed.
The basic model is not identified, because class labels are arbitrary.
The resulting identifiability problems are solved by restricting inference to
the posterior distributions of invariant functions of the
parameters and the vertex class membership.
In addition, models are considered where class labels are identified by
prior distributions for the class membership of some of the vertices.
The model is illustrated by an example from the social networks literature
(Kapferer's tailor shop).
Key words: Colored graph; Gibbs sampling; latent class model; social network; cluster analysis; mixture model.
This paper continues earlier work published as
Nowicki and Snijders (1997).
The methods proposed in this paper are implemented
in
the StOCNET package.
Key words: item response theory, person fit, asymptotic approximations.
Abstract.
The relation between multilevel analysis and multistage sampling
is discussed.
After this, much attention is paid to the determination of sample sizes
in multilevel analysis.
Key words: loneliness, item response theory, Rasch model, dimensionality, aging.
Abstract.
This paper considers a design where the
objects to be scaled are the higher level units; nested within each
object are lower level units, called `subjects';
and a set of dichotomous items is administered to each subject.
The subjects are regarded as strictly parallel tests
for the objects.
Examples are the scaling of teachers on the basis of their pupils' responses,
or of neighborhoods on the basis of responses by inhabitants.
A two-level version is elaborated of the non-parametric
scaling method first proposed by Mokken (1971).
The probabilities of positive responses to the items are assumed to be
increasing functions of the value on a latent trait.
The latent trait value for each subject is
composed of an object-dependent value and a
subject-dependent deviation from this value .
The consistency of responses within, but also between objects
is expressed by two-level versions of Loevinger's H coefficients.
The availability of parallel tests is used to calculate
a reliability coefficient.
Key words: Multi-level models, item response theory, reliability, parallel tests, ecometrics.
An extensive textbook on multilevel analysis.
Material about this book is available
at a separate web page.
Chapters
Abstract.
This paper is about social capital as a second-order resource of
individuals.
In spite of its growing popularity, social capital has mostly been
measured in ad hoc fashions.
This paper discusses possible approaches that could be taken to
measure the social capital of individuals.
What kinds of questions should be posed to the individual,
and how should these questions be integrated to a measure
of his or her social capital?
Several domains of well-being should be distinguished,
and social capital should be measured for these domains separately.
It is argued that aggregation over alters is not additive,
because the main distinction is between having no alter, or
at least one alter who could provide a given resource.
Aggregation over resources is necessary but debatable;
it can be based on either a common valuation, or on statistical
asociations, or on substitutability in the production of the individual's
well-being.
For studying the statistical association between second-order resources
available to a given individual, a distinction is proposed between,
on one hand, within-alter associations, and on the other,
within-ego associations.
The elaboration of these ideas into a questionnaire and a concrete
measurement instrument is being carried out in the
SCALE research programme and its 1999 survey of the
'social networks of the Dutch'.
Key words: social resources, aggregation.
This is further elaborated in the Ph.D. research by Martin van der Gaag on measurement of social capital.
Abstract.
Multilevel models are proposed to study relational or dyadic
data from multiple persons in families or other groups.
The variable under study is assumed to refer to a
dyadic relation between individuals in the groups.
The proposed models are elaborations of the Social Relations Model.
The different roles of father, mother, and child
are emphasized in these models.
Multilevel models provide researchers with a
method to estimate the variances and correlations of the
Social Relations Model, as well as to incorporate the effects of
covariates and to test specialized models, even for possibly incomplete data.
MLn/MLwiN macros for fitting these models can be obtained from my macro page.
Key words: network dynamics, longitudinal social network data, continuous-time Markov chain.
The methods used in this paper are implemented in the SIENA. program.
Key words: rational choice, friendship, Markov processes, random utility models, simulation, empirical test.
Key words: Multilevel analysis, network analysis, longitudinal models, mathematical modeling, gossip.
Abstract.
Actor-oriented models are proposed for the statistical analysis of
longitudinal social network data. These models are implemented as
simulation models, and the statistical evaluation
is based on the method of moments and the Robbins-Monro process
applied to computer simulation outcomes.
In this approach,
the calculations that are required for statistical inference are too
complex to be carried out analytically, and therefore they are replaced
by computer simulation.
The statistical models are continuous-time Markov chains.
It is shown how the reciprocity model of Wasserman
and Leenders can be formulated as a special case of the actor-oriented model.
Key words: Social networks, statistical modeling, actor-oriented model, continuous-time Markov chain, Robbins-Monro process.
Also see Snijders (2001) and the SIENA program.
Key words: Colored graph, EM algorithm, Gibbs sampling, latent class model, social network.
Also see Nowicki and Snijders (2001)
and the associated computer program
BLOCKS.
Key words. Dynamic access model, policy networks, computer simulation, method of moments, Robbins Monro process.
Abstract.
A class of models is proposed for longitudinal network data. These
models are along the lines of methodological individualism: actors use
heuristics to try to achieve their individual goals, subject to constraints.
The current network structure is among these constraints. The models
are continuous time Markov chain models that can be implemented as
simulation models. They incorporate random change in addition to the
purposeful change that follows from the actors' pursuit of their goals,
and include parameters that must be estimated from observed data.
Statistical methods are proposed for estimating and testing these
models. These methods can also be used for parameter estimation for
other simulation models. The statistical procedures are based on the
method of moments, and use computer simulation to estimate the
theoretical moments. The Robbins-Monro process is used to deal with
the stochastic nature of the estimated theoretical moments. An example
is given for Newcomb's fraternity data, using a model that expresses
reciprocity and balance.
Key words: methodological individualism; Markov process; Newcomb data; balance; Robbins-Monro process; simulation models; method of moments; simulated moments; random utility.
Key words: multilevel analysis, hierarchical linear model, random coefficients.
Key words: Personal network, snowball sample, multilevel analysis,hierarchical linear model, random effects, cocaine.
Also see van Duijn, van Busschbach and Snijders (1999).
Key words: R-squared, explained variance, coefficient of determination, multilevel analysis, misspecification.
Key words: Network sampling; random graphs; link-tracing designs.
Key words: Unfolding, item response theory, unimodal response
models, total positivity, unidimensional scaling, measurement theory.
Frank, O. & Snijders, T.A.B., Estimating hidden populations using
snowball sampling.
Abstract.
Journal of Official Statistics, 10 (1994), 53-67.
Snowball sampling is a term used for sampling procedures that allow
the sampled units to provide information not only about themselves but
also about other units. This might be advantageous when rare
properties are of interest. This article illustrates snowball sample
situations and discusses various modelling and estimation problems in
this context. The problem of estimating the size of a population is
discussed for both design-based and model-based approaches. An
application to a study of heroin use is included. Simulation results are
provided for comparing and evaluating various estimators.
Baerveldt, C. & Snijders, T.A.B., Influences on and from the
segmentation of networks: hypotheses and tests.
Abstract.
Social Networks, 16 (1994), 213-232.
This article discusses (a) the influence of network structure on the
diffusion of (new) cultural behavior within the network and (b) the
influence of external events, especially of social programs, on the
diffusion of (new) cultural behavior, and on the network structure.
Hypotheses are formulated and tested on data from a study on the
diffusion of petty crime in pupils' networks in high schools. To test
these hypotheses we propose and use a new measure of network
structure: the segmentation index.
Post, W.J. & Snijders, T.A.B., Nonparametric unfolding models for
dichotomous data.
Abstract.
Methodika 7 (1993), 130-156.
What are essential requirements, formulated in terms of item response
theory, for unidimensional unfolding models for dichotomous data, if
one does not wish to make specific assumptions concerning the form
of the tracelines and of the population distribution of latent trait values?
Tracelines should be unimodal, of course, but this requirement is not
sufficient to derive empirically testable
consequences. Two basic postulates are formulated concerning the
inference about subjects' latent trait values on the basis of observed
responses to items. These postulates are proven to be equivalent to
total positivity of orders 2 and 3 for the traceline family. Given these
postulates, unimodality of the tracelines leads to some empirically
testable results. These are formulated as properties of the conditional
adjacency matrix and of the correlation matrix.
Snijders, T.A.B. & Bosker, R.J., Standard errors and sample sizes for
two-level research.
Abstract.
Journal of Educational Statistics, 18 (1993), 237-259.
The hierarchical linear model approach to a two-level design is
considered, some variables at the lower level having fixed and others
having random regression coefficients. An approximation is derived to
the covariance matrix of the estimators of the fixed regression
coefficients (for variables at the lower and the higher level) under the
assumption that the sample sizes at either level are large enough. This
covariance matrix is expressed as a function of parameters occurring in
the model. If a research planner can make a reasonable guess as to
these parameters, this approximation can be used as a guide to the
choice of sample sizes at either level.
A PC program to carry out the calculations developed in this paper
is available from my
multilevel page.
Key words: hierarchical linear model, multilevel research, sample design.
Key words: Snowball Sampling, Weighting, Parameter Estimations, Social Networks.
Key words: lognormal prior, Dirichlet prior, gamma prior, posterior mode, Rasch's multiplicative Poisson model, empirical Bayes estimation.
Abstract.
Data in the form of zero-one matrices where conditioning on the
marginals is relevant arise in diverse fields such as social networks and
ecology; directed graphs constitute an important special case. An
algorithm is given for the complete enumeration of the family of all
zero-one matrices with given marginals and with a prespecified set of
cells with structural zero entries. Complete enumeration is
computationally feasible only for relatively small matrices. Therefore, a
more useable Monte Carlo simulation method for the uniform
distribution over this family is given, based on unequal probability
sampling and ratio estimation. This method is applied to testing
reciprocity of choices in social networks.
Key words: adjacency matrices, random digraphs, networks, ecology, Monte Carlo methods, unequal probability sampling, reciprocity.
Key words: Consensus, Dice coefficient, Jaccard coefficient, Simple Matching coefficient, Multivariate binary data, Observer agreement, Similarity coefficients, Beta distribution.
Key words: Counts; personal networks; reliability; reliability of change; binomial distribution; random effects; empirical Bayes; regression.
Abstract.
A method is presented for testing change of digraphs (representing
some binary relation) observed at two points in time, labeled I and II.
The test is conditional on the entire digraph at time I, the numbers of
new arcs to and from each actor, and the numbers of disappeared arcs
to and from each actor. A new arc is defined as an arc existing at time
II but not at time I; a disappeared arc is an arc existing at time I but not
at time II. In particular, tests are conditional simultaneously on in-
degrees and out-degrees at times I and II. The elements of the dyad
transition matrix, indicating the numbers of dyads of some particular
type (mutual, asymmetric, of null) at time I, and of some (same or
other) type at time II, are proposed as possible test statistics.
Also see
Snijders (Psychometrika, 1991).
Key words: Subsistence agriculture, risk, early warning.
Key words: bipartite graphs, conditionally uniform distribution.
Key words: combination of tests, equality of correlated proportions, incomplete data, asymptotically most powerful test, Monte Carlo study, antithetic variates, power comparison.
Key words: antithetic variates, Monte Carlo, variance reduction, change-point test, Wilcoxon test.
Key words: graph heterogeneity, graph centrality, random graphs, degree variance.
Key words: Nonparametric tests, bivariate symmetry and asymmetry, locally most powerful tests, asymptotic normality.
Key words: communality, internal consistency, Heywood case, positive definite.
Key words: graph heterogeneity, graph centrality, random graphs, degree variance.
Key words: Empirical Bayes classification, complete class, monotone procedures.
Key words: Markov chain, ethology, transition analysis.