We will analyse data collected by Public Whip on the votes each Member of Parliament (MP) has cast during the 2010-2015 parliament in the House of Commons. Each vote corresponds to a particular piece of legislature and is called a *division*. Download these data here and to load use:

In [2]:

```
library(readr)
df <- read_csv("~/data/publicwhip/parliament2010.dat")
votes <- df[,2:1227]
dim(votes)
```

`votes`

, $+1$ indicates an `aye`

on the division, $-1$ a `no`

. $0$ indicates that the MP did not vote in that particular division. In reality, it is a bit more complicated than this as there are also *tellers* as well as the MPs who voted both ways (!), but I cleaned up the original format so that we can treat it as a simple binary dataset. Now let us consider the number of votes each MP has cast. It turns out that there is a huge range in these: there is an MP who voted only once(!), whereas another one voted in 1179 out of 1226 divisions.

In [3]:

```
#find number of votes cast -
nz<-apply(votes,1,function(x) sum(x!=0))
summary(nz)
```

In [4]:

```
df$surname[nz<5]
```

In [5]:

```
df<-df[nz>=5,]
votes<-votes[nz>=5,]
#recompute number of votes cast
nz<-apply(votes,1,function(x) sum(x!=0))
summary(nz)
```

In [6]:

```
#center the data
y <- scale(votes, scale = FALSE)
#summary(colMeans(y))
npc<-4
y.svd <- svd(y,nu=npc,nv=npc)
proj<-y.svd$u %*% diag(y.svd$d[1:npc])
#plot pca projections
pairs(proj)
```

In [7]:

```
#now plot with party membership
party=factor(df$party)
pairs(proj,col=party)
```

In [21]:

```
plot(proj[,1:2],col=party,pch='x',xlab='PC 1',ylab='PC 2')
text(proj[,1:2],labels=party,pos=3,cex=0.6)
```

In [19]:

```
#find lengths of the vectors of projections
ll<-apply(proj[,1:2],1,function(x) sqrt(sum(x^2)))
#the points pulled towards the mean are not indication of rebelliousness but of not voting often!
plot(nz,ll)
print(paste('correlation:',cor(nz,ll)))
```

`Cameron`

, `Clegg`

, `Osborne`

, `Hague`

(which is hardly a rebellious crowd).

In [22]:

```
plot(proj[,1:2],col=party,pch='x',xlab='PC 1',ylab='PC 2')
text(proj[,1:2],labels=df$surname,pos=3,cex=0.6)
```