I have a student this semester who was interested in
determining whether a group of beetles in a confined space (petri dish) avoided
each other, aggregated in groups, or perhaps simply wandered randomly. I don’t know of an existing statistical test
to determine the answer to this question.
However, it seemed that perhaps the mean pairwise distance between all beetles
might be a good measure of where an organism falls along this continuum. For
example if we have a square grid that is 10 units on a side and we put one
beetle roughly in each corner:

The pairwise measures are:

9

9

9

9

12.73

12.73

The average of these measures is 10.24

In contrast if we have the same grid but the beetles have
aggregated into two clumps:

The pairwise measures would now be:

The pairwise measures would now be:

0

0

12.73

12.73

12.73

12.73

The average of these measures is 8.48

Spread out individuals give us a larger number clumping/aggregation leads to a smaller number - this is what we are after. Once she collected her data we needed to create a null distribution of expected pairwise distances assuming that the beetles were randomly distributed through the available space. To do this we simply need to create the same number of random X and Y coordinates (making sure they fall in the circle of our petri dish) as she had in her experiment. Once we have a set of random coordinates we can then measure all the pairwise distances. We repeat this process thousands of times to generate a null distribution of what we expect to see. Here is what the pseudocode for this looks like

Radius of circle =

Number of beetles =

Iterations =

Observed pairwise mean =

Loop for iterations

-Loop for number of beetles

-Loop for number of beetles

--While
coordinate not found

---Generate a
coordinate

---Test coordinate for falling inside circle if true save

---Test coordinate for falling inside circle if true save

--Calculate mean of all pairwise distances and record

Plot null distribution and observed value

Below is the actual R code that I used to accomplish this. I
decided to tweak it a bit and turn it into a shiny app that allows her to run
it through the web thanks the free shiny apps server. The fully implemented version is here

```
# This code produces pairwise measurement for
# randomly placed points in a circle of arbitrary size
# Heath Blackmon & Elizabeth Mallory
# 20 April 2015
RandomDist <- function(radius=10,
individuals=5,
iterations=100,
observed=34){
# this internal function just calculates a pairwise distance
pw.dist <- function(x1, x2, y1, y2){
sqrt(((x1 - x2) ^ 2) + ((y1 - y2) ^ 2))
}
# this function tests whether a point fall inside (or on)
# a circle of given radius
in.circle <- function(x, y, radius){
(x - radius)^2 + (y - radius)^2 < radius^2
}
# we will store our final null dist here
final.results <- vector()
# this loop will repeat till iterations is reached
for(k in 1:iterations){
# this print statement allows us to monitor progress
if(k/10 == round(k/10)) print(paste("iteration", k))
# we will store our coordinates in this dataframe
coords <- data.frame(x=0,y=0)
# this will loop through for the number of individuals
for(m in 1:individuals){
# this flag will be false until we generate a
# valid coordinate
inside <- F
# we generate the coordinates in this while statement
while(inside == F){
# pick coordinates from uniform distribution
t.coords <- runif(n=2, min = 0, max = radius*2)
# now test if they are in circle
if(in.circle(x=t.coords[1], y=t.coords[2], radius)){
inside <- T
}
}
coords[m, 1:2] <- t.coords
}
colnames(coords) <- c("X", "Y")
# this calculates the cumulative sum of the number of
# individuals - 1 (this is equivelant to the number of
# pairwise measures)
pairnum <- tail(cumsum((individuals - 1):1), n=1)
pairmeas <- vector()
counter <- 2
# this loop is going to do the pairwise measures
for(i in 1:(individuals-1)){
for(j in counter:individuals){
# here we do an actual pairwise calculation
x <- pw.dist(coords[i,1],
coords[j,1],
coords[i,2],
coords[j,2])
pairmeas <- c(pairmeas, x)
}
counter <- counter + 1
}
# store result from each iteration
final.results[k] <- mean(pairmeas)
}
# calculate emperical p-value
lower <- sum(final.results>observed)
higher <- sum(final.results<observed)
if(lower < higher){
pval <- lower/iterations
}else{
pval <- higher/iterations
}
# plot the results
plot(density(final.results),
main = "Distribution of Mean Pairwise Measures",
xlab = "mean distance",
ylab = "density",
cex.main=.7)
abline(v=observed, col="red", lwd=2)
# pull in the plot limits
mylims <- par("usr")
text(x=mylims[1],
y=mean(mylims[3:4]),
paste("pvalue =", pval),
col="red",
pos=4,
cex=.6)
return(final.results)
}
```