I have a student this semester who was interested in determining whether a group of beetles in a confined space (petri dish) avoided each other, aggregated in groups, or perhaps simply wandered randomly.  I don’t know of an existing statistical test to determine the answer to this question.   However, it seemed that perhaps the mean pairwise distance between all beetles might be a good measure of where an organism falls along this continuum.  For example if we have a square grid that is 10 units on a side and we put one beetle roughly in each corner:













The pairwise measures are:
9
9
9
9
12.73
12.73
The average of these measures is 10.24

In contrast if we have the same grid but the beetles have aggregated into two clumps:














The pairwise measures would now be:
0
0
12.73
12.73
12.73
12.73
The average of these measures is 8.48

Spread out individuals give us a larger number clumping/aggregation leads to a smaller number - this is what we are after.  Once she collected her data we needed to create a null distribution of expected pairwise distances assuming that the beetles were randomly distributed through the available space.  To do this we simply need to create the same number of random X and Y coordinates (making sure they fall in the circle of our petri dish) as she had in her experiment. Once we have a set of random coordinates we can then measure all the pairwise distances.  We repeat this process thousands of times to generate a null distribution of what we expect to see.  Here is what the pseudocode for this looks like

Radius of circle = 
Number of beetles =
Iterations =
Observed pairwise mean =

Loop for iterations
-Loop for number of beetles
--While coordinate not found
---Generate a coordinate
---Test coordinate for falling inside circle if true save
--Calculate mean of all pairwise distances and record
Plot null distribution and observed value



Below is the actual R code that I used to accomplish this.  I decided to tweak it a bit and turn it into a shiny app that allows her to run it through the web thanks the free shiny apps server.  The fully implemented version is here


 # This code produces pairwise measurement for  
 # randomly placed points in a circle of arbitrary size  
 # Heath Blackmon & Elizabeth Mallory  
 # 20 April 2015  
 RandomDist <- function(radius=10,   
             individuals=5,   
             iterations=100,   
             observed=34){  
  # this internal function just calculates a pairwise distance  
  pw.dist <- function(x1, x2, y1, y2){  
   sqrt(((x1 - x2) ^ 2) + ((y1 - y2) ^ 2))  
  }  
  # this function tests whether a point fall inside (or on)  
  # a circle of given radius  
  in.circle <- function(x, y, radius){  
   (x - radius)^2 + (y - radius)^2 < radius^2  
  }  
  # we will store our final null dist here  
  final.results <- vector()  
  # this loop will repeat till iterations is reached  
  for(k in 1:iterations){  
   # this print statement allows us to monitor progress  
   if(k/10 == round(k/10)) print(paste("iteration", k))  
   # we will store our coordinates in this dataframe  
   coords <- data.frame(x=0,y=0)  
   # this will loop through for the number of individuals  
   for(m in 1:individuals){  
    # this flag will be false until we generate a   
    # valid coordinate  
    inside <- F  
    # we generate the coordinates in this while statement  
    while(inside == F){  
     # pick coordinates from uniform distribution  
     t.coords <- runif(n=2, min = 0, max = radius*2)  
     # now test if they are in circle  
     if(in.circle(x=t.coords[1], y=t.coords[2], radius)){  
      inside <- T  
     }  
    }  
    coords[m, 1:2] <- t.coords  
   }  
   colnames(coords) <- c("X", "Y")  
   # this calculates the cumulative sum of the number of   
   # individuals - 1 (this is equivelant to the number of   
   # pairwise measures)  
   pairnum <- tail(cumsum((individuals - 1):1), n=1)  
   pairmeas <- vector()  
   counter <- 2  
   # this loop is going to do the pairwise measures  
   for(i in 1:(individuals-1)){  
    for(j in counter:individuals){  
     # here we do an actual pairwise calculation  
     x <- pw.dist(coords[i,1],  
            coords[j,1],   
            coords[i,2],   
            coords[j,2])  
     pairmeas <- c(pairmeas, x)  
    }  
    counter <- counter + 1  
   }  
   # store result from each iteration  
   final.results[k] <- mean(pairmeas)  
  }  
  # calculate emperical p-value  
  lower <- sum(final.results>observed)  
  higher <- sum(final.results<observed)  
  if(lower < higher){  
   pval <- lower/iterations  
  }else{  
   pval <- higher/iterations  
  }  
  # plot the results  
  plot(density(final.results),   
     main = "Distribution of Mean Pairwise Measures",   
     xlab = "mean distance",  
     ylab = "density",  
     cex.main=.7)  
  abline(v=observed, col="red", lwd=2)  
  # pull in the plot limits  
  mylims <- par("usr")  
  text(x=mylims[1],   
     y=mean(mylims[3:4]),   
     paste("pvalue =", pval),  
     col="red",  
     pos=4,  
     cex=.6)  
  return(final.results)  
 }   


0

Add a comment

Great Blogs
Great Blogs
About Me
About Me
My Photo
I am broadly interested in the application and development of comparative methods to better understand genome evolution at all scales from nucleotides to chromosomes.
Subscribe
Subscribe
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.