Stratified Random Sampling in R

Using the strata function from the sampling package in R it’s possible to select a stratified random sample from a set of data. The code below provides an example, where the class is labeled class in the data frame.

# Load sampling package
library(sampling)

# Set number of samples
nSamples <- 10000

# Set samples to nSamples or max samples in class.
numSamplesClass <- as.data.frame(table(all_data$class))[,2]
numSamplesClass[numSamplesClass > nSamples] <- nSamples

# Sort data (required)
all_data <- all_data[order(all_data$class),]

# Sample
s <- strata(all_data, "class", 
 size=numSamplesClass,method="srswor")
sample_data <- getdata(all_data,s)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s