Tag Archives: R

Stratified Random Sampling in R

Using the strata function from the sampling package in R it’s possible to select a stratified random sample from a set of data. The code below provides an example, where the class is labeled class in the data frame.

# Load sampling package
library(sampling)

# Set number of samples
nSamples <- 10000

# Set samples to nSamples or max samples in class.
numSamplesClass <- as.data.frame(table(all_data$class))[,2]
numSamplesClass[numSamplesClass > nSamples] <- nSamples

# Sort data (required)
all_data <- all_data[order(all_data$class),]

# Sample
s <- strata(all_data, "class", 
 size=numSamplesClass,method="srswor")
sample_data <- getdata(all_data,s)