What is Stratified Random Sampling?
Stratified random sampling is a sampling method that involves dividing the population into smaller subgroups called layers. Stratification In random sampling or stratification, layers are formed based on the shared attributes or characteristics of members such as income and educational background.
Stratified random sampling is also known as proportional random sampling or quarter random sampling.
Important point
- Stratified random sampling allows researchers to obtain a sample population that best represents the entire population under study.
- Stratified random sampling divides the entire population into uniform groups called layers.
- Stratified random sampling is different from simple random sampling, which randomly selects data from the entire population, so each possible sample can occur equally.
Stratified random sampling
Mechanism of stratified random sampling
After completing an analysis or survey of a group of entities with similar characteristics, the researcher may find that the population is too large to complete the survey. To save time and money, analysts can take a more viable approach by choosing a small group from the population. A small group, called the sample size, is a subset of the population used to represent the entire population. Samples can be selected from the population in several ways. One of them is the stratified random sampling method.
Stratified random sampling divides the entire population into uniform groups called layers (plural). layer). Then a random sample is selected from each layer. For example, consider an academic researcher who wants to know the number of MBA students who received a job within three months of graduating in 2007.
He will soon notice that there were nearly 200,000 MBA graduates that year. He may decide to take a simple random sample of 50,000 graduates and perform a survey. Even better, he was able to divide the population into layers and take random samples from the layers. To do this, he created a population group based on gender, age range, race, country of nationality, and background. Random samples from each layer are taken in proportion to the size of the layer when compared to the population. Subsets of these layers are then pooled to form a random sample.
[Important: Stratified sampling is used to highlight differences between groups in a population, as opposed to simple random sampling, which treats all members of a population as equal, with an equal likelihood of being sampled.]
Example of stratified random sampling
Suppose a research team wants to determine a GPA for college students across the United States. Research teams have difficulty collecting data from all 21 million college students. We decided to use 4,000 students to get a random sample of the population.
Now suppose the team examines the various attributes of the sample participants and wonders if there is a difference between the GPA and the student major. Suppose you find that 560 students major in English, 1,135 major in science, 800 major in computer science, 1,090 major in engineering, and 415 major in mathematics. The team wants to use a proportionally stratified random sample, where the sample layer is proportional to the random sample in the population.
Suppose the team surveys the demographics of college students in the United States and finds the percentage of students majoring. English is 12%, science is 28%, computer science is 24%, engineering is 21%, and 15%. Majored in mathematics. Therefore, five layers are created from the layered random sampling process.
Next, the team needs to make sure that the population layer is proportional to the sample layer. However, they found that the ratios were not equal. The team then needs to resample 4,000 students from the population and randomly select 480 English, 1,120 science, 960 computer science, 840 engineering, and 600 math students. there is.
With them, there is a proportionally stratified random sample of college students that can better represent the college majors of US students. Researchers can highlight specific demographics, observe different studies of US college students, and observe different grade averages. ..
Simple random sample and stratified random sample
Both simple and stratified random samples are statistical measurement tools. A simple random sample is used to represent the entire data population. Stratified random samples divide the population into smaller groups or layers based on common characteristics.
A simple random sample is used when very little information is available about the data population, when the data populations are so different that they cannot be divided into different subsets, or when there is only one distinct characteristic between the data populations. Often used.
For example, a candy company may want to investigate a customer’s purchasing habits to determine the future of their product line. If you have 10,000 customers, you can use selecting 100 from those customers as a random sample. Then you can apply what you find from 100 customers to the rest of the base. Unlike stratification, 100 members are sampled purely randomly, regardless of their individual characteristics.
Proportional and disproportionate stratification
Stratified random sampling ensures that each subgroup of a particular population is properly represented within the entire sample population of the study. Stratification can be proportional or imbalanced. In the proportional stratification method, the sample size of each layer is proportional to the population size of the layers.
For example, if a researcher needs a sample of 50,000 graduates using an age range, a proportionally stratified random sample is obtained using the following formula: (sample size / population size) x layer size. The table below assumes a population size of 180,000 MBA graduates annually.
Age group |
24-28 |
29-33 |
34-37 |
total |
Number of layers |
90,000 |
60,000 |
30,000 |
180,000 |
Layer sample size |
25,000 |
16,667 |
8,333 |
50,000 |
The layer sample size for MBA graduates aged 24-28 is calculated as (50,000 / 180,000) x 90,000 = 25,000. The same method is used for other age group groups. Now that we know the sample size of the layers, researchers can perform a simple random sampling on each layer to select study participants. That is, 25,000 graduates aged 24-28 are randomly selected from the entire population, and 16,667 graduates aged 29-33 are randomly selected from the population.
In an unbalanced stratified sample, the size of each layer is not proportional to the size within the population. Researchers can decide to sample 1/2 of graduates aged 34-37 and 1/3 of graduates aged 29-33.
It is important to note that one person does not fit in multiple layers. Each entity should fit in only one hierarchy. Duplicate subgroups mean that some individuals are more likely to be selected for the survey. This completely negates the concept of stratified sampling as a type of stochastic sampling.
Portfolio managers can use stratified random sampling to create portfolios by duplicating indexes such as fixed income indexes.
Benefits of stratified random sampling
The main advantage of stratified random sampling is that it captures the characteristics of the major population within the sample. Similar to the weighted average, this sampling method produces characteristics in the sample that are proportional to the entire population. Stratified random sampling works …