Assessing Defective Electronic Components: Sampling, Confidence Intervals, and Operating Characteristic Curves

When dealing with large production runs, such as a batch of a million electronic components, assessing the quality or defect proportion can be challenging. In this blog post, we'll explore sampling methods, confidence intervals, and the probability framework for tackling this problem. We'll also delve into building operating characteristic (OC) curves to evaluate the performance of our sampling plan. [1]

Sampling Methods

To obtain a representative sample from a large production run, we can use the following sampling methods:

Simple Random Sampling (SRS): Each component has an equal probability of being selected. This method is unbiased and easy to implement.
Stratified Sampling: If the components have distinct subgroups or strata based on certain characteristics, we can divide the population into strata and randomly sample from each stratum proportionally to its size in the population.

Confidence Intervals

Confidence intervals provide a range of values likely to contain the true population parameter (proportion of defective components) with a certain level of confidence. The most common confidence level is 95%. The confidence interval for the proportion of defective components is given by:

$$ ⁍ $$

Where:

$\hat{p}$ is the sample proportion of defective components (aka relative frequency)
$z^*$ is the critical value from the standard normal distribution (e.g., 1.96 for a 95% confidence level)
$n$ is the sample size

Probability and Statistical Framework

The Central Limit Theorem (CLT) states that the sampling distribution of the sample proportion approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution. This allows us to make inferences about the population based on the sample data.

The standard error of the sample proportion, denoted as $SE(\hat{p})$, measures the variability of the sample proportion and is calculated as:

$$ ⁍ $$

The probability of the sample representing the main population increases as the sample size grows. To determine the minimum sample size required to achieve a desired margin of error ($e_m$) and confidence level (CL), we can use the following formula:

$$ ⁍ $$

Python Simulation

Let's simulate sampling 100 electronic components from a production run of 1 million components, assess the defects/good components, and repeat the process using Monte Carlo simulation. We'll then compare the results to the analytical results.