Kolmogorov-Smirnov test — using R
Kolmogorov Smirnov test is widely used for univariate distributions to verify if:
- an unknown sample belongs to a known distribution, or
- two unknown samples belong to the same distribution.
The basic logic of the test is based on the distance between the empirical CDF for the distributions.
For scenario 1, suppose we have the random samples X₁, X₂,…, Xₙ from some unknown distribution Z, and we want to test that Z is equal to some known distribution Z₀.
H₀:Z= Z₀ vs H₁ :Z ≠ Z₀
The Kolmogorov-Smirnov statistic Dₙ is the maximum absolute distance between the empirical CDFs of Z and Z₀. Empirical CDF (EDF) is no_of_observations ≤ Xᵢ / total_no._of_observations
Here is an example of comparison of some random samples from Gamma distribution with shape 2.

Once we have the Dₙ, we compare the value with

If Dₙ < D𝒸ᵣᵢₜ,₀.₀₅;we fail to reject the Null Hypothesis H₀.
For scenario 2, let’s assume we have two distributions K and L. And we have the null Hypothesis H₀ : K = L i.e these two distributions are same vs H₁ : K ≠L.
Let Kₘ(x) and Lₙ(x) denote the empirical CDFs for distributions K and L respectively. The Kolmogorov-Smirnov statistic for maximum absolute distance between the empirical CDFs is


Dₙ can take a value between 0 and 1 (0 ≤ Dₙ ≤ 1). If Dₙ is closer to the value of 1, the test supports that the alternative Hypothesis is true (H₁ : K ≠L). If Dₙ converges to 0, means both the samples follows the same distribution.
R offers a built in function ks.test which computes the K-S statistic and the p-value on basis of the empirical CDF. Here is how we can use R to perform a two sample K-S test.
ks.test(x, y, alternative=”two.sided”, simulate.p.value=true, B=10000)
Details:
x — a numeric vector
y — a character string for comparison with a known distribution, e.g. pnorm,
pgamma, or a numeric vector in case of comparison between two distribution
samples.
alternative hypothesis can be one of “two.sided”, “less”, or “greater”. “two-sided” specifies the null hypothesis that the two distributions are equal.
simulate.p.value — Use Monte Carlo simulation to compute p-values for discrete goodness of fit tests
B — number of replicates to use for Monte Carlo simulations (only for discrete goodness of fit tests)
Example use

Here K-S statistic D is 0.12 and the p-value is 0.4676 which is greater than
the accepted significance level = 0.05, so we fail to reject H₀ i.e. the distribution is same for sample₁ and sample₂.
References:
Rizzo, M. L. (2019). KS test. In Statistical computing with R, second edition., Chapman and Hall.