Apache Commons Math: Descriptive Statistics

ASHWANI SINGH
3 min readOct 9, 2021

1. Overview

We’re frequently in need of using mathematical tools and sometimes java.lang.Math is simply not enough. Fortunately, Apache Commons has the goal of filling in the leaks of the standard library, with Apache Commons Math.

Apache Commons Math is the biggest open-source library of mathematical functions and utilities for Java. Apache Commons Math is divided into several packages but we mainly focus on Descriptive Statistics.

2. Maven Configuration

To start with we need to add apache commons math3 dependencies to our pom.xml:
The latest version can be found at Maven Central.

<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency>

3. Descriptive Statistics

The package org.apache.commons.math3.stat provides several tools for statistical computations. Descriptive Statistics maintains a dataset of values of a single variable and computes descriptive statistics based on stored data.

3. 1 Window Size:
Each dataset has a property window size that sets a limit on the number of values that can be stored in the dataset. The default value is infinite which puts no limit on the size of the dataset. This value should be used with caution, as the backing store will grow without bound in this case.

DescriptiveStatistics ds = new DescriptiveStatistics(windowSize);

For example, if the window size is set to 3 and the values {1,2,3,4,5} have been added in that order and then the available values are {3,4,5} and all reported statistics will be based on these values. If the window size is decreased and there are more than the new value of elements in the current dataset, values from the front of the array are discarded to reduce the dataset.

3.2 Statistical Operations:
The very first step is to add the values to the dataset. Once the values are added to the dataset we can perform different statistical operations on that dataset.

getWindowSize() : Returns the maximum number of values that can be stored in the dataset, or INFINITE_WINDOW (-1) if there is no limit.
addValue(double v): Adds the value to the dataset.
getSum(): Returns the sum of the values that have been added to Univariate.
getMax(): Returns the maximum of the available values.
getMean(): Returns the arithmetic mean of the available values.
getMin(): Returns the minimum of the available values.
getPercentile(double p): Returns an estimate for the pth percentile of the stored values.
getStandardDeviation(): Returns the standard deviation of the available values.
getVariance(): Returns the (sample) variance of the available values.
getPopulationVariance(): Returns the population variance of the available values.
getGeometricMean() : Returns the geometric mean of the available values.
getSkewness(): Returns the skewness of the available values.
getKurtosis(): Returns the Kurtosis of the available values.

4. Code Implementation

Now we will see how to achieve the above statistical method using the descriptive statistics in java code.

package statistics;import org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
import java.util.List;
public class StatisticsUtil {public static void performOperation(List<Double> values) {
DescriptiveStatistics ds = new DescriptiveStatistics();
values.forEach(ds::addValue);
int windowSize = ds.getWindowSize();
double sum = ds.getSum();
double max = ds.getMax();
double min = ds.getMin();
double mean = ds.getMean();
double median = ds.getPercentile(50);
double sixtyFifthPercentile = ds.getPercentile(65);
double SD = ds.getStandardDeviation();
double variance = ds.getVariance();
double twoSigma = 2 * ds.getStandardDeviation();
double populationVariance = ds.getPopulationVariance();
double GM = ds.getGeometricMean();
double skewness = ds.getSkewness();
double kurtosis = ds.getKurtosis();
}
}

5. Conclusion

In this article, we have covered only the tip of the iceberg complete list can be found here. If you want to build your own service to deal with statistical data trust me this library will make your life simpler.
Tap the 👏 button and follow if you find this article interesting. Leave your comments if you find any required details are missing.

Reference: https://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics

--

--

ASHWANI SINGH

Software Engineer | Backend Developer | NITK Surathkal | ISC BHU