White Paper
Analysis of Erlanger Data Series:
Option Data Smoothing - Medians
By: Philip B. Erlanger, CMT
Simple moving averages are the customary tool for smoothing data such as options trading. We have found that using medians is a better representation.
I. Introduction
Simple moving averages take into account all data within a period. For example, a 10-day average would factor in all data for a ten-day period. If there are errors, or outlier samples in the data, these would be reflected in the average. Options data is particularly prone to errors as reported by the data feeds from the various exchanges. In illiquid issues particularly they are prone to vast swings that often can be viewed as outliers.
Instead of factoring all values in a 10-day average, we have found it advantageous to measure the middle or center of such a distribution.
II. Medians
The middle value in an ordered list of numbers is called the median. The specific value for the median depends on whether the data set contains an even or odd number of observations and, in the even case, whether or not the two middle values are the same or different. To find the median of a data set:
The median has the property that, as nearly as possible, half the data are below and half the data are above the median value.
III. Influence of Extreme Values on the Median
The median uses order information in the data but does not use the actual numbers to any large extent. The extremely large numbers that can occur with one day of options data has essentially no effect on the median level of options trading. Similarly, small extreme values have no effect on the median. The median is not influenced at all by the extreme observations in a data set . This is of value as the measure of options trading because changes over time are due to overall changes in activity, and not to outliers or the rare error that can occur (especially with options data).