The mean is the only measure of central tendency in which the total of each value's variation from the mean is always zero. The fundamental drawback of the mean, on the other hand, is its vulnerability to the effect of outliers. A single extremely large or small value can change the meaning of the mean dramatically; for example, it may make the mean positive even though most of the data are negative.

Other measures of central tendency include the median, which is equal to the value located in the middle of the distribution; and the mode, which is the most frequently occurring value. These two measures are very similar, with the mode being the same value as the median except when there is more than one mode in which case they would be evenly distributed between them.

It is important to note that while the mean is unique in calculating **the total variation** from the center, it is not unique in determining where this center lies. For example, if a dataset has **two identical means** but one comes after two values far outside the range of the others then we could say that the mean has moved away from the data.

This is not possible with **the other measures** since they all determine the center simultaneously. For example, if two values are twice as far apart as another pair of values then they will divide **the sample space** into four regions, each containing an equal number of elements.

- What are the disadvantages of measures of central tendency?
- What is one concern about using the mean as the measure of central tendency?
- What is the best measure of central tendency and dispersion?
- Which measure of central tendency is most affected by a skewed distribution?
- Why is the mean an appropriate measure of central tendency?
- Which methods are used to measure central tendency and dispersion?
- Which of the 3 measures of central tendency must be equal in a perfectly normal distribution?

The downsides of using the mean as a measure of **central tendency** are that it is particularly vulnerable to outliers (observations that are significantly different from the majority of observations in a data collection) and that it is ineffective when the data is skewed rather than normal. These problems can be avoided by using another statistical measure instead, for example the median or mode.

When your data distribution is continuous and symmetrical, such as when your data is normally distributed, the mean is typically the best measure of central tendency to utilize. If you need to calculate **some sort** of average, such as a median, use the median instead.

When your data distribution is not continuous or symmetrical, such as when your data is not normally distributed, there are other measures of **central tendency** that can be used instead. For example, if your data has **an exponential distribution**, the mode is often used in place of the mean. If there are an even number of observations, such as when counting numbers are assigned to items drawn at random from a population, then the mean is usually used instead of the mode.

The standard deviation is commonly utilized to describe the spread of data around the mean. So if your data follows a normal distribution, you can use the mean and standard deviation to describe its central tendency and variability respectively. If your data does not follow a normal distribution, some other measure of variability may be needed instead. For example, if your data is exponentially distributed, you can use the median and scale parameter to describe it similarly.

Categorical data consists of **only two possible values**: yes/no, right/wrong, true/false.

Because it provides an average of **all the values** in the data set, the mean is the most commonly used measure of central tendency. The median is preferable to the mean when dealing with data from skewed distributions because it is not impacted by exceptionally big numbers. There are other measures of central tendency that are less common including the mode (most frequent value) and the standard deviation.

Skewed Distributions, Means, and Medians However, in this case, the mean is commonly regarded as **the best measure** of central tendency since it is the only measure that uses all of the values in the data set to calculate its value, and any change in any of the scores will alter the mean's value. This makes the mean particularly useful when trying to estimate the center of a distribution from a sample of observations.

The mode, median, and mean of a distribution are three often used metrics of central tendency. In a distribution, the mode is the most common value or values. ( The median is the value in the midpoint of a distribution. The mean is calculated by taking the total of all the values and dividing it by the number of values. ) Central tendency measures such as the mode, median, and mean can be applied to discrete or continuous distributions.

The standard deviation is a measure of dispersion that indicates how far each data point is from the mean. It is also called the variance because it describes the amount of variation around the mean. Dispersive measurements like the standard deviation can be applied to both discrete and continuous distributions.

The range is the highest value minus the lowest value. This measurement indicates how much a set of numbers varies around a mean. The range is useful in comparing distributions with different scales. For example, the range of **income data** is likely to be much larger than the range of **age data**.

The sample mean is simply the average of the individual values in **your sample**. It provides a way to estimate the mean of a population when you don't have access to that population. The sample mean is particularly useful when you have only a small subset of the whole population. For example, if you were trying to estimate the mean age of students at a school where there are only 20 students, then you would use the sample mean rather than asking **every student** their age.

The arithmetic mean, median, and mode are three often used measurements of a distribution's central tendency. The mean, median, and mode of **a completely symmetrical, non-skewed distribution** are all equal. However, for **asymmetric or skewed distributions** these values may differ.

The mean is simply the sum of the values divided by the number of items: μ = Σx i /n. The median is the value such that half of the observations are less than this value and half are greater than or equal to it. For a sample size of n, this can be calculated as follows: med = (x1 + x2 +... + xn) / n. For example, if the data are ranked from largest to smallest, the median is the same as **the third highest value**. Note that if there are an odd number of values, then the median is still equal to the third highest value but now only half of the observations are less than or equal to this value.

For skewed distributions, these statistics will also differ. In such cases, we need another measure of central tendency beyond the mean and median. The mode is the value which occurs most frequently in the data set. For example, if the data are ranked from highest to lowest, the mode would be the same as the third highest value.