This document discusses various statistical concepts used to analyze and summarize data. It defines key terms like frequency distribution, histogram, measures of central tendency (mean, median, mode), measures of dispersion (range, interquartile range, standard deviation), and the normal distribution. Frequency distribution involves arranging raw data into classes and counting frequencies. A histogram is a graphical representation of a frequency distribution. Measures of central tendency indicate the central or typical value of a data set. Measures of dispersion indicate how spread out the data is around the central value. The normal distribution is a symmetrical and bell-shaped distribution defined by its mean and standard deviation.
2. Frequency distribution Raw data are arranged into classes and frequencies. Classes have represent grouping which contains LL lower limit and UL upper limit Against each class, you count and place number called frequency. Range = Max-Min No.of classes – square root of observation. Classes should not <5and not more than >15
3. Histogram Also called frequency histogram It’s a graphical representation, X-axis is class and y-axis represent frequency. When pattern is symmetrical and bell shaped then reflects normal distribution. It is used for assessing material strength, estimating process capabilities, indicative corrective action and comparing.
4. Central Tendency Whenever we measure things of large group , we tend to cluster around the midvalue The most widely used measurement Mean Median Mode
5. Mean Sum total of observation in set divided by total no of observation. n is total no of observation
6. Median Median is the middle most observation when you arrange data in ascending or descending data. If the sample size is an odd number then median is (n+1)/2th value in ranked data. If the sample size is even, then median will be between two middle value, you take average to these two middle values.
7. Mode Mode is that value which occurs most often. It has max frequency occurrence.
8. Measure of Dispersion - Variation It indicates how large the spread of the distribution around the central tendency. Popular measure of dispersion Range Inter-quartile range Mean Absolute deviation (MAD) Standard deviation Coefficient of Variation
9. Range It is the simplest of all measures of dispersion. Calculated as the difference between max and min value in the data sheet. Range is a popular measure of variation in quality control application.
10. Inter-quartile range Range is entirely dependant on max and min values in the data set a misleading when one of them is an extreme value. To overcome, you can resort to inter-quartile range. It is computed as the range after eliminating the highest and lowest 25% of observation in a data set that is arranged in ascending or descending.
11. Example of Inter-quartile Calculate 12, 14, 11, 18, 10.5, 12, 14, 11, 9 Arrange in ascending order 9, 10.5, 11, 11, 12, 12, 14, 14, 18 Ignore first and last two observation. The remaining 11,11,12,12,14 calculate range i.e 14-11=3 ** The range for this problem is 18-10.5=7.5. Inter-quartile range is 3 is much smaller than the range 7.5 thus proving the point that is less sensitive to extreme value.
12. MAD and SD Mean Absolute Deviation and Standard deviation MAD-The average based on the deviations measured from arithmetic mean in which all deviation are treated positive. SD- It is classic measure of dispersion. Its based on all observation. Plays vital role in testing hypothesis and forming confidence level. Positive square root of variance. Variance is average sum of square of each item from the mean in a data set.
13.
14. Has two parameter mean and sd. If the tails of normal distribution are extended, they will run parallel to x-axis.
15.
16. Thank you For any query pl contact anubhawalia@gmail.com Principal Trainer – Soft skills and Quality