六西格玛培训资料统计学基础解读
- 格式:doc
- 大小:19.00 KB
- 文档页数:5
1Notes:2Notes:3Notes:4Notes:5Notes:Descriptive and inferential are the classic ways to dividestatistics. Descriptive statistics are used to characterize populations.Inferential statistics is a relatively new way to draw conclusions about populations by using relatively small samples of data from the population.One of the goals of Six Sigma is to derive an inferential statistical model and then translate it into a practical processmodel. This is known as y = f(x.6Notes:Data is collected not to evaluate each individual data point, but to generate statistics. Statistics help to describe the process (or population and determine its behavior.As a process is delivering a good or service, an analyst can dipinto the stream of output, gather a representative sample, calculate the statistics of the sample, and then infer about thepopulation defined by the total process output.7Notes:Many questions about a process can be answered usingstatistics. The customer is interested in receiving product thatmeets specific characteristics. The process needs to be sampled to determine how well it is performing to the customer expectations. Statistics provides a quantitative means for examination.8Notes:There are four important attributes to a given process that needto be determined in order to fully profile performance.LOCATION –where the process output tends to clusterSPREAD –how much variation exists around the cluster pointSHAPE –what is the pattern of frequencyCONSISTENCY –the snapshot analysis useful for decision-making9Notes:The mean is the most common measure of location (centraltendency. However, if the data is skewed or not symmetric, the mean can be misleading. For symmetric distributions, the median and the mean are identical.The median is the center value in a list of data that is sorted in rank order. For an odd number of data points, the median is the middle value in the list. For an even number of data points,the median is the average of the two middle data points.The mode is the the value that occurs with the most frequency in the sample. The mode is most often used as a measure of the most popular option in an opinion poll.10Notes:11Notes:While the range does provide a measurement of the extremevalues of a sample, it does not give information about the variability of the data about the mean.Variance and Standard Deviation provide information about the collection of samples and how they relate to the sample mean.12Notes:13Notes:14Notes:15Notes:Symmetrical –The process is just as likely to output a low value as a high valueSkewed –Values tend to cluster at one side of the meanBimodal –Could signal two overlapping processesTruncated –Could indicate that someone (or something is sorting out values over (or under a specific value.16Notes:Box Plot gives a graphical summary of the values in a single column and helps you identify extreme values. The default boxplot display consists of a rectangular box, representing roughly the middle 50% (interquartilerange or IQ Range of the data, and lines (or "whiskers" extending to either side, indicating the general extent of the data. Minitab marks the median value inside the box. Minitab also marks outliers. A boxplot can also can depict a confidence interval (CI for the population median.17Notes:18Notes:19Notes:20Notes:P-È (p-value • p-valueªv ²÷¾Ü¥í • p-value pOpbHpUo{² - β ó© ¬ Ǿ -ô¦Ó¨Î¥ ¦ À ª¬÷¾ ¤ µ . Av º`± ª ²÷¾ºªÕ Ú Æ¼ • p-value pCMھƼ»¥Ë -ó©Î¥ ¨û¸ñ¤ ⺠©Ó-@¤ • §C p-value (¤p©ó 0.05 PAsbt§² »Ü¥íª º`± ¦ ¦ ®ÛµãÅ . 21 Notes: 21±`ºA¤À§Gªº´ú¸Õ ´ú¸Õªí¥Ü°²³]¼Æ¾Ú¬O¹ïºÙª ¨ÒÃD :¥Î MINITAB ³Ðy (©Î±`ºA 100 -ӶüƩó . C1. Normality Test generates a normal probability plot and performs a hypothesis test to examine whether or not the observations follow a normal distribution. Some statistical procedures, such as a Z- or t-test, assume that the samples were drawn from a normal distribution. Use this procedure to test the normality assumption. Calc > Random Data > Normal 22 Notes: 22±`ºA¤À§Gªº´ú¸Õ ±`ºA´ú¸Õ¨Ï¥Î MINITAB Stat > Basic Statistics > Normality Test 23 Notes: 23¿é¥Xªº±` A¤À§Gªº´ú¸Õ • ¬õ¦â½uªí¥Ü§¹¬üº±`A¤À ¥¬ • ¶Â¦âI¬O¿é¤Jªº¼Æ¾Ú • P-value OAX¦ ¬ ¾Ú ƼºªJ¤é¿ ©ó±`ºA½uªº¾÷²v • P-value ó©j¤ 0.05 A º`±Ü¥íª 24 Notes: 24。
六西格玛的基本统计概念1. 引言六西格玛(Six Sigma)是一种以统计学为基础的质量管理方法,旨在通过减少变异性和缺陷来提高组织的绩效。
在六西格玛中,基本统计概念是至关重要的,它们帮助我们理解和分析数据,从而作出准确的决策和改进。
2. 总体和样本在六西格玛中,我们经常关注两个重要的概念:总体(Population)和样本(Sample)。
总体是我们感兴趣的整个数据集,而样本是从总体中随机选择出来的一部分数据。
通过对样本进行统计分析,我们可以推断总体的特性。
中心趋势度量是衡量数据集中心位置的统计指标。
常见的中心趋势度量有均值(Mean)、中位数(Median)、众数(Mode)等。
•均值(Mean):是一个数据集中所有观测值的总和除以观测数量。
均值能够反映数据集的总体分布情况。
•中位数(Median):是将数据集按照大小排序后,处于中间位置的观测值。
中位数能够反映数据集的中心位置,相比于均值,中位数对异常值的影响较小。
•众数(Mode):是数据集中出现频率最高的观测值。
众数常用于描述具有离散值的数据集。
选择合适的中心趋势度量,能够帮助我们更好地理解数据的集中程度和分布情况。
分散程度度量是衡量数据集中观测值的离散程度的统计指标。
常见的分散程度度量有方差(Variance)、标准差(Standard Deviation)和极差(Range)等。
•方差(Variance):是数据集中每个观测值与均值之差的平方的平均值。
方差越大,数据集的观测值越分散。
•标准差(Standard Deviation):是方差的正平方根。
标准差是最常用的分散程度度量,它能够告诉我们数据集观测值的平均偏离程度。
•极差(Range):是数据集中最大观测值和最小观测值的差值。
极差能够提供数据集的范围大小。
通过分散程度度量,我们可以了解数据集观测值的离散程度,有助于判断数据的稳定性。
5. 正态分布和六西格玛原则正态分布(Normal Distribution)在六西格玛中起着重要的作用。
六西格玛管理的基本统计概念六西格玛是一种管理方法,旨在通过减少变异性来提高质量和效率。
它基于统计学的基本概念和工具,以帮助企业改进业务流程并减少缺陷率。
在本文中,我们将介绍六西格玛管理中使用的一些基本统计概念,并解释它们的作用和应用。
统计学基本概念在了解六西格玛管理中的统计概念之前,我们先来了解一些基本的统计学概念。
总体与样本在统计学中,我们将研究对象称为总体。
由于总体很大,往往难以收集和处理所有数据,因此我们会从总体中选择一部分数据进行研究,这就是样本。
参数与统计量在统计学中,我们通常对总体进行统计分析,得到一些关于总体特征的度量指标。
这些度量指标称为参数。
而对于样本,我们可以计算出相应的度量指标,这些指标称为统计量。
随机变量与概率分布随机变量是用来表示随机事件结果的数值,它可以是离散的或连续的。
概率分布描述了随机变量的可能取值及其相应的概率。
常见的概率分布包括正态分布、泊松分布等。
样本均值与总体均值样本均值是从样本中计算出来的平均值。
总体均值是指总体的平均值。
在六西格玛管理中,我们常常使用样本均值来估计总体均值。
六西格玛管理的统计概念了解了基本的统计学概念后,我们来看一下在六西格玛管理中常用的一些统计概念。
测量数据类型在六西格玛管理中,我们常常会处理各种类型的数据。
最常见的数据类型包括连续型数据和离散型数据。
连续型数据是指在一个范围上可以取任意值的数据,例如温度、长度等。
离散型数据是指只能取有限个数或者一些特定值的数据,例如产品数量、不良品数等。
测量尺度在统计学中,我们常常使用不同的尺度对数据进行度量。
常见的尺度包括:•名义尺度:仅用于分类,没有大小或顺序关系。
•顺序尺度:可以用于分类,并有一定的顺序关系。
•区间尺度:可以用于分类、有顺序关系,并且可以进行加减运算。
•比例尺度:具有所有尺度的特性,可以进行乘除运算。
在六西格玛管理中,我们通常需要根据不同的测量尺度选择合适的统计方法和工具。
中心趋势测量在统计学中,我们常常使用中心趋势测量来描述数据的中心位置。
1Notes:2Notes:3Notes:4Notes:5Notes:Descriptive and inferential are the classic ways to dividestatistics。
Descriptive statistics are used to characterize populations.Inferential statistics is a relatively new way to draw conclusions about populations by using relatively small samples of data from the population.One of the goals of Six Sigma is to derive an inferential statistical model and then translate it into a practical processmodel. This is known as y = f(x.6Notes:Data is collected not to evaluate each individual data point, but to generate statistics。
Statistics help to describe the process (or population and determine its behavior。
As a process is delivering a good or service, an analyst can dipinto the stream of output, gather a representative sample, calculate the statistics of the sample, and then infer about thepopulation defined by the total process output。
7Notes:Many questions about a process can be answered usingstatistics。
The customer is interested in receiving product thatmeets specific characteristics. The process needs to be sampled to determine how well it is performing to the customer expectations。
Statistics provides a quantitative means for examination。
8Notes:There are four important attributes to a given process that needto be determined in order to fully profile performance。
LOCATION –where the process output tends to clusterSPREAD –how much variation exists around the cluster pointSHAPE –what is the pattern of frequencyCONSISTENCY –the snapshot analysis useful for decision-making9Notes:The mean is the most common measure of location (centraltendency。
However, if the data is skewed or not symmetric, the mean can be misleading。
For symmetric distributions, the median and the mean are identical。
The median is the center value in a list of data that is sorted in rank order。
For an odd number of data points, the median is the middle value in the list。
For an even number of data points,the median is the average of the two middle data points.The mode is the the value that occurs with the most frequency in the sample. The mode is most often used as a measure of the most popular option in an opinion poll.10Notes:11Notes:While the range does provide a measurement of the extremevalues of a sample, it does not give information about the variability of the data about the mean.Variance and Standard Deviation provide information about the collection of samples and how they relate to the sample mean.12Notes:13Notes:14Notes:15Notes:Symmetrical –The process is just as likely to output a low value as a high valueSkewed –Values tend to cluster at one side of the meanBimodal –Could signal two overlapping processesTruncated –Could indicate that someone (or something is sorting out values over (or under a specific value。
16Notes:Box Plot gives a graphical summary of the values in a single column and helps you identify extreme values. The default boxplot display consists of a rectangular box, representing roughly the middle 50% (interquartilerange or IQ Range of the data, and lines (or ”whiskers” extending to either side, indicating the general extent of the data。
Minitab marks the median value inside the box. Minitab also marks outliers。
A boxplot can also can depict a confidence interval (CI for the population median.17Notes:18Notes:19Notes:20Notes:P—È (p-value • p—valueªv ²÷¾Ü¥í • p-value pOpbHpUo{²— β ó© ¬ Ǿ -ô¦Ó¨Î¥ ¦ À ª¬ ÷¾ ¤ µ . Av º`± ª ²÷¾ºªÕ Ú Æ¼ • p—value pCMھƼ»¥Ë -ó©Î¥ ¨û¸ñ¤ ⺩ӗ@¤ • §C p—value (¤p©ó 0。
05 PAsbt§² »Ü¥íª º`± ¦ ¦ ®ÛµãÅ 。
21 Notes: 21±`ºA¤À§Gªº´ú¸Õ ´ú¸Õªí¥Ü°²³]¼Æ¾Ú¬O¹ïºÙª ¨ÒÃD :¥Î MINITAB ³Ðy(©Î±`ºA 100 —ӶüƩó . C1. Normality Test generates a normal probability plot and performs a hypothesis test to examine whether or not the observations follow a normal distribution。