统计学CH04 英文版
- 格式:ppt
- 大小:661.00 KB
- 文档页数:78
统计学英文Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, presentation, and organization of data. In this article, we will discuss the key concepts and principles of statistics.Sample and PopulationStatistics is based on the idea of sampling. A sample is a subset of a population that is selected for analysis. The population is the entire group that is the subject of the study. For example, if we want to study the average age of university students in a country, the population is all the university students in the country. We cannot study the entire population, so we select a sample of students from different universities and use statistics to make inferences about the population based on the sample.Descriptive and Inferential StatisticsDescriptive statistics is concerned with summarizing and describing data. It includes measures of central tendency such as mean, median, and mode, and measures of variability such as range and standard deviation. Descriptive statistics helps us understand the characteristics of the data.Inferential statistics, on the other hand, is concerned with making conclusions about a population based on a sample. It involves testing hypotheses and estimating parameters. For example, we may want to test the hypothesis that the average age of university students in the country is 20 years. We would select a sample of students, calculate the sample mean, anduse statistical tests to determine whether the difference between the sample mean and the hypothesized population mean is significant.Variables and Data TypesA variable is a characteristic of a population or a sample that can take on different values. There are two types of variables: quantitative and qualitative. Quantitative variables are numerical, such as age, weight, and height. Qualitative variables are categorical, such as gender, ethnicity, and occupation.Data can be collected in different ways, such as through surveys, experiments, and observations. Data can also be classified into different types: nominal, ordinal, interval, and ratio. Nominal data are categorical, such as gender or race. Ordinal data are ranked, such as academic achievement or social status. Interval data are numerical, such as temperature or time, but lack a true zero point. Ratio data are numerical and have a true zero point, such as weight or height.Measures of Central TendencyMeasures of central tendency are used to summarize the data and provide a single value that represents the typical score. The three most commonly used measures of central tendency are the mean, median, and mode.The mean is the arithmetic average of the scores. It is calculated by adding up all the scores and dividing by the number of scores. The mean is sensitive to outliers, or extreme scores, which can skew the results.The median is the middle score when the scores are arranged in order. It is not affected by outliers and is a better measure of central tendency when the distribution is skewed.The mode is the most common score. It is useful for nominal data and can be used with ordinal data.Measures of VariabilityMeasures of variability are used to describe the spread or dispersion of the data. The most commonly used measures of variability are the range, variance, and standard deviation.The range is the difference between the largest and smallest scores. It is affected by outliers and is not a very reliable measure of variability.The variance is a measure of how much the scores deviate from the mean. It is calculated by subtracting each score from the mean, squaring the differences, and averaging the squares. The variance is not as intuitive as the other measures of variability, but it is useful for statistical analysis.The standard deviation is the square root of the variance. It is a more intuitive and commonly used measure of variability. The standard deviation is useful for determining how much the scores deviate from the mean and for estimating confidence intervals.Hypothesis TestingHypothesis testing is a process of determining whether a statement about a population is likely to be true or false based on a sample of data. The statement is called a null hypothesis, and the alternative to the null hypothesis is called the alternative hypothesis. We collect data and use statistics to test the null hypothesis.We use a significance level, or alpha, to determine whether the results are statistically significant. If the p-value is less than the significance level, we reject the null hypothesis and accept the alternative hypothesis.ConclusionStatistics is a powerful tool for analyzing and interpreting data. Understanding the concepts and principles of statistics is essential for making informed decisions and drawing accurate conclusions from data.。
关于统计学的英文介绍【中英文版】Introduction to StatisticsStatistics is a branch of mathematics that deals with the collection, analysis, interpretation, presentation, and organization of data. It plays a crucial role in various fields, including economics, biology, psychology, and many more. By utilizing statistical methods, we can draw meaningful conclusions and make informed decisions based on the information extracted from the data.统计学是一门研究数据的收集、分析、解释、呈现和组织方法的数学分支。
它在经济学、生物学、心理学等多个领域发挥着至关重要的作用。
通过运用统计方法,我们可以从数据中提取有意义的信息,并据此做出明智的决策。
The beauty of statistics lies in its ability to simplify complex phenomena into quantifiable measures, enabling us to understand patterns, trends, and relationships within the data. Fundamental concepts such as mean, median, and mode help us summarize and describe data, while techniques like hypothesis testing and regression analysis allow us to make predictions and draw inferences.统计学的魅力在于它能将复杂的现象简化为可量化的指标,使我们能够理解数据中的模式、趋势和关系。
统计学英文版Part1GatheringandExploring Data (descriptive statistics)Different Types of Data (2.1) VariableA variable is any characteristic observed on the subjects in a study. Examples: Marital status, Height, Weight, IQ, Sqft, Price, NE.A variable can be classified as eitherCategorical (in Categories), orQuantitative (Numerical)A variable can be classified as categorical if each observation belongs to one of a set of categories:Examples:Gender (Male or Female)Religious Affiliation (Catholic, Jewish, …)Type of Residence (Apartment, Condo, …)Belief in Life After Death (Yes or No)NE (Located in northeast sector of city (1) or not (0) )A variable is called quantitative if observations on it take numerical values that represent different magnitudes of the variable. Examples:Age, Number of Siblings, Annual Income, Selling price, Sqft Discrete versus continuous quantitative variablesA quantitative variable is discrete if its possible values form a set ofseparate numbers, such as 0,1,2,3,…The set of possible values is not denseExamples:o Number of pets in a householdo Number of children in a familyo Number of foreign languages spoken by an individualA quantitative variable is continuous if its possible values form anintervalThe set of possible values is denseExamples:o Height/Weighto Ageo Blood pressureExerciseIdentify the variable type1.Number of siblings in a family2.County of residence3.Distance (in miles) of commute to school4.Marital status5.Length of time to take a test6.Number of people waiting in line7.Number of speeding tickets received last year8.Your dog’s weightProportion & Percentage (Relative Frequencies)The proportion of the observations that fall in a certain category is the frequency (count) of observations in that category divided by the total number of observations Frequency of that categorySum of all frequenciesThe percentage is the proportion multiplied by 100Proportions and percentages are also called relative frequenciesExampleTable classifies the 630 parliamentary seats of the Italian chamber of deputies by coalition (2013 elections).Coalition SeatsFreq. Prop. Perc.Pierluigi Bersani 345 0.548 54.8Silvio Berlusconi 125 0.198 19.8Beppe Grillo 109 0.173 17.3Mario Monti 47 0.075 7.46Vallee d'Aoste 1 0.002 0.16MAIAE 2 0.003 0.32USEI 1 0.002 0.16Antonio Ingroia 0 0 0Total 630 1 100so, for Grillo,345 is the frequency.0.548 = 345/630 is the proportion and relative frequency.54.8 is the percentage 0.548×100 = 54.8%.Frequency TableA frequency table is a listing of possible values for a variable, together with the number of observations and/or relative frequencies for each value.Raw data Frequency tableCode Gender Gender n i f i p i000001 F F 1000 0.01 1000002 M M 99000 0.99 99 ... ...100000 FExampleA stock broker has been following different stocks over the last month and has recorded whether a stock is up, the same, or down in value. The results were:1.Performance of stock Up Same DownCount 21 7 12What are the subjects?What is the variable of interest?What type of variable is it?Add proportions to this frequency table.Describe data using graphical summaries (2.2) DistributionA graph or frequency table describes a distribution.A distribution tells us the possible values/categories a variable takesas well as the occurrence of those values (frequency or relativefrequency or percentage)In the 2008 General Social Survey, 2020 respondents answered the question, "How many children have you ever had?" The results wereGraphs for categorical data: bar graphs and pie charts Use pie charts and bar graphs to summarize categorical variables: Pie Chart.o A circle where each category is represented as a “slice of the pie”o The size of each pie slice is proportional to the percentage ofobservations falling in that categoryBar Graph.o Bar Graphs display a vertical bar for each categoryo The height of each bar represents either counts (“frequencies”) or percentages (“relative frequencies”) for that categoryPie Chart52%18%17%13%Cars soldFIAT FORD OPEL RENAULTBar GraphBar graph: easier to compare categoriesBar graphs are called Pareto Charts when the categories are ordered by their frequency, from the tallest bar to the shortest barGraphs for quantitative data: dot plotShows a dot for each subject (observation) placed above its value on a number line. To construct a dot plotDraw a horizontal line and label it with the name of the variable. ?Mark regular values of the variable on it.For each observation, place a dot above its value on the number line.Graphs for quantitative data: histogramsA Histogram is a graph that uses bars to portray the frequencies or the relative frequencies of the possible outcomes for a quantitative variable Steps for constructing a histogram1.Divide the range of the data into intervals of equal width2.Count the number of observations in each interval, creating afrequency table3.On the horizontal axis, label the values or the endpoints of theintervals.4.Draw a bar over each value or interval with height equal to itsfrequency (or proportion or percentage), values of which are marked on the vertical axis.bel and title appropriatelyDisplaying Data over Time: time plotsUsed for displaying a time series, a data set collected over time.Plots each observation on the vertical scale against the time it was measured on the horizontal scale. Points are usually connected.Common patterns in the data over time, known as trends, should be noted.Measuring the Center of Quantitative Data (2.3)。