Minimum Cluster Size Estimation and Cluster Refinement

格式：pdf
大小：1.43 MB
文档页数：10

下载文档原格式

一些常见的统计术语翻译

一些常见的统计术语翻译Absolute deviation, 绝对离差Absolute number , 绝对数Absolute r esiduals, 绝对残差Acceler ation arr ay, 加速度立体阵Acceler ation in an arbitr ary dir ection, 任意方向上的加速度Acceler ation nor mal, 法向加速度Acceler ation spac e dimension, 加速度空间的维数Acceler ation tangential, 切向加速度Acceler ation vector , 加速度向量Acceptable hypothesis, 可接受假设Accum ulation, 累积Accuracy, 准确度Actual fr equency, 实际频数Adaptive estimator , 自适应估计量Addition, 相加Addition theor em , 加法定理Additivity, 可加性Adjusted r ate, 调整率Adjusted value, 校正值Adm issible error , 容许误差Aggregation, 聚集性Alternative hypothesis, 备择假设Among gr oups, 组间Amounts, 总量Analysis of c orr elation, 相关分析Analysis of c ovarianc e, 协方差分析Analysis of r egr ession, 回归分析Analysis of time series, 时间序列分析Analysis of varianc e, 方差分析Angular tr ansfor mation, 角转换ANOVA (analysis of variance ), 方差分析ANOVA Models, 方差分析模型Arcing, 弧/ 弧旋Arcsine tr ansfor mation, 反正弦变换Area under the curve, 曲线面积AREG , 评估从一个时间点到下一个时间点回归相关时的误差ARIMA, 季节和非季节性单变量模型的极大似然估计Arithmetic grid paper , 算术格纸Arithmetic mean, 算术平均数Arrhenius r elation, 艾恩尼斯关系Assessing fit, 拟合的评估Associative laws, 结合律Asymmetric distribution, 非对称分布Asymptotic bias, 渐近偏倚Asymptotic efficiency, 渐近效率Asymptotic variance, 渐近方差Attributable risk, 归因危险度Attribute data, 属性资料Attribution, 属性Autoc orrelation, 自相关Autoc orrelation of residuals, 残差的自相关Aver age, 平均数Aver age c onfidenc e interval length, 平均置信区间长度Aver age growth r ate, 平均增长率Bar c hart, 条形图Bar gr aph, 条形图Base period, 基期Bayes' theorem , Bayes 定理Bell-shaped curve, 钟形曲线伯努力分布Ber noulli distribution,Best-trim estimator , 最好切尾估计量Bias, 偏性Binary logistic r egr ession, 二元逻辑斯蒂回归Binomial distribution, 二项分布Bisquare, 双平方Bivariate Corr elate, 二变量相关Bivariate nor mal distribution, 双变量正态分布Bivariate nor mal population, 双变量正态总体Biweight inter val, 双权区间Biweight M-estimator, 双权M 估计量Bloc k, 区组/ 配伍组BMDP(Biomedic al computer pr ograms), BMDP 统计软件包Boxplots, 箱线图/ 箱尾图Breakdown bound, 崩溃界/ 崩溃点Canonical c orrelation, 典型相关Caption, 纵标目Case-c ontrol study , 病例对照研究Categoric al variable, 分类变量Catenary, 悬链线Cauchy distribution, 柯西分布Cause-and-effect r elationship, 因果关系Cell, 单元Censoring, 终检Center of symmetry , 对称中心Centering and sc aling, 中心化和定标Centr al tendency, 集中趋势Centr al value, 中心值CHAID - x 2 Automatic Inter action Detector ,卡方自动交互检测Chanc e, 机遇Chanc e error , 随机误差Chanc e variable, 随机变量Char acteristic equation, 特征方程Char acteristic root, 特征根Char acteristic vector , 特征向量Chebshev criterion of fit, 拟合的切比雪夫准则Chernoff fac es, 切尔诺夫脸谱图Chi-square test, 卡方检验/咒2检验Choleskey dec omposition, 乔洛斯基分解Circle chart, 圆图Class interval, 组距Class mid-value, 组中值Class upper limit, 组上限Classified variable, 分类变量Cluster analysis, 聚类分析Cluster sampling, 整群抽样Code, 代码Coded data, 编码数据Coding, 编码Coefficient of c ontingency, 列联系数Coefficient of deter mination, 决定系数Coefficient of multiple c orr elation, 多重相关系数Coefficient of partial c orrelation, 偏相关系数Coefficient of pr oduction-moment c orrelation, 积差相关系数Coefficient of r ank corr elation, 等级相关系数Coefficient of r egr ession, 回归系数Coefficient of skewness, 偏度系数Coefficient of variation, 变异系数Cohort study, 队列研究Column, 列Column effect, 列效应Column factor , 列因素Combination pool, 合并Combinative table, 组合表Common factor , 共性因子Common regr ession coefficient, 公共回归系数Common value, 共同值Common varianc e, 公共方差Common variation, 公共变异Communality varianc e, 共性方差Compar ability, 可比性Comparison of bathes, 批比较Comparison value, 比较值Compartment model, 分部模型Compassion, 伸缩Complement of an event, 补事件Complete association, 完全正相关Complete dissociation, 完全不相关Complete statistic s, 完备统计量Completely r andomized design, 完全随机化设计Composite event, 联合事件Composite events, 复合事件Concavity, 凹性Conditional expectation, 条件期望Conditional likelihood, 条件似然Conditional pr obability, 条件概率Conditionally linear , 依条件线性Confidenc e interval, 置信区间Confidenc e lim it, 置信限Confidenc e lower lim it, 置信下限Confidenc e upper limit, 置信上限Confir matory Factor Analysis , 验证性因子分析Confir matory research, 证实性实验研究Confounding factor , 混杂因素Conjoint, 联合分析Consistency, 相合性Consistency chec k, 一致性检验Consistent asymptotic ally nor mal estimate, 相合渐近正态估计Consistent estimate, 相合估计Constr ained nonlinear r egr ession, 受约束非线性回归Constr aint, 约束Contam inated distribution, 污染分布Contam inated Gausssian, 污染高斯分布Contam inated nor mal distribution, 污染正态分布Contam ination, 污染Contam ination model, 污染模型Contingency table, 列联表Contour , 边界线Contribution r ate, 贡献率Control, 对照Controlled experiments, 对照实验Conventional depth, 常规深度Convolution, 卷积Corrected factor , 校正因子Corrected mean, 校正均值Correction coefficient, 校正系数Correctness, 正确性Correlation c oefficient, 相关系数Correlation index, 相关指数Correspondenc e, 对应Counting, 计数Counts, 计数/ 频数Covarianc e, 协方差Covariant, 共变Cox Regression, Cox 回归Criteria for fitting, 拟合准则Criteria of least squar es, 最小二乘准则Critic al r atio, 临界比Critic al r egion, 拒绝域Critic al value, 临界值Cr oss-over design, 交叉设计Cr oss-section analysis, 横断面分析Cr oss-section survey, 横断面调查Cr osstabs , 交叉表Cr oss-tabulation table, 复合表Cube r oot, 立方根Cumulative distribution function, 分布函数Cumulative probability, 累计概率Curvatur e, 曲率/ 弯曲Curvatur e, 曲率Curve fit , 曲线拟和Curve fitting, 曲线拟合Curvilinear r egression, 曲线回归Curvilinear r elation, 曲线关系Cut-and-try method, 尝试法Cycle, 周期Cyclist, 周期性D test, D 检验Data acquisition, 资料收集Data bank, 数据库Data c apacity, 数据容量Data deficiencies, 数据缺乏Data handling, 数据处理Data manipulation, 数据处理Data proc essing, 数据处理Data r eduction, 数据缩减Data set, 数据集Data sourc es, 数据来源Data tr ansfor mation, 数据变换Data validity, 数据有效性Data-in, 数据输入Data-out, 数据输出Dead time, 停滞期Degr ee of fr eedom, 自由度Degr ee of pr ecision, 精密度Degr ee of r eliability , 可靠性程度Degr ession, 递减Density function, 密度函数Density of data points,数据点的密度Dependent variable,应变量/ 依变量/ 因变量Dependent variable,因变量Depth, 深度Derivative matrix, 导数矩阵Derivative-fr ee methods, 无导数方法Design, 设计Deter minacy, 确定性Deter minant, 行列式Deter minant, 决定因素Deviation, 离差Deviation from aver age, 离均差Diagnostic plot, 诊断图Dichotomous variable, 二分变量Differential equation,微分方程Direct standardization, 直接标准化法Discr ete variable, 离散型变量DISCRIMINAN T, 判断Discriminant analysis, 判别分析Discriminant c oeffic ient, 判别系数Discriminant function, 判别值Disper sion, 散布/ 分散度Dispr oportional, 不成比例的Dispr oportionate sub-class numbers, 不成比例次级组含量Distribution free, 分布无关性/ 免分布Distribution shape, 分布形状Distribution-free method, 任意分布法Distributive laws, 分配律Distur banc e, 随机扰动项Dose response curve, 剂量反应曲线Double blind method, 双盲法Double blind trial, 双盲试验Double exponential distribution, 双指数分布Double logarithmic, 双对数Downward r ank, 降秩Dual-spac e plot, 对偶空间图DUD, 无导数方法新法Duncan's new multiple r ange method, 新复极差法/DuncanE-LEffect, 实验效应Eigenvalue, 特征值Eigenvector , 特征向量Ellipse, 椭圆Empiric al distribution, 经验分布Empiric al pr obability , 经验概率单位Enumer ation data, 计数资料Equal sun-class number , 相等次级组含量Equally likely , 等可能Equivarianc e, 同变性Error , 误差/ 错误Errorof estimate, 估计误差Error type I, 第一类错误Error type II, 第二类错误Estimand, 被估量Estimated err or mean squares, 估计误差均方Estimated err or sum of squar es, 估计误差平方和Euclidean distanc e,欧式距离Event, 事件Event, 事件Exc eptional data point, 异常数据点Expectation plane, 期望平面Expectation surfac e, 期望曲面Expected values, 期望值Experiment, 实验Experimental sampling, 试验抽样Experimental unit, 试验单位Explanatory variable, 说明变量Explor atory data analysis, 探索性数据分析Explore Summarize, 探索- 摘要Exponential curve, 指数曲线Exponential growth, 指数式增长EXSMOOTH, 指数平滑方法Extended fit, 扩充拟合Extr a par ameter ,附加参数Extr apolation, 外推法Extr eme observation, 末端观测值Extr emes, 极端值/ 极值F distribution, F分布 F test, F 检验Factor , 因素 / 因子Factor analysis, 因子分析Factor Analysis, 因子分析Factor scor e, 因子得分Factorial, 阶乘Factorial design, 析因试验设计False negative, 假阴性False negative error , 假阴性错误 Fam ily of distributions, 分布族 Fam ily of estimator s, 估计量族 Fanning, 扇面Fatality r ate, 病死率Field investigation, 现场调查Field survey , 现场调查Finite population, 有限总体 Finite-sample, 有限样本First derivative, 一阶导数First principal component,First quartile, 第一四分位数Fisher infor mation, 费雪信息量Fitted value, 拟合值Fourth, 四分点Frequency, 频率Frontier point, 界限点Function r elationship, 泛函关系Gaussian distribution, 高斯分布 / 正态分布Gini's mean difference,基尼均差 GLM (Gener al liner models), 通用线性模型Fitting a c urve, 曲线拟合 Fixed base,定基 Fluctuation, 随机起伏 For ec ast, 预测 Four fold table,四格表Fraction blow, 左侧比率Fractional error, 相对误差 Frequency polygon,频数多边图 Gamma distribution, 伽玛分布Gauss increment, 高斯增量Gauss-Newton incr ement, 高斯- 牛顿增量 Gener al census, 全面普查GENLOG (Gener alized liner models), 广义线性模型 Geometric mean,几何平均数第一主成分Goodness of fit, 拟和优度/ 配合度Gradient of deter m inant, 行列式的梯度Graec o-Latin squar e, 希腊拉丁方Grand mean, 总均值Gross error s, 重大错误Gross-error sensitivity, 大错敏感度Group aver ages, 分组平均Grouped data, 分组资料Guessed mean, 假定平均数Half-life, 半衰期Hampel M-estimators, 汉佩尔M 估计量Happenstanc e, 偶然事件Har monic mean, 调和均数Hazar d function, 风险均数Hazar d r ate, 风险率Heading, 标目Heavy-tailed distribution, 重尾分布Hessian arr ay, 海森立体阵Heterogeneity , 不同质Heterogeneity of variance, 方差不齐Hier archic al classific ation, 组内分组Hier archic al clustering method, 系统聚类法High-lever age point, 高杠杆率点HILOGLINEAR, 多维列联表的层次对数线性模型Hinge, 折叶点Histogr am, 直方图Historical c ohort study, 历史性队列研究Holes, 空洞HOMALS, 多重响应分析Homogeneity of varianc e, 方差齐性Homogeneity test, 齐性检验Huber M-estimators, 休伯M 估计量Hyper bola, 双曲线Hypothesis testing, 假设检验Hypothetic al universe, 假设总体Impossible event, 不可能事件Independenc e, 独立性Independent variable, 自变量Index, 指标/ 指数Indir ect standardization, 间接标准化法Individual, 个体Infer enc e band, 推断带Infinite population, 无限总体Infinitely gr eat, 无穷大Infinitely small, 无穷小Influence curve, 影响曲线Intercept, 截距Interpolation, 内插法Invarianc e, 不变性Inverse matrix, 逆矩阵Inverse sine tr ansfor mation, 反正弦变换Iter ation, 迭代Jac obian deter m inant, 雅可比行列式Joint distribution function,分布函数 Joint probability, 联合概率Joint probability distribution,联合概率分布 K means method, 逐步聚类法Kaplan-Meier , 评估事件的时间长度Kaplan-Merier c hart, Kaplan-Merier图 Kendall's r ank c orrelation, Kendall等级相关 Kinetic, 动力学Kolmogor ov-Smirnove test, 柯尔莫哥洛夫 - 斯米尔诺夫检验Kruskal and Wallis test, Kr uskal 及 Wallis 检验 / 多样本的秩和检验 /H 检验 Kurtosis, 峰度Lac k of fit, 失拟Ladder of powers, 幂阶梯Lag, 滞后Lar ge sample, 大样本Lar ge sample test, 大样本检验Latin squar e, 拉丁方Latin squar e design, 拉丁方设计Leakage, 泄漏Least favor able c onfigur ation, 最不利构形Least favor able distribution, 最不利分布Least signific ant differ enc e, 最小显著差法Least squar e method, 最小二乘法Least-absolute-r esiduals estimates, Least-absolute-r esiduals fit, 最小绝对残差拟合 Least-absolute-r esiduals line, 最小绝对残差线 Legend, 图例L-estimator , L 估计量Infor mation capacity, 信息容量 Initial condition,初始条件 Initial estimate,初始估计值 Initial level,最初水平 Interaction,交互作用 Interaction terms, 交互作用项Interquartile range,四分位距 Interval estimation,区间估计 Intervals of equal probability, 等概率区间 Intrinsic c urvature,固有曲率Inverse probability,逆概率最小绝对残差估计L-estimator of loc ation, 位置L 估计量L-estimator of sc ale, 尺度L 估计量Level, 水平Life expectanc e, 预期期望寿命Life table, 寿命表Life table method, 生命表法Light-tailed distribution, 轻尾分布似然函数Likelihood function,似然比Likelihood r atio,line gr aph, 线图直线相关Linear corr elation,线性方程Linear equation,Linear pr ogr amm ing, 线性规划直线回归Linear regr ession,线性回归Linear Regression,Linear trend, 线性趋势Loading, 载荷Loc ation and sc ale equivarianc e, 位置尺度同变性Loc ation equivarianc e, 位置同变性Loc ation invarianc e, 位置不变性Loc ation sc ale family, 位置尺度族Log r ank test, 时序检验Logarithm ic curve, 对数曲线Logarithm ic nor mal distribution, 对数正态分布Logarithm ic sc ale, 对数尺度Logarithm ic tr ansfor mation, 对数变换Logic chec k, 逻辑检查Logistic distribution, 逻辑斯特分布Logit tr ansfor mation, Logit 转换LOGLINEAR, 多维列联表通用模型Lognor mal distribution, 对数正态分布Lost function, 损失函数Low corr elation, 低度相关Lower lim it, 下限Lowest-attained varianc e, 最小可达方差LSD, 最小显著差法的简称Lur king variable, 潜在变量M-RMain effect, 主效应Major heading, 主辞标目Marginal density function, 边缘密度函数Marginal pr obability, 边缘概率Marginal pr obability distribution, 边缘概率分布Matched data, 配对资料Matched distribution, 匹配过分布Matching of distribution, 分布的匹配Matching of tr ansfor mation, 变换的匹配Mathematic al expectation, 数学期望Mathematic al model, 数学模型Maximum L-estimator , 极大极小L 估计量Maximum likelihood method, 最大似然法Mean, 均数Mean squar es between groups, 组间均方Mean squar es within gr oup, 组内均方Means (Compar e means), 均值- 均值比较Median, 中位数Median effective dose, 半数效量Median lethal dose, 半数致死量Median polish, 中位数平滑Median test, 中位数检验Minimal sufficient statistic, 最小充分统计量Minimum distanc e estimation, 最小距离估计Minimum effective dose, 最小有效量Minimum lethal dose, 最小致死量Minimum varianc e estimator , 最小方差估计量MIN ITAB, 统计软件包Minor heading, 宾词标目Missing data, 缺失值Model specific ation, 模型的确定Modeling Statistic s , 模型统计Models for outliers, 离群值模型Modifying the model, 模型的修正Modulus of c ontinuity , 连续性模Mor bidity , 发病率Most favor able c onfigur ation, 最有利构形Multidimensional Sc aling (ASCAL), 多维尺度/ 多维标度Multinomial Logistic Regression , 多项逻辑斯蒂回归Multiple c omparison, 多重比较Multiple c orr elation , 复相关Multiple c ovarianc e, 多元协方差Multiple linear r egr ession, 多元线性回归Multiple r esponse , 多重选项Multiple solutions, 多解Multiplic ation theor em , 乘法定理Multir esponse, 多元响应Multi-stage sampling, 多阶段抽样Multivariate T distribution, 多元T 分布Mutual exclusive, 互不相容Mutual independenc e, 互相独立Natur al boundary, 自然边界Natur al dead, 自然死亡Natur al zer o, 自然零Negative c orr elation, 负相关Negative linear corr elation, 负线性相关Negatively skew ed, 负偏Newman-Keuls method, q 检验NK method, q 检验No statistic al signific ance, 无统计意义Nom inal variable, 名义变量Nonc onstancy of variability, 变异的非定常性Nonlinear regr ession, 非线性相关Nonpar ametric statistics, 非参数统计Nonpar ametric test, 非参数检验Nonpar ametric tests, 非参数检验Normal deviate, 正态离差Normal distribution, 正态分布Normal equation, 正规方程组Normal r anges, 正常范围Normal value, 正常值Nuisanc e par ameter , 多余参数/ 讨厌参数Null hypothesis, 无效假设Numeric al variable, 数值变量Objective function, 目标函数观察单位Observation unit,观察值Observed value,One sided test, 单侧检验One-way analysis of varianc e, 单因素方差分析Oneway ANOVA , 单因素方差分析Open sequential trial, 开放型序贯设计Optrim, 优切尾Optrim efficiency, 优切尾效率Order statistic s, 顺序统计量Or dered categories, 有序分类Or dinal logistic r egr ession , 序数逻辑斯蒂回归有序变量Or dinal variable,正交基Orthogonal basis,Orthogonal design, 正交试验设计Orthogonality c onditions, 正交条件ORTHOPLAN, 正交设计Outlier cutoffs, 离群值截断点Outlier s, 极端值OVE RALS , 多组变量的非线性正规相关Over shoot, 迭代过度Pair ed design, 配对设计Pair ed sample, 配对样本Pairwise slopes, 成对斜率Par abola, 抛物线Par allel tests, 平行试验Par ameter , 参数Par ametric statistic s, 参数统计Par ametric test, 参数检验Partial c orrelation, 偏相关Partial r egression, 偏回归Partial sorting, 偏排序Partials r esiduals, 偏残差Patter n, 模式Pear son curves, 皮尔逊曲线Peeling, 退层Perc ent bar gr aph, 百分条形图Perc entage, 百分比Perc entile, 百分位数Perc entile curves, 百分位曲线Periodicity , 周期性Per mutation, 排列P-estimator , P 估计量Pie graph, 饼图Pitman estimator , 皮特曼估计量Pivot, 枢轴量Planar , 平坦Planar assumption, 平面的假设PLANCARDS, 生成试验的计划卡Point estimation, 点估计Poisson distribution, 泊松分布Polishing, 平滑Polled standar d deviation, 合并标准差Polled varianc e, 合并方差Polygon, 多边图Polynomial, 多项式Polynomial c urve, 多项式曲线Population, 总体Population attributable risk,人群归因危险度Qualitative classific ation, 属性分类Qualitative method, 定性方法Quantile-quantile plot, Quantitative analysis, Quartile, 四分位数Quic k Cluster , 快速聚类Radix sort, 基数排序Random alloc ation, 随机化分组Random bloc ks design, 随机区组设计Random event, 随机事件Random ization, 随机化Range, 极差/ 全距Rank c orr elation, 等级相关Rank sum test, 秩和检验Rank test, 秩检验 Ranked data, 等级资料Rate, 比率Ratio, 比例 Positive c orrelation, 正相关Positively skewed, 正偏Posterior distribution, 后验分布Power of a test, 检验效能 Precision,精密度Predicted value, 预测值Preliminary analysis, 预备性分析Principal c omponent analysis, 主成分分析Prior distribution, 先验分布 Prior pr obability, Probabilistic model, probability, 概率Probability density Product moment, 先验概率概率模型, 概率密度乘积矩 / 协方差Profile tr ace, 截面迹图Proportion, 比/ 构成比Proportion alloc ation in str atified random sampling, Proportionate, 成比例Proportionate sub-class numbers, 成比例次级组含量Prospective study , 前瞻性调查Proximities, 亲近性Pseudo F test, 近似 F 检验Pseudo model, 近似模型Pseudosigma, 伪标准差Purposive sampling, 有目的抽样QR dec omposition, QR 分解Quadratic approximation, 二次近似按比例分层随机抽样分位数-分位数图 /Q-Q 图定量分析Raw data, 原始资料Raw residual, 原始残差Rayleigh's test, 雷氏检验Rayleigh's Z, 雷氏Z 值Recipr ocal, 倒数Recipr ocal tr ansfor mation, 倒数变换Rec or ding, 记录Redesc ending estimators, 回降估计量Reducing dimensions, 降维Re-expression, 重新表达Refer enc e set, 标准组Region of acc eptanc e, 接受域Regr ession coefficient, 回归系数Regr ession sum of squar e, 回归平方和Rej ection point, 拒绝点Relative disper sion, 相对离散度Relative number , 相对数Reliability , 可靠性Repar ametrization, 重新设置参数Replication, 重复Report Summar ies, 报告摘要Residual sum of squar e, 剩余平方和Resistanc e, 耐抗性Resistant line, 耐抗线Resistant technique, 耐抗技术R-estimator of location, 位置R 估计量R-estimator of sc ale, 尺度R 估计量Retr ospective study, 回顾性调查Ridge tr ace, 岭迹Ridit analysis, Ridit 分析Rotation, 旋转Rounding, 舍入Row, 行Row effects, 行效应Row factor , 行因素RXC table, RXC 表S-ZSample, 样本Sample r egression c oefficient, 样本回归系数Sample size, 样本量Sample standar d deviation, 样本标准差Sampling error , 抽样误差SAS(Statistical analysis system ), SAS Scale, 尺度/ 量表Scatter diagr am, 散点图统计软件包Schematic plot, 示意图/ 简图Scor e test, 计分检验Screening, 筛检SEASON, 季节分析Sec ond derivative, 二阶导数Sec ond principal c omponent, 第二主成分SEM (Structur al equation modeling), 结构化方程模型Semi-logarithm ic gr aph, 半对数图Semi-logarithm ic paper , 半对数格纸Sensitivity c urve, 敏感度曲线Sequential analysis,贯序分析Sequential data set, 顺序数据集Sequential design, 贯序设计Sequential method, 贯序法Sequential test, 贯序检验法Serial tests, 系列试验Short-c ut method, 简捷法Sigmoid curve, S形曲线Sign function, 正负号函数Sign test, 符号检验Signed r ank, 符号秩Signific anc e test, 显著性检验Signific ant figur e, 有效数字Sim ple cluster sampling, 简单整群抽样Sim ple c orrelation, 简单相关Sim ple r andom sampling, 简单随机抽样Sim ple r egr ession, 简单回归simple table, 简单表Sine estimator , 正弦估计量Single-valued estimate, 单值估计Singular matrix, 奇异矩阵Skewed distribution, 偏斜分布Skewness, 偏度Slash distribution, 斜线分布Slope, 斜率Smirnov test, 斯米尔诺夫检验Source of variation, 变异来源Spear man r ank c orrelation, 斯皮尔曼等级相关Specific factor , 特殊因子Specific factor varianc e, 特殊因子方差Spectr a , 频谱Spherical distribution, 球型正态分布Spr ead, 展布SPSS(Statistical pac kage for the social scienc e), SPSS Spurious c orr elation, 假性相关Square root tr ansfor mation, 平方根变换Stabilizing variance, 稳定方差Standard deviation, 标准差Standard error , 标准误Standard error of differ ence, 差别的标准误Standard error of estimate, 标准估计误差Standard error of r ate, 率的标准误Standard nor mal distribution, 标准正态分布Standardization, 标准化Starting value, 起始值Statistic, 统计量Statistical c ontrol, 统计控制Statistical gr aph, 统计图Statistical inferenc e, 统计推断Statistical table, 统计表Steepest desc ent, 最速下降法Stem and leaf display, 茎叶图Step factor , 步长因子Stepwise r egr ession, 逐步回归Stor age, 存Strata, 层(复数)Stratified sampling, 分层抽样Stratified sampling, 分层抽样Strength, 强度Stringency , 严密性Structur al r elationship, 结构关系Studentized r esidual, 学生化残差/t 化残差Sub-class number s, 次级组含量Subdividing, 分割Sufficient statistic, 充分统计量Sum of pr oducts, 积和Sum of squares, 离差平方和Sur e event, 必然事件Survey, 调查Survival, 生存分析统计软件包Sum of squares about regr Sum of squares between gr Sum of squares of partial r ession, 回归平方和oups, 组间平方和egression, 偏回归平方和Survival r ate, 生存率Suspended r oot gr am, 悬吊根图Symmetry, 对称Systematic err or, 系统误差Systematic sampling, 系统抽样Tags, 标签Tail ar ea, 尾部面积Tail length, 尾长Tail weight, 尾重Tangent line, 切线Target distribution, 目标分布Taylor series, 泰勒级数Tendency of dispersion, 离散趋势Testing of hypotheses, 假设检验Theor etical frequency , 理论频数Time series, 时间序列Toler anc e interval, 容忍区间Toler anc e lower lim it, 容忍下限Toler anc e upper lim it, 容忍上限Torsion, 扰率Total sum of squar e, 总平方和Total variation, 总变异Transfor mation, 转换Treatment, 处理Trend, 趋势Trend of perc entage, 百分比趋势Trial, 试验Trial and err or method, 试错法Tuning c onstant, 细调常数Two sided test, 双向检验Two-stage least squar es, 二阶最小平方Two-stage sampling, 二阶段抽样Two-tailed test, 双侧检验Two-way analysis of varianc e, 双因素方差分析Two-way table, 双向表Type I err or, 一类错误/ a错误Type II err or,二类错误/ B错误UMVU, 方差一致最小无偏估计简称Unbiased estimate, 无偏估计Unc onstrained nonlinear r egr ession , 无约束非线性回归Unequal subclass number , 不等次级组含量Ungr ouped data, 不分组资料Unifor m coor dinate, 均匀坐标Unifor m distribution, 均匀分布Unifor m ly m inimum varianc e unbiased estimate, 方差一致最小无偏估计Unit, 单元Unor der ed categories, 无序分类Upper lim it, 上限Upwar d r ank, 升秩Vague conc ept, 模糊概念Validity , 有效性W test, W 检验W-estimation, W 估计量W-estimation of location,位置 W 估计量Width, 宽度 Wilcoxon paired test, 威斯康星配对法 / 配对符号秩和检验 Wild point, 野点 / 狂点Wild value, 野值 / 狂值Winsorized mean, 缩尾均值Withdr aw, 失访Youden's index, 尤登指数Z test, Z 检验Zer o corr elation, 零相关Z-tr ansfor mation, Z 变换 VARCOMP (Varianc e c omponent estimation), 方差元素估计 Variability , 变异性 Variable,变量 Varianc e,方差 Variation, 变异Varimax orthogonal rotation, 方差最大正交旋转 Volume of distribution,容积Weibull distribution, 威布尔分布 Weight, 权数Weighted Chi-squar e test, 加权卡方检验 /Coc hr an 检验 Weighted linear regression method, 加权直线回归 Weighted mean, 加权平均数Weighted mean squar Weighted sum of squarWeighting coefficient,Weighting method,e, 加权平均方差e, 加权平方和权重系数加权法。

Sample Size Determination

Iቤተ መጻሕፍቲ ባይዱtroduction
Freiman JA, NEJM, 1978;299:690-4
Reviewed the power of 71 published RCTs which had failed to detect a difference
Found that 67 could have missed a 25% therapeutic improvement
Factors That Influence Sample Size Calculations
Is the F/U long enough to be of any clinical relevance Desired level of significance Desired power One or two-tailed test Any explanation for the possible ranges or variations in outcome that is
Factors That Influence Sample Size Calculations
Research subjects
Target population Inclusion & exclusion criteria Baseline risk Pt. compliance rate Pt. drop-out rate
considerations) Adjustments (accounting for potential dropouts or
effect of covariates)
Introduction
Consequences of getting it wrong

机器学习专业词汇中英文对照

机器学习专业词汇中英⽂对照activation 激活值activation function 激活函数additive noise 加性噪声autoencoder ⾃编码器Autoencoders ⾃编码算法average firing rate 平均激活率average sum-of-squares error 均⽅差backpropagation 后向传播basis 基basis feature vectors 特征基向量batch gradient ascent 批量梯度上升法Bayesian regularization method 贝叶斯规则化⽅法Bernoulli random variable 伯努利随机变量bias term 偏置项binary classfication ⼆元分类class labels 类型标记concatenation 级联conjugate gradient 共轭梯度contiguous groups 联通区域convex optimization software 凸优化软件convolution 卷积cost function 代价函数covariance matrix 协⽅差矩阵DC component 直流分量decorrelation 去相关degeneracy 退化demensionality reduction 降维derivative 导函数diagonal 对⾓线diffusion of gradients 梯度的弥散eigenvalue 特征值eigenvector 特征向量error term 残差feature matrix 特征矩阵feature standardization 特征标准化feedforward architectures 前馈结构算法feedforward neural network 前馈神经⽹络feedforward pass 前馈传导fine-tuned 微调first-order feature ⼀阶特征forward pass 前向传导forward propagation 前向传播Gaussian prior ⾼斯先验概率generative model ⽣成模型gradient descent 梯度下降Greedy layer-wise training 逐层贪婪训练⽅法grouping matrix 分组矩阵Hadamard product 阿达马乘积Hessian matrix Hessian 矩阵hidden layer 隐含层hidden units 隐藏神经元Hierarchical grouping 层次型分组higher-order features 更⾼阶特征highly non-convex optimization problem ⾼度⾮凸的优化问题histogram 直⽅图hyperbolic tangent 双曲正切函数hypothesis 估值，假设identity activation function 恒等激励函数IID 独⽴同分布illumination 照明inactive 抑制independent component analysis 独⽴成份分析input domains 输⼊域input layer 输⼊层intensity 亮度/灰度intercept term 截距KL divergence 相对熵KL divergence KL分散度k-Means K-均值learning rate 学习速率least squares 最⼩⼆乘法linear correspondence 线性响应linear superposition 线性叠加line-search algorithm 线搜索算法local mean subtraction 局部均值消减local optima 局部最优解logistic regression 逻辑回归loss function 损失函数low-pass filtering 低通滤波magnitude 幅值MAP 极⼤后验估计maximum likelihood estimation 极⼤似然估计mean 平均值MFCC Mel 倒频系数multi-class classification 多元分类neural networks 神经⽹络neuron 神经元Newton’s method ⽜顿法non-convex function ⾮凸函数non-linear feature ⾮线性特征norm 范式norm bounded 有界范数norm constrained 范数约束normalization 归⼀化numerical roundoff errors 数值舍⼊误差numerically checking 数值检验numerically reliable 数值计算上稳定object detection 物体检测objective function ⽬标函数off-by-one error 缺位错误orthogonalization 正交化output layer 输出层overall cost function 总体代价函数over-complete basis 超完备基over-fitting 过拟合parts of objects ⽬标的部件part-whole decompostion 部分-整体分解PCA 主元分析penalty term 惩罚因⼦per-example mean subtraction 逐样本均值消减pooling 池化pretrain 预训练principal components analysis 主成份分析quadratic constraints ⼆次约束RBMs 受限Boltzman机reconstruction based models 基于重构的模型reconstruction cost 重建代价reconstruction term 重构项redundant 冗余reflection matrix 反射矩阵regularization 正则化regularization term 正则化项rescaling 缩放robust 鲁棒性run ⾏程second-order feature ⼆阶特征sigmoid activation function S型激励函数significant digits 有效数字singular value 奇异值singular vector 奇异向量smoothed L1 penalty 平滑的L1范数惩罚Smoothed topographic L1 sparsity penalty 平滑地形L1稀疏惩罚函数smoothing 平滑Softmax Regresson Softmax回归sorted in decreasing order 降序排列source features 源特征sparse autoencoder 消减归⼀化Sparsity 稀疏性sparsity parameter 稀疏性参数sparsity penalty 稀疏惩罚square function 平⽅函数squared-error ⽅差stationary 平稳性（不变性）stationary stochastic process 平稳随机过程step-size 步长值supervised learning 监督学习symmetric positive semi-definite matrix 对称半正定矩阵symmetry breaking 对称失效tanh function 双曲正切函数the average activation 平均活跃度the derivative checking method 梯度验证⽅法the empirical distribution 经验分布函数the energy function 能量函数the Lagrange dual 拉格朗⽇对偶函数the log likelihood 对数似然函数the pixel intensity value 像素灰度值the rate of convergence 收敛速度topographic cost term 拓扑代价项topographic ordered 拓扑秩序transformation 变换translation invariant 平移不变性trivial answer 平凡解under-complete basis 不完备基unrolling 组合扩展unsupervised learning ⽆监督学习variance ⽅差vecotrized implementation 向量化实现vectorization ⽮量化visual cortex 视觉⽪层weight decay 权重衰减weighted average 加权平均值whitening ⽩化zero-mean 均值为零Letter AAccumulated error backpropagation 累积误差逆传播Activation Function 激活函数Adaptive Resonance Theory/ART ⾃适应谐振理论Addictive model 加性学习Adversarial Networks 对抗⽹络Affine Layer 仿射层Affinity matrix 亲和矩阵Agent 代理 / 智能体Algorithm 算法Alpha-beta pruning α-β剪枝Anomaly detection 异常检测Approximation 近似Area Under ROC Curve／AUC Roc 曲线下⾯积Artificial General Intelligence/AGI 通⽤⼈⼯智能Artificial Intelligence/AI ⼈⼯智能Association analysis 关联分析Attention mechanism 注意⼒机制Attribute conditional independence assumption 属性条件独⽴性假设Attribute space 属性空间Attribute value 属性值Autoencoder ⾃编码器Automatic speech recognition ⾃动语⾳识别Automatic summarization ⾃动摘要Average gradient 平均梯度Average-Pooling 平均池化Letter BBackpropagation Through Time 通过时间的反向传播Backpropagation/BP 反向传播Base learner 基学习器Base learning algorithm 基学习算法Batch Normalization/BN 批量归⼀化Bayes decision rule 贝叶斯判定准则Bayes Model Averaging／BMA 贝叶斯模型平均Bayes optimal classifier 贝叶斯最优分类器Bayesian decision theory 贝叶斯决策论Bayesian network 贝叶斯⽹络Between-class scatter matrix 类间散度矩阵Bias 偏置 / 偏差Bias-variance decomposition 偏差-⽅差分解Bias-Variance Dilemma 偏差 – ⽅差困境Bi-directional Long-Short Term Memory/Bi-LSTM 双向长短期记忆Binary classification ⼆分类Binomial test ⼆项检验Bi-partition ⼆分法Boltzmann machine 玻尔兹曼机Bootstrap sampling ⾃助采样法／可重复采样／有放回采样Bootstrapping ⾃助法Break-Event Point／BEP 平衡点Letter CCalibration 校准Cascade-Correlation 级联相关Categorical attribute 离散属性Class-conditional probability 类条件概率Classification and regression tree/CART 分类与回归树Classifier 分类器Class-imbalance 类别不平衡Closed -form 闭式Cluster 簇/类/集群Cluster analysis 聚类分析Clustering 聚类Clustering ensemble 聚类集成Co-adapting 共适应Coding matrix 编码矩阵COLT 国际学习理论会议Committee-based learning 基于委员会的学习Competitive learning 竞争型学习Component learner 组件学习器Comprehensibility 可解释性Computation Cost 计算成本Computational Linguistics 计算语⾔学Computer vision 计算机视觉Concept drift 概念漂移Concept Learning System /CLS 概念学习系统Conditional entropy 条件熵Conditional mutual information 条件互信息Conditional Probability Table／CPT 条件概率表Conditional random field/CRF 条件随机场Conditional risk 条件风险Confidence 置信度Confusion matrix 混淆矩阵Connection weight 连接权Connectionism 连结主义Consistency ⼀致性／相合性Contingency table 列联表Continuous attribute 连续属性Convergence 收敛Conversational agent 会话智能体Convex quadratic programming 凸⼆次规划Convexity 凸性Convolutional neural network/CNN 卷积神经⽹络Co-occurrence 同现Correlation coefficient 相关系数Cosine similarity 余弦相似度Cost curve 成本曲线Cost Function 成本函数Cost matrix 成本矩阵Cost-sensitive 成本敏感Cross entropy 交叉熵Cross validation 交叉验证Crowdsourcing 众包Curse of dimensionality 维数灾难Cut point 截断点Cutting plane algorithm 割平⾯法Letter DData mining 数据挖掘Data set 数据集Decision Boundary 决策边界Decision stump 决策树桩Decision tree 决策树／判定树Deduction 演绎Deep Belief Network 深度信念⽹络Deep Convolutional Generative Adversarial Network/DCGAN 深度卷积⽣成对抗⽹络Deep learning 深度学习Deep neural network/DNN 深度神经⽹络Deep Q-Learning 深度 Q 学习Deep Q-Network 深度 Q ⽹络Density estimation 密度估计Density-based clustering 密度聚类Differentiable neural computer 可微分神经计算机Dimensionality reduction algorithm 降维算法Directed edge 有向边Disagreement measure 不合度量Discriminative model 判别模型Discriminator 判别器Distance measure 距离度量Distance metric learning 距离度量学习Distribution 分布Divergence 散度Diversity measure 多样性度量／差异性度量Domain adaption 领域⾃适应Downsampling 下采样D-separation （Directed separation）有向分离Dual problem 对偶问题Dummy node 哑结点Dynamic Fusion 动态融合Dynamic programming 动态规划Letter EEigenvalue decomposition 特征值分解Embedding 嵌⼊Emotional analysis 情绪分析Empirical conditional entropy 经验条件熵Empirical entropy 经验熵Empirical error 经验误差Empirical risk 经验风险End-to-End 端到端Energy-based model 基于能量的模型Ensemble learning 集成学习Ensemble pruning 集成修剪Error Correcting Output Codes／ECOC 纠错输出码Error rate 错误率Error-ambiguity decomposition 误差-分歧分解Euclidean distance 欧⽒距离Evolutionary computation 演化计算Expectation-Maximization 期望最⼤化Expected loss 期望损失Exploding Gradient Problem 梯度爆炸问题Exponential loss function 指数损失函数Extreme Learning Machine/ELM 超限学习机Letter FFactorization 因⼦分解False negative 假负类False positive 假正类False Positive Rate/FPR 假正例率Feature engineering 特征⼯程Feature selection 特征选择Feature vector 特征向量Featured Learning 特征学习Feedforward Neural Networks/FNN 前馈神经⽹络Fine-tuning 微调Flipping output 翻转法Fluctuation 震荡Forward stagewise algorithm 前向分步算法Frequentist 频率主义学派Full-rank matrix 满秩矩阵Functional neuron 功能神经元Letter GGain ratio 增益率Game theory 博弈论Gaussian kernel function ⾼斯核函数Gaussian Mixture Model ⾼斯混合模型General Problem Solving 通⽤问题求解Generalization 泛化Generalization error 泛化误差Generalization error bound 泛化误差上界Generalized Lagrange function ⼴义拉格朗⽇函数Generalized linear model ⼴义线性模型Generalized Rayleigh quotient ⼴义瑞利商Generative Adversarial Networks/GAN ⽣成对抗⽹络Generative Model ⽣成模型Generator ⽣成器Genetic Algorithm/GA 遗传算法Gibbs sampling 吉布斯采样Gini index 基尼指数Global minimum 全局最⼩Global Optimization 全局优化Gradient boosting 梯度提升Gradient Descent 梯度下降Graph theory 图论Ground-truth 真相／真实Letter HHard margin 硬间隔Hard voting 硬投票Harmonic mean 调和平均Hesse matrix 海塞矩阵Hidden dynamic model 隐动态模型Hidden layer 隐藏层Hidden Markov Model/HMM 隐马尔可夫模型Hierarchical clustering 层次聚类Hilbert space 希尔伯特空间Hinge loss function 合页损失函数Hold-out 留出法Homogeneous 同质Hybrid computing 混合计算Hyperparameter 超参数Hypothesis 假设Hypothesis test 假设验证Letter IICML 国际机器学习会议Improved iterative scaling/IIS 改进的迭代尺度法Incremental learning 增量学习Independent and identically distributed/i.i.d. 独⽴同分布Independent Component Analysis/ICA 独⽴成分分析Indicator function 指⽰函数Individual learner 个体学习器Induction 归纳Inductive bias 归纳偏好Inductive learning 归纳学习Inductive Logic Programming／ILP 归纳逻辑程序设计Information entropy 信息熵Information gain 信息增益Input layer 输⼊层Insensitive loss 不敏感损失Inter-cluster similarity 簇间相似度International Conference for Machine Learning/ICML 国际机器学习⼤会Intra-cluster similarity 簇内相似度Intrinsic value 固有值Isometric Mapping/Isomap 等度量映射Isotonic regression 等分回归Iterative Dichotomiser 迭代⼆分器Letter KKernel method 核⽅法Kernel trick 核技巧Kernelized Linear Discriminant Analysis／KLDA 核线性判别分析K-fold cross validation k 折交叉验证／k 倍交叉验证K-Means Clustering K – 均值聚类K-Nearest Neighbours Algorithm/KNN K近邻算法Knowledge base 知识库Knowledge Representation 知识表征Letter LLabel space 标记空间Lagrange duality 拉格朗⽇对偶性Lagrange multiplier 拉格朗⽇乘⼦Laplace smoothing 拉普拉斯平滑Laplacian correction 拉普拉斯修正Latent Dirichlet Allocation 隐狄利克雷分布Latent semantic analysis 潜在语义分析Latent variable 隐变量Lazy learning 懒惰学习Learner 学习器Learning by analogy 类⽐学习Learning rate 学习率Learning Vector Quantization/LVQ 学习向量量化Least squares regression tree 最⼩⼆乘回归树Leave-One-Out/LOO 留⼀法linear chain conditional random field 线性链条件随机场Linear Discriminant Analysis／LDA 线性判别分析Linear model 线性模型Linear Regression 线性回归Link function 联系函数Local Markov property 局部马尔可夫性Local minimum 局部最⼩Log likelihood 对数似然Log odds／logit 对数⼏率Logistic Regression Logistic 回归Log-likelihood 对数似然Log-linear regression 对数线性回归Long-Short Term Memory/LSTM 长短期记忆Loss function 损失函数Letter MMachine translation/MT 机器翻译Macron-P 宏查准率Macron-R 宏查全率Majority voting 绝对多数投票法Manifold assumption 流形假设Manifold learning 流形学习Margin theory 间隔理论Marginal distribution 边际分布Marginal independence 边际独⽴性Marginalization 边际化Markov Chain Monte Carlo/MCMC 马尔可夫链蒙特卡罗⽅法Markov Random Field 马尔可夫随机场Maximal clique 最⼤团Maximum Likelihood Estimation/MLE 极⼤似然估计／极⼤似然法Maximum margin 最⼤间隔Maximum weighted spanning tree 最⼤带权⽣成树Max-Pooling 最⼤池化Mean squared error 均⽅误差Meta-learner 元学习器Metric learning 度量学习Micro-P 微查准率Micro-R 微查全率Minimal Description Length/MDL 最⼩描述长度Minimax game 极⼩极⼤博弈Misclassification cost 误分类成本Mixture of experts 混合专家Momentum 动量Moral graph 道德图／端正图Multi-class classification 多分类Multi-document summarization 多⽂档摘要Multi-layer feedforward neural networks 多层前馈神经⽹络Multilayer Perceptron/MLP 多层感知器Multimodal learning 多模态学习Multiple Dimensional Scaling 多维缩放Multiple linear regression 多元线性回归Multi-response Linear Regression ／MLR 多响应线性回归Mutual information 互信息Letter NNaive bayes 朴素贝叶斯Naive Bayes Classifier 朴素贝叶斯分类器Named entity recognition 命名实体识别Nash equilibrium 纳什均衡Natural language generation/NLG ⾃然语⾔⽣成Natural language processing ⾃然语⾔处理Negative class 负类Negative correlation 负相关法Negative Log Likelihood 负对数似然Neighbourhood Component Analysis/NCA 近邻成分分析Neural Machine Translation 神经机器翻译Neural Turing Machine 神经图灵机Newton method ⽜顿法NIPS 国际神经信息处理系统会议No Free Lunch Theorem／NFL 没有免费的午餐定理Noise-contrastive estimation 噪⾳对⽐估计Nominal attribute 列名属性Non-convex optimization ⾮凸优化Nonlinear model ⾮线性模型Non-metric distance ⾮度量距离Non-negative matrix factorization ⾮负矩阵分解Non-ordinal attribute ⽆序属性Non-Saturating Game ⾮饱和博弈Norm 范数Normalization 归⼀化Nuclear norm 核范数Numerical attribute 数值属性Letter OObjective function ⽬标函数Oblique decision tree 斜决策树Occam’s razor 奥卡姆剃⼑Odds ⼏率Off-Policy 离策略One shot learning ⼀次性学习One-Dependent Estimator／ODE 独依赖估计On-Policy 在策略Ordinal attribute 有序属性Out-of-bag estimate 包外估计Output layer 输出层Output smearing 输出调制法Overfitting 过拟合／过配Oversampling 过采样Letter PPaired t-test 成对 t 检验Pairwise 成对型Pairwise Markov property 成对马尔可夫性Parameter 参数Parameter estimation 参数估计Parameter tuning 调参Parse tree 解析树Particle Swarm Optimization/PSO 粒⼦群优化算法Part-of-speech tagging 词性标注Perceptron 感知机Performance measure 性能度量Plug and Play Generative Network 即插即⽤⽣成⽹络Plurality voting 相对多数投票法Polarity detection 极性检测Polynomial kernel function 多项式核函数Pooling 池化Positive class 正类Positive definite matrix 正定矩阵Post-hoc test 后续检验Post-pruning 后剪枝potential function 势函数Precision 查准率／准确率Prepruning 预剪枝Principal component analysis/PCA 主成分分析Principle of multiple explanations 多释原则Prior 先验Probability Graphical Model 概率图模型Proximal Gradient Descent/PGD 近端梯度下降Pruning 剪枝Pseudo-label 伪标记Letter QQuantized Neural Network 量⼦化神经⽹络Quantum computer 量⼦计算机Quantum Computing 量⼦计算Quasi Newton method 拟⽜顿法Letter RRadial Basis Function／RBF 径向基函数Random Forest Algorithm 随机森林算法Random walk 随机漫步Recall 查全率／召回率Receiver Operating Characteristic/ROC 受试者⼯作特征Rectified Linear Unit/ReLU 线性修正单元Recurrent Neural Network 循环神经⽹络Recursive neural network 递归神经⽹络Reference model 参考模型Regression 回归Regularization 正则化Reinforcement learning/RL 强化学习Representation learning 表征学习Representer theorem 表⽰定理reproducing kernel Hilbert space/RKHS 再⽣核希尔伯特空间Re-sampling 重采样法Rescaling 再缩放Residual Mapping 残差映射Residual Network 残差⽹络Restricted Boltzmann Machine/RBM 受限玻尔兹曼机Restricted Isometry Property/RIP 限定等距性Re-weighting 重赋权法Robustness 稳健性/鲁棒性Root node 根结点Rule Engine 规则引擎Rule learning 规则学习Letter SSaddle point 鞍点Sample space 样本空间Sampling 采样Score function 评分函数Self-Driving ⾃动驾驶Self-Organizing Map／SOM ⾃组织映射Semi-naive Bayes classifiers 半朴素贝叶斯分类器Semi-Supervised Learning 半监督学习semi-Supervised Support Vector Machine 半监督⽀持向量机Sentiment analysis 情感分析Separating hyperplane 分离超平⾯Sigmoid function Sigmoid 函数Similarity measure 相似度度量Simulated annealing 模拟退⽕Simultaneous localization and mapping 同步定位与地图构建Singular Value Decomposition 奇异值分解Slack variables 松弛变量Smoothing 平滑Soft margin 软间隔Soft margin maximization 软间隔最⼤化Soft voting 软投票Sparse representation 稀疏表征Sparsity 稀疏性Specialization 特化Spectral Clustering 谱聚类Speech Recognition 语⾳识别Splitting variable 切分变量Squashing function 挤压函数Stability-plasticity dilemma 可塑性-稳定性困境Statistical learning 统计学习Status feature function 状态特征函Stochastic gradient descent 随机梯度下降Stratified sampling 分层采样Structural risk 结构风险Structural risk minimization/SRM 结构风险最⼩化Subspace ⼦空间Supervised learning 监督学习／有导师学习support vector expansion ⽀持向量展式Support Vector Machine/SVM ⽀持向量机Surrogat loss 替代损失Surrogate function 替代函数Symbolic learning 符号学习Symbolism 符号主义Synset 同义词集Letter TT-Distribution Stochastic Neighbour Embedding/t-SNE T – 分布随机近邻嵌⼊Tensor 张量Tensor Processing Units/TPU 张量处理单元The least square method 最⼩⼆乘法Threshold 阈值Threshold logic unit 阈值逻辑单元Threshold-moving 阈值移动Time Step 时间步骤Tokenization 标记化Training error 训练误差Training instance 训练⽰例／训练例Transductive learning 直推学习Transfer learning 迁移学习Treebank 树库Tria-by-error 试错法True negative 真负类True positive 真正类True Positive Rate/TPR 真正例率Turing Machine 图灵机Twice-learning ⼆次学习Letter UUnderfitting ⽋拟合／⽋配Undersampling ⽋采样Understandability 可理解性Unequal cost ⾮均等代价Unit-step function 单位阶跃函数Univariate decision tree 单变量决策树Unsupervised learning ⽆监督学习／⽆导师学习Unsupervised layer-wise training ⽆监督逐层训练Upsampling 上采样Letter VVanishing Gradient Problem 梯度消失问题Variational inference 变分推断VC Theory VC维理论Version space 版本空间Viterbi algorithm 维特⽐算法Von Neumann architecture 冯 · 诺伊曼架构Letter WWasserstein GAN/WGAN Wasserstein⽣成对抗⽹络Weak learner 弱学习器Weight 权重Weight sharing 权共享Weighted voting 加权投票法Within-class scatter matrix 类内散度矩阵Word embedding 词嵌⼊Word sense disambiguation 词义消歧Letter ZZero-data learning 零数据学习Zero-shot learning 零次学习Aapproximations近似值arbitrary随意的affine仿射的arbitrary任意的amino acid氨基酸amenable经得起检验的axiom公理，原则abstract提取architecture架构，体系结构；建造业absolute绝对的arsenal军⽕库assignment分配algebra线性代数asymptotically⽆症状的appropriate恰当的Bbias偏差brevity简短，简洁；短暂broader⼴泛briefly简短的batch批量Cconvergence 收敛，集中到⼀点convex凸的contours轮廓constraint约束constant常理commercial商务的complementarity补充coordinate ascent同等级上升clipping剪下物；剪报；修剪component分量；部件continuous连续的covariance协⽅差canonical正规的，正则的concave⾮凸的corresponds相符合；相当；通信corollary推论concrete具体的事物，实在的东西cross validation交叉验证correlation相互关系convention约定cluster⼀簇centroids 质⼼，形⼼converge收敛computationally计算(机)的calculus计算Dderive获得，取得dual⼆元的duality⼆元性；⼆象性；对偶性derivation求导；得到；起源denote预⽰，表⽰，是…的标志；意味着，[逻]指称divergence 散度；发散性dimension尺度，规格；维数dot⼩圆点distortion变形density概率密度函数discrete离散的discriminative有识别能⼒的diagonal对⾓dispersion分散，散开determinant决定因素disjoint不相交的Eencounter遇到ellipses椭圆equality等式extra额外的empirical经验；观察ennmerate例举，计数exceed超过，越出expectation期望efficient⽣效的endow赋予explicitly清楚的exponential family指数家族equivalently等价的Ffeasible可⾏的forary初次尝试finite有限的，限定的forgo摒弃，放弃fliter过滤frequentist最常发⽣的forward search前向式搜索formalize使定形Ggeneralized归纳的generalization概括，归纳；普遍化；判断（根据不⾜）guarantee保证；抵押品generate形成，产⽣geometric margins⼏何边界gap裂⼝generative⽣产的；有⽣产⼒的Hheuristic启发式的；启发法；启发程序hone怀恋；磨hyperplane超平⾯Linitial最初的implement执⾏intuitive凭直觉获知的incremental增加的intercept截距intuitious直觉instantiation例⼦indicator指⽰物，指⽰器interative重复的，迭代的integral积分identical相等的；完全相同的indicate表⽰，指出invariance不变性，恒定性impose把…强加于intermediate中间的interpretation解释，翻译Jjoint distribution联合概率Llieu替代logarithmic对数的，⽤对数表⽰的latent潜在的Leave-one-out cross validation留⼀法交叉验证Mmagnitude巨⼤mapping绘图，制图；映射matrix矩阵mutual相互的，共同的monotonically单调的minor较⼩的，次要的multinomial多项的multi-class classification⼆分类问题Nnasty讨厌的notation标志，注释naïve朴素的Oobtain得到oscillate摆动optimization problem最优化问题objective function⽬标函数optimal最理想的orthogonal(⽮量，矩阵等)正交的orientation⽅向ordinary普通的occasionally偶然的Ppartial derivative偏导数property性质proportional成⽐例的primal原始的，最初的permit允许pseudocode伪代码permissible可允许的polynomial多项式preliminary预备precision精度perturbation 不安，扰乱poist假定，设想positive semi-definite半正定的parentheses圆括号posterior probability后验概率plementarity补充pictorially图像的parameterize确定…的参数poisson distribution柏松分布pertinent相关的Qquadratic⼆次的quantity量，数量；分量query疑问的Rregularization使系统化；调整reoptimize重新优化restrict限制；限定；约束reminiscent回忆往事的；提醒的；使⼈联想…的（of）remark注意random variable随机变量respect考虑respectively各⾃的；分别的redundant过多的；冗余的Ssusceptible敏感的stochastic可能的；随机的symmetric对称的sophisticated复杂的spurious假的；伪造的subtract减去；减法器simultaneously同时发⽣地；同步地suffice满⾜scarce稀有的，难得的split分解，分离subset⼦集statistic统计量successive iteratious连续的迭代scale标度sort of有⼏分的squares平⽅Ttrajectory轨迹temporarily暂时的terminology专⽤名词tolerance容忍；公差thumb翻阅threshold阈，临界theorem定理tangent正弦Uunit-length vector单位向量Vvalid有效的，正确的variance⽅差variable变量；变元vocabulary词汇valued经估价的；宝贵的Wwrapper包装分类:。

SSE包用户指南说明书

Package‘sse’October14,2022Type PackageTitle Sample Size EstimationVersion0.7-17Author Thomas Fabbro[aut,cre]Maintainer Thomas Fabbro<***********************>URL /projects/power/BugReports /projects/power/Description Provides functions to evaluate user-deﬁned power functions for a parame-ter range,and draws a sensitivity plot.It also provides a resampling procedure for semi-parametric sample size estimation and methods for adding information to a Sweave report. License GPL-3LazyLoad yesImports methods,grid,lattice,graphics,stats,parallelSuggests testthatNeedsCompilation noRepository CRANRepository/R-Forge/Project powerRepository/R-Forge/Revision44Repository/R-Forge/DateTimeStamp2021-05-1912:29:42Date/Publication2021-05-1915:20:02UTCR topics documented:Extracting actual elements from objects of class powPar (2)Extracting from objects of class powPar (3)inspect (4)plot (5)powCalc (7)powEx (8)1powPar (10)reﬁne (11)tex (12)update (14)Index16Extracting actual elements from objects of class powParExtracting an actual n,theta,and xiDescriptionExtracting the actual n,theta,or xi from an object of class powPar.This functions are needed within the’power-function’for extracting always the actual element during evaluation.Usagen(x)theta(x)xi(x)Argumentsx An object of class powPar.DetailsDuring the evaluation process with powCalc every combination of n,theta,and xi is evaluated.The described functions extract the actual n,theta,or xi during the evaluation process.The evaluation process with powCalc changes the actual element to ensure that all combinations are evaluated.When a objcect of class powPar is created,theﬁrst element of n,theta,or xi is also set to be the actual element.This allows to use this method also outside the evaluation with powCalc for testing the’power function’.ValueAn integer value for n.A numeric value for theta and xi.NoteDo not use the method pp inside the power-function e.g.like pp(x,"n"),because this would extract the whole vector of n and not just the actual element.See Alsopp,for extracting all other elements provided by the user(exept n,theta,and xi.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(n=seq(from=20,to=60,by=2),theta=seq(from=0.5,to=1.5,by=0.1),muA=0,muB=1)##extracting all elements of psi individually,starting with the firstn(psi)theta(psi)xi(psi)##extracting all elements,not just the actual:pp(psi,name="n")pp(psi,name="theta")pp(psi,name="xi")##an example of usagepowFun<-function(psi){power.t.test(n=n(psi),delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##testing the power-functionpowFun(psi)Extracting from objects of class powParExtracting from an object of class powParDescriptionAll information needed for the’power-function’should be provided by an object of class powPar.To extract this information the function pp should be used.Usagepp(x,name)Argumentsx An object of class powPar.name A character indicating the name of the object to be extracted.ValueEverything that can be stored within a list is possible.4inspectNoteThe name pp is an abbreviation for power parameter.See AlsoFor extracting individual elements of n,theta and xi the functions n,theta,or xi should be used.Examplespsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.5),n=seq(from=10,to=30,by=10),muA=0,muB=1)pp(psi,name="muA")##an example of usagepowFun<-function(psi){power.t.test(n=n(psi),delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##testing the power-functionpowFun(psi)inspect Inspection PlotDescriptionCreating a plot that allows to inspect the sample size calculation.Usageinspect(object)Argumentsobject An object of class power.DetailsThe plot shows for every evaluated theta the sample size and the power on a transformed scale.The method used for sample size estimation’step’or’lm’is indicated.If the method’lm’is used a red regression line is shown for the range that was used for estimation.plot5ValueA plot is generated but nothing is returned.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2),muA=0,muB=1)##defining a power-functionpowFun<-function(psi){power.t.test(n=n(psi)/2,delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,powFun)##adding example at theta of1and power of0.9pow<-powEx(calc,theta=1,power=0.9)##drawing an inspection plotinspect(pow)plot Power PlotDescriptionA sensitivity plot(called power plot)for the sample size ing a contour for a givenpower,it shows how sample size changes if theta is varied.Usageplot(x,y,...)Argumentsx The object of class power used for plottingy Not used...additional arguments implemented:•at=c(0.9,0.8,0.85,0.95)a numeric vector giving breakpoints alongthe power range.Contours(if any)will be drawn at these values.Thecontour line of the example will be emphasised.If example=FALSE theﬁrst number indicates,which contour should be emphasized.6plot•smooth=FALSE logical that indicates if the contours should be smoothed.If TRUE a span of0.75will be used by default.Alternatively the argumentsmooth can also take a numeric value that will be used for smoothing.Seethe documentation of the function loess for details.•example=TRUE a logical indicating if an example should be drawn or not.An example is an arrow that points from the particular theta on the x-axisto the contour line and to the sample size on the y-axis.•reflines=TRUE a logical indicating if reference lines should be drawnor not.Reference lines are drawn at every n and theta that was used forevaluating the power function.If reference lines are drawn the backgroundwill be grey.DetailsGenerates a contour plot with theta on the x-axis and n on the y-axis and the contours for the estimated power(indicated with the argument at).ValueA plot is generated but nothing is returned.See Alsoinspect for drawing an inspection plot and levelplot for further arguments that can be passed to plot.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2),muA=0,muB=1)##defining a power-functionpowFun<-function(psi){power.t.test(n=n(psi)/2,delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,powFun)##adding example at theta of1and power of0.9pow<-powEx(calc,theta=1,power=0.9)##drawing the power plot with3contour linesplot(pow,xlab="Standard Deviation",powCalc7 ylab="Total Sample Size",at=c(0.85,0.9,0.95))##without example the contour line at the first element of at is boldplot(pow,example=FALSE)powCalc Power calculationDescriptionThe user-deﬁned’power-function’provided as statistic will be evaluated for the whole range of n,theta,and xi as speciﬁed in the powPar-object.UsagepowCalc(object,statistic,n.iter=NA,cluster=FALSE)Argumentsobject An object of class powPar.statistic A function that takes an object of class powPar as argument.Ideally this is also the only argument.The function should return a vector of numeric values or avector of logicals,depending on the type.See Details.n.iter A number specifying how often the power-function is evaluated.cluster Still experimental!This argument can be logical,indicating if the library parallel should be used or not,or numeric.In the latter case the number is passed asinteger to the function makeCluster from library parallel.The default is FALSE. DetailsIf the statistic does not return the power(a numeric value between0and1)but returns a logical (TRUE or FALSE)the argument n.iter is expected.The statitic will then be evaluated n.iter-times and the proportion of successes will be interpreted as the power.ValueAn object of class powCalc.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2),muA=0,muB=1)##defining a power-functionpowFun<-function(psi){power.t.test(n=n(psi)/2,delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,powFun)powEx Deﬁning the example to be used and the method to be used for samplesize estimation.DescriptionA function for constructing an object of class power used for drawing an example in a sensitivityplot and for estimating the sample size.UsagepowEx(x,theta,xi=NA,endpoint=NA,power=0.9,drop=0,method=c("default","lm","step"),lm.range=NA,forceDivisor=FALSE)Argumentsx An object of class powCalc.theta a numeric value indicating for which theta to draw the example in the sensitiv-ity plot and where to evaluate sample size.It makes only sense to use a thetain the range evaluated.xi a numeric value,as theta but for xiendpoint Object of class character,indidating for which endpoint sample size should beevaluatedpower Object of class numeric,indicating for what power samle size should be evalu-atedmethod Deﬁning the method how the sample size for the is calculated.method="default"uses"lm"if resampling was used to calculate the powCalc object,otherwise"step"is used.lm.range The range of evaluations that are used for estimating the sample size if themethod="lm"or evaluates to"lm".For the default lm.range=0.2this meansfrom80to120%of the power in the example,e.g.for the power of0.9this is arange from0.72to1.08.Note that the range is cut at0and1.drop Object of class numeric(range:0to1),indicating how many drop outs areexpected.This information is used to calculate the number of subject that shouldbe recruited(addressed e.g.by the function tex using type nRec).forceDivisor If TRUE the biggest common divisor of all evaluated sample sizes is used as divisor and the estimated sample size is increased to be divisible by this divisor.If an integer is provided it is used as divisor.DetailsFor method equal to"lm"a linear model isﬁt as lm(sample.size~transformed(power))with all data where theta,and xi are equal to the theta and xi of the example and within the power-range as deﬁned by the argument lm.range.This model is then used for predicting the sample size.Always inspect the result using inspect!The method"step"returns the last element in the sequence of sample sizes-power pairs,sorted with decreasing power,where the power is above the power deﬁned for the example.ValueAn object of class power.NoteIn older verstions of the package:The function merge was used together with an object of class powEx to form an object of class power.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2),muA=0,muB=1)##defining a power-functionpowFun<-function(psi){power.t.test(n=n(psi)/2,delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,powFun)##adding example at theta of1and power of0.9pow<-powEx(calc,theta=1,power=0.9)##drawing the power plot with3contour linesplot(pow,xlab="Standard Deviation",ylab="Total Sample Size",at=c(0.85,0.9,0.95))##changing the estimation method10powPar pow2<-powEx(calc,theta=1,power=0.9,method="lm")##drawing an inspection plotinspect(pow2)powPar Constructing an object of class’powPar’.DescriptionA function for constructing an object of class powPar.Such an object is used for evaluating the userdeﬁned’power function’for a parameter range.All information that is needed for calculating the power(e.g.a pilot data set)should be provided by making use of the...argument.UsagepowPar(n,theta=NA,xi=NA,...)Argumentsn A numeric vector,indicating for which sample sizes to evaluate the power func-tion.theta A numeric vector that will be used for evaluating the power function.The method theta can be used within the power function to extract the elementsof this vector one by one.xi A numeric vector that will be used for evaluating the power function.Since for every element of xi an individuall sensitivity plot has to be constructed,thelength of the xi vector is usually short....This arguemt is used to provide additional parameters needed by the power func-tion for calculating the power.This parameters can be extracted using the func-tion pp.DetailsAn object of class powPar is used to evaluate the’power function’for a range of n and theta and optionally for several xi values.The user can write a’power function’and extract the individual elements using the functions n, theta,xi and pp.It is a good practice to include everything that is needed for the calculation,also data sets etc.To extract the vector of theta,instead of individual values,you can use the method pp with the name theta.For historical reasons:If the argument theta=NA the argument (a character)has to be used,to indicate the name of a numeric vector that was passed to the argument(...).The same is true for the argument xi.reﬁne11ValueAn object of the class powParExamples##defining the range of n and theta to be evaluatedpsi<-powPar(n=seq(from=20,to=60,by=2),theta=seq(from=0.5,to=1.5,by=0.05))##defining a power-functionpowFun<-function(psi){return(power.t.test(n=n(psi)/2,delta=theta(psi),sig.level=0.05)$power)}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,statistic=powFun)##adding example at theta of1and power of0.9pow<-powEx(calc,theta=1)##drawing the power plotplot(pow,xlab="Difference",ylab="Total Sample Size")refine Reﬁning the estimationDescriptionIncreasing the number of iterations for estimating the sample size for the’theta’and’xi’as speciﬁed for the example.Usagerefine(object,factor=10)Argumentsobject An object of class power.factor An integer larger than one that is multiplied with the available number of itera-tions to from the target number of iterations.ValueAn object of class power.12texNoteThis function is only useful if the object of class power was generated using a resamling approach. Examples##takes quite some time##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2))##defining a power-functionpowFun<-function(psi){x<-rnorm(n(psi)/2)y<-rnorm(n(psi)/2)+theta(psi)return(wilcox.test(x=x,y=y)$p.value<0.05)}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,powFun,n.iter=10)##adding example at theta of1and power of0.9pow<-powEx(calc,theta=1,power=0.9)##another900(=1000-100)iterationsrefine(pow)tex Preparing text for using with LaTeXDescriptionMethods for function texUsagetex(x,type,...)Argumentsx The object of class power used for extractiontype Currently available:•"drop",indicating the drop-out rate used for calculation.•"nRec",sample size that was increased to take into account the drop-outrate.•"nEval",sample size needed for evaluation without taking into account thedrop-out rate.tex13•"n.iter",nuber of iterations used for calculation.•"power",’power’used for calculation.•"sampling",a description of the sampling process.•"theta",’theta’used for calculation....Not used so farValueA character string.MethodsThis methods prepare strings that can directly be used for including information from objects of power into Sweave reports.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2),muA=0,muB=1)##defining a power-functionpowFun<-function(psi){power.t.test(n=n(psi)/2,delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##evaluating the power-function for all combinations of n and thetacalc<-powCalc(psi,powFun)##adding example at theta of1and power of0.9pow<-powEx(calc,theta=1,power=0.9)##drawing the power plot with3contour linesplot(pow,xlab="Standard Deviation",ylab="Total Sample Size",at=c(0.85,0.9,0.95))##tex(pow,type="sampling")update Updating a powCalc or a power object.DescriptionA function for updating an existing object of class powCalc or power.Usageupdate(object,...)Argumentsobject An object of class powCalc or power....The following elements(slots)of the object can be updated:n.iter A number indicating the number of iterations used to dertermine thepower if the calcualation is based on resampling.The existing iterationswill be kept.n.iter indicates the number of iterations after evaluation,therefore n.iter has to be equal or larger than the existing number of iter-ations.n A vector with numbers for evaluating the power.New elements will be evalu-ated and existing elements reused.If some elements of the original are notpart of n they will be ommitted.theta see n for details.xi see n for details.statistic A function of an object of class psi.If a new statistic is provided allelements will be evaluated again.ValueAn object of class powCalc.NoteBe careful if you use this function to update objects of class power.See AlsopowCalc for geneating new objcets of class powCalc.Examples##defining the range of n and theta to be evaluatedpsi<-powPar(theta=seq(from=0.5,to=1.5,by=0.1),n=seq(from=20,to=60,by=2),muA=0,muB=1)##defining a power-functionpowFun<-function(psi){power.t.test(n=n(psi)/2,delta=pp(psi,"muA")-pp(psi,"muB"),sd=theta(psi))$power}##evaluating the power-function for all combinations of n and theta calc<-powCalc(psi,powFun)##updating by using additional elements for"n"calc2<-update(calc,n=seq(from=20,to=90,by=2))##adding example at theta of1and power of0.9pow<-powEx(calc2,theta=1,power=0.9)##drawing the power plot with3contour linesplot(pow,xlab="Standard Deviation",ylab="Total Sample Size",at=c(0.85,0.9,0.95))Index∗methodsinspect,4plot,5powEx,8refine,11tex,12update,14∗methodExtracting from objects of class powPar,3∗miscExtracting actual elements fromobjects of class powPar,2 powCalc,7powPar,10Extracting actual elements fromobjects of class powPar,2 Extracting from objects of classpowPar,3inspect,4,6levelplot,6n,4,10n(Extracting actual elements fromobjects of class powPar),2 plot,5powCalc,2,7,7,8,14power,4,5,8,9,11–14powEx,8,9powPar,2,3,7,10,10,11pp,2,10pp(Extracting from objects of class powPar),3refine,11tex,12theta,4,10theta(Extracting actual elements fromobjects of class powPar),2 update,14xi,4,10xi(Extracting actual elements fromobjects of class powPar),216。

数学专业英语

MATHS ENGLISHabsolute value 绝对值 acceptable region 接受域 additivity 可加性alternative hypothesis 对立假设 analysis of covariance 协方差分析analysis of variance 方差分析 arithmetic mean 算术平均值 association 相关性 assumption checking 假设检验 availability 有效度 band 带宽bar chart 条形图 beta-distribution 贝塔分布 between groups 组间的binomial distribution 二项分布binomial test 二项检验center of gravity 重心 central tendency 中心趋势 hi-square distribution 卡方分布 chi-square test 卡方检验 classify 分类 cluster analysis 聚类分析coefficient 系数 coefficient of correlation 相关系数 collinearity 共线性 components 构成，分量 compound 复合的 confidence interval 置信区间consistency 一致性continuous variable 连续变量control charts 控制图 correlation 相关 covariance 协方差 covariance matrix 协方差矩阵 critical point 临界点 critical value 临界值 cross tab 列联表 cubic term 三次项 cumulative distribution function 累加分布函数curve estimation 曲线估计 default 默认的 deleted residual 剔除残差density function 密度函数dependent variable 因变量design of experiment 试验设计 df.(degree of freedom) 自由度 diagnostic 诊断discrete variable 离散变量discriminant function 判别函数discriminatory analysis 判别分析 D-optimal design D-优化设计 effects of interaction 交互效应eigenvalue 特征值equal size 等含量estimation of parameters 参数估计 estimations 估计量 exact value 精确值 expected value 期望值 exponential指数的 exponential distribution 指数分布 extreme value 极值 factor analysis 因子分析 factor score 因子得分 factorial designs 析因设计 factorial experiment 析因试验fitted line 拟合线fitted value 拟合值fixed variable 固定变量fractional factorial design 部分析因设计 F-test F检验 full factorial design 完全析因设计 gamma distribution 伽玛分布 geometric mean 几何均值 harmonic mean 调和均值 heterogeneity 不齐性 histogram 直方图homogeneity 齐性homogeneity of variance 方差齐性 hypothesis test 假设检验independence独立independent variable 自变量independent-samples 独立样本index of correlation 相关指数interclass correlation 组内相关 interval estimate 区间估计inverse 倒数的iterate 迭代kurtosis 峰度large sample problem 大样本问题least-significant difference 最小显著差数 least-square estimation 最小二乘估计 least-square method 最小二乘法 level of significance 显著性水平 leverage value 中心化杠杆值 life test 寿命试验likelihood function 似然函数 likelihood ratio test 似然比检验 linear estimator 线性估计linear model 线性模型 linear regression 线性回归 linear relation 线性关系 linear term 线性项 logarithmic 对数的 logarithms 对数 lost function 损失函数 main effect 主效应matrix 矩阵 maximum 最大值maximum likelihood estimation 极大似然估计mean squared deviation(MSD) 均方差 mean sum of square 均方和 measure 衡量 media中位数M-estimator M估计minimum 最小值missing values 缺失值mixed model 混合模型mode 众数 Monte Carle method 蒙特卡罗法 moving average移动平均值 multicollinearity 多元共线性 multiple comparison 多重比较multiple correlation 多重相关multiple correlation coefficient 复相关系数 multiple correlation coefficient 多元相关系数multiple regression analysis 多元回归分析 multiple regression equation 多元回归方程 multiple response 多响应 multivariate analysis 多元分析negative nonadditively 不可加性 nonlinear 非线性 nonlinear regression 非线性回归 noparametric tests 非参数检验 normal distribution 正态分布null hypothesis 零假设number of cases 个案数one-sample 单样本one-tailed test 单侧检验one-way ANOVA 单向方差分析one-way classification 单向分类 optimal 优化的 optimum allocation 最优配制order statistics 次序统计量 origin 原点 orthogonal 正交的 outliers 异常值paired observations 成对观测数据paired-sample 成对样本parameter estimation 参数估计partial correlation 偏相关partial correlation coefficient 偏相关系数 partial regression coefficient 偏回归系 percentiles 百分位数 pie chart 饼图 point estimate 点估计poisson distribution 泊松分布 polynomial curve 多项式曲线 polynomial regression 多项式回归 polynomials 多项式 positive relationship 正相关 power 幂 P-P plot P-P概率图 predicted value 预测值prediction intervals 预测区间principal component analysis 主成分分析proability 概率 probability density function 概率密度函数 quadratic 二次的 Q-Q plot Q-Q概率图 quadratic term 二次项 quality control 质量控制 quantitative 数量的，度量quartiles 四分位数 random sampling 随机取样random seed 随机数种子random variable 随机变量randomization 随机化range 极差rank correlation 秩相关rank statistic 秩统计量regression analysis 回归分析regression coefficient 回归系数 regression line 回归线rejection region 拒绝域residual 残差 residual sum of squares 剩余平方和 risk function 风险函数 robustness 稳健性 root mean square 标准差 row 行 run test 游程检验sample size 样本容量 sample space 样本空间 sampling 取样sampling inspection 抽样检验 scatter chart 散点图 S-curve S形曲线sets 集合sign test 符号检验significance level 显著性水平significance testing 显著性检验significant digits 有效数字skewed distribution 偏态分布 small sample problem 小样本问题 sort 排序sources of variation 方差来源 ion 标准离差 standard error of mean 均值的标准误差 statistical quality control 统计质量控制 std. residual 标准残差 stepwise regression analysis 逐步回归 strong assumption 强假设 stud. deleted residual 学生化剔除残差 stud. residual 学生化残差subsamples 次级样本 sufficient statistic 充分统计量 sum of squares 平方和t-distribution t分布test criterion 检验判据test for linearity 线性检验test of goodness of fit 拟合优度检验test of homogeneity 齐性检验 test of independence 独立性检验 test rules 检验法则test statistics 检验统计量 testing function 检验函数 timeseries 时间序列 tolerance limits 容许限 trimmed mean 截尾均值 true value 真值 t-test t检验 two-tailed test 双侧检验unbiased estimation 无偏估计 unbiasedness 无偏性 uniform distribution 均匀分布 value of estimator 估计值variance 方差 variance components 方差分量 variance ratio 方差比weighted average 加权平均值 within groups 组内的 Z score Z分数 active constraint 活动约束 active set method 活动集法 analytic gradient 解析梯度 approximate 近似 arbitrary 强制性的 argument 变量attainment factor 达到因子 bandwidth 带宽 be equivalent to 等价于best-fit 最佳拟合 coefficient 系数 complex-value 复数值 component 分量constrained 有约束的 constraint function 约束函数 converge 收敛cubic polynomial interpolation method 三次多项式插值法 curve-fitting 曲线拟合 data-fitting 数据拟diagonal 对角的 direct search method 直接搜索法direction of search 搜索方向eigenvalue 特征值empty matrix 空矩阵exceeded 溢出的feasible solution 可行解finite-difference 有限差分 first-order 一阶 Gauss-Newton method 高斯-牛顿法 goal attainment problem 目标达到问题 gradient method 梯度法 handle 句柄 Hessian matrix 海色矩阵 independent variables 独立变量inequality 不等式infeasibility 不可行性initial feasible solution 初始可行解 initialize 初始化 invoke 激活 iteration 迭代Jacobian 雅可比矩阵 Lagrange multiplier 拉格朗日乘子 large-scale 大型的least square 最小二乘least squares sense 最小二乘意义上的Levenberg-Marquardt method 列文伯格-马夸尔特法 line search 一维搜索linear equality constraints 线性等式约束 linear programming problem 线性规划问题local solution 局部解 medium-scale 中型的 mixed quadratic and cubic polynomial interpolation and extrapolation method 混合二次、三次多项式内插、外插法 multi objective 多目标的 norm 范数 observed data 测量数据optimization routine 优化过程optimizer 求解器over-determined system 超定系统 partial derivatives 偏导数polynomial interpolation method 多项式插值法quadrati二次的quadratic interpolation method 二次内插法quadratic programming 二次规划real-value 实数值 residuals 残差robust 稳健的robustness 稳健性，鲁棒性scalar 标量semi-infinitely problem 半无限问题Sequential Quadratic Programming method 序列二次规划法 simplex search method 单纯形法sparse matrix 稀疏矩阵 sparsity pattern 稀疏模式 sparsity structure 稀疏结构 starting point 初始点 step length 步长 subspace trust region method 子空间置信域法symmetric matrix 对称矩阵termination message 终止信息 termination tolerance 终止容限 the exit condition 退出条件 the method of steepest descent 最速下降法 transpose 转置unconstrained 无约束的under-determined system 负定系统weighting matrix 加权矩阵approximation 逼近a spline in b-form/b-spline b样条 a spline of polynomial piece /ppform spline 分段多项式样条bivariate spline function 二元样条函数break/breaks 断点coefficient/coefficients 系数cubic interpolation 三次插值/三次内插cubic polynomial 三次多项式cubic smoothing spline 三次平滑样条cubic spline 三次样条 cubic spline interpolation 三次样条插值/三次样条内插 curve 曲线 degree of freedom 自由度 end conditions 约束条件input argument 输入参数 interpolation 插值/内插 interval 取值区间knot/knots 节点least-squares approximation 最小二乘拟合 multiplicity 重次 multivariate function 多元函数 optional argument 可选参数 output argument 输出参数point/points 数据点rational spline 有理样条rounding error 舍入误差（相对误差）sequence 数列（数组spline approximation 样条逼近/样条拟合spline function 样条函数spline curve 样条曲线 spline interpolation 样条插值/样条内插 spline surface 样条曲面 smoothing spline 平滑样条 tolerance 允许精度univariate function 一元函数 absolute error 绝对误差 absolute tolerance 绝对容限adaptive mesh 适应性网格 boundary condition 边界条件 contour plot 等值线图coordinate 坐标系decomposed geometry matrix 分解几何矩阵diagonal matrix 对角矩阵Dirichlet boundary conditions 边界条件eigenvalue 特征值 elliptic 椭圆形的 error estimate 误差估计exact solution 精确解 generalized Neumann boundary condition 推广的Neumann 边界条件geometry description matrix 几何描述矩阵 geometry matrix 几何矩阵 graphical user interface（GUI）图形用户界面 hyperbolic 双曲线的 initial mesh 初始网格 jiggle 微调Lagrange multipliers 拉格朗日乘子 Laplace equation 拉普拉斯方程 linear interpolation 线性插值machine precision 机器精度mixed boundary condition 混合边界条件Neuman boundary condition Neuman边界条件 node point 节点 nonlinear solver 非线性求解器normal vector 法向量Parabolic 抛物线型的partial differential equation 偏微分方程plane strain 平面应变 plane stress 平面应力 Poisson's equation 泊松方程 polygon 多边形positive definite 正定refined triangular mesh 加密的三角形网格relative tolerance 相对容限 relative tolerance 相对容限 residual norm 残差范数 singular 奇异的postulate假定, 基本条件, 基本原理,要求, 假定,要求conic, conical圆锥的；圆锥形的ellipse椭圆, 椭圆形ellipt hyperbolic 双曲线的parabolic用寓言表达的: 抛物线的，像抛物线的algebraic代数的, 关于代数学的mineralogy 矿物学axiom公理collinear在同一直线上的同线的convex 凸出的；凸面的triangle三角形, 三人一组, 三角关系parallelogram平行四边形straight angle平角right angle 直角acute angle锐角obtuse angle钝角reflex angle优角rectilinear直线的；由直线组成的；循直线进行的isosceles triangle等腰三角形equilateral triangle等边三角形right triangle n. 直角三角形obtuse triangle钝角三角形acute triangle锐角三角形equiangular triangle正三角形,等角三角形hypotenuse（直角三角形的）斜边infinitesimal 无穷小的, 极小的, 无限小的calculus 微积分学, 结石inscribe 记下polygon多角形, 多边形curvilinear曲线的, 由曲线组成的intuition 直觉, 直觉的知识integral积分, 完整, 部分defective有缺陷的, (智商或行为有)欠缺的differential coefficient 微分系数irrational numbers无理数domain 定义域contradiction 矛盾continuous variable 连续变量;［连续变数］variation 变分, 变化independent variable 自变量dependent variable 应变量rectangular coordinate 直角坐标abscissa〈数〉横坐标ordinate纵线, 纵座标differential 微分的,微分(differentiation)Integral 积分, 完整, 部分(integration) trigonometry 三角法exponential 指数的, 幂数的logarithm 对数derivative导数；微商tangent 切线正切definite integral 定积分culminate 达到顶点differential equation 微分方程extreme value 极值multiple integral 多重积分functional analysis 泛函分析cardinal number 基数（如：1, 2, 3, ... 有别于序数）denumerable可数的aggregate 合计, 总计, 集合体，合计的, 集合的, 聚合的，聚集, 集合, 合计purport主旨，声称superior 长者, 高手, 上级，较高的, 上级的, 上好的, 出众的, 高傲的cumbersome 讨厌的, 麻烦的, 笨重的drastically 激烈地, 彻底地conservation 守衡律quadrature求积, 求积分interpolation插值extrapolation外推法, 推断internal point 内点generalized solution 广义解hydrodynamics 流体力学，水动力学divergence 发散（性），梯度，发散integro-interpolation method 积分插值法Variational method 变分方法comparatively 比较地, 相当地self-adjoint (nonself-adjoint) 自治的，自伴的，自共轭的finite element method 有限元法spline approximation 样条逼近Particles-in-the-Cell 网格质点法herald 使者, 传令官, 通报者, 先驱, 预兆，预报, 宣布, 传达, 欢呼advection水平对流fluctuation波动, 起伏mean-square 均方dispersion离差, 差量nterpolation 插值divisible 可分的dice, die 骰子pitfall 缺陷celestial天上的macroscopic肉眼可见的, 巨观的classical field theory 经典场理论rigit 刚硬的, 刚性的, 严格的quantum量, 额, [物] 量子, 量子论inception 起初, 获得学位pertain 适合, 属于encompass 包围, 环绕, 包含或包括某事物ingredient 成分, 因素acquainted有知识的, 知晓的 synonymous同义的configuration 构造, 结构, 配置, 外形inertia 惯性, 惯量attribute 特性momentum动量designate 指明projectile 射弹，发射的ballistics 弹道学, 发射学intractable 难处理的furnish 供应, 提供, 装备, 布置torque n. 扭矩, 转矩moment 力矩的dissipation 消散, 分散, 挥霍, 浪费, 消遣, 放荡, 狂饮constitutive构成的, 制定的continuum mechanics 连续介质力学superposition重叠, 重合, 叠合reckon 计算, 总计, 估计, 猜想，数, 计算, 估计, 依赖, 料想strength 强度load 载荷empirical 以经验为依据的insofar 在……范围cohesive 内聚性的stiffness 硬度furnish 供给turbulent 湍流laminar 层流isothermal 等温isotropic 各向同性eddy 旋涡viscosity 粘性、粘度adiabatic 绝热的reversible 可逆的 isentropic 等熵的stream tube 流管 tangential 切线的incompressible 不可压缩的similitude 相似性hydraulic 水力的,水力学的spillway (河或水坝的)放水道,泄洪道prototype 原型,样板vibratory 振动的,摆动的propagation 传播acoustic 听觉的,声学的damp 阻尼,衰减restore 复职,归还neutral 平衡 exciting force 激励力resonant共振的,谐振的stiffness 刚度,刚性magnitude 数值,大小substantially实质上的perturb 干扰,扰乱Fourier series 傅里叶级数shredder 切菜器metropolitan 大都市的at-grade 在同一水平面上elevated 高架的guide way 导轨rigid body 刚体medium 介质aging 老化polymeric聚合(物)的consolidate 把…联合为一体,统一radically 根本地,本质上deliberate 从容不迫的,深思熟虑Attribute赋予medieval 中世纪的etch 蚀刻,蚀镂fingernail 指甲bar chart 直方图joystick 游戏杆trial-error 试制, 试生产junction n. 连接, 接合, 交叉点, 汇合处contrive v. 发明, 设计, 图谋snooker (=snooker pool)彩色台球, 桌球****公理 axiom 命题 proposition 被加数augend , summand 加数addend 被减数minuend 减数subtrahend 差remainder 被乘数multiplicand, faciend 乘数multiplicator 积 product 被除数 dividend 除数 divisor 商 quotient 大于等于 is equal or greater than 小于等于 is equal or lesser than 运算符operator 算术平均数geometric mean n个数之积的n次方根（reciprocal） x的倒数为1/x 有理数 rational number 无理数irrational number 整数 integer小数点 decimal point分数 fraction 分子 numerator 分母 denominator 比 ratio 十进制 decimal system 二进制binary system 十六进制 hexadecimal system 权 weight, significance 截尾 truncation 四舍五入 round 下舍入 round down 上舍入 round up 有效数字significant digit 无效数字 insignificant digit 代数 algebra 单项式monomial 多项式polynomial, multinomial 系数coefficient 未知数 unknown, x-factor, y-factor, z-factor 等式，方程式 equation 一次方程simple equation 二次方程quadratic equation 三次方程cubic equation 四次方程 quartic equation 阶乘 factorial 对数logarithm 指数，幂 exponent 乘方 power 二次方，平方 square 三次方，立方 cube 四次方 the power of four, the fourth power n次方 the power of n, the nth power 开方 evolution, extraction 二次方根，平方根 square root 三次方根，立方根 cube root 四次方根 the root of four, the fourth root n次方根 the root of n, the nth root 坐标系coordinates 坐标轴 x-axis, y-axis, z-axis 横坐标 x-coordinate 纵坐标y-coordinate 原点origin 象限quadrant 截距(有正负之分)intercede （方程的）解solution 线段 segment 射线 radial 平行parallel 相交intersect 角度degree 弧度radian 钝角obtuse angle 平角 straight angle 周角 perigon 底 base 锐角三角形 acute triangle 直角边 leg 斜边 hypotenuse 勾股定理 Pythagorean theorem 钝角三角形 obtuse triangle 不等边三角形 scalene triangle 等腰三角形isosceles triangle 等边三角形equilateral triangle 四边形quadrilateral 平行四边形parallelogram 周长perimeter 全等congruent 三角 trigonometry 正弦 sine 余弦 cosine 正切 tangent 余切 cotangent 正割 secant 余割 cosecant 反正弦 arc sine 反余弦 arc cosine 反正切 arc tangent 反余切 arc cotangent 反正割arc secant 反余割 arc cosecant 集合aggregate 空集 void 子集subset 交集intersection 并集union 补集complement 映射mapping 定义域 domain, field of definition 值域 range 单调性monotonicity 图象 image 数列，级数 series 导数 derivative 无穷小infinitesimal 复数complex number 矩阵matrix 行列式determinant 半圆 semicircle 扇形 sector 环 ring 椭圆 ellipse 圆周 circumference 轨迹 locus, loca(pl.) 平行六面体 parallelepiped 立方体 cube 七面体 heptahedron 八面体 octahedron 九面体 enneahedron 十面体 decahedron 十一面体 hendecahedron 十二面体 dodecahedron 二十面体 icosahedron 多面体 polyhedron 四面体 tetrahedron 五面体pentahedron 六面体hexahedron 菱形rhomb, rhombus, rhombi(pl.), diamond 正方形 square 梯形 trapezoid 直角梯形 right trapezoid 等腰梯形 isosceles trapezoid 五边形 pentagon 六边形 hexagon 七边形heptagon 八边形 octagon 九边形 enneagon 十边形 decagon 十一边形hendecagon 十二边形dodecagon 多边形polygon 正多边形equilateral polygon 相位 phase 振幅 amplitude 内心 incentre(BrE), incenter(AmE) 外心 excentre(BrE), excenter(AmE) 旁心 escentre(BrE), escenter(AmE) 垂心orthocentre(BrE), orthocenter(AmE) 重心barycentre(BrE), barycenter(AmE) 内切圆 inscribed circle 外切圆circumcircle 方差variance 标准差root-mean-square deviation, standard deviation 百分点 percentage 百分位数 percentile 排列permutation 分布 distribution 正态分布 normal distribution 非正态分布abnormal distribution 条形统计图bar graph 柱形统计图histogram 折线统计图 broken line graph 曲线统计图 curve diagram 扇形统计图pie diagram**** mutually disjoint events 互不相交事件mutually disjoint subsets 互不相交子集 mutually independent events 互相独立事件myria 万 myriad 无数的 multiplicity 重数 mid square method 平方取中法 midperpendicular 中垂线 minor 子式 minor arc 劣弧 mixed number 带分数 regular convergence 正则收敛 relative discriminant 相对判别式relative error 相对误差 relative extremum 局部极值 ricci equatoin 李奇恒等式ricci identity 李奇恒等式riemann function 黎曼函数riemann integral 黎曼积分 right direct product 右直积 right endpoint 右端点 right inner product 右内积 ring of integers 整数环 ring of matrices 矩阵环 root mean square error 均方根差 root of equation 方程式的根 rotation of axes 坐标轴的旋转 rotation of co ordinate system 坐标轴的旋转 round off error 舍入规则 round up error 舍入规则 runge kutta method 龙格库塔法 n disk n维圆盘 nth member 第n项 nth partial quotient 第n偏商 nth power operation n次幂运算 nth root n次根 nth term 第n项 n times continuously differentiable n次连续可微的natural injection 自然单射natural isomorphism 自然等necessary and sufficient conditions 必要充分的条件necessary and sufficient statistic 必要充分统计量 neutral element 零元素 neutral line 中线 nonhomogeneous linear boundary value problem 非齐次线性边值问题 nonhomogeneous linear differential equation 非齐次线性微分方程nonhomogeneous linear system of differential equations 非齐次线性微分方程组interval algebra 区间代数 interval analysis 区间分析 interval closed at the right 右闭区间 interval estimation 区域估计 intervalfunction 区间函数 interval graph 区间图 interval of convergence 收敛区间 interval of definition 定义区间 interval topology 区间拓扑irreducible set 不可约集 irreducible r module 不可约r模 periodical decimal fraction 循环十进小数 pentad 拼五小组 pentadecagon 十五边形pentagon 五角形 pentagonal number 五角数 pentagonal pyramid 五角锥pentagram 五角星 pentahedron 五面体 pentaspherical coordinates 五球坐标 penalty method 补偿法 pascal distribution 帕斯卡分布 partition function 分折函数 partial differential equation of elliptic type 椭圆型偏微分方程 partial differential equation of first order 一阶偏微分方程 partial differential equation of hyperbolic type 双曲型偏微分方程 partial differential equation of mixed type 混合型偏微分方程partial differential equation of parabolic type 抛物型偏微分方程partial differential operator 偏微分算子parametric test 参数检验particular solution 特解parallelogram axiom 平行四边形公理orthogonality relation 正交关系 ordinary differential equation 常微分方程optimal value function 最优值函数opposite angles 对角opposite category 对偶范畴 one to one mapping 一一映射 onto mapping 满射 open mapping theorem 开映射定理 one to many mapping 一对多映射one sided limit 单侧极限 numerical solution of linear equations 线性方程组的数值解法 null set 空集 null solution 零解 third boundary condition 第三边界条件two sided neighborhood 双侧邻域unbiased estimating equation 无偏估计方程unbounded function 无界函数unbounded quantifier 无界量词uncertainty principle 测不准原理uncorrelated random variables 不相关随机变量 undetermined coefficient 末定系数 velocity distribution 速度分布 velocity optimal 速度最优的weak approximation theorem 弱逼近定理 weak completeness 弱完备性weak continuity 弱连续性 weak convergence 弱收敛 wiener measure 维纳测度word group 自由群 sample correlation coefficient 样本相关系数sample covariance 样本协方差schwarz inequality 施瓦尔兹不等式second boundary condition 诺伊曼边界条件second comparison test 第二比较检验second limit theorem 第二极限定理 self adjoint differential equation 自伴微分方程 semimajor axis 半长轴semiminor axis 半短轴 sentential calculus 命题演算 set of measure zero 零测度集 set topology 集论拓扑simple connectedness 单连通性slope function 斜率函数 solution curve 积分曲线 solution domain 解域solution set of equation 方程的解集spatial co ordinate 空间坐标specific address 绝对地址spherical bessel function 球贝塞耳函数 spherical cap 球冠 spherical coordinates 球极坐标 spherical curvature 球面曲率 spherical shell 球壳 spherical zone 球带spline function 样条函数spline interpolation 样条内插stability conditions 稳定条件 statistical hypothesis testing 统计假设检验strict inequality 严格不等式strict isotonicity 严格保序性strict isotony 严格保序性strict increasing 严格递增system of partial differential equations 偏微分方程组system of ordinary differential equations 常微分方程组system of linear homogeneousequations 线性齐次方程组 system of linear inhomogeneous equations 线性非齐次方程组system of inequalities 联立不等式system of polarcoordinates 极坐标系system of variational equations 变分方程组system with concentrated parameters 集中参数系统system with distributedparameters 分布参数系统 t1topological space t1拓扑空间 t2topologicalspace t2拓扑空间 t3topological space 分离空间 t4topological space 正则拓扑空间 t5 topological space 正规空间 t6topological space 遗传正规空间 tangent cone 切线锥面 telegraph equation 电报方程 theorem for damping 阻尼定理****充分条件sufficient condition必要条件necessary condition 充要条件sufficient and necessary condition……的充要条件是……… if and only if …****abscissa 横坐标 alternatingseries 交错级数 angle of the sector 扇形角 arbitrary constant 任意常数 augmented matrix 增广矩阵 axis of parabola 拋物线的轴 axis of revolution 旋转轴 axis of rotation 旋转轴 binomial series 二项级数binomial theorem 二项式定理 binomial distribution 二项分布 bisectionmethod 分半法；分半方法 bounded above 有上界的；上有界的 boundedbelow 有下界的；下有界的bounded function 有界函数boundedsequence 有界序列brace 大括号bracket 括号Cartesian coordinates 笛卡儿坐标 certain event 必然事件 circumcentre 外心；外接圆心 circumcircle 外接圆 classical theory of probability 古典概率论 cofactor 余因子; 余因式 common denominator 同分母；公分母 commondifference 公差 common divisor 公约数；公约 common logarithm 常用对数 common multiple 公位数；公倍 common ratio 公比 commutative law 交换律 compasses 圆规 Cauchy-Schwarz inequality 柯西 - 许瓦尔兹不等式central limit theorem 中心极限定理 centripedal acceleration 向心加速度concave downward 凹向下的concurrent 共点concyclic 共圆concyclic points 共圆点Euclidean geometry 欧几里德几何Euler'sformula 尤拉公式；欧拉公式 even function 偶函数 even number 偶数（2）博奕 Gaussian distribution 高斯分布 greatest term 最game （1）对策；大项 greatest value 最大值 harmonic mean (1) 调和平均数; (2) 调和中项 harmonic progression 调和级数 higher order derivative 高阶导数improper fraction 假分数improper integral 广义积分; 非正常积分implicit function 隐函数 incircle 内切圆 inclined plane 斜 included angle 夹角 indefinite integral 不定积分 initial condition 原始条件；初值条件 initial-value problem 初值问题 interior angles on the same side of the transversal 同旁内角interior opposite angle 内对角isosceles triangle 等腰三角形 iterate (1)迭代值; (2)迭代 Lagrange interpolating polynomial 拉格朗日插值多项代Laplace expansion 拉普拉斯展式 lemniscate 双纽线 left hand limit 左方极限 limiting case 极限情况limiting position 极限位置line of best-fit 最佳拟合line segment 线段 logarithmic equation 对数方程 mathematical analysis 数学分析mathematical induction 数学归纳法monotonic decreasing function 单调递减函数 monotonic convergence 单调收敛性 monotonic increasing function 单调递增函数multiple-angle formula 倍角公式multiple root 多重根 mutually disjoint 互不相交 mutually exclusive events 互斥事件mutually independent 独立; 互相独立mutually perpendicular lines 互相垂直 numerical method 计算方法；数值法oblique cone 斜圆锥 orthogonal circles 正交圆 orthogonality 正交性oscillatory convergence 振动收敛性 ordinary differential equation 常微分方程pairwise mutually exclusive events 两两互斥事件place holder 补位数字 point of inflection (inflexion) 拐点; 转折点Pisson distribution 泊松分布point-slope form 点斜式polar coordinate plane 极坐标平面polynomial equation 多项式方程posterior probability 后验概率; 事后概率premultiply 前乘; 自左乘prime factor 质因子；质因素 prime number 素数；质数 principal angle 主角principal axis 主轴 principal value 主值 prior probability 先验概率; 事先概率 probability density function 概率密度函数 product and sum formula 和积互变公式 product sample space 积样本空间 product to sum formula 积化和差公式 proof by contradiction 反证法; 归谬法 proper fraction 真分数proper integral 正常积分proper subset 真子集propositional calculus 命题演算propositional inference 命题推演protractor 量角器Pythagoras' theorem 勾股定理Pythagorean triplet 毕氏三元数组 quadratic convergence 二阶收敛性 quadrature 求积法 quotient set 商集 radial component 沿径分量 radical axis 根轴range 值域；区域；范围；极差；分布域 rationalization 有理化 raw data 原始数据 rectifiable 可求长的 reciprocal 倒数 rectangular coordinate plane 直角坐标平面 recurrence formula 递推公式 reducibility 可约性; 可化简性 reflexive relation 自反关系 reference angle 参考 reference line 基准线 reflex angle 优角；反角 region of acceptance 接受区域region of convergency 收敛区域 region of rejection 否定区域 right circular cone 直立圆锥（体） resolution of vector 向量分解; 矢量分解right hand limit 右方极限 right prism 直立棱柱；直立角柱(体) right pyramid 直立棱锥；直立角锥(体) right-angled triangle 直角二角形scalene triangle 不等边三角形；不规则三角形 scatter diagram 散点图scientific notation 科学记数法semi-conjugate axis 半共轭轴semi-transverse axis 半贯轴semi-vertical angle 半顶角separable differential equation 可分微分方程septic equation 七次方程set square 三角尺；三角板 shaded portion 有阴影部分 significance level 显著性水平 significant figure 有效数字 similar triangles 相似三角形simple iteration method 简单迭代法simple pendulum 单摆Simpson's integral 森逊积分 standard deviation 标准差；标准偏离 standard normal distribution 标准正态分布; 标准常态分布 stationary point 平稳点; 逗留点; 驻点 strictly monotonic 严格单调 statistical chart 统计分析submultiple angle formula 半角公式subsidiary angle 辅助角substitution 代入; 代入法successive approximation 逐次逼近法successive derivative 逐次导数 successive differentiation 逐次微分法suffix 下标 sum to infinity 无限项之和 sum to product formula 和化积公式 superimposing 迭合 supplementary angle 补角 surjection 满射symmetric relation 对称关系 tautology 恒真命题；恒真式 Taylor’s expansion 泰勒展开式 Taylor’s series 泰勒级数 Taylor’s theorem 泰勒定理 test criterion 检验标准 test of significance 显著性检验 to the nearest 至最接近之torque 转矩torus 环面transcendental function 超越函数 transformation of variable 变数转换 transitive 可传递的 transpose of matrix 倒置矩阵；转置矩阵 transversal 截；横截的 triangle law of addition 三角形加法 travel graph 行程图 tree diagram 树形图trapezoidal integral 梯形积分truncated Taylor’s series 截断泰勒级数 two-tailed test 双尾检验；只端检验 type I error I 型误差type II error II型误差unbiased estimator 无偏估计量undetermined coefficient 待定系数 unique solution 唯一解 vertical asymptote 垂直渐近线 vertically opposite angles 对顶角 without loss of generality 不失一般性****分子Numerator 分母Denominator 阿拉伯数字Hindu-Arabic numeral假分数Improper fraction 最大公因子Highest Common Factor (H.C.F.) 最小公倍数 Lowest Common Multiple (L.C.M.) 行列式 determinant****interval closed at the right 右闭区间 interval of convergence 收敛区间interval of definition 定义区间invariance theorem 不变性定理 invariant of an equation 方程的不变量 inverse circular function 反三角函数 inverse hyperbolic function 反双曲函数inversion formula 反演公式 isotonic injective mapping 保序单射映射jacobi identity 雅可比恒等式 jump point 跳跃点 law of double negation 双重否定律 law of inertia 惯性律law of large numbers 大数定律 leading ideal 猪想 liouville theorem 刘维尔定理 lipschitz condition 李普希茨条件 markov transform 马尔可夫变换 mathematical approximation 数学近似法 mathematical model 数学模型 maximum condition 极大条件 maximum deviation 最大偏差 mean square deviation 方差 mean square of error 误差的均方 meromorphic function 亚纯函数柱形统计图 histogram 折线统计图 broken line graph 曲线统计图 curve diagram 扇形统计图 pie diagram排列 permutatio内切圆 inscribed circle 外切圆 circumcircle正多边形 equilateral polygon metric space 度量空间 metric subspace 度量子空间method of runge kutta type 朗格库塔型的方法 method of steepest ascent 最速上升法 method of steepest descent 最速下降法method of finite elements 有限元法method of fractional steps 分步法method of exhaustion 穷竭法 method of approximation 近似法 method of artificial variables 人工变量法method of balayage 扫除法method of characteristic curves 特者法 method of comparison 比较法 method of conjugate gradients 共轭梯度法lateral area 侧面积 last multiplier 最后乘子large sample test 大样本检验lattice constant 点阵常数lattice design 格子设计method of difference 差分法method of elimination 消元法method of estimation 估计法meromorphic differential 亚纯微分 median 中位数 measuring rule 量尺 mean term 内项mean term 内项mean term 内项irreducibility criterion 不可约性判别准则irreducible polynomial 不可约多项式 irreducible generating set 不可约生成集 irregular divisor class 非正则因子类 irregular point 非正则点irregular singular point 非正则奇点isometric circle 等距圆isometric embedding 等距嵌入isomorphic field 同构域isomorphic graph 同构图isomorphic group 同构群isomorphic image 同构象isothermal parameter 等温参数iterated function 叠函数iterated integral 累积分joint distribution 联合分布 jordan algebra 约当代数kernel of an integral equation 积分方程的核l'hospital rule 洛必达规则laboratory system of coordinates 实验室坐标系 labyrinth 迷宫lacation principle 介值定理lag correlation coefficient 滞后相关系数lag regression 落后回归laguerre differential equation 拉盖尔微分方程lame equation 拉梅方程language of formula 公式语言 laplace beltrami operator 拉普拉斯贝尔特拉米算子lateral area 侧面积 last multiplier 最后乘子large sample test 大样本检验lattice constant 点阵常数lattice design 格子设计 left adjoint 左伴随的 left derivative 左导数left differential 左微分 left direct product 左直积 left end point 左端点 left length 左长 left limit value 左极限值left multiplication ring 左乘环length of curve 曲线的长 length of normal 法线的长 levi decomposition 列维分解 limes inferior 下极限 limes superior 上极限limit circle 极限圆 limit circle type 极限圆型logarithm to the base 10 常用对数logarithmic normal distribution 对数正态分布logic of relations 关系逻辑magic circle 幻圆magic cube 幻立方manifold without boundary 无边廖many valued mapping 多值映射marginal distribution density function 边缘分布密度函数 marginal distribution function 边缘分布函数mathematical programming 数学规划 mathematical random sample 数学随机样本 mathematical statistics 数理统计 maximum likelihood estimating function 极大似然估计量 independent variable 自变量 dependent variable 应变量equiangular triangle 正三角形,等角三角形命题proposition 差remainder 积product 除数divisor 商quotient 截尾truncation 未知数unknown, x-factor, y-factor, z-factor 阶乘 factorial 集合 aggregate 空集 void 子集 subset 交集 intersection 并集 union 补集 complement 映射 mapping 勾股定理 Pythagorean theorem 菱形 rhomb, rhombus, rhombi(pl.), diamond 双曲线 hyperbola 抛物线 parabola topology of bounded convergence 有界收敛拓扑toroid 超环面 toroidal coordinates 圆环坐标trace of dyadic 并向量的迹transcendental integral function 超越整函数transformation formulas of the coordinates 坐标的变换公式 transformation to principal axes 轴变换 transversal lines 截线trapezoid method 梯形公式trefoil knot 三叶形纽结truth function 真值函项two sided test 双侧检定 two sided neighborhood 双侧邻域two sided surface 双侧曲面two termed expression 二项式ultrahyperbolic equation 超双曲型方程。

统计学专业英语词汇

概率论与数理统计词汇英汉对照表Aabsolute value 绝对值accept 接受acceptable region 接受域additivity 可加性adjusted 调整的alternative hypothesis 对立假设analysis 分析analysis of covariance 协方差分析analysis of variance 方差分析arithmetic mean 算术平均值association 相关性assumption 假设assumption checking 假设检验availability 有效度average 均值Bbalanced 平衡的band 带宽bar chart 条形图beta-distribution 贝塔分布between groups 组间的bias 偏倚binomial distribution 二项分布binomial test 二项检验Ccalculate 计算case 个案category 类别center of gravity 重心central tendency 中心趋势chi-square distribution 卡方分布chi-square test 卡方检验classify 分类cluster analysis 聚类分析coefficient 系数coefficient of correlation 相关系数collinearity 共线性column 列compare 比较comparison 对照components 构成，分量compound 复合的confidence interval 置信区间consistency 一致性constant 常数continuous variable 连续变量control charts 控制图correlation 相关covariance 协方差covariance matrix 协方差矩阵critical point 临界点critical value 临界值crosstab 列联表cubic 三次的，立方的cubic term 三次项cumulative distribution function 累加分布函数curve estimation 曲线估计Ddata 数据default 默认的definition 定义deleted residual 剔除残差density function 密度函数dependent variable 因变量description 描述design of experiment 试验设计deviations 差异df.(degree of freedom) 自由度diagnostic 诊断dimension 维discrete variable 离散变量discriminant function 判别函数discriminatory analysis 判别分析distance 距离distribution 分布D-optimal design D-优化设计Eeaqual 相等effects of interaction 交互效应efficiency 有效性eigenvalue 特征值equal size 等含量equation 方程error 误差estimate 估计estimation of parameters 参数估计estimations 估计量evaluate 衡量exact value 精确值expectation 期望expected value 期望值exponential 指数的exponential distributon 指数分布extreme value 极值Ffactor 因素，因子factor analysis 因子分析factor score 因子得分factorial designs 析因设计factorial experiment 析因试验fit 拟合fitted line 拟合线fitted value 拟合值fixed model 固定模型fixed variable 固定变量fractional factorial design 部分析因设计frequency 频数F-test F检验full factorial design 完全析因设计function 函数Ggamma distribution 伽玛分布geometric mean 几何均值group 组Hharmomic mean 调和均值heterogeneity 不齐性histogram 直方图homogeneity 齐性homogeneity of variance 方差齐性hypothesis 假设hypothesis test 假设检验Iindependence 独立independent variable 自变量independent-samples 独立样本index 指数index of correlation 相关指数interaction 交互作用interclass correlation 组内相关interval estimate 区间估计intraclass correlation 组间相关inverse 倒数的iterate 迭代Kkernal 核Kolmogorov-Smirnov test柯尔莫哥洛夫-斯米诺夫检验kurtosis 峰度Llarge sample problem 大样本问题layer 层least-significant difference 最小显著差数least-square estimation 最小二乘估计least-square method 最小二乘法level 水平level of significance 显著性水平leverage value 中心化杠杆值life 寿命life test 寿命试验likelihood function 似然函数likelihood ratio test 似然比检验linear 线性的linear estimator 线性估计linear model 线性模型linear regression 线性回归linear relation 线性关系linear term 线性项logarithmic 对数的logarithms 对数logistic 逻辑的lost function 损失函数Mmain effect 主效应matrix 矩阵maximum 最大值maximum likelihood estimation 极大似然估计mean squared deviation(MSD) 均方差mean sum of square 均方和measure 衡量media 中位数M-estimator M估计minimum 最小值missing values 缺失值mixed model 混合模型mode 众数model 模型Monte Carle method 蒙特卡罗法moving average 移动平均值multicollinearity 多元共线性multiple comparison 多重比较multiple correlation 多重相关multiple correlation coefficient 复相关系数multiple correlation coefficient 多元相关系数multiple regression analysis 多元回归分析multiple regression equation 多元回归方程multiple response 多响应multivariate analysis 多元分析Nnegative relationship 负相关nonadditively 不可加性nonlinear 非线性nonlinear regression 非线性回归noparametric tests 非参数检验normal distribution 正态分布null hypothesis 零假设number of cases 个案数Oone-sample 单样本one-tailed test 单侧检验one-way ANOVA 单向方差分析one-way classification 单向分类optimal 优化的optimum allocation 最优配制order 排序order statistics 次序统计量origin 原点orthogonal 正交的outliers 异常值Ppaired observations 成对观测数据paired-sample 成对样本parameter 参数parameter estimation 参数估计partial correlation 偏相关partial correlation coefficient 偏相关系数partial regression coefficient 偏回归系数percent 百分数percentiles 百分位数pie chart 饼图point estimate 点估计poisson distribution 泊松分布polynomial curve 多项式曲线polynomial regression 多项式回归polynomials 多项式positive relationship 正相关power 幂P-P plot P-P概率图predict 预测predicted value 预测值prediction intervals 预测区间principal component analysis 主成分分析proability 概率probability density function 概率密度函数probit analysis 概率分析proportion 比例Qqadratic 二次的Q-Q plot Q-Q概率图quadratic term 二次项quality control 质量控制quantitative 数量的，度量的quartiles 四分位数Rrandom 随机的random number 随机数random number 随机数random sampling 随机取样random seed 随机数种子random variable 随机变量randomization 随机化range 极差rank 秩rank correlation 秩相关rank statistic 秩统计量regression analysis 回归分析regression coefficient 回归系数regression line 回归线reject 拒绝rejection region 拒绝域relationship 关系reliability 可靠性repeated 重复的report 报告，报表residual 残差residual sum of squares 剩余平方和response 响应risk function 风险函数robustness 稳健性root mean square 标准差row 行run 游程run test 游程检验Ssample 样本sample size 样本容量sample space 样本空间sampling 取样sampling inspection 抽样检验scatter chart 散点图S-curve S形曲线separately 单独地sets 集合sign test 符号检验significance 显著性significance level 显著性水平significance testing 显著性检验significant 显著的，有效的significant digits 有效数字skewed distribution 偏态分布skewness 偏度small sample problem 小样本问题smooth 平滑sort 排序soruces of variation 方差来源space 空间spread 扩展square 平方standard deviation 标准离差standard error of mean 均值的标准误差standardization 标准化standardize 标准化statistic 统计量statistical quality control 统计质量控制std. residual 标准残差stepwise regression analysis 逐步回归stimulus 刺激strong assumption 强假设stud. deleted residual 学生化剔除残差stud. residual 学生化残差subsamples 次级样本sufficient statistic 充分统计量sum 和sum of squares 平方和summary 概括，综述Ttable 表t-distribution t分布test 检验test criterion 检验判据test for linearity 线性检验test of goodness of fit 拟合优度检验test of homogeneity 齐性检验test of independence 独立性检验test rules 检验法则test statistics 检验统计量testing function 检验函数time series 时间序列tolerance limits 容许限total 总共，和transformation 转换treatment 处理trimmed mean 截尾均值true value 真值t-test t检验two-tailed test 双侧检验Uunbalanced 不平衡的unbiased estimation 无偏估计unbiasedness 无偏性uniform distribution 均匀分布Vvalue of estimator 估计值variable 变量variance 方差variance components 方差分量variance ratio 方差比various 不同的vector 向量Wweight 加权，权重weighted average 加权平均值within groups 组内的ZZ score Z分数最优化方法词汇英汉对照表Aactive constraint 活动约束active set method 活动集法analytic gradient 解析梯度approximate 近似arbitrary 强制性的argument 变量attainment factor 达到因子Bbandwidth 带宽be equivalent to 等价于best-fit 最佳拟合bound 边界Ccoefficient 系数complex-value 复数值component 分量constant 常数constrained 有约束的constraint 约束constraint function 约束函数continuous 连续的converge 收敛cubic polynomial interpolation method 三次多项式插值法curve-fitting 曲线拟合Ddata-fitting 数据拟合default 默认的，默认的define 定义diagonal 对角的direct search method 直接搜索法direction of search 搜索方向discontinuous 不连续Eeigenvalue 特征值empty matrix 空矩阵equality 等式exceeded 溢出的Ffeasible 可行的feasible solution 可行解finite-difference 有限差分first-order 一阶GGauss-Newton method 高斯-牛顿法goal attainment problem 目标达到问题gradient 梯度gradient method 梯度法Hhandle 句柄Hessian matrix 海色矩阵Iindependent variables 独立变量inequality 不等式infeasibility 不可行性infeasible 不可行的initial feasible solution 初始可行解initialize 初始化inverse 逆invoke 激活iteration 迭代iteration 迭代JJacobian 雅可比矩阵LLagrange multiplier 拉格朗日乘子large-scale 大型的least square 最小二乘least squares sense 最小二乘意义上的Levenberg-Marquardt method列文伯格-马夸尔特法line search 一维搜索linear 线性的linear equality constraints 线性等式约束linear programming problem 线性规划问题local solution 局部解Mmedium-scale 中型的minimize 最小化mixed quadratic and cubic polynomial interpolation and extrapolation method 混合二次、三次多项式内插、外插法multiobjective 多目标的Nnonlinear 非线性的norm 范数Oobjective function 目标函数observed data 测量数据optimization routine 优化过程optimize 优化optimizer 求解器over-determined system 超定系统Pparameter 参数partial derivatives 偏导数polynomial interpolation method多项式插值法Qquadratic 二次的quadratic interpolation method 二次内插法quadratic programming 二次规划Rreal-value 实数值residuals 残差robust 稳健的robustness 稳健性，鲁棒性Sscalar 标量semi-infinitely problem 半无限问题Sequential Quadratic Programming method序列二次规划法simplex search method 单纯形法solution 解sparse matrix 稀疏矩阵sparsity pattern 稀疏模式sparsity structure 稀疏结构starting point 初始点step length 步长subspace trust region method 子空间置信域法sum-of-squares 平方和symmetric matrix 对称矩阵Ttermination message 终止信息termination tolerance 终止容限the exit condition 退出条件the method of steepest descent 最速下降法transpose 转置Uunconstrained 无约束的under-determined system 负定系统Vvariable 变量vector 矢量Wweighting matrix 加权矩阵样条词汇英汉对照表Aapproximation 逼近array 数组a spline in b-form/b-spline b样条a spline of polynomial piece /ppform spline分段多项式样条Bbivariate spline function 二元样条函数break/breaks 断点coefficient/coefficients 系数cubic interpolation 三次插值/三次内插cubic polynomial 三次多项式cubic smoothing spline 三次平滑样条cubic spline 三次样条cubic spline interpolation三次样条插值/三次样条内插curve 曲线Ddegree of freedom 自由度dimension 维数Eend conditions 约束条件Iinput argument 输入参数interpolation 插值/内插interval 取值区间Kknot/knots 节点Lleast-squares approximation 最小二乘拟合Mmultiplicity 重次multivariate function 多元函数Ooptional argument 可选参数order 阶次output argument 输出参数Ppoint/points 数据点Rrational spline 有理样条rounding error 舍入误差（相对误差）Sscalar 标量sequence 数列（数组）spline 样条spline approximation 样条逼近/样条拟合spline function 样条函数spline curve 样条曲线spline interpolation 样条插值/样条内插spline surface 样条曲面smoothing spline 平滑样条Ttolerance 允许精度Uunivariate function 一元函数Vvector 向量Wweight/weights 权重4 偏微分方程数值解词汇英汉对照表Aabsolute error 绝对误差absolute tolerance 绝对容限adaptive mesh 适应性网格Bboundary condition 边界条件Ccontour plot 等值线图converge 收敛coordinate 坐标系Ddecomposed 分解的decomposed geometry matrix 分解几何矩阵diagonal matrix 对角矩阵Dirichlet boundary conditionsDirichlet边界条件Eeigenvalue 特征值elliptic 椭圆形的error estimate 误差估计exact solution 精确解Ggeneralized Neumann boundary condition推广的Neumann边界条件geometry 几何形状geometry description matrix 几何描述矩阵geometry matrix 几何矩阵graphical user interface（GUI）图形用户界面Hhyperbolic 双曲线的Iinitial mesh 初始网格Jjiggle 微调LLagrange multipliers 拉格朗日乘子Laplace equation 拉普拉斯方程linear interpolation 线性插值loop 循环Mmachine precision 机器精度mixed boundary condition 混合边界条件NNeuman boundary condition Neuman边界条件node point 节点nonlinear solver 非线性求解器normal vector 法向量PParabolic 抛物线型的partial differential equation 偏微分方程plane strain 平面应变plane stress 平面应力Poisson's equation 泊松方程polygon 多边形positive definite 正定Qquality 质量Rrefined triangular mesh 加密的三角形网格relative tolerance 相对容限relative tolerance 相对容限residual 残差residual norm 残差范数Ssingular 奇异的。

Clustering

mk
i:C ( i ) k
x
Nk
i
, k 1,, K .
• For a current set of cluster means, assign each observation as:
C(i) arg min xi mk , i 1,, N
1k K 2
• Iterate above two steps until convergence
W (C ) 1 d ( xi , x j ) 2 k 1 C (i )k C ( j )k 1 K d ( xi , x j ) 2 k 1 C (i )k C ( j ) k
K
W (C ) N k
k 1
K
K
C (i )k
x m
i
2
2
k
B(C )
K-means: Setup
• • • x1,…, xN are data points or vectors of observations Each observation (vector xi) will be assigned to one and only one cluster C(i) denotes cluster number for the ith observation
Choice of K?
• Can WK(C), i.e., the within cluster distance as a function of K serve as any indicator? • Note that WK(C) decreases monotonically with increasing K. That is the within cluster scatter decreases with increasing centroids. • Instead look for gap statistics (successive difference between WK(C)):

ClussCluster包用户说明说明书

Package‘ClussCluster’October12,2022Type PackageTitle Simultaneous Detection of Clusters and Cluster-Speciﬁc Genes inHigh-Throughput Transcriptome DataVersion0.1.0Description Implements a new method'ClussCluster'descried in Ge Jiang and Jun Li,``Simultane-ous Detection of Clusters and Cluster-Speciﬁc Genes in High-throughput Transcrip-tome Data''(Unpublished).Simultaneously perform clustering analysis and signature gene selection on high-dimensional transcriptome data sets.To do so,'ClussCluster'incorporates a Lasso-type regularization penalty term to the objective function of K-means so that cell-type-speciﬁc signature genes can be identiﬁed while clustering the cells.Depends R(>=2.10.0)Suggests knitr,rmarkdown(>=1.13)VignetteBuilder knitrImports stats(>=3.5.0),utils(>=3.5.0),VennDiagram,scales(>=1.0.0),reshape2(>=1.4.3),ggplot2(>=3.1.0),rlang(>=0.3.4)License GPL-3Encoding UTF-8LazyData trueRoxygenNote6.1.1NeedsCompilation noAuthor Li Jun[cre],Jiang Ge[aut],Wang Chuanqi[ctb]Maintainer Li Jun<*************>Repository CRANDate/Publication2019-07-0216:30:16UTC12ClussCluster R topics documented:ClussCluster (2)ﬁlter_gene (3)Hou_sim (4)plot_ClussCluster (5)plot_ClussCluster_Gap (6)print_ClussCluster (7)print_ClussCluster_Gap (7)sim_dat (8)Index9 ClussCluster Performs simultaneous detection of cell types and cell-type-speciﬁcsignature genesDescriptionClussCluster takes the single-cell transcriptome data and returns an object containing cell types and type-speciﬁc signature gene setsSelects the tuning parameter in a permutation approach.The tuning parameter controls the L1 bound on w,the feature weights.UsageClussCluster(x,nclust=NULL,centers=NULL,ws=NULL,nepoch.max=10,theta=NULL,seed=1,nstart=20,iter.max=50,verbose=FALSE)ClussCluster_Gap(x,nclust=NULL,B=20,centers=NULL,ws=NULL,nepoch.max=10,theta=NULL,seed=1,nstart=20,iter.max=50,verbose=FALSE)Argumentsx An nxp data matrix.There are n cells and p genes.nclust Number of clusters desired if the cluster centers are not provided.If both are provided,nclust must equal the number of cluster centers.centers A set of initial(distinct)cluster centres if the number of clusters(nclust)is null.If both are provided,the number of cluster centres must equal nclust.ws One or multiple candidate tuning parameters to be evaluated and compared.De-termines the sparsity of the selected genes.Should be greater than1.nepoch.max The maximum number of epochs.In one epoch,each cell will be evaluated to determine if its label needs to be updated.ﬁlter_gene3 theta Optional argument.If provided,theta are used as the initial cluster labels of the ClussCluster algorithm;if not,K-means is performed to produce starting clusterlabels.seed This seed is used wherever K-means is used.nstart Argument passed to kmeans.It is the number of random sets used in kmeans.iter.max Argument passed to kmeans.The maximum number of iterations allowed.verbose Print the updates inside every epoch?If TRUE,the updates of cluster label and the value of objective function will be printed out.B Number of permutation samples.DetailsTakes the normalized and log transformed number of reads mapped to genes(e.g.,log(RPKM+1) or log(TPM+1)where RPKM stands for Reads Per Kilobase of transcript per Million mapped reads and TPM stands for transcripts per million)but NOT centered.Valuea list containing the optimal tuning parameter,s,group labels of clustering,theta,and type-speciﬁcweights of genes,w.a list containig a vector of candidate tuning parameters,ws,the corresponding values of objectivefunction,O,a matrix of values of objective function for each permuted data and tuning param-eter,O_b,gap statistics and their one standard deviations,Gap and sd.Gap,the result given by ClussCluster,run,the tuning parameters with the largest Gap statistic and within one standard deviation of the largest Gap statistic,bestw and onesd.bestwExamplesdata(Hou_sim)hou.dat<-Hou_sim$xrun.ft<-filter_gene(hou.dat)hou.test<-ClussCluster(run.ft$dat.ft,nclust=3,ws=4,verbose=FALSE)filter_gene Gene FilterDescriptionFilters out genes that are not suitable for differential expression analysis.Usagefilter_gene(dfname,minmean=2,n0prop=0.2,minsd=1)4Hou_simArgumentsdfname name of the expression data frameminmean minimum mean expression for each genen0prop minimum proportion of zero expression(count)for each geneminsd minimum standard deviation of expression for each geneDetailsTakes an expression data frame that has been properly normalized but NOT centered.It returns a list with the slot dat.ft being the data set that satisﬁes the pre-set thresholds on minumum mean, standard deviation(sd),and proportion of zeros(n0prop)for each gene.If the data has already been centered,one can still apply theﬁlters of mean and sd but not n0prop. Valuea list containing the data set with genes satisfying the thresholds,dat.ft,the name of dat.ft,andthe indices of those kept genes,index.Examplesdat<-matrix(rnbinom(300*60,mu=2,size=1),300,60)dat_filtered<-filter_gene(dat,minmean=2,n0prop=0.2,minsd=1)Hou_sim A truncated subset of the scRNA-seq expression data set from Hou et.al(2016)DescriptionThis data contains expression levels(normalized and log-transformed)for33cells and100genes. Usagedata(Hou_sim)FormatAn object containing the following variables:x An expression data frame of33HCC cells on100genes.y Numerical group indicator of all cells.gnames Gene names of all genes.snames Cell names of all cells.groups Cell group names.note A simple note of the data set.DetailsThis data contains raw expression levels(log-transformed but not centered)for33HCC cells and 100genes.The33cells belongs to three different subpopulations and exhibited different biological characteristics.For descriptions of how we generated this data,please refer to the paper.Sourcehttps:///geo/query/acc.cgi?acc=GSE65364ReferencesHou,Yu,et al."Single-cell triple omics sequencing reveals genetic,epigenetic,and transcriptomic heterogeneity in hepatocellular carcinomas."Cell research26.3(2016):304-319.Examplesdata(Hou_sim)data<-Hou_sim$xplot_ClussCluster Plots the results of ClussClusterDescriptionPlots the number of signature genes against the tuning parameters if multiple tuning parameters are evaluated in the object.If only one is included,then plot_ClussCluster returns a venn diagram and a heatmap at this particular tuning parameter.Usageplot_ClussCluster(object,m=10,snames=NULL,gnames=NULL,...)top.m.hm(object,m,snames=NULL,gnames=NULL,...)Argumentsobject An object that is obtained by applying the ClussCluster function to the data set.m The number of top signature genes selected to produce the heatmap.snames The names of the cells.gnames The names of the genes...Addtional parameters,sent to the methodDetailsTakes the normalized and log transformed number of reads mapped to genes(e.g.,log(RPKM+1) or log(TPM+1)where RPKM stands for Reads Per Kilobase of transcript per Million mapped reads and TPM stands for transcripts per million)but NOT centered.If multiple tuning parameters are evaluated in the object,the number of signature genes is computed for each cluster and is plotted against the tuning parameters.Each color and line type corresponds to a cell type.If only one tuning parameter is evaluated,two plots will be produced.One is the venn diagram of the cell-type-speciﬁc genes,the other is the heatmap of the data with the cells and top m signature genes.See more details in the paper.Valuea ggplot2object of the heatmap with top signature genes selected by ClussClusterExamplesdata(Hou_sim)<-ClussCluster(Hou_sim$x,nclust=3,ws=c(2.4,5,8.8))plot_ClussCluster(,m=5,snames=Hou$snames,gnames=Hou$gnames)plot_ClussCluster_Gap Plots the results of ClussCluster_GapDescriptionPlots the gap statistics and number of genes selected as the tuning parameter varies.Usageplot_ClussCluster_Gap(object)Argumentsobject object obtained from ClussCluster_Gap()print_ClussCluster7 print_ClussCluster Prints out the results of ClussClusterDescriptionPrints out the results of ClussClusterUsageprint_ClussCluster(object)Argumentsobject An object that is obtained by applying the ClussCluster function to the data set.print_ClussCluster_GapPrints out the results of ClussCluster_Gap Prints the gap statisticsand number of genes selected for each candidate tuning parameter.DescriptionPrints out the results of ClussCluster_Gap Prints the gap statistics and number of genes selected for each candidate tuning parameter.Usageprint_ClussCluster_Gap(object)Argumentsobject An object that is obtained by applying the ClussCluster_Gap function to the data set.8sim_dat sim_dat A simulated expression data set.DescriptionAn example data set containing expressing levels for60cells and200genes.The60cells belong to4cell types with15cells each.Each cell type is uniquely associated with30signature genes,i.e.,theﬁrst cell type is associated with theﬁrst30genes,the second cell type is associated withthe next30genes,so on and so forth.The remaining80genes show indistinct expression patterns among the four cell types and are considered as noise genes.Usagedata(sim_dat)FormatA data frame with60cells on200genes.ValueA simulated dataset used to demonstrate the application of ClussCluster.Examplesdata(sim_dat)head(sim_dat)Index∗datasetsHou_sim,4sim_dat,8ClussCluster,2ClussCluster_Gap(ClussCluster),2filter_gene,3Hou_sim,4plot_ClussCluster,5plot_ClussCluster_Gap,6print_ClussCluster,7print_ClussCluster_Gap,7sim_dat,8top.m.hm(plot_ClussCluster),59。

统计学名词中英文对照

统计学名词中英文对照Aabscissa 横坐标absence rate 缺勤率absolute number 绝对数absolute value 绝对值accident error 偶然误差accumulated frequency 累积频数alternative hypothesis 备择假设analysis of data 分析资料analysis of variance(ANOVA) 方差分析arith-log paper 算术对数纸arithmetic mean 算术均数assumed mean 假定均数arithmetic weighted mean 加权算术均数asymmetry coefficient 偏度系数average 平均数average deviation 平均差Bbar chart 直条图、条图bias 偏性binomial distribution 二项分布biometrics 生物统计学bivariate normal population 双变量正态总体Ccartogram 统计图case fatality rate(or case mortality) 病死率census 普查chi-sguare(X2) test 卡方检验central tendency 集中趋势class interval 组距classification 分组、分类cluster sampling 整群抽样coefficient of correlation 相关系数coefficient of regression 回归系数coefficient of variability(or coefficieut of variation) 变异系数collection of data 收集资料column 列（栏）combinative table 组合表combined standard deviation 合并标准差combined variance(or poolled variance) 合并方差complete survey 全面调查completely correlation 完全相关completely random design 完全随机设计confidence interval 可信区间，置信区间confidence level 可信水平，置信水平confidence limit 可信限，置信限constituent ratio 构成比，结构相对数continuity 连续性control 对照control group 对照组coordinate 坐标correction for continuity 连续性校正correction for grouping 归组校正correction number 校正数correction value 校正值correlation 相关，联系correlation analysis 相关分析correlation coefficient 相关系数critical value 临界值cumulative frequency 累积频率Ddata 资料degree of confidence 可信度，置信度degree of dispersion 离散程度degree of freedom 自由度degree of variation 变异度dependent variable 应变量design of experiment 实验设计deviation from the mean 离均差diagnose accordance rate 诊断符合率difference with significance 差别不显著difference with significance 差别显著discrete variable 离散变量dispersion tendency 离中趋势distribution 分布、分配Eeffective rate 有效率eigenvalue 特征值enumeration data 计数资料equation of linear regression 线性回归方程error 误差error of replication 重复误差error of type II Ⅱ型错误，第二类误差error of type I Ⅰ型错误，第一类误差estimate value 估计值event 事件experiment design 实验设计experiment error 实验误差experimental group 实验组extreme value 极值Ffatality rate 病死率field survey 现场调查fourfold table 四格表freguency 频数freguency distribution 频数分布GGaussian curve 高斯曲线geometric mean 几何均数grouped data 分组资料Hhistogram 直方图homogeneity of variance 方差齐性homogeneity test of variances 方差齐性检验hypothesis test 假设检验hypothetical universe 假设总体Iincidence rate 发病率incomplete survey 非全面调检indepindent variable 自变量indivedual difference 个体差异infection rate 感染率inferior limit 下限initial data 原始数据inspection of data 检查资料intercept 截距interpolation method 内插法interval estimation 区间估计inverse correlation 负相关Kkurtosis coefficient 峰度系数Llatin sguare design 拉丁方设计least significant difference 最小显著差数least square method 最小平方法，最小乘法leptokurtic distribution 尖峭态分布leptokurtosis 峰态，峭度linear chart 线图linear correlation 直线相关linear regression 直线回归linear regression eguation 直线回归方程link relative 环比logarithmic normal distribution 对数正态分布logarithmic scale 对数尺度lognormal distribution 对数正态分布lower limit 下限Mmatched pair design 配对设计mathematical statistics 数理统计（学）maximum value 极大值mean 均值mean of population 总体均数mean square 均方mean variance 均方，方差measurement data 讲量资料median 中位数medical statistics 医学统计学mesokurtosis 正态峰method of least squares 最小平方法，最小乘法method of grouping 分组法method of percentiles 百分位数法mid-value of class 组中值minimum value 极小值mode 众数moment 动差，矩morbidity 患病率mortality 死亡率Nnatality 出生率natural logarithm 自然对数negative correlation 负相关negative skewness 负偏志no correlation 无相关non-linear correlation 非线性相关non-parametric statistics 非参数统计normal curve 正态曲线normal deviate 正态离差normal distribution 正态分布normal population 正态总体normal probability curve 正态概率曲线normal range 正常范围normal value 正常值normal kurtosis 正态峰normality test 正态性检验nosometry 患病率null hypothesis 无效假设，检验假设Oobserved unit 观察单位observed value 观察值one-sided test 单测检验one-tailed test 单尾检验order statistic 顺序统计量ordinal number 秩号ordinate 纵坐标Ppairing data 配对资料parameter 参数percent 百分率percentage 百分数，百分率percentage bar chart 百分条图percentile 百分位数pie diagram 园图placebo 安慰剂planning of survey 调查计划point estimation 点估计population 总体，人口population mean 总体均数population rate 总体率population variance 总体方差positive correlation 正相关positive skewness 正偏态power of a test 把握度，检验效能prevalence rate 患病率probability 概率，机率probability error 偶然误差proportion 比，比率prospective study 前瞻研究prospective survey 前瞻调查public health statistics 卫生统计学Qquality eontrol 质量控制quartile 四分位数Rrandom 随机random digits 随机数字random error 随机误差random numbers table 随机数目表random sample 随机样本random sampling 随机抽样random variable 随机变量randomization 随机化randomized blocks 随机区组,随机单位组randomized blocks analysis of variance 随机单位组方差分析randomized blocks design 随机单位组设计randomness 随机性range 极差、全距range of normal values 正常值范围rank 秩,秩次,等级rank correlation 等级相关rank correlation coefficent 等级相关系数rank-sum test 秩和检验rank test 秩(和)检验ranked data 等级资料rate 率ratio 比recovery rate 治愈率registration 登记regression 回归regression analysis 回归分析regression coefficient 回归系数regression eguation 回归方程relative number 相对数relative ratio 比较相对数relative ratio with fixed base 定基比remainder error 剩余误差replication 重复retrospective survey 回顾调查Ridit analysis 参照单位分析Ridit value 参照单位值Ssample 样本sample average 样本均数sample size 样本含量sampling 抽样sampling error 抽样误差sampling statistics 样本统计量sampling survay 抽样调查scaller diagram 散点图schedule of survey 调查表semi-logarithmic chart 半对数线图semi-measursement data 半计量资料semi-guartile range 四分位数间距sensitivity 灵敏度sex ratio 性比例sign test 符号检验significance 显著性,意义significance level 显著性水平significance test 显著性检验significant difference 差别显著simple random sampling 单纯随机抽样simple table 简单表size of sample 样本含量skewness 偏态slope 斜率sorting data 整理资料sorting table 整理表sources of variation 变异来源square deviation 方差standard deviation(SD) 标准差standard error (SE) 标准误standard error of estimate 标准估计误差standard error of the mean 均数的标准误standardization 标准化standardized rate 标化率standardized normal distribution 标准正态分布statistic 统计量statistics 统计学statistical induction 统计图statistical inference 统计归纳statistical map 统计推断statistical method 统计地图statistical survey 统计方法statistical table 统计调查statistical test 统计表statistical treatment 统计检验stratified sampling 统计处理stochastic variable 分层抽样sum of cross products of 随机变量deviation from mean 离均差积和sum of ranks 秩和sum of sguares of deviation from mean 离均差平方和superior limit 上限survival rate 生存率symmetry 对称(性)systematic error 系统误差systematic sampling 机械抽样Tt-distribution t分布t-test t检验tabulation method 划记法test of normality 正态性检验test of one-sided 单侧检验test of one-tailed 单尾检验test of significance 显著性检验test of two-sided 双侧检验test of two-tailed 双尾检验theoretical frequency 理论频数theoretical number 理论数treatment 处理treatment factor 处理因素treatment of date 数据处理two-factor analysis of variance 双因素方差分析two-sided test 双侧检验two-tailed test 双尾检验type I error 第一类误差type II error 第二类误差typical survey 典型调查Uu test u检验universe 总体,全域ungrouped data 未分组资料upper limit 上限Vvariable 变量variance 方差,均方variance analysis 方差分析variance ratio 方差比variate 变量variation coefficient 变异系数velocity of development 发展速度velocity of increase 增长速度Wweight 权数weighted mean 加权均数Zzero correlation 零相关population 母体sample 样本census 普查sampling 抽样quantitative 量的qualitative/categorical 质的discrete 离散的continuous 持续的population parameters 母体参数sample statistics 样本统计量descriptive statistics 表达统计学inferential/inductive statistics 推论/归纳统计学levels of measurement 衡量尺度nominal scale 名目尺度ordinal scale 顺序尺度interval scale 区间尺度ratio scale 比例尺度frequency distribution 次数分派relative frequency 相对次数range 全距class midpoint 组中点class limits 组限class boundaries 组界class width 组距cumulative frequency (以下) 累加次数decumulative frequency 以上累加次数histogram 直方图pie chart 饼图ogive 肩形图frequency polygon 多边形图cumulative frequency polygon 累加次数多边形图box plot 盒须图stem and leaf plot 枝叶图measures of central tendency 中央趋势量数mean 平均数median 中位数mode 众数location measures 位置量数percentile 百分位数quartile 四分位数decile 十分位数dispersion measures 分散量数range 全距interquartile-range IQR 四分位距mean absolute deviation 平均绝对离差variance 变异数standard deviation 标准差coefficient of variation 变异系数left-skewed 左偏negative-skewed 负偏right-skewed 右偏positive-skewed 正偏contingency table 列联表sampling distribution (of a statistic)(某个统计量的) 抽样散布point estimate 点估量值point estimator 点估量式unbiased estimator 不偏点估量式efficient estimator 有效点估量式consistent estimator 一致点估量式confidence level 信任水准confidence interval 信任区间null hypothesis 虚无假设alternative hypothesis 对立假设left-tailed test 左尾检定right-tailed test 右尾检定two-tailed test 双尾检定test statistic 检定统计量critical value 临界值。

概率与统计英语

概率与统计英语《概率论与数理统计》基本名词中英文对比表英文中文 Probability theory 概率论mathematical statistics 数理统计deterministic phenomenon 确定性现象random phenomenon 随机现象sample space 样本空间random occurrence 随机大事fundamental event 基本领件certain event 必定大事impossible event 不行能大事random test 随机实验incompatible events 互不相容大事frequency 频率classical probabilistic model 古典概型geometric probability 几何概率conditional probability 条件概率multiplication theorem 乘法定理Bayes's formula 贝叶斯公式Prior probability 先验概率Posterior probability 后验概率Independent events 互相自立大事Bernoulli trials 贝努利实验random variable 随机变量probability distribution 概率分布distribution function 分布函数discrete random variable 离散随机变量distribution law 分布律hypergeometric distribution 超几何分布random sampling model 随机抽样模型binomial distribution 二项分布Poisson distribution 泊松分布geometric distribution 几何分布probability density 概率密度continuous random variable 延续随机变量uniformly distribution 匀称分布exponential distribution 指数分布numerical character 数字特征mathematical expectation 数学期望variance 方差moment 矩central moment XXX矩n-dimensional random variable n-维随机变量two-dimensional random variable 二维离散随机变量joint probability distribution 联合概率分布joint distribution law 联合分布律joint distribution function 联合分布函数boundary distribution law 边缘分布律boundary distribution function 边缘分布函数exponential distribution 二维指数分布continuous random variable 二维延续随机变量joint probability density 联合概率密度boundary probability density 边缘概率密度conditional distribution 条件分布conditional distribution law 条件分布律conditional probability density 条件概率密度covariance 协方差dependency coefficient 相关系数normal distribution 正态分布limit theorem 极限定理standard normal distribution 标准正态分布logarithmic normal distribution 对数正态分布covariance matrix 协方差矩阵central limit theorem XXX极限定理Chebyshev's inequality 切比雪夫不等式Bernoulli's law of large numbers 贝努利大数定律statistics 统计量simple random sample 容易随机样本sample distribution function 样本分布函数sample mean 样本均值sample variance 样本方差sample standard deviation 样本标准差sample covariance 样本协方差sample correlation coefficient 样本相关系数order statistics 挨次统计量sample median 样本中位数sample fractiles 样本极差sampling distribution 抽样分布parameter estimation 参数估量estimator 估量量estimate value 估量值unbiased estimator 无偏估量unbiassedness 无偏性biased error 偏差mean square error 均方误差relative efficient 相对有效性minimum variance 最小方差asymptotic unbiased estimator 渐近无偏估量量uniformly estimator 全都性估量量moment method of estimation 矩法估量maximum likelihood method of estimation 极大似然估量法likelihood function 似然函数maximum likelihood estimator 极大似然估量值interval estimation 区间估量hypothesis testing 假设检验statistical hypothesis 统计假设simple hypothesis 容易假设composite hypothesis 复合假设rejection region 否决域acceptance domain 接受域test statistics 检验统计量linear regression analysis 线性回归分析1 概率论与数理统计词汇英汉对比表Aabsolute value 肯定值accept 接受acceptable region 接受域additivity 可加性adjusted 调节的alternative hypothesis 对立假设analysis 分析analysis of covariance 协方差分析analysis of variance 方差分析arithmetic mean 算术平均值association 相关性assumption 假设assumption checking 假设检验availability 有效度average 均值Bbalanced 平衡的band 带宽bar chart 条形图beta-distribution 贝塔分布between groups 组间的bias 偏倚binomial distribution 二项分布binomial test 二项检验Ccalculate 计算case 个案category 类别center of gravity 重心central tendency XXX趋势chi-square distribution 卡方分布chi-square test 卡方检验classify 分类cluster analysis 聚类分析coefficient 系数coefficient of correlation 相关系数collinearity 共线性column 列compare 比较comparison 对比components 构成，重量compound 复合的confidence interval 置信区间consistency 全都性constant 常数continuous variable 延续变量control charts 控制图correlation 相关covariance 协方差covariance matrix 协方差矩阵critical point 临界点critical value 临界值crosstab 列联表cubic 三次的，立方的cubic term 三次项cumulative distribution function 累加分布函数curve estimation 曲线估量Ddata 数据default 默认的definition 定义deleted residual 剔除残差density function 密度函数dependent variable 因变量description 描述design of experiment 实验设计deviations 差异df.(degree of freedom) 自由度diagnostic 诊断dimension 维discrete variable 离散变量discriminant function 判别函数discriminatory analysis 判别分析distance 距离distribution 分布D-optimal design D-优化设计Eeaqual 相等effects of interaction 交互效应efficiency 有效性eigenvalue 特征值equal size 等含量equation 方程error 误差estimate 估量estimation of parameters 参数估量estimations 估量量evaluate 衡量exact value 精确值expectation 期望expected value 期望值exponential 指数的exponential distributon 指数分布extreme value 极值 Ffactor 因素，因子factor analysis 因子分析factor score 因子得分factorial designs 析因设计factorial experiment 析因实验fit 拟合fitted line 拟合线fitted value 拟合值fixed model 固定模型fixed variable 固定变量fractional factorial design 部分析因设计frequency 频数F-test F检验full factorial design 彻低析因设计function 函数Ggamma distribution 伽玛分布geometric mean 几何均值group 组Hharmomic mean 调和均值heterogeneity 不齐性histogram 直方图homogeneity 齐性homogeneity of variance 方差齐性hypothesis 假设hypothesis test 假设检验Iindependence 自立independent variable 自变量independent-samples 自立样本index 指数index of correlation 相关指数interaction 交互作用interclass correlation 组内相关interval estimate 区间估量intraclass correlation 组间相关inverse 倒数的iterate 迭代Kkernal 核Kolmogorov-Smirnov test柯尔莫哥洛夫-斯米诺夫检验kurtosis 峰度Llarge sample problem 大样本问题layer 层least-significant difference 最小显著差数least-square estimation 最小二乘估量least-square method 最小二乘法level 水平level of significance 显著性水平leverage value XXX化杠杆值life 寿命life test 寿命实验likelihood function 似然函数likelihood ratio test 似然比检验 linear 线性的linear estimator 线性估量linear model 线性模型linear regression 线性回归linear relation 线性关系linear term 线性项logarithmic 对数的logarithms 对数logistic 规律的lost function 损失函数Mmain effect 主效应matrix 矩阵maximum 最大值maximum likelihood estimation 极大似然估量mean squared deviation(MSD) 均方差mean sum of square 均方和measure 衡量media 中位数M-estimator M估量minimum 最小值missing values 缺失值mixed model 混合模型mode 众数model 模型Monte Carle method 蒙特卡罗法moving average 移动平均值multicollinearity 多元共线性multiple comparison 多重比较multiple correlation 多重相关multiple correlation coefficient 复相关系数multiple correlation coefficient 多元相关系数multiple regression analysis 多元回归分析multiple regression equation 多元回归方程multiple response 多响应multivariate analysis 多元分析Nnegative relationship 负相关nonadditively 不行加性nonlinear 非线性nonlinear regression 非线性回归noparametric tests 非参数检验normal distribution 正态分布null hypothesis 零假设number of cases 个案数Oone-sample 单样本one-tailed test 单侧检验one-way ANOVA 单向方差分析one-way classification 单向分类optimal 优化的optimum allocation 最优配制order 排序order statistics 次序统计量origin 原点orthogonal 正交的outliers 异样值Ppaired observations 成对观测数据paired-sample 成对样本parameter 参数parameter estimation 参数估量partial correlation 偏相关partial correlation coefficient 偏相关系数partial regression coefficient 偏回归系数percent 百分数percentiles 百分位数pie chart 饼图point estimate 点估量poisson distribution 泊松分布polynomial curve 多项式曲线polynomial regression 多项式回归polynomials 多项式positive relationship 正相关power 幂P-P plot P-P概率图predict 预测predicted value 预测值prediction intervals 预测区间principal component analysis 主成分分析proability 概率probability density function 概率密度函数probit analysis 概率分析proportion 比例Qqadratic 二次的Q-Q plot Q-Q概率图quadratic term 二次项quality control 质量控制quantitative 数量的，度量的quartiles 四分位数Rrandom 随机的random number 随机数random number 随机数random sampling 随机取样random seed 随机数种子random variable 随机变量randomization 随机化range 极差rank 秩rank correlation 秩相关rank statistic 秩统计量regression analysis 回归分析regression coefficient 回归系数regression line 回归线reject 否决rejection region 否决域relationship 关系reliability 牢靠性repeated 重复的report 报告，报表residual 残差residual sum of squares 剩余平方和response 响应risk function 风险函数robustness 稳健性root mean square 标准差row 行run 游程run test 游程检验Ssample 样本sample size 样本容量sample space 样本空间sampling 取样sampling inspection 抽样检验scatter chart 散点图S-curve S形曲线separately 单独地sets 集合sign test 符号检验significance 显著性significance level 显著性水平significance testing 显著性检验significant 显著的，有效的significant digits 有效数字skewed distribution 偏态分布skewness 偏度small sample problem 小样本问题smooth 平滑sort 排序soruces of variation 方差来源space 空间spread 扩展square 平方standard deviation 标准离差standard error of mean 均值的标准误差standardization 标准化standardize 标准化statistic 统计量statistical quality control 统计质量控制std. residual 标准残差stepwise regression analysis 逐步回归stimulus 刺激strong assumption 强假设stud. deleted residual 同学化剔除残差stud. residual 同学化残差subsamples 次级样本sufficient statistic 充分统计量sum 和sum of squares 平方和summary 概括，综述Ttable 表t-distribution t分布test 检验test criterion 检验判据test for linearity 线性检验test of goodness of fit 拟合优度检验test of homogeneity 齐性检验test of independence 自立性检验test rules 检验法则test statistics 检验统计量testing function 检验函数time series 时光序列tolerance limits 容许限total 总共，和transformation 转换treatment 处理trimmed mean 截尾均值true value 真值t-test t检验two-tailed test 双侧检验Uunbalanced 不平衡的unbiased estimation 无偏估量unbiasedness 无偏性uniform distribution 匀称分布Vvalue of estimator 估量值variable 变量variance 方差variance components 方差重量variance ratio 方差比various 不同的vector 向量Wweight 加权，权重weighted average 加权平均值within groups 组内的ZZ score Z分数。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Minimum Cluster Size Estimation and Cluster Reﬁnement for the Randomized GravitationalClustering AlgorithmJonatan Gomez1,Elizabeth León1,and Olfa Nasraoui21Alife&Midas Research GroupsComputer Systems EngineeringUniversidad Nacional de Colombia{jgomezpe,eleonguz}@.co2Knowledge Discovery&Web Mining LabDept.of Computer Engineering&Computer ScienceUniversity of Louisvilleolfa.nasraoui@Abstract.Although clustering is an unsupervised learning approach,most clustering algorithms require setting several parameters(such as thenumber of clusters,minimum density or distance threshold)in advanceto work properly.In this paper,we eliminate the necessity of settingthe minimum cluster size parameter of the Randomized GravitationalClustering algorithm proposed by Gomez et al.Basically,the minimumcluster size is estimated using a heuristic that takes in considerationthe functional relation between the number of clusters and the clusterswith at least a given number of points.Then a data point’s region ofaction(region of the space assigned to a point)is deﬁned and a clusterreﬁnement process is proposed in order to merge overlapping clusters.Our experimental results show that the proposed algorithm is able todeal with noise,whileﬁnding an appropriate number of clusters withoutrequiring a manual setting of the minimum cluster size1.Keywords:data mining,data clustering,gravitational clustering,clus-ter reﬁnement,cluster size estimation.1IntroductionClustering is a learning technique that accepts unlabeled data points(data records)and classiﬁes them into diﬀerent groups or clusters according to some similarity measure(points assigned to the same cluster have high similarity be-tween them,while points assigned to diﬀerent clusters have low similarity be-tween them)[1,6,7,11,13].Although clustering is considered an unsupervised learning approach,most of the clustering algorithms require the setting of some 1This paper was partially funded by a Colciencias Grant(1101-521-28885).J.Gomez was partially funded by the2010-2011Fulbright Visiting Scholar Program Grant. J.Pavón et al.(Eds.):IBERAMIA2012,LNAI7637,pp.51–60,2012.c Springer-Verlag Berlin Heidelberg201252J.Gomez,E.León,and O.Nasraouiparameters like the number of clusters k.Such is the case of partitional algo-rithms like k-means[11].However,ﬁnding the right clusters is a diﬃcult task since clusters can vary in shape,size and density and can suﬀer from the pres-ence of noise.In fact,the presence of noise can deteriorate the results of many of the clustering techniques that are based on the Least Squares estimate[13]. DBSCAN[3]is a density-based clustering algorithm used to discover arbitraily shaped clusters in the presence of noise.Basically,a concept of data point neigh-borhood is deﬁned as the set of points located within a distance smaller than a certain threshold(Epsilon).Then points are categorized into core point(if they have at least a number of predeﬁned neighbors(MinPoints)),border point if they have fewer than such a predeﬁned number of neighbors and belong to some neighborhood of a core point and noise point if they have fewer than such a predeﬁned number of neighbors and do not belong to some neighborhood of a core point.Then,a cluster is deﬁned as a set of density-conected points that is maximal with respect to density-reachability.One diﬃculty in using DBScan is its sensitivity to the setting of the parameters Epsilon and MinPoints[8].Data clustering algorithms inspired by natural phenomena such as natural evolution[10,12],and gravitational force have also been proposed in order to tackle these problems.In particular,gravitational clustering algorithms are a type of agglomerative hierarchical algorithms that are based on concepts inspired fromﬁeld theory in physics[14,9],but suﬀer from the relatively high complexity of the algorithm.One such case is the GC algorithm[14]that simulates the universal gravitational system by considering each data point as a particle in a space exposed to gravitationalﬁelds.A unit mass is associated with each data point,and they are moved toward the cluster centers due to gravitationalﬁelds. In[4,5],a gravitational clustering algorithm(Rgc)that is robust to noise and does not require the number of clusters was proposed.Like GC,it is inspired from gravitational theory,however it redeﬁned the clustering target and the dynamics of the system,thus reducing the time complexity of the original GC to less than quadratic in addition to being able to resist noise.The computational complexity was reduced mainly by considering only a sample instead of all the other data points when making a decision about moving a given data point. Both interacting data points are moved according to a simpliﬁed version of the Universal Gravitational Law and Newton’s Second Motion Law.Points that are close enough end up merged into virtual clusters.Finally,the big crunch eﬀect (convergence to one single big cluster at the end)was eliminated by introducing a cooling mechanism similar to the one used in simulated annealing.In this paper,we extend the Rgc algorithm by deﬁning a heuristic for estimat-ing the minimum cluster size parameter that is based on analyzing the behavior of the number of clusters(c m)with at least m points.A notion of data point’s region of action based on the merging process is deﬁned and a heuristic for re-ﬁning the extracted clusters(merging overlapping clusters),based on the newly developed notion of data point’s region of action.The remainder of this paper is organized as follow.An overview of the Randomized Gravitational Clustering algorithm is done in Sect.2.Then we introduce the heuristic for estimating theMinimum Cluster Size Estimation and Cluster Reﬁnement 53minimum cluster size,the data points’region of action concept and the cluster reﬁnement heuristic in Sect.3and 4.Our Experimental results are presented in Sect.5,and ﬁnally,some conclusions are drawn in Sect.6.2Randomized Gravitational Clustering (Rgc )Gravitational clustering is an agglomerative hierarchical clustering algorithm based on the concepts of ﬁeld theory in physics [9,14].The algorithm simu-lates a gravitational system,following Newton’s gravitational law,where each data point is considered as a single particle,with unit mass,exposed to some gravitational ﬁelds in the feature space (deﬁned by the data set)and is moved toward the cluster centers due to these gravitational ﬁelds generated by other data points.Finally,the hierarchy of emergent clusters of particles is extracted[14].When two particles are close enough to be merged,they are considered to belong to the same cluster,one of them is removed from the simulation,and the mass of the other is increased by the amount of the mass of the particle being removed.The process is stopped when only one particle remains in the system.The main problem with Wright’s algorithm is its high time complexity,O N 3 ,with respect to the size of the data set (N )[9].Also,the gravitational clustering algorithm does not automatically detect noise.In order to reduce the time complexity of the Gravitational clustering algorithm,Gomez et al proposed a randomized version in [4],that modiﬁes four components of the original al-gorithm:First,every particle is moved according to the gravitational force ﬁeld induced by a single randomly selected particle.In this way,Equation 1is used for deﬁning the force vector applied to a given particle and Equation 2is used for moving,in an asynchronous fashion 2,both particles (the particle under con-sideration and the randomly selected one).F x,y =Gm x m yd 2x,y (1)y t +1=y t +Gm x m y −−→d x,y3−−→d x,y (2)Second,when the given particles are close enough to be merged,both of them are kept in the system and an optimal disjoin-union set structure (see [2])is updated to track the formation of clusters.Since no data points are removed from the system,nor are modiﬁcations on the mass of the particles considered,Equation 2is simpliﬁed as follows:y t +1=y t +−−→d x,y G −−→d x,y3(3)2The original gravitational clustering works in a synchronous fashion,since every change movement is computed for every particle before they are moved.54J.Gomez,E.León,and O.NasraouiAlgorithm1Randomized Gravitational Clustering R gc(x,G, (G),M,ε)1.for i=1to N do2.Make(i)3.for i=1to M do4.for j=1to N do5.k=random point index such that k=j6.Move(x j,x k)//Move both points using Eq3.7.if d xj,x k ≤εthen Union(j,k)8.G=(1- (G))*G9.for i=1to N do10.Find(i)11.return disjoint-setsAlgorithm2Cluster Extraction.GetClusters(clusters,α,N)1.newClusters=∅2.for i=0to number of clusters do4.if size(cluster i)≥αthen5.newClusters=newClusters∪{cluster i}6.return newClustersThird,a cooling factor( (G)),that reduces the“Gravitation constant”value G,is introduced in order to reduce or eliminate the big crunch eﬀect.In this way,a new parameter,the number of iterations(M),is introduced.Fourth,a parameter(α)is used to determine the minimum number of data points that a cluster should include in order to be considered a valid cluster.Algorithm1shows the randomized gravitational clustering algorithm(Rgc) producing the disjoint-union set structure while Algorithm2is used for extract-ing the set of clusters that are considered valid.In[5],Gomez et al introduced two elements in the Rgc algorithm:theﬁrst one to reduce the eﬀect of the data set size in the system’s dynamics and the second one to automatically set the initial gravitational constant value.In order to determine an appropriate value of G,an extended bisection search algorithm is used[2].Basically,the value of G is considered appropriate if the number ofclustered points is close to N2,i.e.N2±τ,after√Niterations(these values wereobtained after analyzing the behavior of the cluster formation when running theRgc algorithm on diﬀerent data sets).If more than half of the points(N2)areclustered,then G is reduced,otherwise G is increased.3Estimation of the Minimum Cluster Size(α)The minimum cluster size,i.e.,the number of data points that a cluster should contain in order to be considered a valid cluster,can be estimated by analyzingMinimum Cluster Size Estimation and Cluster Reﬁnement55Fig.1.Expected relation between the number of cluster with at least m points the behavior of the number of clusters(c m)with at least m points.First,noise points are expected to be far enough from any other point,thus reducing the possibility of being merged into clusters,so noisy points would be merged in clusters with a low number of points.Second,noise is expected to be present in diﬀerent regions of the space,so it is expected that the number of clusters with a low number of points(noise points)will be high.Third,it is expected that real clusters will be deﬁned by a large number of points.Fourth,it is expected,in general,that the number of clusters deﬁning the data set is very small compared to the number of points represented by such clusters.Figure1shows the expected behavior of the number of clusters c m with at least m points.Therefore,we compute the behavior of c m and obtain the sizeαwhere theslope is closest toπ4,i.e.,we choose the valueαwhere the number of clusters((nc)with at leastαpoints is higher than the valueα(nc>α),see Fig.1.4Data Point Region of Action and Cluster Reﬁnement Due to the random nature of the merging process of the Rgc algorithm,it ispossible that some good points,points that already belong to some cluster,willnot be assigned to some cluster.Moreover,it is possible that some’overlapping’clusters will not be merged at all.In order to tackle these disadvantages,weintroduce a notion of a data point’s region of action based on the merging processand we propose a heuristic for reﬁning the extracted clusters based on this notion. Due to the dynamic behavior of the Rgc,it is possible not only to track thecluster formation,but also to track the real distance(distance in the originalspace before points were moved)between points that have been merged.This distance gives us an indication of the strength of attraction force exerted by theregion of action of the data point.In this way,we associate with every data point k,two values:the aggregated merging distance of data point k(noted d k)and the total number of data points that data point k has being merged with(notedn k).Notice that d k and n k can be computed in constant time,by incrementing d k and n k every time that a data point j is merged with data point k(just by adding to d k the distance between data points k and j and by adding one to n k).Finally,we compute the average merging distance(d k=d k nk)of data point56J.Gomez,E.León,and O.Nasraouik,and we use it as the radius of the region of action(a hyper-sphere centered at the data point with radius d k)deﬁned by data point k.We deﬁne the concept of a data point’s region of action,having in mind two main objectives:Theﬁrst one is to be able to determine the cluster that an unknown and potentially new data point should belong to,and the second one is to be able to reﬁne clusters by merging’overlapping clusters’.In this direction,we deﬁne the sub-cluster relation between clusters as follows:One cluster c1is said to be a sub-cluster of a cluster c2if every data point x of cluster c1falls inside the region of action of some data point y in cluster c2. As expected,two clusters will be merged if one of them is a sub-cluster of the other one.This process can be repeated until no further clusters are merged.5Experimental ResultsExperiments were conducted on synthetic data sets with Gaussian clusters and with clusters of diﬀerent shapes.5.1Gaussian Cluster Data SetsTests were conducted on four synthetic data sets with diﬀerent cluster densities and sizes,and with diﬀerent levels of noise(10%and20%),see Fig.2.(a)(b)(c)(d)Fig.2.Data sets with Gaussian clusters:(a)Two clusters,(b)Five clusters,(c)Ten clusters,and(d)Fifteen clustersDue to the lack of space,we show the results for only the most challenging data set,i.e.the ten clusters’data set with20%noise.Figure4shows the typical clustering result obtained after each500iterations,up to the end of the process (when the stopping criterion is applied).Notice that the majority of data points inside the clusters are moving towards the cluster centers(seeﬁrst column)while almost all noisy points remain still or barely move.As expected,clusters emerge as in a dynamic system,ﬁrst the most dense clusters(see point labels’column at iteration250)and then the less dense clusters(see point labels’column at iteration1000).Moreover,noisy data points either do not form clusters or they form very tiny clusters.It can be seen that the proposed heuristic for determining the minimum size of valid clusters works well since almost every noise point isMinimum Cluster Size Estimation and Cluster Reﬁnement57 removed and only’real’clusters are extracted(see third column).Finally,it looks like both the simpliﬁcation of the heuristic for estimating the initial gravitational constant G and the heuristic for stopping the Rgc algorithm work well,since,in general,no’real’clusters were merged and the structure of the extracted clusters does not change much between the last two shown iterations(see iteration1000 and1500)indicating that no further changes are expected.Figure5shows the behavior of the PFRgc algorithm on all the four Gaussian clusters data sets (when applying the reﬁnement to the extracted clusters)and compares them against the results obtained by DBScan.As can be noticed,some small clusters were merged with bigger clusters producing a more compact cluster model3. However,for the15cluster data set,PFRgc divided one of the clusters into a few smaller ones.This can be due to the fact that this cluster is not dense and is located between two very dense cluster.In the end,all the clusters are extracted and noise is removed.5.2Data Sets with Clusters of diﬀerent ShapesTests were conducted on three Chameleon benchmark data sets,see Fig.3.(a)(b)(c)Fig.3.Chameleon Data sets(a)Six clusters,(b)Nine Clusters,and(c)Eight clusters Due to the lack of space,we show results for only the nine clusters’data set. Figure6shows the typical clustering result obtained after each400iterations up to the end of the process(when the stopping criteria is applied).As expected,the behavior is similar to the one observed on the data sets with Gaussian clusters. Interestingly,the majority of data points inside the clusters are seen to be moving towards some kinds of cluster cores(seeﬁrst column)while almost all the noisy points remain still or barely move.In this way,any cluster shape is detected by the PFRgc algorithm.Notice that some’overlapping’clusters are generated by the PFRgc,but such clusters are merged when using the reﬁnement process,see Fig.7.As expected,the behavior for all the data sets is similar to that observed on the nine cluster data set.However,for the eight clusters data set,PFRgc merges three clusters and splits one of them.,which can be due to the fact that3It is possible to apply the reﬁnement process to the generated clusters before extract-ing them.However,we just perform it after extracting the clusters for simplifying the analysis of the reﬁnement process.58J.Gomez,E.León,and O.Nasraoui50010001500Fig.4.Typical clustering result for the Gaussian10clusters data set with20%noise each500iterations.Row one shows the position of the data points after the given number of iterations,and row two shows the extracted clusters after the given number of iterations.(a)(b)(c)Fig.5.Typical clustering results obtained by the Rgc algorithm on the Gaussian data sets:(a)Extracted clusters,(b)Reﬁned clusters,and(c)DBScan results with Max Distance0.014and MinPoints5.one of the clusters is not dense and is located very close to a dense cluster.As in the previous experiments,all the clusters are extracted and noise is removed. Notice that PFRgc generally performs better than DBScan.Minimum Cluster Size Estimation and Cluster Reﬁnement594008001200Fig.6.Typical clustering result for the nine clusters chameleon data set(t7.10)after each400iterations.Row one shows the position of the points after the given number of iterations,and row two shows the extracted clusters after that many iterations.(a)(b)(c)Fig.7.Typical clustering results obtained by the Rgc algorithm on the Chameleon data sets(last iteration):(a)Extracted clusters,(b)Reﬁned clusters,and(c)DBScan results with Max Distance0.014and MinPoints56ConclusionsIn this paper,we established a heuristic for estimating the minimum cluster size parameter by analyzing the behavior of the number of clusters(c m)with at least60J.Gomez,E.León,and O.Nasraouim points,we deﬁned the notion of a data point’s region of action based on the merging process,and we developed a heuristic for reﬁning the extracted clusters (merging overlapping clusters),based on the notion of a data point’s region of action.Since the PFRgc algorithm neither requires a special initialization, nor assumes a parametric model,PFRgc is able toﬁnd good clusters,without knowing the number of clusters in advance,regardless of their arbitrary shape, and in the presence of noise.As shown,the rich dynamic system behavior of the Rgc algorithm allows us to get useful information for the clustering process.References1.Bezdek,J.C.:Pattern Recognition with Fuzzy Objective Function Algorithms.Plenun Press(1981)2.Cormer,T.,Leiserson,C.,Rivest,R.:Introduction to Algorithms.McGraw-Hill(1990)3.Ester,M.,Kriegel,H.,Sander,J.,Xu,X.:A Density-Based Algorithm for Dis-covering Clusters in Large Spatial Databases with Noise.In:2nd Intl.Conf.on Knowledge Discovery and Data Mining(KDD1996),pp.226–231.AAAI(1996) 4.Gomez,J.,Dasgupta,D.,Nasraoui,O.:A New Gravitational Clustering Algorithm.In:3rd SIAM Intl.Conf.on Data Mining(SDM2003),vol.3,pp.83–94.Society for Industrial and Applied Mathematics(2003)5.Gomez,J.,Nasraoui,O.,Leon,E.:RAIN–Data Clustering Using RandomizedInteractions between Data Points.In:3rd Intl.Conf.on Machine Learning and Applications(ICMLA2004),pp.250–255(2004)6.Han,J.,Kamber,M.:Data Mining–Concepts and Techniques.Morgan Kaufmann(2000)7.Jain,A.K.:Data Clustering–50Years Beyond K-Means.Pattern RecognitionLetters31(8),651–666(2010)8.Karypis,G.,Han,E.,Kumar,V.:CHAMELEON–A Hierarchical Clustering Al-gorithm Using Dynamic Model.IEEE Computer32(8),68–75(1999)9.Kundu,S.:Gravitational Clustering–A New Approach Based on the Spatial Dis-tribution of the Points.Pattern Recognition32(7),1149–1160(1999)10.Leon,E.,Nasraoui,O.,Gomez,J.:A Scalable Evolutionary Clustering Algorithmwith Self-Adaptive Genetic Operators.In:2010IEEE Congress on Evolutionary Computation(CEC2010),pp.4010–4017.IEEE(2010)11.MacQueen,J.:Some Methods for Classiﬁcation and Analysis of Multivariate Obser-vations.In:5th Berkeley Symposium on Mathematics,Statistics,and Probabilities, pp.281–297.University of California(1967)12.Nasraoui,O.,Krishnapuram,R.:A Novel Approach to Unsupervised Robust Clus-tering Using Genetic Niching.In:9th IEEE Intl.Conf.on Fuzzy Systems(FUZZ IEEE2000),vol.1,pp.170–175(2000)13.Rousseeuw,P.J.,Leroy, A.M.:Robust Regression and Outlier Detection.John Wiley&Sons(1987)14.Wright,W.E.:Gravitational Clustering.Pattern Recognition9(3),151–166(1977)。