Short-Term Prediction of Wind Farm Power
- 格式:pdf
- 大小:1.44 MB
- 文档页数:12
基于稳健估计时间序列法的风功率预测朱晓荣;刘艳萍【摘要】Time series method based on robust estimation is introduced to make short term prediction of wind power. Firstly the data are preprocesscd, and then the least squares method and robust estimation method are respectively applied to build an autoregressive integrated moving average model, finally the wind power in the next 30 minutes are forecasted, repeating 10 times. The results show that by using the time series model based on robust estimation to predict wind power, the forecasting error of most points is between 5 percent, except one point of 10. 1 percent. The forecasting error is significantly smaller than conventional time series. It proves robust estimation methods can get a better forecast accuracy when the data have few outliers.%基于稳健估计运用时间序列法对风电场出力进行了短期预测.先对数据进行了预处理,用最小二乘法和稳健估计法分别建立了自回归滑动平均模型.通过模型提前预测了下个30min的风电场出力,总共预测了10次.结果表明,基于稳健估计的时间序列建模进行预测的误差大多数都在5%以内,只有一个点达到10.1%,明显比常规的时间序列建模预测的误差要小.说明稳健估计能在建模数据含有少量异常值时,比常规自回归模型预报精度要高.【期刊名称】《电力系统及其自动化学报》【年(卷),期】2012(024)003【总页数】5页(P107-110,126)【关键词】风电场出力预测;时间序列法;稳健估计;最小二乘法【作者】朱晓荣;刘艳萍【作者单位】华北电力大学电力与电子工程学院,保定071003;华北电力大学电力与电子工程学院,保定071003【正文语种】中文【中图分类】TM614风力的随机性和间歇性不能保证输出平稳的功率,这对电力系统的稳定性以及发电和运行计划的制定带来很多困难。
含核函数切换的风电功率短期预测新方法欧阳庭辉;查晓明;秦亮;熊一;夏添;黄鹤鸣【摘要】为了降低大规模风电接入对电网造成的潜在威胁,提出基于核函数切换机制的混沌时间序列预测新方法,以进一步提高短期风电功率预测性能.首先,结合互信息法和虚假邻近点法实现原始风电功率序列的相空间重构,通过递归图和最大Lyapunov指数验证了风电功率是来自含确定性和随机性的混沌系统,说明了混沌预测方法的可行性.其次,给出了使用核函数进行混沌时间序列预测的实现方法,结合训练样本分析了该方法优于传统预测方法,并结合训练结果提出了使用支持向量机(SVM)训练最优核函数的切换机制,进一步提高了预测精度.最后,以美国BPA数据为实例,通过预测误差指标的对比分析,说明了含切换机制的核函数预测法可有效地实现风电功率短期预测,同时也证明了该方法可较好地提高风电预测性能.【期刊名称】《电力自动化设备》【年(卷),期】2016(036)009【总页数】7页(P80-86)【关键词】风电;预测;核函数;支持向量机;切换机制;混沌时间序列;风电功率预测【作者】欧阳庭辉;查晓明;秦亮;熊一;夏添;黄鹤鸣【作者单位】武汉大学电气工程学院,湖北武汉430072;武汉大学电气工程学院,湖北武汉430072;武汉大学电气工程学院,湖北武汉430072;武汉大学电气工程学院,湖北武汉430072;武汉大学电气工程学院,湖北武汉430072;武汉大学电气工程学院,湖北武汉430072【正文语种】中文【中图分类】TM6140 引言随着化石能源的不断消耗,能源危机日益严重,为此,全世界范围内都在大力发展可再生能源[1]。
目前,风电作为一种丰富的、可开发利用的资源,在电网中的渗透率逐步提高。
然而,由于风能具有随机性和波动性,特别是在大规模、高集中度的风电发展模式下[2],风电带来的挑战大于机遇,如2008年美国德州发生了一场大规模、危害较大的大功率下坡事件[3]。
电气传动2023年第53卷第5期ELECTRIC DRIVE 2023Vol.53No.5摘要:考虑相关风电场之间的影响因素可以有效提升新建风电场的风电功率预测精度,提出利用变分模态分解技术(VMD )将单风电场风电功率预处理分解为本征模态函数(IMF ),然后将各风电场同频段分量,即低频分量、高频分量和残差分量,组合为二维特征矩阵作为卷积神经网络(CNN )的输入,利用卷积神经网络提取同分量子模态下空间特征信息,输入到长短时记忆网络(LSTM )提取时间序列中的长时依赖关系进行预测,最后将预测结果进行叠加,获得完整的预测结果。
组合神经网络的超参数设置相较于单一模型对预测精度的影响更大,采用新型麻雀搜索算法(SSA )可以节省人工手动调制参数的时间、提高超参数设置的精度和效率。
使用该方法对某风电集群中的新建基准风电场进行预测,预测结果表明经SSA 优化的VMD -CNN -LSTM 模型在预测风电集群数据上有较高的精度,预测效果好于对比模型LSTM ,CNN -LSTM 和SSA -VMD -LSTM 。
关键词:风电功率;变分模态分解;卷积神经网络;长短时记忆网络;麻雀搜索算法中图分类号:TM743文献标识码:ADOI :10.19457/j.1001-2095.dqcd24196Research on Short -term Wind Power Forecasting Based on VMD -CNN -LSTM Optimized bySparrow AlgorithmZHANG Zihua 1,LI Yan 1,XU Tianqi 1,WANG Yangguang 2,DENG Xiaoliang 2(1.The Key Laboratory of Cyber-physical Power System of Yunnan Colleges and Universities ,Yunnan Minzu University ,Kunming 650504,Yunnan ,China ;2.State Grid HunanElectric Power Co.,Ltd.,Changsha 410004,Hunan ,China )Abstract:In order to improve the prediction accuracy of wind power in new wind farms effectively ,the influencing factors between relevant wind farms were considered.A variational mode decomposition (VMD )technique was proposed to decompose the wind power preprocessing of a single wind farm into intrinsic mode function (IMF ),and then the same frequency band component such as low-frequency components ,high-frequency components and residual components of each wind farm were combined respectively as the input of convolution neural network (CNN ).CNN was used to extract the characteristic information under the same split sub-mode ,which was input to the long short-term memory (LSTM )network for prediction ,and finally the prediction results were overlaid to obtain the complete prediction pared with a single model ,the hyperparameter setting of the combined neural network will affect the prediction accuracy more.A new sparrow search algorithm (SSA )was proposed to save the time of manual parameter adjustment and improve the accuracy and efficiency of hyperparameter setting .The proposed method was used to predict the new benchmark wind farm in a wind power cluster ,the result verifies that the VMD-CNN-LSTM optimized by SSA has a higher accuracy in predicting the wind power cluster data ,which is higher than the comparison model LSTM ,CNN-LSTM and SSA-VMD-LSTM.Key words:wind power ;variational mode decomposition (VMD );convolution neural network (CNN );long short-term memory (LSTM )network ;sparrow search algorithm (SSA )基金项目:国家自然科学基金项目(62062068,61761049)作者简介:张子华(1997—),女,硕士,主要研究领域为新能源发电预测技术,Email :****************通讯作者:李琰(1977—),女,副教授,主要研究领域为电力信息物理系统,Email :**************.cn基于麻雀算法优化的VMD⁃CNN⁃LSTM 的短期风电功率研究张子华1,李琰1,徐天奇1,王阳光2,邓小亮2(1.云南民族大学云南省高校CPS 融合系统重点实验室,云南昆明650504;2.国网湖南省电力有限公司,湖南长沙410004)近年来由于能源形势愈发严峻[1-2],全球范围内新能源发展在电力市场日趋活跃,其中风电产77张子华,等:基于麻雀算法优化的VMD⁃CNN⁃LSTM的短期风电功率研究电气传动2023年第53卷第5期业发展尤为迅速,风电建设由初期分散式、小规模逐渐演变为集中式、大规模的风电集群,资源的紧凑性使得新建风电场往往与在运风电场区域存在相互关联性[3]。
基于小波变换的ARMA-LSSVM短期风速预测赵辉;李斌;李彪;岳有军【摘要】对风电场风速的准确预测,可以有效减轻并网后风电对电网的影响,提高风电市场竞争力.提出将时间序列自回归滑动平均模型(Auto Regressive Moving Average,ARMA)与最小二乘支持向量机模型(Least Square Support Vector Machine,LS-SVM)相结合的混合模型短期风速预测方法.采用小波变换(Wavelet Transform,WT)方法将历史风速序列分解成具有不同频率特征的序列.根据分解后各分量的特点,对于低频趋势分量选取LS-SVM方法进行预测,而高频波动分量则选取ARMA模型进行预测,采用小波重构得到最终预测结果.仿真实例表明,不同的预测方法整体的预测精度不同,而混合模型预测的均方根误差最低为11.5%,与单一预测方法相比,混合模型提高了预测精度.%A wind speed forecasting with high accuracy can effectively reduce or avoid the adverse effect of wind farm on power grids, meanwhile it can enhance the competitive ability of wind power in electricity market. A short-term wind speed forecasting method based on auto-regressive moving average (ARMA) model and least square support vector machine (LS-SWM) model was proposed. By using wavelet transform method, the historical load data was decomposed into series with different frequency characteristics. The low frequency components were predicted by LS-SVM model, while the high frequency components were predicted by ARMA model. The final forecasting results were obtained with wavelet reconstruction. Research results show that the prediction accuracy is different from each method. The mean square errorof the proposed hybrid forecast model is 11.5%, better than the results by single forecasting methods.【期刊名称】《中国电力》【年(卷),期】2012(045)004【总页数】4页(P78-81)【关键词】短期风速预测;小波变换;时间序列;最小二乘支持向量机【作者】赵辉;李斌;李彪;岳有军【作者单位】天津理工大学天津市复杂控制理论与应用重点实验室,天津300384;天津理工大学天津市复杂控制理论与应用重点实验室,天津300384;陕西长岭纺织机电科技有限公司,陕西宝鸡721013;天津理工大学天津市复杂控制理论与应用重点实验室,天津300384【正文语种】中文【中图分类】TM614大规模风电接入电网给电力系统的运行带来一些新问题,尤其是对电力系统运行及调度产生影响[[2];也有利于即将并网的风电场参与发电竞价[3-5]。
基于RBF神经元网络的风电功率短期预测武小梅;白银明;文福拴【摘要】Accurate wind power outputs forecasting plays an important role in power system dispatching, power system stability,and wind farm operation. Based on historical data from an operating wind farm such as wind speed, environmental temperature, wind power and so on, a short-term wind power forecasting model based on the well-developed Radial Basis Function(RBF) neural network is presented for hour-ahead forecasting, and the predicted error is about 12%. The forecasting results are compared with actual wind power outputs, and this shows that the presented method can lead to acceptable and stable forecasting results.This work is supported by National Natural Science Foundation of China (No. 70673032).%准确地预测风力发电的输出功率对电力系统调度、电力系统稳定性和风电场运行都具有重要意义.从实际运行的风电场获得了相关风速、环境温度和风电功率的历史数据,建立了基于径向基函数( Radial Basis Function,RBF)神经元网络的短期风电功率预测模型.运用该模型进行了1 h后的风电输出功率预测,预测误差在12%附近.通过将预测结果和实际风电输出功率比较,表明该方法预测精度较高且比较稳定.【期刊名称】《电力系统保护与控制》【年(卷),期】2011(039)015【总页数】4页(P80-83)【关键词】风力发电功率;电力系统调度;风电场;RBF神经网络;短期预测【作者】武小梅;白银明;文福拴【作者单位】华南理工大学电力学院,广东,广州,510640;广东工业大学自动化学院,广东,广州,510006;天津市电力公司城南供电分公司,天津,300210;浙江大学电气工程学院,浙江,杭州,310027【正文语种】中文【中图分类】TM6140 引言随着全球气温变暖和化石燃料一次性能源的逐渐枯竭,可再生能源的利用在世界范围内受到普遍的重视。
基于遗传优化的最小二乘支持向量机风电场风速短期预测杨洪;古世甫;崔明东;孙禹【摘要】风电场短期风速的准确预测能为风电并网运行的规划、调度、运行和控制提供及时有效的信息.支持向量机基于结构风险最小化原理,从整体上考虑曲线的平滑度对数据进行拟合,对风速预测时能及时跟踪其变化趋势.针对支持向量参数难以确定问题,采用遗传算法对最小二乘支持向量机惩罚系数C和核参数σ<'2>寻优,在对参数遗传编码时,通过对数变换编码提高了搜索灵敏度,加快了模型收敛速度.最终利用现场连续150 h实测风速样本,对其中最后12 h进行预测,结果与广义回归神经网络(GRNN)相比,表明LS-SVM有更好的泛化能力,且取得了相对误差绝对值的平均值为8.32%的良好效果.%Timely and effective information can be obtained and then applied to the planning, scheduling, operation and control of wind power system, provided that the short-term wind speed can be accurately forecasted in wind farms.Support vector machine algorithm is established based on structural risk minimization principles.It considers smoothness of the regression curve entirety on the whole in regression model and predicts wind speed and tracks trend in time.To sovle the problem that the parameters of SVM are difficult to determine, genetic algorithm is employed to optimize the penalty factor C and kernel parameter σ2 of support vector machines.In the genetic coding of the parameters, the search sensitivity is improved and the model convergence speed is accelerated through the logarithmic transformation.Finally, prediction of the last 12-hour samples of 150-hour wind speed samples is done, and compared with the general regression neural network (GRNN)method, LS-SVM achieves better generalization ability and its average absolute value of relative error is only 8.32%.【期刊名称】《电力系统保护与控制》【年(卷),期】2011(039)011【总页数】6页(P44-48,61)【关键词】遗传算法;支持向量机;参数优化;短期风速预测【作者】杨洪;古世甫;崔明东;孙禹【作者单位】西华大学电气信息学院,四川,成都,610039;西华大学电气信息学院,四川,成都,610039;中国华电集团公司云南以礼河发电厂,云南,会泽,654200;中国华电集团公司云南以礼河发电厂,云南,会泽,654200【正文语种】中文【中图分类】TM6140 引言近年来,风电作为一种绿色环保的可再生能源,正逐步向规模化、产业化发展。
基于小波变换和时间序列的风功率超短期预测模型研究苏展;张志霞;朴在林;孙卓;赵丽华;王强【摘要】为提高风功率超短期预测模型的精确度,利用小波变换将原始风功率时间序列进行分解和重构,得到相应的高频序列和低频序列.对不同序列建立相应的自回归移动平均模型,并且进行拉格朗日乘子检验,验证是否具有拉格朗日乘子效应,从而建立相应的自回归条件异方差模型或广义自回归条件异方差模型,将所得的预测结果进行线性叠加组合得出最终结果.通过算例分析及与其他几种预测模型预测结果的对比,结果表明小波变换和时间序列结合的风功率超短期预测模型可以有效提高风功率超短期预测精度.%In order to improve the accuracy of the ultra short term forecasting model of wind power, original wind power time series are decomposed and reconstructed by wavelet transform to get corresponding high frequency sequences and low -frequency sequences. The corresponding autoregressive moving average model is established according to different sequences,as well as the Lagrangian multiplier test is mainly used to verify whether there is a Lagrangian multiplier effect,so as to establish the corresponding autoregressive conditional heteroskedasticity model or generalized autoregressive conditional heteroscedasticity model.The final forecasting results are obtained through the linear superposition and combination. Through the case study and the comparison with the forecasting results of several other forecasting models,the consequences show that the combination model of wavelet transform and time series can effectively improve the accuracy of the ultra short term forecasting model of wind power.【期刊名称】《可再生能源》【年(卷),期】2017(035)009【总页数】6页(P1381-1386)【关键词】小波变换;自回归移动平均模型;时间序列;风功率预测;超短期【作者】苏展;张志霞;朴在林;孙卓;赵丽华;王强【作者单位】沈阳农业大学信息与电气工程学院,辽宁沈阳 110866;沈阳农业大学信息与电气工程学院,辽宁沈阳 110866;沈阳农业大学信息与电气工程学院,辽宁沈阳 110866;国网辽阳供电公司,辽宁辽阳 111000;国网朝阳供电公司,辽宁朝阳122000;国网锦州供电公司,辽宁锦州 121000【正文语种】中文【中图分类】TK81;TM711随着可再生能源的不断开发和利用,风能在清洁能源中的发展和应用已经较为成熟[1]。
Short-Term Prediction of Wind Farm Power:A Data Mining ApproachAndrew Kusiak,Member,IEEE,Haiyang Zheng,and Zhe Song,Student Member,IEEEAbstract—This paper examines time series models for predicting the power of a wind farm at different time scales,i.e.,10-min and hour-long intervals.The time series models are built with data min-ing algorithms.Five different data mining algorithms have been tested on various wind farm datasets.Two of thefive algorithms performed particularly well.The support vector machine regres-sion algorithm provides accurate predictions of wind power and wind speed at10-min intervals up to1h into the future,while the multilayer perceptron algorithm is accurate in predicting power over hour-long intervals up to4h ahead.Wind speed can be pre-dicted fairly accurately based on its historical values;however,the power cannot be accurately determined given a power curve model and the predicted wind speed.Test computational results of all time series models and data mining algorithms are discussed.The tests were performed on data generated at a wind farm of100turbines. Suggestions for future research are provided.Index Terms—Data mining algorithms,multiperiod predic-tion,multiscale prediction,time series model,wind farm power prediction.I.I NTRODUCTIONW IND POWER generation is rapidly expanding into a large-scale industry.As most wind farms are relatively new,it is natural that their performance has not been adequately studied.Prediction of the power produced by a wind farm at different time scales is of interest to the electricity grid.A number of different approaches have been applied to fore-cast wind speed and the power produced by wind farms.Potter and Negnevitsky[6]applied the adaptive neurons fuzzy infer-ence approach to forecast short-term wind speed and direction. Barbounis et al.[20]used the nonlinear recursive least-squares method to train a recurrent neural network(NN)based on the meteorological data.Their model has improved the accuracy of long-term wind speed and power forecasting.Damousis et al.[8] developed a fuzzy logic model and trained it with a genetic al-gorithm.The model was then used to forecast wind speed over horizons ranging from0.5to2h.Li et al.[19]compared re-gression and NN models for wind turbine power estimation,and reported that the NN model outperformed the regression model. Sfetsos[3]presented a novel method to forecast the mean hourly wind speed using a time series analysis,and showed that the de-veloped model outperformed the conventional forecasting mod-Manuscript received March5,2008;revised May27,2008;accepted September4,2008.First published January13,2009;current version published February19,2009.This work was supported by Iowa Energy Center under Grant IEC07-01.Paper no.TEC-00080-2008.The authors are with the Intelligent Systems Laboratory,Department of Mechanical and Industrial Engineering,The University of Iowa,Iowa City, IA52242-1527USA(e-mail:andrew-kusiak@).Color versions of one or more of thefigures in this paper are available online at .Digital Object Identifier10.1109/TEC.2008.2006552els.Torres et al.[23]built the autoregressive moving average (ARMA)model based on time series data after transformation and standardization,and predicted mean hourly wind speed for up to10h ahead.Developing prediction models for wind farms is a challenge, as the power is mainly determined by the wind speed that is difficult to forecast accurately.The wind speed depends on pa-rameters such as air pressure,temperature,terrain topography, etc.The stochastic nature of a wind farm environment calls for new modeling approaches to accurately predict the power to be produced in the future time periods.Data mining is a promising approach to model wind farm per-formance.Numerous successful applications of data mining in manufacturing,marketing,medical informatics,and the energy industry have been reported in the literature[1],[10],[15],[16]. In this paper,a data mining approach has been applied to build time series models for the prediction of wind farm power over short horizons,e.g.,10–70min as well as longer horizons, e.g.,1–4h.Two different methodologies for power prediction have been employed.The models are built using historical data collected by supervisory control and data acquisition(SCADA) systems installed at a wind farm.A short-term power prediction is important in dispatching power to meet customer needs.For long horizon predictions,meteorological data are usually used. II.B ASIC M ETHODOLOGIES FOR W IND P OWER P REDICTION A.Time Series Prediction ModelingTime series prediction[5],[24]focuses on determining future events based on known events,measured typically at successive times and spaced at(often uniform)time intervals.The basic time series prediction model is as follows[11]:ˆy(t+T)=f(y(t),y(t−T),...,y(t−mT))(1) where T is the sampling time(time interval),ˆy(t+T)is the predicted parameter,y(t),y(t−T),...,y(t−mT)are the cur-rent and past observed parameters,and m+1is the number of inputs(predictors)to the model.To obtain an accurate prediction model with a data mining ap-proach,appropriate predictors need to be selected.Data mining offers different algorithms to perform this task.For example, the boosting tree algorithm[13],[14]can be used to select the best predictors,as well as the wrapper approach[26]using the genetic or the best-first search algorithms[9],[12].To maximize the performance of the prediction model,a boosting tree algorithm was employed to select a set of the most important predictors among{y(t),y(t−T),...,y(t−mT)}. Two metrics,i.e.,the absolute error(2)and the relative error(3)0885-8969/$25.00©2009IEEETABLE ID ATASET DESCRIPTION were used to select the accurate model (1)extracted with data mining algorithms.Absolute error =|ˆy (t +T )−y (t +T )|(2)Relative error =ˆy (t +T )−y (t +T )y (t +T )×100%(3)where ˆy (t +T )is the predicted parameter and y (t +T )is the observed (measured)parameter.B.Data DescriptionThe data used in this research were generated at a wind farm of 100turbines.The data were collected by a SCADA system installed at each wind turbine.The SCADA system of every wind turbine collects data for more than 120parameters.Though the data are sampled at high frequency,e.g.,2s,it is averaged and stored at 10-min intervals (referred to as the 10-min average data).The data used in this research were collected over a period of 1month for all turbines of the wind farm.The wind speed was shown to be an important predictor of wind farm power in the previous research [25].The time series prediction models for wind speed and wind farm power are discussed in Sections III and IV .The wind speed and power recorded at 10-min intervals re-sulted in 4455instances (dataset 1in Table I),starting from “January 1,20061:40A .M .”and continuing up to “January 31,200611:50P .M .”During this time period,the overall wind farm performance was normal.Dataset 1was divided into two sub-sets,dataset 2and dataset 3.Dataset 2contains 3568data points and was used to develop a prediction model with data mining algorithms.Dataset 3includes 887data points and was used to test the prediction performance of the model learned from dataset 2.For the testing data,the mean and standard deviation of the two statistical measures (2)and (3)are the most impor-tant indicators for selecting the data mining algorithms to learn model (1)of Section II.A.C.Feature SelectionImportant predictors are determined by the importance index generated by the boosting tree algorithm [13],[14].The basic idea of the boosting tree algorithm is to build a number of trees (e.g.,binary trees)splitting the dataset and approximating the underlying function.The importance of each predictor is measured by its contribution to the prediction accuracy on the training dataset.It is not surprising to observe that the importance of the predictors (past and present values of the model (1))isTABLE IIR ANK O RDER OF PREDICTORSranked in the order of time sequence.The ranking order isas follows:I [y (t )]>I [y (t −T )]>I [y (t −2T )],...,I [y (t −(m −1)T ]>I [y (t −mT )],where I [·]is the importance (rank)of predictors.In this paper,three different models are considered:a 10-min time series model of wind speed,a 10-min time series model of wind farm power,and a 1-h time series model of mean hourly wind farm power.The performance of the hourly time series model applied to wind speed is rather poor;however,the wind farm power time series model performs well.Therefore,the wind speed results produced by the hourly time series model are not discussed.Note that here the mean hourly power is the average of the wind farm power produced over an hour.Table II shows the importance of ten predictors computed by the boosting tree algorithm based on the 10-min power data.It is important to select predictors with the highest in-formation content among {y (t ),y (t −T ),...,y (t −mT )}to maximize prediction accuracy.A threshold value of 0.75has been established heuristically to select the predictors for the three models.For the 10-min time series models of wind speed and power,six predictors {y (t ),y (t −T ),y (t −2T ),y (t −3T ),y (t −4T ),y (t −5T )}have been selected.For the hourly time series model of power,the selected predictors are {y (t ),y (t −T ),y (t −2T ),y (t −3T )}.The threshold value of 0.75used in computation has pro-duced good quality results.A lower threshold value leads to more predictors.A large number of predictors could result in inferior performance of extracted models due to “the course of dimensionality”principle.D.Multiperiod PredictionsUsing the selected predictors,model (1)predicts the values of wind speed and power at future time periods.Fig.1(a)–(c)illustrates the concept of a multiperiod predic-tion with the 10-min time series model.In this model,the sam-pling time T is 10min.In Fig.1(a),using the average mea-sured values at the intervals [t =−60,t =−50),[t =−50,t =−40),...,[t =−10,t =0−),the average value at the subsequent interval [t =0,t =10)is predicted.In Fig.1(b),based on the average measured value at the intervals [t =−50,t =−40),[t =−40,t =−30),...,[t =−10,t =0)andKUSIAK et al.:SHORT-TERM PREDICTION OF WIND FARM POWER:A DATA MINING APPROACH127Fig.1.Description of the10-min time series prediction model.(a)10-min ahead prediction.(b)20-min ahead prediction.(c)30-min ahead prediction.previously predicted value at[t=0,t=10−)as predictors, the average value at the subsequent interval[t=10,t=20) is predicted.In Fig.1(c),using the average measured val-ues at the intervals[t=−40,t=−30)...,[t=−10,t=0) and previously predicted values at[t=0,t=10)and[t=10, t=20−)as predictors,the average value at the subsequent in-terval[t=20,t=30)is predicted.Similarly,the average values at intervals[t=30,t=40),[t=50,t=60)are predicted. Fig.2(a)–(c)illustrate the concept of a multiperiod predic-tion with the hourly time series model.In this model,the sam-pling time T is an hour.In Fig.2(a),using the mean mea-sured hourly power at the intervals[t=−4,t=−3),[t=−3, t=−2),[t=−2,t=−1),[t=−1,t=0−),the mean hourly power at the subsequent interval[t=0,t=1)is predicted.In Fig.2(b),based on the mean measured hourly power at the inter-vals[t=−3,t=−2),[t=−2,t=−1),[t=−1,t=0)and the previously predicted value at[t=0,t=1−)as predictors, the mean hourly power at the subsequent interval[t=1,t=2) is predicted.In Fig.2(c),using the mean measured hourly power at the intervals[t=−2,t=−1),[t=−1,t=0)and the pre-viously predicted values at[t=0,t=1)and[t=1,t=2−) as predictors,the mean hourly power at the subsequent interval [t=2,t=3)is predicted.Similarly,the mean hourly power at [t=3,t=4)is predicted.E.Integrated k Nearest Neighbor(kNN)and Time Series Pre-diction ModelBased on the time series prediction model,two ways to predict the short-term wind farm power are proposed.One is to directly use the power values measured in the past to predict the future power.The other is to use the wind speed measured in the past to predict the future wind speedfirst,and then use the predicted wind speed to predict the wind farm power.The basic equation of wind power density[2]is shown inP w=0.5ρv3(4) where P w is the power density(watts per square meter),ρis the air density(in kilogram per cubic centimeter),and v is the horizontal component of the mean freestream wind velocity (meter per second).As the hub of the turbine is usually located60–80m above the ground,the air densityρis frequently considered constant at that height.Though wind direction changes,the yaw position is controlled to face the wind to capture the maximum energy from the wind.Therefore,the wind speed is a significant predictor of the wind farm power.Knowing the wind speed,power produced by a wind farm is usually computed using the power curve function provided128IEEE TRANSACTIONS ON ENERGY CONVERSION,VOL.24,NO.1,MARCH2009Fig.2.Description of the hourly time series prediction model.(a)1-h ahead prediction.(b)2-h ahead prediction.(c)3-h aheadprediction.Fig.3.Structure of the integrated prediction model.by the turbine manufacturer.For wind farms containing diverse type turbines,different power curves are used.Factors such as turbine location,wind conditions,operations,and control make the actual power curve different from the one offered by the turbine ing the manufacturer’s power curve function to compute the wind farm power output for known wind speed leads to significant errors due to the difference between the actual power curve and the one provided by the puting wind farm power,based on the current wind speed,has been discussed in the literature [25].It has been shown that the kNN model [9]accurately determines wind farm power,given the wind speed collected at the corresponding 10-min time interval.In this paper,the time series model pre-dicting the wind speed and the kNN model are combined to predict wind farm power as shown in Fig.3.In Fig.3,the future wind speed is predicted with the time series model.Then,the predicted wind speed is used as an input of the kNN model to compute the wind farm power.III.W IND S PEED T IME S ERIES P REDICTIONA.Algorithm Selection for the 10-min Time Series Prediction ModelThe important predictors {y (t ),y (t −T ),y (t −2T ),y (t −3T ),y (t −4T ),y (t −5T )}of the time series model for wind speed have been selected in Section II-C.The relative error (2)and absolute error (3)have been used to select the most suitable algorithm for building the time series model (1).Five data mining algorithms that appeared to be the most promising have been used to construct the 10-min time seriesKUSIAK et al.:SHORT-TERM PREDICTION OF WIND FARM POWER:A DATA MINING APPROACH 129TABLE IIIE RROR S TATISTICS OF D IFFERENT M ODELS B ASED ON D ATASET 3OF T ABLEIFig.4.Predicted and observed wind speed for the first 150data points from dataset 3of Table I.prediction model (1)for wind speed.They include:the sup-port vector machine regression (SVMreg)algorithm [4],[18],multilayer perceptron (MLP)algorithm [9],[17],M5P tree algo-rithm [7],[9],[27],Reduced Error Pruning (REP)tree (decision or regression tree)[9],[28],and the bagging tree [9],[21],[22].Table III summarizes the model’s prediction accuracy based on the dataset 3of Table I.The SVMreg algorithm outperformed the other four algorithms.The MLP and REP tree algorithms performed the worst.The SVMreg algorithm was finally se-lected for building the 10-min time series wind speed prediction model.Fig.4shows the first 150observed and predicted wind speeds from dataset 3of Table I (the 10-min ahead prediction).It is obvious that the observed and predicted wind speeds are almost identical.B.Multiperiod Prediction With 10-min Time Series Model Based on the approach described in Section II-D,the time se-ries model built by the SVMreg algorithm is used for multiperiod predictions.The test dataset for the 10-min ahead predictions containing 887data points is reduced by one,when the predic-tion period moves forward by one step.Figs.5(a)–(e)illustrates the first 150observed and predicted wind speeds at 20-,30-,40-,50-,and 60-min ahead periods,respectively.Tables IV and V show the absolute and relative errors for each of the one-to six-period ahead predictions.The standard deviation,mean,and maximum of the absolute and relative errors all increase as the prediction horizon increases.However,the minimum error does not substantially increase,which could be due to the relative stability of the wind.Persistent forecasting with traditional methods has been stud-ied in the literature [5],[29].The time series model built by datamining algorithms has enhanced the prediction accuracy by at least 20%.Further improvements can likely be made as data mining offers a variety of algorithms.IV .W IND F ARM P OWER T IME S ERIES P REDICTIONA.Algorithm Selection for the 10-min Time Series Prediction ModelThe total power of the wind farm analyzed in this section was scaled to the interval [0,100MW].Five of the most promising data mining algorithms (the same as in Section III-A)were applied to extract the 10-min time series prediction model of wind power.Table VI summarizes the prediction accuracy based on the dataset 3of Table I.The SVMreg algorithm outperformed the other four algorithms.The MLP and REP tree algorithms performed the worst.Therefore,the SVMreg algorithm was finally selected to build the 10-min time series prediction model (1)of wind farm power.Fig.6shows the first 200observed and predicted (10-min ahead)wind power values from dataset 3of Table I.It is easy to see that both the observed and predicted wind farm power values are almost identical.B.Multiperiod Power Prediction With the 10-min Time Series ModelIn this section,the time series model learned by the SVMreg algorithm is used to predict the wind farm power over 10-min intervals,10–60min ahead into the future.The test data for the 10-min prediction used in Section IV-A containing 887points is reduced by one for each of the next 10-min period predictions.Fig.7(a)–(e)shows the first 200observed and predicted wind farm power values over 20-,30-,40-,50-,and 60-min future time intervals,respectively.Tables VII and VIII summarize the statistics of absolute and relative errors of the future power prediction over six differ-ent 10-min intervals.The mean,the standard deviation,and the maximum error all increase when the prediction horizon in-creases.However,the minimum error remains relatively stable.The relative errors reported in Table VIII are for power when the wind farm power is greater than 7MW (7%of the maximum power),as the relative error at the low power output is usually large,while the absolute error remains small.Note that the pre-diction accuracy with smaller relative error is more meaningful when the power generated is greater than 7MW.C.Algorithm Selection for the Hourly Time Series Prediction ModelThe important predictors of the hourly time series model of wind farm power that were selected by the boosting tree al-gorithm are {y (t ),y (t −T ),y (t −2T ),y (t −3T )}.The sam-pling time for this model is an hour,and thus all variables need to be at the same time scale.As the original dataset 1in Table I contains 10-min average data,every six consecutive data points are aggregated into a mean hourly power value (the average of six measured power values).Therefore,the original 4455dataset has been aggregated into a dataset with 742data points130IEEE TRANSACTIONS ON ENERGY CONVERSION,VOL.24,NO.1,MARCH2009Fig.5.Observed and predicted wind speeds at different periods for the first 150data points from dataset 3of Table I.(a)20-min ahead prediction.(b)30-min ahead prediction.(c)40-min ahead prediction.(d)50-min ahead prediction.(e)60-min ahead prediction.TABLE IVA BSOLUTE E RROR S TATISTICS FOR 10–60-min A HEAD PREDICTIONSdescribed in Table IX.This dataset begins from “January 1,20061:30A .M .”and continues to “January 31,200611:30P .M .”All the time stamps in Table IX denote the hourly intervals,e.g.,TABLE VR ELATIVE E RROR S TATISTICS FOR 10–60-min A HEAD PREDICTIONS“January 1,20065:30A .M .”represents the mean hourly power during the hourly interval from “January 1,20064:30A .M .”to “January 1,20065:30A .M .”Dataset 1of Table I was dividedKUSIAK et al.:SHORT-TERM PREDICTION OF WIND FARM POWER:A DATA MINING APPROACH 131TABLE VIE RROR S TATISTICS OF F IVE D IFFERENT A LGORITHMS B ASED ON D ATASET3OF T ABLEIFig.6.Observed and predicted wind power for the first 200data points from dataset 3of Table I.into two datasets,dataset 2and dataset 3.Dataset 2contains 593data points and is used to develop a prediction model with data mining algorithms.Dataset 3includes 149data points and it tests the prediction performance of the model learned from dataset 2.Five of the most promising data mining algorithms (the same used previously)were selected to construct the hourly time se-ries model.Table X summarizes the prediction accuracy of dif-ferent algorithms based on dataset 3of Table IX.Here,the MLP algorithm outperformed the other four algorithms.The REP algorithm performed the worst.Based on its performance,the MLP algorithm was selected to build the hourly time series model (1)for the mean hourly wind farm power prediction.Fig.8shows the observed and predicted wind power for dataset 3of Table IX.This is an hour ahead prediction,and the predicted power closely trails the observed power.D.Four-Period Forward Prediction of the Hourly Time Series ModelBased on the approach discussed in Section II-D,multiperiod predictions were tested.The accuracy of the time series model decreases as the time horizon extends.The test data of Table IX for hourly predictions contains 149data points,and it will be reduced by one data point for each subsequent prediction in-terval.Fig.9(a)–(c)illustrates the observed and predicted mean hourly wind farm power over 2-,3-,and 4-h ahead time intervals,respectively.Tables XI and XII show the absolute and relative error statis-tics for the mean hourly power prediction over four different hourly intervals.The mean,the standard deviation,and the max-imum error all increase as the prediction horizon lengthens.The relative error here is computed when the mean hourly power is greater than 7MW (7%of the maximum power).V .I NTEGRATED M ODEL FOR W IND P OWER P REDICTION A.kNN Model for Wind Power PredictionIntegrating the kNN model with the wind speed time series model for power prediction has been inspired by the wind in-dustry practice.The prevailing approach to the wind farm power prediction is to forecast the wind speed first and use it to compute power based on a predefined power curve function.The kNN is a machine learning algorithm predicting unknown value for an instance (here power)using the data supporting that instance.The predicted value is associated with the majority votes of these k neighbors.Euclidean distance is often used to measure the closeness of the data points.In this paper,the kNN algorithm predicts wind farm power based on the wind speed.The basic steps of the kNN algorithm are as follows.1)Represent each instance in a multidimensional space.2)Divide the entire dataset into training and test datasets.3)Given a test instance,a distance metric is computed be-tween the test instance and all training instances,then the k nearest neighbor instances are selected from the training data.4)Compute the distance-weighted (or nonweighted)average of output of the k nearest neighbor instances selected from the training data.This average becomes the predicted value for the test (unknown)instance.Previous research [9],[25]has shown that the kNN model is quite accurate for computing wind farm power given the wind speed as ing the measured wind speed,the wind farm power is computed fairly accurately when the wind farm is operating under normal conditions.The normal conditions exclude states where the wind speed is too low or too high,turbines undergo maintenance,and the turbine power output is low due to other issues.When the wind speed is below the cut-in speed,the power output of a wind turbine is about zero.When the wind speed is above the rated speed,the power output of a wind turbine is almost constant.Removing the corresponding data points allows the kNN model to identify the relationship between wind speed and power output.To build a kNN model,the original dataset of Table I has been preprocessed into the format shown in Table XIII.Dataset 1was divided into two data subsets,dataset 2and dataset 3.Dataset 2contains 3476data points and is used to develop a prediction model with the kNN algorithm.Dataset 3of 871data points was used to test the prediction performance of the model learned from dataset 2.Table XIV shows the error statistics of the kNN model for dataset 3of Table XIII.parison of the Integrated Model and the Time Series ModelsThe computational experience reported in the previous sec-tions showed that the kNN algorithm provided accurate power132IEEE TRANSACTIONS ON ENERGY CONVERSION,VOL.24,NO.1,MARCH 2009Fig.7.Observed and predicted wind farm power at different periods for the first 200data points from dataset 3of Table I.(a)20-min ahead prediction.(b)30-min ahead prediction.(c)40-min ahead prediction.(d)50-min ahead prediction.(e)60-min ahead prediction.TABLE VIIA BSOLUTE E RROR S TATISTICS FOR 10–60-min A HEAD P REDICTIONS TABLE VIIIR ELATIVE E RROR S TATISTICS FOR 10–60-min A HEAD P REDICTIONSKUSIAK et al.:SHORT-TERM PREDICTION OF WIND FARM POWER:A DATA MINING APPROACH 133TABLE IXD ESCRIPTION OF THE H OURLY DATASETTABLE XP REDICTION A CCURACY OF F IVE D IFFERENT A LGORITHMSB ASED ON D ATASET 3OF T ABLEIXFig.8.Observed and predicted hourly power of dataset 3from Table IX.predictions.It is desirable to make sure that the wind speed predictions are dependable.As presented in Section II-E,the kNN model and the 10-min time series wind speed prediction model have been integrated to predict future power.The 10-min time series model for the wind speed prediction was discussed in Section III.The test data set for the 10-min ahead prediction of 871data points will be reduced by one for each future prediction interval.Fig.10(a)–(f)shows the first 200observed and predicted wind farm power over future 10-,20-,30-,40-,50-,and 60-min time intervals,respectively.Tables XV and XVI show the absolute and relative error statistics of the integrated prediction model.Like the time se-ries prediction model,the performance of the integrated model decreases as the prediction horizon increases.The mean,the standard deviation,and the maximum error all increase in time.However,the minimum error remains relatively stable.The computational results reported in this paper have shown that the 10-min time series model for wind power prediction outperforms the integration model.Though the kNN model and the 10-min wind speed time series prediction model perform well individually,the integrated model produces a largererrorFig.9.Observed and predicted hourly power at different future intervals.(a)2-h ahead prediction.(b)3-h ahead prediction.(c)4-h ahead prediction.TABLE XIA BSOLUTE E RROR S TATISTICS FOR D IFFERENT H OURLY I NTERVALSwhen predicting future power.This could be due to the fact that the power is a cubic function of the wind speed,as indi-cated by the wind power density function (4)of Section II-E.134IEEE TRANSACTIONS ON ENERGY CONVERSION,VOL.24,NO.1,MARCH2009Fig.10.Observed and predicted power at different future intervals.(a)10-min ahead prediction.(b)20-min ahead prediction.(c)30-min ahead prediction.(d)40-min ahead prediction.(e)50-min ahead prediction.(f)60-min ahead prediction.TABLE XIIR ELATIVE E RROR S TATISTICS FOR D IFFERENT H OURLY I NTERVALSAdditionally,the wind speed in the kNN model is too sensitive as a predictor for wind farm power,and thus it might lead to a worse prediction for the integration model.The integration of the two models did not improve prediction accuracy.TABLE XIIID ATASET DESCRIPTIONTABLE XIVE RROR S TATISTICS OF THE kNN MODEL。