当前位置：文档之家› Efficiency enhancement of a process-based rainfall–runoff modelusing a new modified AdaBoost

Efficiency enhancement of a process-based rainfall–runoff modelusing a new modified AdaBoost

Applied Soft Computing 23(2014)521–529

Contents lists available at ScienceDirect

Applied Soft

Computing

j o u r n a l h o m e p a g e :w w w.e l s e v i e r.c o m /l o c a t e /a s o

Ef?ciency enhancement of a process-based rainfall–runoff model using a new modi?ed AdaBoost.RT technique

Shuang Liu a ,Jingwen Xu a ,?,Junfang Zhao b ,Xingmei Xie a ,Wanchang Zhang c

College of Resources and Environment,Sichuan Agricultural University,Chengdu 611130,PR China b

Chinese Academy of Meteorological Sciences,Beijing 100081,PR China c

Key Laboratory of Digital Earth Science,Institute of Remote Sensing and Digital Earth,Chinese Academy of Sciences,Beijing 100094,PR China

a r t i c l e

i n f o

Article history:

Received 3July 2013

Received in revised form 5May 2014Accepted 23May 2014

Available online 5June 2014

Keywords:

AdaBoost.RT algorithm

Particle swarm optimization Process-based hydrologic model

a b s t r a c t

High-ef?ciency rainfall–runoff forecast is extremely important for ?ood disaster warning.Single process-based rainfall–runoff model can hardly capture all the runoff characteristics,especially for ?ood periods and dry periods.In order to address the issue,an effective multi-model ensemble approach is urgently required.The Adaptive Boosting (AdaBoost)algorithm is one of the most robust ensemble learning meth-ods.However,it has never been utilized for the ef?ciency improvement of process-based rainfall–runoff models.

Therefore AdaBoost.RT (Adaptive Boosting for Regression problems and “T ”for a threshold demarcating the correct from the incorrect)algorithm,is innovatively proposed to make an aggregation (AdaBoost-XXT)of a process-based rainfall–runoff model called XXT (a hybrid of TOPMODEL and Xinanjing model).To adapt to hydrologic situation,some modi?cations were made in AdaBoost.RT.Firstly,weights of wrong predicted examples were made increased rather than unchangeable so that those “hard”samples could be highlighted.Then the stationary threshold to demarcate the correct from the incorrect was replaced with dynamic mean value of absolute errors.In addition,other two minor modi?cations were also made.Then particle swarm optimization (PSO)was employed to determine the model parameters.Finally,the applicability of AdaBoost-XXT was tested in Linyi watershed with large-scale and semi-arid con-ditions and in Youshuijie catchment with small-scale area and humid climate.The results show that modi?ed AdaBoost.RT algorithm signi?cantly improves the performance of XXT in daily runoff predic-tion,especially for the large-scale watershed or low runoff periods,in terms of Nash–Sutcliffe ef?ciency coef?cients and coef?cients of determination.Furthermore,the AdaBoost-XXT has the more satisfactory generalization ability in processing input data,especially in Linyi watershed.Thus the method of using this modi?ed AdaBoost.RT to enhance model performance is promising and easily extended to other process-based rainfall–runoff models.

Introduction

For daily rainfall–runoff modeling and sustainable water resources management,ef?ciency of hydrologic models plays a decisive role in it.Hence,in order to meet higher and higher demands of precise prediction,many hydrologists have employed various methods to improve the traditional rainfall–runoff models for better accuracy [1–6].

However,practical experience with model calibration suggests that no single-objective function was adequate to match the impor-tant characteristics of the observed data [7].Speci?cally speaking,

?Corresponding author.Tel.:+86182********;fax:+8602882690983.E-mail address:x.j.w@https://www.doczj.com/doc/cd8366561.html, (J.Xu).

single model,even the quite great distributed model,may well cap-ture the spatial characteristics of catchment but cannot successfully depict the runoff variation during the different periods.Krishna-murti et al.[8]assigned weights to each model of his superensemble whose error for the monthly mean ?elds was proved quite small compared to all other individual models for the seasonal climate simulations using the Atmospheric Model Intercomparison Project (AMIP)dataset.Moreover,Zhang [9]has indicated that no single model can provide the best forecast result for all kinds of time series data.Therefore,combining multiple forecasts effectively is regarded as an effective way to reduce the prediction errors and hence it provides considerably increased accuracy [10].Without doubt deterministic rainfall–runoff model is also not suf?ciently adapted to all natural conditions for whole time series,such as rainy season and dry season.Therefore ensemble method may be a useful

https://www.doczj.com/doc/cd8366561.html,/10.1016/j.asoc.2014.05.033

522S.Liu et al./Applied Soft Computing23(2014)521–529

and hopeful technique for accuracy improvement of process-based rainfall–runoff models.

The Adaptive Boosting(AdaBoost)algorithm,considering sam-ple reweighting from error computation,is one of the most robust ensemble learning methods,which has been frequently studied in past years[11–14].It derived from the concept of ensemble learning and was originally designed for classi?cation problems. However,by using different loss criteria,the boosting copes with not only classi?cation problems but also regression problems[15]. Avnimelech and Intrator[16]made the threshold-based boosting algorithm for regression which used an analogy between classi?-cation errors and big errors in regression.Zemel[17]proposed an analogous formulation for adaptive boosting of regression prob-lems,utilizing a novel objective function that leads to a simple boosting algorithm,and then got a better effect versus the gen-eral methods.Similarly,it is proved that AdaBoost outperforms other boosting methods,bagging,arti?cial neural networks,and

a single M5model tree for regression problems[18].Alfaro et al.

[19]compared AdaBoost and neural networks for bankruptcy fore-casting and the results show that their method can decrease the generalization error by about thirty percent with respect to the error produced with a neural network.Sun et al.[20]constructed AdaBoost ensemble respectively with single attribute test(SAT)and decision tree(DT)for?nancial distress prediction.The experiments indicate that AdaBoost ensemble with SAT outperforms AdaBoost ensemble with DT,single DT classi?er and single support vector machine classi?er.Moreover,Wang and Yu[21]have made the conclusion that BP-AdaBoost.RT model is better than multiple lin-ear regression and single BP(back propagation)network.It can be obviously seen that different modi?cations of boosting algorithm are suitable for different regression problems.However,it is abso-lutely a pity that so rare reports of the boosting methods are related to hydrologic modeling.Recently,Brochero[22]has adopted the AdaBoost.RT(Adaptive Boosting for Regression problems and“T”for threshold“?”demarcating the correct from the incorrect)[23] to build the stacked ANN model for stream?ow forecast with explo-iting radar rainfall estimates.One of the conclusions suggests that the stacked ANN response performs better than the best single ANN.

Nevertheless,the combination of the boosting approach and process-based hydrologic model seems never being reported. Hence a novel approach of using the AdaBoost method to enhance the effecincy of process-based daily rainfall–runoff model is?rstly proposed in this paper.Additionally speaking,based on the soil moisture storage capacity distribution curve(SMSCC)of the Xinan-jiang model and the simple model structure of TOPMODEL,a rainfall–runoff model named XXT(X for Xinanjiang model and T for TOPMODEL)was developed by Xu et al.[24].The relevant results make it clear that the XXT model is better than TOPMODEL and the Xinanjiang model for watersheds rainfall–runoff modeling.XXT model can be regarded as a very worthy rainfall–runoff process-based model.So we have employed the XXT model to be the base learner of AdaBoost algorithm.Then,we have used a new modi-?ed AdaBoost.RT algorithm to build an ensemble of base learners. Finally,we have applied the AdaBoost-XXT model and the well trained single XXT in two watersheds of different climate types and scales.

Methods

Brief introduction of XXT model

The XXT model is a hybrid model of the Xinanjiang model and TOPMODEL.The?rst letter“X”in XXT denotes the Xinanjiang model,the second letter“X”represents hybrid because the sign “X”means hybrid in agronomy,and the last letter“T”indicates TOPMODEL.The vertical structure of the model presented in this paper includes interception zone(including vegetation layer and root zone of soil),unsaturated zone,and saturated zone.Rainfall is ?rstly received by the interception zone,which suggests vegetation interception,depression storage,and initial soil moisture storage at moisture contents.Interception zone has a maximum moisture storage value(SRmax),which must be?lled before in?ltration takes place.Evaporation is allowed from this zone at the estimated poten-tial rate until there is no moisture in it.The excess of interception zone storage goes into the unsaturated zone.The unsaturated zone has a non-uniform distribution of the soil moisture storage capac-ity in the horizontal direction.The distribution is depicted by a new soil moisture storage capacity distribution curve,which is derived from the Xinanjiang model[25].Rainfall directly falling on the sat-urated zone will immediately become surface?ow,while water in?ltrated into the saturated zone from the unsaturated zone will immediately become subsurface?ow.The concrete deduction pro-cess of the formulas and calculation method[24]is omitted in this paper.However,different from original version of XXT model,in order to abide by the terrain law to estimate the possible maximum discharge of subsurface?ow[26],we remained the input of topo-graphic index data.The routing approach of XXT model depends on isochrones method[27]based on DEM(digital elevation model).It is essentially a time area routing method,in which the travel time in a watershed is divided into equal intervals.At each time interval, the area within the watershed boundaries and the speci?c distance increment will contribute to the runoff at the watershed outlet.The partial runoff at the watershed outlet from each sub-area is equal to the total runoff(the sum of surface and subsurface runoff)times the area of the contributing portion of the watershed.Summing the partial runoff of all contributing areas at each time step gives the total runoff at the watershed outlet for each time step in the hydrograph[28].

In this study,XXT has adopted6main parameters including SZM(maximum storage capacity of the unsaturated zone),SRmax, ln(T0)(natural logarithm of soil moisture conductivity at satura-tion),Chv(valid convergence speed of the slope),WM(the soil moisture de?cit depth for a watershed when it is very dry)and B (the areal heterogeneity of soil moisture storage capacity).The non-uniformity of soil moisture storage capacity distribution increases as B grows.All of these parameters were determined by PSO algo-rithm which is introduced in following section.

Particle swarm optimization technique

Particle swarm optimization technique was?rstly proposed to solve continuous optimization problems by Kennedy[29].In the recent years,this algorithm has been widely adopted for opti-mization[30–33].PSO imitates the hunting of a swarm of birds, consisting of a number of individuals re?ning their knowledge of the given search space.The individuals of PSO have a position and a velocity and are denoted as particles.This study adopted the modi?ed PSO version that was introduced by Shi[34].This ver-sion introduced the inertia weight(w).Then,the update of velocity and position of each particle should be obedient to the following equations.

V i=w?V i+c1?rand()?(pbest i?x i)+c2?rand()?(gbest?x i)

(1)

x i=x i+V i(2) where“x i”represents the position of particle;“w”is the inertia weight;pbest i indicates the present best position found by parti-cle;gbest is the global best found by the whole swarm;rand()is

S.Liu et al./Applied Soft Computing23(2014)521–529523 randomly generated at each step;“c1”and“c2”are positive con-

stant parameters called acceleration coef?cients.The position of

each particle is updated at each iteration by adding the velocity

vector“V i”to the position vector“x i”.

In this paper,in order to meet the fair comparison between the

XXT and AdaBoost-XXT model,the same PSO algorithm is used for

the training set.Simply speaking,during training process XXT and

AdaBoost-XXT need to select the best parameters by PSO making

a preparation for testing.The numbers of particles and iterations

are respectively10and500.The acceleration coef?cients“c1”and

“c2”are set to2and the inertia weight is0.5.Not as same as the

AdaBoost used for classi?cation problem,the ability of boosting

for regression is limited by some factors,such as the parameters

range and the number of weak learners,which is similar to the

work of Shrestha and Solomatine[18].So we made an analysis on

relationship between ef?ciency coef?cients and the number of base

predictors and?nally determined the relatively satisfactory num-

ber“t”of weak predictors in the section‘Calibration of XXT model

and AdaBoost-XXT model’.

Modi?ed AdaBoost algorithm for time series forecasting

As is same as it is mentioned before,single rainfall–runoff model

is quite hard to capture all the runoff characteristics,especially

during the?ood and dry periods.Thus,we need to use weighted

samples to focus learning on the most dif?cult hydrologic features

and combining predictors with weighted votes to build a sturdy

predictor.AdaBoost is an iterative algorithm for constructing a

“strong”classi?er as linear combination of some weak learners.

It adopts weighted samples to focus learning on the most“hard”

examples and combining classi?ers with weighted votes.At the

beginning,boosting schemes were designed for binary classi?ca-

tion problems,then for a multi-class case by model versions called

AdaBoost.M1and AdaBoost.M2.Afterwards,Freund and Schapire

[35]extended AdaBoost.M2to regression problems.Moreover,the

boosting scheme in this study is just derived from the AdaBoost.RT

algorithm of Solomatine and Shrestha[23].We have remained the

most mathematics of original version.The speci?c derivation and

proof of the primary algorithm is omitted in this paper.However,

there is something modi?ed,including that(a)we adopted the

absolute error instead of absolute relative error after trial and error;

(b)we replaced original loss function with the weighted loss func-

tion for seeking proper best weak learners;(c)we replaced the

threshold“?”with mean value of absolute errors to demarcate the

correct from the incorrect.Because the“?”is dif?cult to be decided

at?rst and would cause additional complexities[18];(d)weights

of wrong predicted examples would be increased at each epoch

rather than unchangeable so that those“hard”samples could be

highlighted.The concrete procedure of a new modi?ed AdaBoost.RT

algorithm is demonstrated as follows:

(1)Input:

?Training dataset:(x1,y1),(x2,y2),...,(x n,y n)with y∈R;

?Base learner:every iterative step produces a hypothesis f(x)

whose accuracy is judged by weighted errorεt.

(2)Initialize the distribution of weights

p1(x i)=p i

1=w i

n;(3)

(3)Iteration of t steps:

?Call base learners,and then?nd the best weak learner to minimize theεt with the distribution p t(x i).

εt=

p i t×

f t(x i)?y i

(4)

?Calculate the mean value of absolute errors of all training data

set.

f t(x i)?y i

n(5)

?Calculate the error rate.If|f t(x i)?y i|>r,then

p i t(6)

?Set

ˇt=e2.(7)

?Update the training distribution and normalization:

w i t+1=

w i t×ˇt,if

f t(x i)?y i

≤r

w i t×

ˇt

,otherwise

(8)

p i t+1=w i t+1/

j=1

w j t+1(9)

(4)Construct a strong regression model y:

ln(1/ˇt)×f t(x)/

ln(1/ˇt)

(10)

Building the AdaBoost-XXT model

As is brie?y depicted before,it is considerable that as a process-

based model,the XXT model is a very simple and quali?ed

rainfall–runoff model for hydrologic modeling.Taking the advan-

tages of the proposed ensemble algorithm into account,we have

utilized it to build an ensemble of individual XXT models.There-

fore,each XXT model has been regarded as a base learner or weak

predictor in the present study.For the some regression problems,

the AdaBoost algorithm needs quite high demand of the predic-

tor’s ability.To?nd out the best learner for each iteration“t”,PSO

optimization approach has been utilized to seek the optimal model

parameters.Finally,a new robust predictor,namely AdaBoost-XXT,

is combined with weights distribution as same as Eq.(10).The

whole roadmap is vividly shown in Fig.1.

Experimental setup

Study areas and data

Case1

Linyi watershed with10,040km2area,annual rainfall about

800mm,one of the areas richest in rivers and reservoirs,32?–37?

N,114?–121?E,located at the upstream of the Yishusi catchment

in Shandong province,China,has been selected for this work.It is

a typical semi-arid area.Daily rainfall and pan evaporation from

Linyi meteorological stations and daily runoff depth data observed

at the Linyi gauging station were used as input data of two models.

5-year daily data from2001to2003and from2006to2007were

adopted,while3-year daily data(2001–2003)were used for cali-

bration and2-year daily data(2006–2007)were considered as the

validation set.

Case2

Youshuijie catchment,a drainage area of about946km2,is

located in the upper branch of the Hanjiang catchment in the

524S.Liu et al./Applied Soft Computing23(2014)521–529

Fig.1.Flowchart of establishment of the AdaBoost-XXT model.

S.Liu et al./Applied Soft Computing 23(2014)521–529

525

N E c o e f f i c i e n t

Number of weak learner s

Fig.2.Relationship between the NE coef?cient and the number of base predictors

in Linyi watershed.

Yangtze River basin.As a typical warm humid area in rainfall-rich Hanjiang River basin that is in subtropical monsoon climate region,it has an average temperature of 7.7–15.7?C,annul mean humidity of about 74%,and annual rainfall of 800–1200mm.About 80%rainfalls happen during May to October.Daily rainfall and pan evaporation data from local meteorological stations and daily stream?ow data from hydrologic station were used.5-year daily series from 1981to 1985were collected,while 3-year daily data were used for calibration and 2-year daily data for model validation.

Evaluation criterions of model performance

The performances of models are evaluated by the frequently used criterion for assessing hydrologic modeling performance,the Nash–Sutcliffe ef?ciency coef?cient (NE)[36],as expressed:

NE =1?

i =1

(Q obs,i ?Q sim,i )2 n

i =1

(Q obs,i ?ˉQ

obs )(11)

where n is the total length of data;Q obs,i and Q sim,i denote the

observed and simulated value at the i th time step,and ˉQ

obs is the observed mean value;NE ranges from ?∞to 1.The NE become larger as the ?t between observed and simulated stream ?ow improves.

Moreover,although the coef?cient of determination “R 2”is deemed to be inappropriate for hydrologic modeling [37],we used it to assist model estimate for some case.In addition,t -test (Stu-dents t test)[38]of NE is adopted as a complement to evaluation criterion in the level of statistics in the section ‘t -Test of NE coef?-cient’.

Results and discussion

Calibration of XXT model and AdaBoost-XXT model

For the XXT model,6main parameters of SZM,SRmax,ln(T 0),Chv,WM and B have been calibrated by the PSO technique.How-ever,for the AdaBoost-XXT model,before training process,the iteration number “t ”should be initially con?rmed.“t ”denotes the number of weak models which were integrated into the AdaBoost algorithm.But it is different for various watersheds.Hence,we have to analyze the relationship between the NE coef?cient and the num-ber of machines for the training data set.As is shown in Fig.2,the ?rst eight iterations are regarded as an example for Linyi water-shed.From the tendency of polygonal line chart,it is obviously seen that the AdaBoost approach has improved the NE coef?cient from about 0.74up to about 0.78.But when the number of weak learners is more than ?ve,the boosting degree of the coef?cient

N E c o e f f i c i e n t

Number o f weak learner s

Fig.3.Relationship between the NE coef?cient and the number of base predictors

in Youshuijie catchment.

Table 1

NE and R 2of two models for calibration period.

Linyi watershed

Youshuijie catchment

R 2NE

R 2XXT

0.760.760.890.90AdaBoost-XXT

0.78

0.79

0.91

0.92

very slight.By the same method,the ?rst eight iterations are also taken as an example for Youshuijie catchment,which is demon-strated in Fig.3.We can see the NE coef?cient increases from about 0.86up to 0.90above.When the iteration number increases from 1to 3,the gradient of NE coef?cient change is signi?cantly large.However,afterwards,the NE coef?cient is improved slightly.Gen-erally speaking,the AdaBoost algorithm can successfully enhance the degree of ?tting in both study areas.But something different between them is the fact that more noise existed in primary data of Linyi watershed because of the large area and low runoff coef?cient.Moreover,many researchers have agreed with the conclusion that the AdaBoost is prone to over?tting,especially in noisy domains [12,39–41].Hence,this paper has selected just 7best weak learn-ers of AdaBoost in Linyi watershed.Due to the very satisfactory performance of the single model in Youshuijie catchment,we have set “t ”to be 20in this study area.

The evaluation criterions including NE coef?cient and the coef-?cient of determination “R 2”for both river basins are exhibited in Table 1.Due to the wider search range of the AdaBoost algorithm for all time series periods,the range settings of parameters in two models were absolutely different.Thus the parameter ranges in the AdaBoost-XXT model should be larger.So the trained NE value of the best single XXT model is not as same as it is in Figs.2and 3.From Table 1,it can be obtained that (a)in Linyi watershed,the NE and R 2of the XXT model are respectively 0.76and 0.76while 0.78and 0.79for the AdaBoost-XXT;(b)in Youshuijie catchment,the NE and R 2of the XXT model are respectively 0.89and 0.90while 0.91and 0.92for the AdaBoost-XXT.Apparently,the AdaBoost-XXT can resemble the observed stream?ow better than the best single XXT model in both river basins.

Validation and comparison of two models

Case 1

From Table 2,in Linyi watershed the NE and R 2of XXT model are respectively 0.66and 0.69but 0.74and 0.75for AdaBoost-XXT model.From both evaluation criterions,the AdaBoost-XXT model outperforms the best single XXT model.The discharge hydrographs are displayed respectively in Figs.4and 5.Observed and forecasted scatter plot of stream?ow by the XXT model and the AdaBoost-XXT

526

S.Liu et al./Applied Soft Computing 23(2014)

521–529

Rainfall(m)

R u n o f f d e p t h (m )

Time(day)

Fig.4.Forecasted by XXT model in Linyi

watershed.

Time(day)

R u n o f f d e p t h (m )

Rainfall(m)

Fig.5.Forecasted by AdaBoost-XXT model in Linyi

watershed.

Fig.6.(a)Scatter plot of observed and forecasted by XXT model stream?ow in Linyi watershed;(b)the results of AdaBoost-XXT model.

S.Liu et al./Applied Soft Computing 23(2014)521–529

527

Rainfall(m)

R u n o f f d e p t h (m )

Time(day)

Fig.7.Forecasted by XXT model in Youshuijie

catchment.

Rainfall(m)

R u n o f f d e p t h (m )

Time(day)

Fig.8.Forecasted by AdaBoost-XXT model in Youshuijie catchment.

Table 2

NE and R 2of two models for validation period.

Linyi watershed

Youshuijie catchment

R 2

XXT

0.660.690.840.86AdaBoost-XXT

0.74

0.75

0.89

model are presented in Fig.6(a and b).The results from the ensem-ble model are more convergent to 45?line compared with the other one.For both two models,most of the forecasted values are hard to ?t the observed ones.The important reason may be attributed to at least 797reservoirs.Regulating reservoir can effectively lessen the risk of the ?ood,but it brings a big challenge to runoff predic-tion.This is identi?ed with our previous work [42].However,from Figs.4and 5it is suggested that the AdaBoost-XXT model does a better work from about the 541th to the 631th time step.For the largest ?ood peak,both of them perform well.But the AdaBoost-XXT model is more excellent than the XXT model for the second largest peak.Furthermore,on about the 187th day,the best XXT model has not made any response to the rainfall information while the AdaBoost-XXT model has commendably predicted daily runoff.As a whole,XXT still maintains certain advantage for large ?ood

peaks modeling while AdaBoost-XXT is signi?cantly more excellent during low runoff periods.

Case 2

The hydrographs are presented respectively in Figs.7and 8.Observed and forecasted scatter plot of discharge by the XXT model and the AdaBoost-XXT model are shown in Fig.9(a and b).As is exhibited in Table 2,the NE and R 2of the XXT model are respectively 0.84and 0.86while 0.89and 0.89for the AdaBoost-XXT model in Youshuijie catchment.The ef?ciency has been improved signi?-cantly by the AdaBoost algorithm.From Figs.7and 8,we know the rainfall during this period is much more than it in Linyi watershed.So both of them can execute the task successfully.Speci?cally,the XXT model overestimates the largest peak at about the 187th time step.The output is 0.0415m much higher than the observed value of 0.0373m.But the result of the AdaBoost-XXT model is 0.0380m closer to the observed one.Moreover,for some lower runoff,XXT model is dif?cult to model the practical situation while AdaBoost-XXT model is able to ?t the environment nicely.On about the 606th day,neither of them captures the observed ?ow.From the relation-ship between rainfall and stream?ow,we can see the rainfall on that day is very little but the observed runoff is 0.0147m.Hence,some uncertainties like this will also affect the forecast results.

528

S.Liu et al./Applied Soft Computing 23(2014)521–529

0.000

0.005

0.0100.0150.0200.0250.0300.0350.040

0.0450.0

000.00 50.0 100.01 50. 0200.0 250.0300.0 350.04 00.0 45F o r e c a s t e d r u n o f f d e p t h (m )

Observed runoff depth (m)

R 2=0.86

0.000

0.005

0.0100.0150.0200.0250.0300.035

0.0400.0000.0050.0100.0150.0200.0250.0300.0350.040

F o r e c a s t e d r u n o f f d e p t h (m )

Observed runo ff depth(m)

R 2=0.89

)b (

)a (Fig.9.(a)Scatter plot of observed and forecasted by XXT model stream?ow in Youshuijie catchment;(b)the results of AdaBoost-XXT model.

From Fig.9(a and b),it is clear that results of both two models can

get close to the 45?line well.However,the results of low runoff depth from AdaBoost-XXT are apparently more convergent to 45?line compared with the other model.On a whole,the predicted runoff of the AdaBoost-XXT model matches with the observed one much better than that of the best single XXT model.

t-Test of NE coef?cient

As is shown above,for example,a difference from 0.76to 0.78in the NE coef?cient for the Linyi watershed may be not statistically signi?cant.In order to identify whether the results are affected by new approach or systemic error,t -test method is used for con?r-mation.In this paper,we have made experimental replication for 20times respectively in both catchments for training and testing periods.Then the data for different watersheds are blent into one list of data in different blending ratios.At last,two lists of data,including NE of XXT and NE of AdaBoost-XXT,are the test sam-ples.The results are calculated by SPSS (Statistical Package for the Social Sciences).The average values of two periods are 0.8341and 0.8154respectively.Both signi?cance levels are lower than 0.01,so the differences are regarded as highly signi?cance.Thus,it could be concluded that the differences between XXT and AdaBoost-XXT are not caused by systemic error.In other words,the improvement of NE is highly likely originated from the new method.

Therefore,no matter for the large-scale and semi-arid water-shed or the small-scale and humid catchment,the AdaBoost-XXT model always ?nishes the forecast work more satisfactory than the best single XXT model,which is almost in accord with the conclusion of Brochero [22].AdaBoost-XXT has the better general-ization ability,particularly in large-scale and semi-arid watershed.However,theoretically speaking,the AdaBoost-XXT should have manifested much more advantages than the single model.From Eq.(10),it is instructively told that all weak learners are com-bined for each sample but some base predictors are not suitable for some samples.So there should be some thresholds to con-trol the number and types of weak learners for combination.Due to the complexity of rainfall–runoff processes,it is a tough task to determine it.In addition,whole data in this paper were divided into two parts,namely training data and testing data.Thus both models were not suf?ciently validated in fact.On this issue,multifold cross-validation or leave-one-out method [43]may be helpful.The ef?ciency of boosting algorithm is also related to other

uncertainties,such as the data noise and parameters setting.If hav-ing addressed these problems,the results might be better.

Conclusions

The AdaBoost algorithm is a widely-used and mature ensem-ble approach to make an aggregation of some weak learners with weights distribution.In order to improve the ef?ciency of process-based rainfall–runoff model,this study has utilized a new modi?ed version of AdaBoost.RT algorithm to build a robust ensemble of XXT models.Then,we have applied the XXT model and AdaBoost-XXT model in two different climatic kinds of basins.In Linyi watershed,the observed data from 2001to 2003were used for calibration and daily data from 2006to 2007were considered as the validation set.In Youshuijie catchment,the observed daily rainfall,pan evap-oration and stream ?ow data from 1981to 1983were used for calibration and daily data from 1984to 1985were considered as the validation set.The results show that (a)for calibration period,the NE and R 2of XXT model are respectively 0.76and 0.76while 0.78and 0.79for AdaBoost-XXT model in Linyi watershed;the NE and R 2of the XXT model are respectively 0.89and 0.90while 0.91and 0.92for the AdaBoost-XXT in Youshuijie catchment;(b)for validation period,the NE and R 2of XXT model are respectively 0.66and 0.69but 0.74and 0.75for AdaBoost-XXT model in Linyi water-shed and the NE and R 2of the XXT model are respectively 0.84and 0.86while 0.89and 0.89for the AdaBoost-XXT model in Youshuijie catchment.Furthermore,t -test proves that the difference of NE is highly signi?cant.In general,the AdaBoost-XXT model performs much better than the best single XXT model especially in Linyi watershed or during low runoff periods and the former has the better generalization ability for runoff prediction.Hence,using the modi?ed AdaBoost.RT to enhance the predictive ability of XXT is a promising approach and it can be also easily popularized to other process-based rainfall–runoff models.

Acknowledgments

This study is ?nancially supported by the National Nature Sci-ence Foundation of China (31101073),Natural Science Research Fund of the Education Department of Sichuan Province (09ZA075),Open Research Fund of the Meteorological Center for Huaihe Watershed (HRM200905).

S.Liu et al./Applied Soft Computing23(2014)521–529529

References

[1]J.Xu,Y.Liu,J.Zhao,T.Tang,X.Xie,Improving TOPMODEL performance in

rainfall–runoff simulating based on ANN,in:2010International Conference on Display and Photonics,SPIE,Nanjing,China,2010,pp.77491A–77495A. [2]G.Cerda-Villafana,S.E.Ledesma-Orozco,E.Gonzalez-Ramirez,Tank model

coupled with an arti?cial neural network Advances in Arti?cial Intelligence, Proceedings,Micai2008,vol.5317,2008,pp.343–350.

[3]B.Yong,W.Zhang,Development of a Land-surface Hydrological Model TOPX

and Its Coupling Study with Regional Climate Model RIEMS,Nanjing University, Nanjing,2007,pp.132.

[4]C.Yao,Z.-J.Li,H.-J.Bao,Z.-B.Yu,Application of a developed grid-Xinanjiang

model to Chinese watersheds for?ood forecasting purpose,J.Hydrol.Eng.14 (2009)923–934.

[5]T.K.Ao,H.T.Q.Ishidaira,Introduction of block-wise use of TOPMODEL and

Muskingum–Cunge method for the hydro-environmental simulation of a large ungauged basin,Hydrol.Sci.J.44(1999)633–646.

[6]S.Srinivasulu,A.Jain,River?ow prediction using an integrated approach,J.

Hydrol.Eng.14(2009)75–83.

[7]P.O.Yapo,H.V.Gupta,S.Sorooshian,Multi-objective global optimization for

hydrologic models,J.Hydrol.204(1998)83–97.

[8]T.N.Krishnamurti,C.M.Kishtawal,Z.Zhang,https://www.doczj.com/doc/cd8366561.html,Row,D.Bachiochi,E.Willi-

ford,S.Gadgil,S.Surendran,Multimodel ensemble forecasts for weather and seasonal climate,J.Climate13(2000)4196–4216.

[9]G.P.Zhang,Time series forecasting using a hybrid ARIMA and neural network

model,Neurocomputing50(2003)159–175.

[10]R.Adhikari,R.Agrawal,A novel weighted ensemble technique for time series

forecasting,in:Advances in Knowledge Discovery and Data Mining,2012,pp.

38–49.

[11]R.Wang,AdaBoost for feature selection,classi?cation and its relation with SVM

–a review,Phys.Procedia25(2012)800–807.

[12]Y.Gao,F.Gao,Edited AdaBoost by weighted kNN,Neurocomputing73(2010)

3079–3088.

[13]J.-F.Ge,Y.-P.Luo,A comprehensive study for asymmetric AdaBoost and its

application in object detection,Acta Autom.Sin.35(2009)1403–1409. [14]H.Drucker,Improving regressors using boosting techniques,in:Machine

Learning–International Workshop then Conference,1997.

[15]D.-S.Cao,Q.-S.Xu,Y.-Z.Liang,L.-X.Zhang,H.-D.Li,The boosting:a new idea of

building models,https://www.doczj.com/doc/cd8366561.html,b.100(2010)1–11.

[16]R.Avnimelech,N.Intrator,Boosting regression estimators,Neural Comput.

Appl.Math.11(1999)499–520.

[17]R.S.Zemel,T.Pitassi,A gradient-based boosting algorithm for regression prob-

lems NIPS2000,13,2001,pp.696–702.

[18]D.L.Shrestha,D.P.Solomatine,Experiments with AdaBoost.RT,an improved

boosting scheme for regression,Neural Comput.18(2006)1678–1710. [19]E.Alfaro,N.García,M.Gámez,D.Elizondo,Bankruptcy forecasting:an empirical

comparison of AdaBoost and neural networks,Decis.Support Syst.45(2008) 110–122.

[20]J.Sun,M.-Y.Jia,H.Li,AdaBoost ensemble for?nancial distress prediction:an

empirical comparison with data from Chinese listed companies,Expert Syst.

Appl.38(2011)9305–9312.

[21]J.Wang,J.Yu,Scienti?c creativity research based on generalizability theory and

BP Adaboost RT,Procedia Eng.15(2011)4178–4182.[22]D.Brochero,F.Anctil,C.Gagné,Forward greedy ANN input selection in a stacked

framework with Adaboost.RT–a stream?ow forecasting case study exploiting radar rainfall estimates,Geophys.Res.Abstr.14(2012)6683.

[23]D.P.Solomatine,D.L.Shrestha,AdaBoost.RT:a boosting algorithm for regression

problems,in:Proceedings2004IEEE International Joint Conference on Neural Networks,2004,pp.1163–1168.

[24]J.Xu,W.Zhang,Z.Zheng,J.Chen,M.Jiao,Establishment of a hybrid

rainfall–runoff model for use in the Noah LSM,Acta Meteorol.Sin.26(2012) 85–92.

[25]R.J.Zhao,The Xinanjiang model applied in China,J.Hydrol.135(1992)371–381.

[26]D.M.Wolock,Simulating the variable-source-area concept of stream?ow

generation with the watershed model TOPMODEL,in:U.S.Geological Sur-vey Water-Resources Investigation Report93-4124,U.S.Geological Survey, Lawrence,1993.

[27]B.Sagha?an,P.Y.Julien,H.Rajaie,Runoff hydrograph simulation based on time

variable isochrone technique,J.Hydrol.261(2002)193–203.

[28]R.B.Nageshwar,https://www.doczj.com/doc/cd8366561.html,ura,N.F.Mark,Runoff modeling of a mountainous catch-

ment using TOPMODEL:a case study,J.Am.Water Resour.Assoc.41(2005) 107–121.

[29]J.Kennedy,R.C.Eberhart,D.M.Tsai,Particle swarm optimization,in:Proc.IEEE

Int’l.Conf.on Neural Networks,IEEE Service Center,NJ,1995,pp.1942–1948.

[30]Y.Wang,B.Li,T.Weise,J.Wang,B.Yuan,Q.Tian,Self-adaptive learning based

particle swarm optimization,Inf.Sci.181(2011)4515–4538.

[31]B.Majhi,G.Panda,Robust identi?cation of nonlinear complex systems using

low complexity ANN and particle swarm optimization technique,Expert Syst.

Appl.38(2010)321–333.

[32]E.Assareh,M.A.Behrang,M.R.Assari,A.Ghanbarzadeh,Application of PSO(par-

ticle swarm optimization)and GA(genetic algorithm)techniques on demand estimation of oil in Iran,Energy35(2010)5223–5229.

[33]Y.Jiang,T.Hu,C.Huang,X.Wu,An improved particle swarm optimization

algorithm,https://www.doczj.com/doc/cd8366561.html,put.193(2007)231–239.

[34]Y.H.Shi,R.C.Eberhart,A modi?ed particle swarm optimizer,in:IEEE Interna-

tional Conference on Evolutionary Computation,Anchorage,Alaska,1998. [35]Y.Freund,R.E.Schapire,A decision-theoretic generalization of on-line learning

and an application to boosting,https://www.doczj.com/doc/cd8366561.html,put.Syst.Sci.55(1997)119–139. [36]J.E.Nash,I.V.Sutcliffe,River?ow forecasting through conceptual models,J.

Hydrol.273(1970)282–290.

[37]C.L.Wu,K.W.Chau,Data-driven models for monthly stream?ow time series

prediction,Eng.Appl.Artif.Intel.23(2010)1350–1367.

[38]T.K.Seize,Student’s t-test,South.Med.J.70(1977)1299.

[39]G.R?tsch,T.Onoda,K.R.Müller,Soft margins for AdaBoost,Mach.Learn.42

(2001)287–320.

[40]J.Cao,S.Kwong,R.Wang,A noise-detection based AdaBoost algorithm for

mislabeled data,Pattern Recognit.45(2012)4451–4465.

[41]E.Souza,S.Matwin,Improvements to AdaBoost dynamic,in:L.Kosseim,D.

Inkpen(Eds.),Advances in Arti?cial Intelligence,Springer Berlin,Heidelberg, 2012,pp.293–298.

[42]S.Liu,J.Xu,J.Zhao,X.Xie,W.Zhang,An innovative method for dynamic update

of initial water table in XXT model based on neural network technique,Appl.

Soft Comput.13(2013)4185–4193.

[43]S.Borraa,A.D.Ciacciob,Measuring the prediction error.A comparison of cross-

validation,bootstrap and covariance penalty methods,Comput.Stat.Data Anal.

54(2010)2976–2989.