Statistically Hiding Sets
- 格式:pdf
- 大小:269.83 KB
- 文档页数:17
making it capable of stableexistenceMaking It Capable of Stable ExistenceStability is a fundamental attribute that all systems, whether natural or artificial, strive to achieve. For a system to exist stably, it must be able to withstand external perturbations and internal fluctuations without undergoing significant changes in its structure or function. Achieving this state of stability often requires a delicate balance between various forces and factors.In the realm of technology, making a system capable of stable existence often involves complex engineering principles and design considerations. For instance, in the case of mechanical systems, stability can be achieved through careful balancing of weights, adjustment of springs, and optimization of material properties. In electrical systems, stability is often achieved by regulating power flows, managing loads, and ensuring the integrity of insulation.But stability is not just about resisting external forces; it is also about maintaining internal equilibrium. This is particularly true for biological systems, where stability is achieved through a delicate balance of biochemical reactions and cellular processes. For example, the human body maintains stability by regulating its internal environment, such as pH levels, blood sugar concentrations, and body temperature.To achieve stability in any system, a thorough understanding of its components and interactions is crucial. By identifying the key factors that contribute to stability and designing interventions that enhance these factors, we can make systems more resilient and less prone to failure. This approach is not only important in engineering and science but also has applications in areas like social science and economics, where stability is equally critical for sustainable development and progress.。
量化研究英语用词
Variable: 变量
变量是研究中的基本元素,可以是一个数字、一个文字或一个符号。
在量化研究中,变量通常被用来表示研究对象的不同特征或属性。
Measurement: 测量
测量是对研究对象进行量化的过程。
通过测量,我们可以将变量的具体值表示出来。
在量化研究中,测量是获取数据的重要手段。
Sample: 样本
样本是从总体中选取的一部分研究对象。
通过样本的研究,我们可以推断出总体的特征和规律。
在量化研究中,样本的选择和研究方法对研究结果的影响至关重要。
Population: 总体
总体是研究对象的全体。
总体包含了所有的研究对象,而样本是从总体中选取的一部分。
在量化研究中,对总体的研究可以提供更全面的信息,但通常需要更多的时间和资源。
Dependent Variable: 因变量
因变量是研究中受其他变量影响的变量。
因变量的变化趋势可以反映出自变量的影响效果。
在量化研究中,因变量的选择和研究方法对研究结果的影响至关重要。
Independent Variable: 自变量
自变量是研究中能够影响其他变量的变量。
自变量的变化可以引起因变量的变化。
在量化研究中,自变量的选择和研究方法对研究结果的影响至关重要。
Control Variable: 控制变量
控制变量是在研究中需要控制或考虑的变量。
控制变量的影响可以被排除或控制,以便更好地研究自变量和因变量之间的关系。
在量化研究中,控制变量的选择和研究方法对研究结果的影响至关重要。
A distortion-free data hiding scheme for high dynamic range imagesChung-Min Yu,Kuo-Chen Wu,Chung-Ming Wang ⇑Institute of Computer Science and Engineering,National Chung Hsing University,250Kuo Kuang Road,Taichung 402,Taiwan,ROCa r t i c l e i n f o Article history:Received 18December 2009Received in revised form 8February 2011Accepted 21February 2011Available online 27February 2011Keywords:High dynamic range images Data hiding Distortion-freeMessage embedding Steganographya b s t r a c tIn this paper we present a distortion-free data hiding algorithm which can embed secret messages into high dynamic range (HDR)images.Our scheme provides three significant benefits.First,it enables us to convey secret messages to produce a stego HDR image.When we operate the tone mapping technique to reduce the high contrast to a displayable range,no distortion is encountered between the tone-mapped cover and the stego images.A quantitative measure verifies that histograms of the cover and stego HDR images are correlated with linear dependency.To the best of our knowledge,our algorithm is the first approach in HDR literature that can provide capability of distortion-free data embedding.For the appli-cation of image annotation,the average capacity offered by our method is in the range of 0.12–0.29bits per pixel.Our scheme provides an average capacity in the range of 0.0010–0.0026bits per pixel for the application of image steganography where the stego HDR image preserves an HDR image encoding for-mat which does not cause any suspicion by eavesdroppers.Second,our algorithm performs with adaptive message embedding where pixels conceal different amounts of secret messages based on their homoge-neous representations.Quantitative analysis indicates that our algorithm offers an insignificantly small magnitude of the maximal pixel difference between the cover and stego HDR images.This feature and the histogram distribution of similarity between the cover and stego HDR images increase the difficulty of detecting whether any message is hidden in an HDR image.Third,our scheme is efficient.The time required for message embedding or extraction is in the range of several hundred milliseconds.Our approach belongs to a blind detection where the messages can be extracted without referring to the ori-ginal cover HDR image.We believe our proposed scheme is suitable for applications such as image anno-tation or image steganography.Ó2011Elsevier B.V.All rights reserved.1.IntroductionTransmission of private information through the internet in a secret manner is now more frequent due to the prevalence of com-puter science and the internet.This trend encourages researchers to investigate techniques for covert communication.Besides the system of cryptography,data hiding [11]provides an alternative solution to achieving the goal of covert communication.Data hid-ing is a way of secret communication carried out by using various digital multimedia to convey the critical messages,and therefore the major demand here is for both good imperceptibility and a high embedding capacity.Generally,the object in which we intend to embed the secret message is called the cover object indicating that the secret message has not yet been embedded [19].After it has conveyed the secret message,we refer to it as the stego object.While a variety of media,such as text [1],image [17],audio [8],video [10],3D models [2],or general multimedia [7],can serve as a cover object,the image is the most popular medium that is employed for data hiding.An image data hiding technique is usually evaluated in terms of visual quality and embedding capac-ity.The image data hiding algorithm should maximize the amount of messages that can be conveyed in the cover image,and mini-mize the distortion appearing in the stego image caused by the hidden message [17].In addition,data hiding algorithms can be developed to provide features of reversibility.These data hiding algorithms are referred to as reversible data hiding algorithms [14,18]which allow the receiver to extract the embedded data,and completely restore the cover image without losing any of the information.Going one step further,we can produce a stego image without incurring any image distortion which minimizes the distortion to the extreme.This kind of algorithm is referred to as a distortion-free data hiding algorithm.An intriguing feature of the distortion-free algorithm is camouflage of the stego image;consequently,the stego image will not attract much attention by eavesdroppers when it is delivered to the receiver through a public channel.This property makes it useful for data hiding applications such as medical or military image authentication where the quality of the stego image is strictly required,and/or image annotation where sensitivity of the secret message is strictly confidential.0141-9382/$-see front matter Ó2011Elsevier B.V.All rights reserved.doi:10.1016/j.displa.2011.02.004⇑Corresponding author.Tel.:+886422840497915;fax:+886422853869.E-mail addresses:phd9202@.tw (C.-M.Yu),phd9501@.tw (K.-C.Wu),cmwang@.tw (C.-M.Wang).In recent years,there has been an explosion of interest in high dynamic range(HDR)images[13].The‘‘dynamic range’’of a scene is the contrast ratio between its brightest and darkest parts.In con-trast to low-dynamic range(LDR)images,HDR images represent luminance values usingfloating-point numbers for a scene in order to accurately represent the wide range of intensity levels found in real scenes ranging from direct sunlight to deepest shadows.Fig.1 demonstrates the visual difference between LDR and HDR images. The scene has high contrast ratio because of outdoor as well as in-door landscape.When we directly display the LDR image,we lose the details of the outdoor scene because the luminance is out of the range that is supported in an ordinary device.Similarly,the detail in the indoor scene is not visible when we directly exhibit the HDR image.However,we can visualize both details when the HDR image is processed by the tone-mapping operator.Several image processing software and computer games are developed to support HDR images,and they are becoming increasingly popular in variousfields such as digital photography,computer graphics, movies,videogames,and medical imaging.Unfortunately,research in steganography has not kept pace with the advances of HDR images,even though they are expected to replace the low-dynamic range(LDR)images and become the new image standard.To the best of our knowledge,there has been only very limited data hiding work done on HDR images[3].This work produces stego images with high visual quality that is accept-able to human perception.However,an image distortion is inevita-ble due to the hidden messages.In this paper,we provide a new data hiding algorithm that em-beds secret messages into HDR images encoded with the radiance RGBE format[15].Our scheme takes advantage of encoding secret messages to homogeneous representations inherent in the radi-ance RGBE encoding format which has found widespread use in the image community.The scheme provides three significant ben-efits.First,it enables us to convey secret messages to produce a ste-go HDR image.The tone-mapped cover image and the stego images message embedding.The insignificantly small magnitude of the maximal pixel difference and the histogram distribution of similar-ity increase the difficulty of detecting whether any message is hid-den in an HDR image.Third,our scheme is efficient as the time required for message embedding or extraction is in the range of several hundred milliseconds.Our algorithm belongs to a blind detection where the messages can be extracted without referring to the original cover HDR image.Experimental results have verified the feasibility of our algorithm.This paper is organized as follows.In Section2,we review data hiding approaches for HDR images.We then put forward our algo-rithm in Section3.Experimental results are shown in Section4,fol-lowed by the Conclusion and Future Work in Section5.2.Related worksThis section surveys data hiding approaches for HDR images. We were surprised tofind only one paper in the current literature which presents information for HDR data hiding[3].Since that paper utilizes the radiance RGBE encoding as the cover image, we believe it can provide more insight for developing our proposed algorithm.For reference purposes we will briefly describe the ‘‘distortionless data hiding’’that is misleading in the low-dynamic range images,and the approach to producing distortion-free data embedding using permutation.In the following paragraphs we highlight the HDR format,and review the only HDR data hiding algorithm that we have found.A number of data hiding techniques were proposed which use the LDR image as the cover media to convey secret messages. One of these algorithms uses the title of‘‘distortionless data hid-ing’’[20].This causes misconception because the algorithm pro-vides the capability of‘‘reversibility’’which means that once the secret messages are extracted,the original cover image can be restored.Nevertheless,this type of algorithm generates a stegodifference between LDR and HDR images:the LDR image is directly displayed in thefirst column;the HDR image is directly displayed displays the tone-mapping result of the HDR image.226 C.-M.Yu et al./Displays32(2011)225–236encode an optimal message capacity of up to log2(n!)bits,where n is the number of elements to be arranged.The radiance RGBE encoding format[15],originally known as the Radiance picture format,wasfirst introduced as part of the Radiance lighting simulation and rendering system[16].This encoding format has found widespread use for HDR photography and image-based lighting.Other encoding formats include OpenEXR,LogLuv,etc.In the HDR format,the pixel’s color is pre-sented by four channels that include the red,green,blue,and expo-nent channels.This encoding leads to a feature indicating that the representation of a color is not unique.Therefore,we take this advantage to present a distortion-free data hiding algorithm in this paper.Cheng and Wang proposed an adaptive data hiding approach with authentication for a high dynamic range image[3].To the best of our knowledge,their scheme is thefirst such approach for HDR images using the radiance32-bit RGBE encoding.In the radiance format HDR image,the range of luminance intensity is decided by the8-bit exponent value E.Cheng and Wang’s method uses this advantage to classify the pixels into theflat and boundary areas.This pixel classification enables their scheme to remove the restrictions of afixed size of message embedding at each pixel in order to provide larger embedding capacity with little visual dis-tortion.In their reports,their algorithm achieves embedding capacity in the range of 5.13–9.69bits out of32bits of RGBE encoding.Although their algorithm causes image distortion between the cover and stego image,the PSNR values for the tone-mapped stego images are greater than the30dB that is acceptable to human perception.Our survey indicates that an HDR image data hiding algorithm was presented based on the use of the32-bit radiance RGBE encod-ing,and it causes image distortion because of the hidden message. Due to the advantages of data embedding with the distortion-free manner,we believe it is necessary to develop a distortion-free data hiding algorithm which takes this encoding into consideration.Our algorithm is detailed in the next section.3.Our proposed algorithmThis section presents our proposed methods for embedding a secret message into an HDR image without causing any distortion. The method is referred to as a concise fundamental method(CF). This method is simple and direct with an intuitive manner.The low-dynamic range(LDR)images use8bits to represent each of the primary colors,red,green,and blue,leading to a total of24bits of image representations.In an HDR image encoded with the radi-ance format,a pixel is represented by three primary channels fol-lowed by an exponent channel,resulting in a total of32bits of image representation.In each channel,8bits of storage are used so the value at each channel is in the range of0and255.Without loss of generality,let P(r,g,b,e)represent a pixel en-coded with the radiance format,where r,g,and b represent the pri-mary color channels and e indicates the exponent channel which is based on a power of two with the biased number of128.The color of this pixel is afloating point value which can be derived using the floating point conversion as shown in Eq.(1).Similarly,given a col-or in a pixel with thefloating values(R,G,B),we can convert the pixel into the radiance(r,g,b,e)encoding using the integer conver-sion,as shown in Eq.(2),where max(R,G,B)represents the maxi-mum value in the R,G,and B color components.R¼ððrþ0:5Þ=256ÞÂ2ðeÀ128ÞG¼ððgþ0:5Þ=256ÞÂ2ðeÀ128ÞB¼ððbþ0:5Þ=256ÞÂ2ðeÀ128Þð1Þe¼d log2½maxðR;G;BÞ þ128er¼bð256ÂRÞ=ð2eÀ128Þcg¼bð256ÂGÞ=ð2eÀ128Þcb¼bð256ÂBÞ=ð2eÀ128Þcð2ÞDue to the exponent channel that is introduced in the radiance format,we can derive that there is more than one representation todescribe the color of a pixel.For example,we can apply the divisionoperator with the divisor2for each color channel and increase1tothe exponent channel.Given an original pixel P(r,g,b,e),this divi-sion operator will produce a representation A(r/2,g/2,b/2,e+1)which would give nearly the samefloating-point color value andgive identical color after tone mapping for the original pixel pro-vided that components in the color channels,r/2,g/2,and b/2,stillobey the integer form.Similarly,we can apply the multiplicationoperator with the multiplier2for each color channel and subtract1from the exponent channel.This multiplication operator will pro-duce a representation B(2r,2g,2b,eÀ1)which will give nearly thesamefloating-point color value and also give identical color aftertone mapping for the original pixel provided that components inthe color channel,2r,2g,and2b,are within the legal range be-tween0and255.Since each pixel may contain a number of differ-ent representations,the concise fundamental method we proposetakes advantage of this feature to convey the secret message with-out producing any image distortion.We detail this method in thefollowing paragraphs.3.1.Our embedding methodGiven an arbitrary pixel P(r,g,b,e),we define the homogeneous representation group(HRG)for this pixel as a set of representationswhere every element in HRG describes the pixel color identical toP(r,g,b,e).We use the HRG P with the suffix‘‘P’’to denote thehomogeneous representation group for the pixel P.We define thehomogeneity value(HV P)for this pixel as the number of elementsin the homogeneous representation group.We sort every elementin the HRG according to the value represented in the exponentchannel using the ascending order,and assign each sorted elementan index.This allows us to define a homogeneity index(HI)forevery sorted element in HRG where the HI has the range from0to(HV p)À1.As an example,let P(24,160,52,127)represent a pix-el.Then,the homogeneous representation group of this pixel HRG Pcontains three elements as shown in Table1,where the HRG is ex-pressed as HRG P={(24,160,52,127),(12,80,26,128),(6,40,13,129)}.This pixel has the homogeneity value of3(HV P=3),andthe element(24,160,52,127)is assigned to the smallest homoge-neity index of0(HI P=0)because it has the smallest value in theexponent channel.Accordingly,the element(6,40,13,129)hasthe homogeneity index of2(HI P=2).Note that the blue channelin the element(6,40,13,129)contains an odd value of13whichterminates the possibility of applying the division operator.We re-fer to the blue channel as the dominated channel for this pixel.Depending on particular values of a pixel,there are two special cases that we do not determine a pixel’s corresponding HRG formessage embedding.For a pixel P(r,g,b,e),thefirst case is whenthe pixel values in the primary color and exponent channels areTable1An example of a pixel P has four sorted elements in the homogeneous representationgroup(HRG)with the homogeneity value(HV P=3).Pixel P Homogeneityvalue(HV P)Sorted elementsin HRG PHomogeneityindex(HI P)P(24,160,52,127)3(24,160,52,127)0(12,80,26,128)1(6,40,13,129)2C.-M.Yu et al./Displays32(2011)225–236227all zeros,i.e.,P(r,g,b,e)=(0,0,0,0).We refer to this type of pixel as the‘‘null’’pixel.Note that it might be possible for us to apply the division operator255times to produce the homogeneous repre-sentation group,HRG p={(0,0,0,0),(0,0,0,1),...,(0,0,0,255)}, which has the homogeneity value of256allowing us to embed up to8bits of secret message.However,the embedding will pro-duce a relatively large pixel difference after the message embed-ding(see the analysis of pixel difference described in Section 3.3).Therefore,our method does not embed any secret messages when encountering the‘‘null’’pixel.The second special case occurs when the pixel values in the pri-mary color channels are power of2,or one or two of pixel values is/ are zeros,i.e.,P(r,g,b,e)=(2k||0,2k||0,2k||0,e)where k is an integer satisfying the range of06k67and||represents or nota-tion.We refer to this type of pixel as the‘‘neutral’’pixel.Note that the pixel values cannot be all zeros in three channels,because we have defined this kind of pixel as the‘‘null’’pixel.Note again that if the exponent e is less than or equal to248,we can apply the divi-sion operator up to eight times in order to produce the homoge-neous representation group with the homogeneity value of8.For example,when P(r,g,b,e)=(128,128,128,248),there are eight elements in the HRG p where HRG p={(1,1,1,255),(2,2,2,254), (4,4,4,253),...,(128,128,128,248)}.It might be possible for us to adopt this HRG p to embed3bits of secret message.Unfortu-nately,the embedding will produce a much larger pixel difference which becomes evident from the analysis of pixel difference de-scribed in Section 3.3.Consequently,our method excludes the ‘‘neutral’’pixel from message embedding.It is not difficult to determine the homogeneity representation group for a given pixel.In particular,wefirst apply the multiplica-tion operator with the multiplier2to the extreme before we apply the division operator with the divisor2to the extreme.When applying the multiplication rule,the extreme in the multiplication operator means that the components in the color channel are lar-ger than the maximum value of255.The extreme in the division operator,however,means that the components in the color chan-nel have changed to be with thefloating point form.If we operate MU(multiplication)times of multiplication and DI(division)times of division operator,then the homogeneity value of a pixel P is HV P=MU+DI+1.As an example,given a pixel K(20,16,60,127), we consider the color channel(20,16,30)and apply the multipli-cation operator,at most,two times(MU=2)producing two ele-ments,(40,32,120,126)and(80,64,240,125).We cannot apply the multiplication operator anymore because if we do so,the component480will be larger than the maximum value of255. Similarly,we apply the division operator for two times only (DI=2),producing two elements(10,8,30,128)and(5,4,15, 129).We cannot apply the division operator anymore,because if we do so,the component2.5and7.5are with thefloating point forms.The homogeneous representation group of the pixel K is HRG K={(80,64,240,125),(40,32,120,126),(20,16,60,127), (10,8,30,128),(5,4,15,129)}and the homogeneity value of the pixel K will be HV K=2+2+1=5.Note that the homogeneity value of a pixel has the maximal value of7because the component in the color channel must be in the range of0and255.This also means that the homogeneous representation group contains,at most, seven elements.The homogeneity value of a pixel has the minimal value of1indicating that there is an element in the homogeneous representation group which is the pixel itself.Once we have determined the homogeneous group and the homogeneity value of a pixel K,HV K,we can compute the pixel capacity in bits and denote it as C K,as shown in Eq.(3).The pixel capacity indicates how many bits of secret messages that this pixel can offer to convey secret messages.Certainly,the pixel capacity depends on how many elements are in the homogeneous represen-tation group.C k¼b log2ðHV KÞcð3ÞThe embedding process for the cover pixel K(r,g,b,e)can befacilitated using the homogeneity index table(HIT)as shown in Table2.Given a cover pixel K,we can determine the homogeneity value HV K.Depending on this value,thefirst column of Table2lists numbers of bits that can be conveyed.In the third column,we de-scribe the associate bit pattern of the secret message that can be concealed with respect to different homogeneity indices.By refer-ring to the HIT,we can alter the cover status C(HV K,HI K)which re-cords the status of the cover pixel to the stego status S(HV K,HI0K), indicating that a desired bit pattern of the secret message has been conveyed by the stego pixel.The embedding process is best illustrated by an example shown below.Given a cover pixel K(20,16,60,127),following the exam-ple shown above,we can produce the homogeneous representation group for this cover pixel.Table3shows the sortedfive elements in the group where HRG K={(80,64,240,125),(40,32,120,126),(20, 16,60,127),(10,8,30,128),(5,4,15,129)}.Clearly,the cover sta-tus C(HV K,HI K)=C(5,2)because the cover pixel has the homogene-ity value of HV K=5and,according to the sorted elements in HRG K, it has the homogeneity index of HI K=2.Note that since HV K=5,this cover pixel can embed2bits of se-cret message based on the expression shown in Eq.(3).In particu-lar,by altering the cover status C(HV K,HI K)=C(5,2)to S(HV K,HI0K)=S(5,0),appearing in thefirst row third column,we embed two bits of secret message‘‘01.’’In other words,in order to convey two bits of the secret message‘‘01,’’the stego pixel which we should select is the element in the homogeneous representation group HRG K that has the homogeneity index of HI K=0.Conse-quently,the stego pixel is K’(80,64,240,125).As another example, if we intend to embed two bits of the secret message‘‘10,’’we alterC(HV K,HI K)=C(5,2)to S(HV K,HI0K)=S(5,1)in the second row indi-cating that the stego pixel will be K’(40,32,120,126)which has the homogeneity index of HI K=1.We do not need to take any ac-tion if we intend to embed the secret message‘‘11’’because the cover pixel C(HV K,HI K)=C(5,2)happens to have the homogeneity index of HI K=2,which means that C(HV K,HI K)=C(5,2)and S(HV K,HI0K)=S(5,2).Consequently,the stego pixel is exactly the same as the cover pixel.Note that we will not change the cover status C(5,2)to the stego status(5,4)appearing in thefinal row.This is because,though the cover pixel has the homogeneity value ofTable2Homogeneity index table used to embed secret message into a cover pixel K with different homogeneity values HV K.Number ofbitsconveyedHomogeneityvalue(HV K)Homogeneity index0123456 01NP––––––12‘‘0’’‘‘1’’–––––13‘‘1’’‘‘0’’NA––––24‘‘00’’‘‘01’’‘‘10’’‘‘11’’–––25‘‘01’’‘‘10’’‘‘11’’‘‘00’’NA––26‘‘10’’‘‘11’’‘‘00’’‘‘01’’NA NA–27‘‘11’’‘‘00’’‘‘01’’‘‘10’’NA NA NATable3An example of embedding2bits of secret message into a cover pixel K(80,64,240, 125)with the homogeneity value of HV K=5and cover status C(5,2).Sorted elements inHRG KHomogeneityindex(HI K)Status of stegopixelConveyedmessage (80,64,240,125)0S(5,0)‘‘01’’(40,32,120,126)1S(5,1)‘‘10’’(20,16,60,127)2S(5,2)‘‘11’’(10,8,30,128)3S(5,3)‘‘00’’(5,4,15,129)4S(5,4)NA228 C.-M.Yu et al./Displays32(2011)225–236HV K=5,we assign four patterns to represent two bits of secret message.As a result,we denote‘‘NA’’in thefinal row to indicate that we do not assign any bit pattern.We have described the secret message embedding by referring to the homogeneity index table through an example described above.Observing the homogeneity index table shown in Table2 again,the symbol‘‘NP,’’appearing in thefirst row,means that it is not possible to embed the secret message if a pixel has one homogeneity index.The‘‘–’’symbol represents that the homoge-neity index is out of range.Taking the second row as an example, if a pixel K has two elements in its homogeneous representation group,the homogeneity value of this pixel is HV K=2and the pixel has two homogeneity indices,either HI K=0or HI K=1.Homogene-ity indices that are greater or equal to2are certainly out of range. Similar to the embedding example shown in Table2,the‘‘NA’’symbol means that we do not assign any bit pattern.It is worth mentioning that the secret message of the bit pattern associated with the homogeneity index is not identical even though the same numbers of bits are conveyed with different homogeneity values.For example,(HV K,HI K)=(4,0)in the fourth row third column,and(HV K,HI K)=(5,0)in thefifth row third col-umn,all can convey two bits of secret message;however,the for-mer indicates the secret message with the bit pattern‘‘00,’’but the latter depicts the secret message with the bit pattern‘‘01.’’The benefit of adopting the diverse bit pattern is in order to avoid coin-cident alternation of the homogeneity index when embedding the same amounts of secret messages.Another benefit is that the di-verse bit pattern will reduce changes encountered in the histogram distributions(at red,green,and blue color channels)for the stego HDR image.This avoids the attack of histogram inspection that is commonly employed in the steganalytic technique.We will pres-ent a quantitative measure of the histogram distribution in the experimental results which will demonstrate that our proposed algorithm produces a similar histogram distribution even though a number of secret messages have been conveyed in the stego HDR image.Finally,the homogeneity index table is a necessity both in the message embedding and the extraction.Therefore, we can use a secret key,Key-1,to increase the security,thereby avoiding the attack of eavesdroppers.Given an HDR image with a resolution of MÂN,the message embedding in the concise fundamental method is operated using the following four steps:Step1:We examine every pixel according to a secret key,Key-2, which determines the embedding order of the secret message.For the examined pixel,such as K,we determine the corre-sponding homogeneous representation group(HRG K)and calcu-late the homogeneity value(HV K)of the HRG K.Step2:For the examined pixel,the pixel capacity C K is com-puted using Eq.(3).If the homogeneity value(HV K)is less than or equal to1,this pixel cannot convey any secret message.We go back to Step1and process the next pixel.Otherwise,we read in C K bits of secret messages accordingly.Step3:We compute the current cover pixel status C(HV K,HI K).Based on the secret message,C(HV K,HI K),we determine thedesired stego pixel status S(HV K,HI0K)by referring to C(HV K, HI K),the Homogeneity Index Table,and the secret message.Step4:We alter the current cover pixel K to become the stego pixel K0by selecting an appropriate element in HRG K that hasthe homogeneity index of HI0K.Once this has been done,we can process the next pixel starting from Step1.The total embedding capacity(TMC)of an HDR image can be computed by examining the homogeneity value of each pixel,as shown in Eq.(4).TMC¼XMÂNi¼1b log2ðHV iÞcð4ÞThe extraction of the secret message is straightforward.Given a stego HDR image,we examine every pixel in a specific order de-rived by using the secret key,Key-2.For each stego pixel,such as K0,which we inspect,we compute the homogeneity value HV K0 for this stego pixel.If HV K0is less than or equal to1,then this pixel conveys no secret message.We then process the next pixel in a specific order.Otherwise,we produce the homogeneous represen-tation group,HRG K0for this stego pixel and calculate how many bits of secret message,say SM,are concealed in the cover pixel K0using Eq.(3).By comparing the cover pixel K0with all of the elements in HRG K0,we can determine the homogeneity index of the cover pixel,HI K0,and produce the status of the stego pixel S(HV K,HI0K).Given the secret key,Key-1,we can produce the homogeneity index table (HIT).Finally,we can extract SM bits of secret message by referringto HIT and S(HV K,HI0K).This ends the extraction of a stego pixel K0, and we can proceed to extracting the secret messages concealed in the next stego pixel.3.2.Pixel categories classificationThe proposed method embeds secret messages into the HDR images.In this section,we further discuss the issue of pixel classi-fication in order to provide an insight for the pixel distribution in a single HDR image.In addition,the pixel classification can illustrate pixels that are eligible for message embedding.We classify the pixels in an HDR image into totally seven cate-gories,as shown in Table4,where max(r,g,b)represents the max-imal values of a pixel for the three channels.Whenever possible, we illustrate an example of a pixel in each category.Based on its features,a pixel is classified into either‘‘regular’’or‘‘irregular’’pixels,as shown in Table4.The‘‘regular’’pixel affirms that the maximal values of a pixel for the three channels,abbreviated as max(r,g,b),is equal to or greater than128.In contrast,the‘‘irreg-ular’’pixel represents that max(r,g,b)6127.Note that a pixel can belong to one of these two categories(‘‘regular’’or‘‘irregular’’),but not both.Based on the point of view of message embedding,a pixel can be classified intofive categories.Pixels belonging to thefirst twoTable4The categories of pixels based on two classification bases in an HDR image encoded by the RGBE format.Classification basis Pixel category Satisfied conditions Example of a pixel P(R,G,B,E)Pixel features Regular max(r,g,b)P128P(12,19,132,131)Irregular max(r,g,b)6127P(127,43,64,133) Message embedding Embeddable26HV P67P(12,132,26,134)Promising max(r,g,b)=127(HV P=2)max(r,g,b)=254and r,g,b2even(HV P=2)P(43,127,56,125) P(254,142,38,129)Singular HV P=1P(129,124,122,130)Null r=g=b=0(HV P>8)P(0,0,0,128)Neutral HV P=8P(128,128,128,248)C.-M.Yu et al./Displays32(2011)225–236229。
[键入文字] 知识·巧学·升华庖丁巧解牛巧解生词【词析】 音析:ur 读//,ey 读弱音//。
形析:sur+ vey ,类似的词还有survive (幸存)。
义析:a detailed inspection or investigation【例句】 In five of the villages that were surveyed , non-farm work provided one quarter of their income.在五个被调查的村子中,非农工作为他们提供了四分之一的收入。
【拓展】 相关短语:do a survey 做调查;make a survey of the land 测量土地【词析】 音析:双写辅音字母前,元音字母a 发短音//。
形析:a+ dd (双写表示相加)义析:to find the total of two or more numbers; plus【例句】 The fire is going out; will you addsome wood? 火快熄了,请你加些木柴好吗? 【拓展】记住其相关短语: add in 算入;包括 add on加上,添上 add to 增加,加到add. ..to...把……加到……之上 【辨析】 add 和increaseadd 意为join so as to increase增加,添加。
如: add a name to the list在名单上添加一个名字add a few words to what has been said对已说过的话补充几句;increase 意为make or become large in amount or number 增加,增大,增多 如:His employer has increased his wages.他的雇主增加了他的工资。
【例句】 Add upall the money I owe you.把我应付你的钱都加在一起。
2023-2024学年河南省焦作市高二下学期期中英语试题The ArkShanghai International Dance Center Theater will present“The Ark”this weekend, a two-dance performance by Chinese and foreign female choreographers (编舞者).“Build Beauty” by Chinese choreographer Gong Xingxing and “Last Man Standing” by German choreographer Sita Ostheimer, comprise “The Ark”. Artists from Xiexin Dance Theater will perform both works.Time: December 23,7:30 pmAdmission:180—580 yuanVenue: Shanghai International Dance Center TheaterBelt and Road InitiativeThe exhibition narrates the history of the ancient Silk Road and Shanghai ‘s modern development. It features over 250 documents, artifacts, photos and videos. About 80 percent of the exhibits are on display in Shanghai for the first time. Highlighted items include a tiny replica (模型) of the treasure ship of Zheng He and some historical documents.Time: Through late April, 2024Admission: FreeVenue: Shanghai ArchivesLive in Love!The Shanghai Rainbow Chamber Singers will lead audiences to welcome the New Year with a concert “Live in Love!”Starting at 10 pm on Sunday, the concert feature s RCS’s original compositions covering the themes of love, memory, and farewells. The concert will end with the title song “Live in Love!” Audiences will be invited to stand up and set their emotions free together with the singers to welcome the New Year.Time: December 31, 10 pmAdmission:180—1,080 yuanVenue: Shanghai Oriental Art CenterInside No. 9The popular British TV series “Inside No. 9” has been adapted into an immersive (沉浸式的) live theater performance. Three “Inside No. 9” stories will be performed live for the audience. The specially designed seats and stages will provide audiences with a one-of-a-kind immersive theater experience.Time: Through February 29,2:50 pm/7:30 pm/8:20 pmAdmission:489—589 yuanVenue: Shanghai Grand Theater1. What is special about “The Ark”?A.It will contain dances from the East and the West.B.It will be composed all by German choreographers.C.It will provide specially designed stages.D.It will offer an immersive theater experience.2. What can you enjoy at Shanghai Grand Theater?A.Some brilliant dances. B.Some operas with the theme of love.C.Some performances based on a TV play. D.Some videos about the ancient SilkRoad.3. Which one should you choose to experience a festive celebration?D.Inside No. 9.A.The Ark. B.Live in Love! C.Belt and RoadInitiative.This year, it was harder than ever to get into Harvard University. The prestigious college announced their lowest acceptance rate ever, welcoming only 1,968 of 57,435 first-year applicants into their hallowed halls. Thanks to Abigail Mack’s moving, insightful essay, she will be one of the lucky students to matriculate this fall.The Massachusetts high school senior used TikTok to share a part of the essay that made her one of the 4 percent of applicants who made the cut. Her essay focused on an unusual theme: the letter “S.”“I hate the letter ‘S’,” she read aloud on TikTok. “Of the 164,777 words with ‘S’, I only struggle with one. To condemn an entire letter because of its use 0.0006 percent of the time sounds statistically unreasonable, but that one case changed 100 percent of my life. I used to have two parents, but now I have one, and the ‘S’ in ‘parents’ isn’t going anywhere.”“‘S’ follows me,” she continued. “I can’t get through a day without being reminded that while my friends went out to dinner with their parents, I ate with my parent. As I write this essay, there is a blue line under the word ‘parent’ telling me to check my grammar; even Grammarly assumes that I should have parents, but cancer doesn’t listen to edit suggestions.”She went on to explain that she fled that dreaded letter by throwing herself into school activities. She joined clubs, sports, and performed in theatrical productions, all in an effort to lessen the pain of losing her mom. Eventually, she realized she was hiding from her pain and decided to face it head-on. She took over the “S” for her own purposes. Now, instead of thinking about the “S” in parents, she concentrates on the double “S” in passion.Abigail’s essay earned her a spot at several top colleges and she has officially been accepted into the class of 2025 in Harvard. In the meantime, her essay has gone viral (走红) with over 16 million views!4. What did the letter “S” mean to Abigail Mack?A.A terrible failure. B.An unfortunate fact.C.A special challenge. D.A meaningful experience.5. What can we infer from paragraph 4 about Abigail Mack?A.She isn’t good at spelling.B.She has poor grammar.C.She has been struggling with cancer. D.She has lost one of her parents. 6. How did Abigail Mack deal with her situation?A.By writing more and more essays. B.By reading all kinds of books.C.By participating in various activities. D.By competing with others secretly.7. What would be the best title for the text?A.Teen’s Special Feeling for the Letter “S”B.Teen’s Essay Won Great PopularityOnlineC.Teen’s Secret to Achieving Academic Success D.Teen Got Admitted to Harvard for Her EssayDo you believe that most people are greedy or generous? It is easy to come up with examples of stories that could support either conclusion if we are relying on our memories or on our guts (直觉).Recently, a team of researchers sought to investigate this question in partnership with the TED organization. TED generously gave away $10,000 each to 200 lucky individuals (yes, you read that correctly), which essentially means these participants won a lottery. Besides, they were asked to spend all the money in three months rather than save it). These participants were from three low-income countries (Indonesia, Brazil, Kenya) and four high-income countries (Australia, Canada, UK, USA). Over the next three months, participants were asked to track their spending to examine how generously or selfishly this money was spent. They reported their spending to the researchers a few months later.Of the $10,000 participants received, they spent $6,431 on other people. To be clear, this also included certain behaviors in which the participants themselves benefited personally (such as taking their friends out to dinner or paying for a family vacation). But still, people are very generous. Participants gave away $1,697 strictly to charity or nonprofit organizations.The researchers expected that if people publicly shared how they spent their money, they would be more generous. To check if this was correct, they asked half of the participants to post on Twitter about how they spent the money. The other half were asked to keep their spending “private”. Surprisingly, the researchers saw that “generous spending was similar” between Twitter and private groups. The mini lottery winners were no more or less generous depending on whether they posted their spending on Twitter or kept it to themselves. The authors admitted they expected the Twitter group to spend more generously, but this prediction was not supported by the data. People did not need to have their spending shown publicly to behave generously.8. What’s the purpose of the researchers?A.To confirm a scientific theory. B.To research into human nature.C.To analyze people’s economic behaviour.D.To classify people’s spending habits. 9. What do we know about the study conducted by the team?A.It was divided into two stages.B.It focused on low-income people.C.The participants were required to report their spending.D.The participants could spend the money without restriction.10. What does the underlin ed word “this” in paragraph 4 refer to?A.People’s sharing how they spent.B.People’s keeping their spending private.C.People’s spending habits in private.D.People’s being more generous in public.11. What does the author intend to tell us?A.Humans are fundamentally generous. B.Money that is easily got will be spentsoon.C.Sharing spending online makes people generous. D.People prefer to keep their spending to themselves.Climate change causes tens of billions of dollars in economic damage in the United States every year. Climate change is expensive, deadly but preventable, according to the new National Climate Assessment, the most sweeping, sophisticated federal analysis of climate change compiled to date.“Climate change affects us all, but it doesn’t affect us all equally,” says climate scientist Katharine Hayhoe, one of the authors of the assessment.“The research indicates that people with lower income have more trouble adapting to climate change, because adaptation comes at a cost,” says Solo mon Hsiang, a climate economist at the University of California.For example, one of the simplest ways to adapt to severe heat waves is to run your air conditioner more. But “if people can’t pay for it, then they can’t protect themselves,” explains Hsiang. Weather-related disasters in the U. S. cause about $150 billion each year in direct losses, according to the report. That’s a lot of money and it’s only expected to go up as the Earth gets hotter. And the hotter it gets, the more profound the economic harm. Twice as much planetary warming leads to more than twice as much economic harm, the assessment warns.But it also points out many successful efforts underway to adapt to the new reality and to prevent worse outcomes. “It’s not the message that if we don’t hit 1.5 degrees, we’re all going to die,” says Hayhoe. “It’s the message that everything we do matters. Every 10th of a degree of warming we avoid, there’s a benefit to that.”There’s been a slight shift in the report’s perspective since the last one, says Candis Callison, a sociologist and author of the report. There’s now a clear acknowledgement, development, developed through years of rigorous research, that the fossil fuel-powered society the U. S. built over generations was profoundly unjust. “Clim ate change actually provides us with an opportunity to address some of those inequities (不公平) and injustices—and to respond to these impacts,” Callison says. “That’s really a powerful thing.”12. What do Katharine Hayhoe and Solomon Hsiang stress about climate change?A.It results in lower income. B.It leads to new unfairness.C.It needs immediate action. D.It causes economic damage.13. What does the author intend to show by giving the example of the air conditioner?A.Heat waves can be easily defeated. B.Climate change leads to serious heat.C.Adapting to climate change is time-consuming. D.Dealing with climate change is expensive.14. What does Katharine Hayhoe focus on in paragraph 6?A.The potential risks of the new reality. B.The consequences of not hitting 1.5degrees.C.The value of each small effort underway. D.The achievements we have made.15. What is Candis Callison’s attitude towards climate change?A.Optimistic. B.Doubtful. C.Worried. D.Uncaring. Gratitude, which is a positive emotional state, can have a profound impact on our mental, emotional, and physical well-being.Gratitude offers us a way of embracing (拥抱) all that makes our lives what they are. 16 It includes the willingness to expand our attention so that we perceive more of the goodness we are always receiving.Robert Emmons is one of the world’s leading experts on the science of gratitude. 17 The first is an affirmation of goodness: People can learn to wake up to the good around them and notice the gifts they have received. The second part of gratitude is recognizing that the source of this goodness rests outside of oneself—we receive these gifts from other people, and sometimes from fate, or the natural world. 18In one study involving nearly 300 adults seeking counseling services at a university, one group wrote a gratitude letter each week for three weeks. The gratitude group reported significantly better mental health (compared to the control group), 12 weeks after the last writing exercise. 19 A study of this practice found that people who wrote down three things that had gone well in their day and identified the causes of those good things were significantly happier and less depressed.20 When you practice gratitude, you shift your thoughts away from negative emotions and uncomfortable sensations. Instead, you begin to focus on good things that you may have overlooked.Rather than focusing on the misfortune of having a flat tire, for example, you consider how your job has made it possible to pay for repairs. Or you shift your focus to how fortunate you are to have close friends who are willing to drive you home.I was sitting in the doctor’s office waiting for my annual check-up. The doctor threw in a (n)________ that took me off guard.“So Robin, what are you going to do after high school? Why don’t you go to college to become a (n) ________ like me?” he asked while writing on the file.Go to college to become a doctor? Who was this man kidding? I thought he was ________ for even suggesting it. My grades were ________ and I wasn’t college material. Embarrassed by his question, I ________ to the doctor, “I’m not ________ enough to be a doctor.”The doctor immediately looked at me straight in the eyes when he said very ________, “You don’t have to be clever to be a doctor. You just have to be persistent (坚持不懈的).”Even though I wasn’t college material, what the doctor said ________ me. Three years later, I________ to a college close to my home and soon found myself walking the campus as a new student.I began ________ anything that seemed frightening. I put all my energy toward passing the assignments one by one. I ________ that when I was persistent, I could ________ things I never believed possible.I graduated with a master’s degree in September 2023, two decades after that ________ with my doctor. I ________ I could shake his hand and tell him “thank you”. Sometimes even the smallest moments in time can have a life-changing ________.21.A.present B.request C.question D.invitation22.A.teacher B.expert C.doctor D.scientist23.A.crazy B.boring C.strange D.considerate 24.A.stable B.average C.formal D.excellent25.A.complained B.apologized C.lied D.replied26.A.smart B.careful C.outgoing D.patient27.A.regretfully B.proudly C.gratefully D.seriously28.A.impressed B.disturbed C.limited D.discouraged 29.A.pointed B.returned C.applied D.adapted30.A.taking over B.breaking down C.setting aside D.giving up31.A.promised B.announced C.agreed D.discovered 32.A.understand B.avoid C.control D.achieve33.A.cooperation B.experiment C.conversation D.argument34.A.wish B.think C.insist D.recall35.A.purpose B.influence C.chance D.choice阅读下面短文,在空白处填入1个适当的单词或括号内单词的正确形式,并将答案填写在答题卡上。
A very brief guide to using MXMMichail Tsagris,Vincenzo Lagani,Ioannis Tsamardinos1IntroductionMXM is an R package which contains functions for feature selection,cross-validation and Bayesian Networks.The main functionalities focus on feature selection for different types of data.We highlight the option for parallel computing and the fact that some of the functions have been either partially or fully implemented in C++.As for the other ones,we always try to make them faster.2Feature selection related functionsMXM offers many feature selection algorithms,namely MMPC,SES,MMMB,FBED,forward and backward regression.The target set of variables to be selected,ideally what we want to discover, is called Markov Blanket and it consists of the parents,children and parents of children(spouses) of the variable of interest assuming a Bayesian Network for all variables.MMPC stands for Max-Min Parents and Children.The idea is to use the Max-Min heuristic when choosing variables to put in the selected variables set and proceed in this way.Parents and Children comes from the fact that the algorithm will identify the parents and children of the variable of interest assuming a Bayesian Network.What it will not recover is the spouses of the children of the variable of interest.For more information the reader is addressed to[23].MMMB(Max-Min Markov Blanket)extends the MMPC to discovering the spouses of the variable of interest[19].SES(Statistically Equivalent Signatures)on the other hand extends MMPC to discovering statistically equivalent sets of the selected variables[18,9].Forward and Backward selection are the two classical procedures.The functionalities or the flexibility offered by all these algorithms is their ability to handle many types of dependent variables,such as continuous,survival,categorical(ordinal,nominal, binary),longitudinal.Let us now see all of them one by one.The relevant functions are1.MMPC and SES.SES uses MMPC to return multiple statistically equivalent sets of vari-ables.MMPC returns only one set of variables.In all cases,the log-likelihood ratio test is used to assess the significance of a variable.These algorithms accept categorical only, continuous only or mixed data in the predictor variables side.2.wald.mmpc and wald.ses.SES uses MMPC using the Wald test.These two algorithmsaccept continuous predictor variables only.3.perm.mmpc and perm.ses.SES uses MMPC where the p-value is obtained using per-mutations.Similarly to the Wald versions,these two algorithms accept continuous predictor variables only.4.ma.mmpc and ma.ses.MMPC and SES for multiple datasets measuring the same variables(dependent and predictors).5.MMPC.temporal and SES.temporal.Both of these algorithms are the usual SES andMMPC modified for correlated data,such as clustered or longitudinal.The predictor vari-ables can only be continuous.6.fbed.reg.The FBED feature selection method[2].The log-likelihood ratio test or the eBIC(BIC is a special case)can be used.7.fbed.glmm.reg.FBED with generalised linear mixed models for repeated measures orclustered data.8.fbed.ge.reg.FBED with GEE for repeated measures or clustered data.9.ebic.bsreg.Backward selection method using the eBIC.10.fs.reg.Forward regression method for all types of predictor variables and for most of theavailable tests below.11.glm.fsreg Forward regression method for logistic and Poisson regression in specific.Theuser can call this directly if he knows his data.12.lm.fsreg.Forward regression method for normal linear regression.The user can call thisdirectly if he knows his data.13.bic.fsreg.Forward regression using BIC only to add a new variable.No statistical test isperformed.14.bic.glm.fsreg.The same as before but for linear,logistic and Poisson regression(GLMs).15.bs.reg.Backward regression method for all types of predictor variables and for most of theavailable tests below.16.glm.bsreg.Backward regression method for linear,logistic and Poisson regression(GLMs).17.iamb.The IAMB algorithm[20]which stands for Incremental Association Markov Blanket.The algorithm performs a forward regression at first,followed by a backward regression offering two options.Either the usual backward regression is performed or a faster variation, but perhaps less correct variation.In the usual backward regression,at every step the least significant variable is removed.In the IAMB original version all non significant variables are removed at every step.18.mmmb.This algorithm works for continuous or categorical data only.After applying theMMPC algorithm one can go to the selected variables and perform MMPC on each of them.A list with the available options for this argument is given below.Make sure you include the test name within””when you supply it.Most of these tests come in their Wald and perm (permutation based)versions.In their Wald or perm versions,they may have slightly different acronyms,for example waldBinary or WaldOrdinal denote the logistic and ordinal regression respectively.1.testIndFisher.This is a standard test of independence when both the target and the setof predictor variables are continuous(continuous-continuous).2.testIndSpearman.This is a non-parametric alternative to testIndFisher test[6].3.testIndReg.In the case of target-predictors being continuous-mixed or continuous-categorical,the suggested test is via the standard linear regression.If the robust option is selected,M estimators[11]are used.If the target variable consists of proportions or percentages(within the(0,1)interval),the logit transformation is applied beforehand.4.testIndRQ.Another robust alternative to testIndReg for the case of continuous-mixed(or continuous-continuous)variables is the testIndRQ.If the target variable consists of proportions or percentages(within the(0,1)interval),the logit transformation is applied beforehand.5.testIndBeta.When the target is proportion(or percentage,i.e.,between0and1,notinclusive)the user can fit a regression model assuming a beta distribution[5].The predictor variables can be either continuous,categorical or mixed.6.testIndPois.When the target is discrete,and in specific count data,the default test isvia the Poisson regression.The predictor variables can be either continuous,categorical or mixed.7.testIndNB.As an alternative to the Poisson regression,we have included the Negativebinomial regression to capture cases of overdispersion[8].The predictor variables can be either continuous,categorical or mixed.8.testIndZIP.When the number of zeros is more than expected under a Poisson model,thezero inflated poisson regression is to be employed[10].The predictor variables can be either continuous,categorical or mixed.9.testIndLogistic.When the target is categorical with only two outcomes,success or failurefor example,then a binary logistic regression is to be used.Whether regression or classifi-cation is the task of interest,this method is applicable.The advantage of this over a linear or quadratic discriminant analysis is that it allows for categorical predictor variables as well and for mixed types of predictors.10.testIndMultinom.If the target has more than two outcomes,but it is of nominal type(political party,nationality,preferred basketball team),there is no ordering of the outcomes,multinomial logistic regression will be employed.Again,this regression is suitable for clas-sification purposes as well and it to allows for categorical predictor variables.The predictor variables can be either continuous,categorical or mixed.11.testIndOrdinal.This is a special case of multinomial regression,in which case the outcomeshave an ordering,such as not satisfied,neutral,satisfied.The appropriate method is ordinal logistic regression.The predictor variables can be either continuous,categorical or mixed.12.testIndTobit(Tobit regression for left censored data).Suppose you have measurements forwhich values below some value were not recorded.These are left censored values and by using a normal distribution we can by pass this difficulty.The predictor variables can be either continuous,categorical or mixed.13.testIndBinom.When the target variable is a matrix of two columns,where the first one isthe number of successes and the second one is the number of trials,binomial regression is to be used.The predictor variables can be either continuous,categorical or mixed.14.gSquare.If all variables,both the target and predictors are categorical the default test isthe G2test of independence.An alternative to the gSquare test is the testIndLogistic.With the latter,depending on the nature of the target,binary,un-ordered multinomial or ordered multinomial the appropriate regression model is fitted.The predictor variables can be either continuous,categorical or mixed.15.censIndCR.For the case of time-to-event data,a Cox regression model[4]is employed.Thepredictor variables can be either continuous,categorical or mixed.16.censIndWR.A second model for the case of time-to-event data,a Weibull regression modelis employed[14,13].Unlike the semi-parametric Cox model,the Weibull model is fully parametric.The predictor variables can be either continuous,categorical or mixed.17.censIndER.A third model for the case of time-to-event data,an exponential regressionmodel is employed.The predictor variables can be either continuous,categorical or mixed.This is a special case of the Weibull model.18.testIndIGreg.When you have non negative data,i.e.the target variable takes positivevalues(including0),a suggested regression is based on the the inverse Gaussian distribution.The link function is not the inverse of the square root as expected,but the logarithm.This is to ensure that the fitted values will be always be non negative.An alternative model is the Weibull regression(censIndWR).The predictor variables can be either continuous, categorical or mixed.19.testIndGamma(Gamma regression).Gamma distribution is designed for strictly positivedata(greater than zero).It is used in reliability analysis,as an alternative to the Weibull regression.This test however does not accept censored data,just the usual numeric data.The predictor variables can be either continuous,categorical or mixed.20.testIndNormLog(Gaussian regression with a log link).Gaussian regression using the loglink(instead of the identity)allows non negative data to be handled naturally.Unlike the gamma or the inverse gaussian regression zeros are allowed.The predictor variables can be either continuous,categorical or mixed.21.testIndClogit.When the data come from a case-control study,the suitable test is via con-ditional logistic regression[7].The predictor variables can be either continuous,categorical or mixed.22.testIndMVReg.In the case of multivariate continuous target,the suggested test is viaa multivariate linear regression.The target variable can be compositional data as well[1].These are positive data,whose vectors sum to1.They can sum to any constant,as long as it the same,but for convenience reasons we assume that they are normalised to sum to1.In this case the additive log-ratio transformation(multivariate logit transformation)is applied beforehand.The predictor variables can be either continuous,categorical or mixed.23.testIndGLMMReg.In the case of a longitudinal or clustered target(continuous,propor-tions within0and1(not inclusive)),the suggested test is via a(generalised)linear mixed model[12].The predictor variables can only be continuous.This test is only applicable in SES.temporal and MMPC.temporal.24.testIndGLMMPois.In the case of a longitudinal or clustered target(counts),the suggestedtest is via a(generalised)linear mixed model[12].The predictor variables can only be continuous.This test is only applicable in SES.temporal and MMPC.temporal.25.testIndGLMMLogistic.In the case of a longitudinal or clustered target(binary),thesuggested test is via a(generalised)linear mixed model[12].The predictor variables can only be continuous.This test is only applicable in SES.temporal and MMPC.temporal.To avoid any mistakes or wrongly selected test by the algorithms you are advised to select the test you want to use.All of these tests can be used with SES and MMPC,forward and backward regression methods.MMMB accepts only testIndFisher,testIndSpearman and gSquare.The reason for this is that MMMB was designed for variables(dependent and predictors)of the same type.For more info the user should see the help page of each function.2.1A more detailed look at some arguments of the feature selection algorithmsSES,MMPC,MMMB,forward and backward regression offer the option for robust tests(the argument robust).This is currently supported for the case of Pearson correlation coefficient and linear regression at the moment.We plan to extend this option to binary logistic and Poisson regression as well.These algorithms have an argument user test.In the case that the user wants to use his own test,for example,mytest,he can supply it in this argument as is,without””. For all previously mentioned regression based conditional independence tests,the argument works as test=”testIndFisher”.In the case of the user test it works as user test=mytest.The max kargument must always be at least1for SES,MMPC and MMMB,otherwise it is a simple filtering of the variables.The argument ncores offers the option for parallel implementation of the first step of the algorithms.The filtering step,where the significance of each predictor is assessed.If you have a few thousands of variables,maybe this option will do no significant improvement.But, if you have more and a”difficult”regression test,such as quantile regression(testIndRQ),then with4cores this could reduce the computational time of the first step up to nearly50%.For the Poisson,logistic and normal linear regression we have included C++codes to speed up this process,without the use of parallel.The FBED(Forward Backward Early Dropping)is a variant of the Forward selection is per-formed in the first phase followed by the usual backward regression.In some,the variation is that every non significant variable is dropped until no mre significant variables are found or there is no variable left.The forward and backward regression methods have a few different arguments.For example stopping which can be either”BIC”or”adjrsq”,with the latter being used only in the linear regression case.Every time a variable is significant it is added in the selected variables set.But, it may be the case,that it is actually not necessary and for this reason we also calculate the BIC of the relevant model at each step.If the difference BIC is less than the tol(argument)threshold value the variable does not enter the set and the algorithm stops.The forward and backward regression methods can proceed via the BIC as well.At every step of the algorithm,the BIC of the relevant model is calculated and if the BIC of the model including a candidate variable is reduced by more that the tol(argument)threshold value that variable is added.Otherwise the variable is not included and the algorithm stops.2.2Other relevant functionsOnce SES or MMPC are finished,the user might want to see the model produced.For this reason the functions ses.model and mmpc.model can be used.If the user wants to get some summarised results with MMPC for many combinations of max k and treshold values he can use the mmpc.path function.Ridge regression(ridge.reg and ridge.cv)have been implemented. Note that ridge regression is currently offered only for linear regression with continuous predictor variables.As for some miscellaneous,we have implemented the zero inflated Poisson and beta regression models,should the user want to use them.2.3Cross-validationcv.ses and cv.mmpc perform a K-fold cross validation for most of the aforementioned regression models.There are many metric functions to be used,appropriate for each case.The folds can be generated in a stratified fashion when the dependent variable is categorical.3NetworksCurrently three algorithms for constructing Bayesian Networks(or their skeleton)are offered,plus modifications.MMHC(Max-Min Hill-Climbing)[23],(mmhc.skel)which constructs the skeleton of the Bayesian Network(BN).This has the option of running SES[18]instead.MMHC(Max-Min Hill-Climbing)[23],(local.mmhc.skel)which constructs the skeleton around a selected node.It identifies the Parents and Children of that node and then finds their Parents and Children.MMPC followed by the PC rules.This is the command mmpc.or.PC algorithm[15](pc.skel for which the orientation rules(pc.or)have been implemented as well.Both of these algorithms accept continuous only,categorical data only or a mix of continuous,multinomial and ordinal.The skeleton of the PC algorithm has the option for permutation based conditional independence tests[21].The functions ci.mm and ci.fast perform a symmetric test with mixed data(continuous, ordinal and binary data)[17].This is employed by the PC algorithm as well.Bootstrap of the PC algorithm to estimate the confidence of the edges(pc.skel.boot).PC skeleton with repeated measures(glmm.pc.skel).This uses the symetric test proposed by[17]with generalised linear models.Skeleton of a network with continuous data using forward selection.The command work does a similar to MMHC task.It goes to every variable and instead applying the MMPC algorithm it applies the forward selection regression.All data must be continuous,since the Pearson correlation is used.The algorithm is fast,since the forward regression with the Pearson correlation is very fast.We also have utility functions,such as1.rdag and rdag2.Data simulation assuming a BN[3].2.findDescendants and findAncestors.Descendants and ancestors of a node(variable)ina given Bayesian Network.3.dag2eg.Transforming a DAG into an essential(mixed)graph,its class of equivalent DAGs.4.equivdags.Checking whether two DAGs are equivalent.5.is.dag.In fact this checks whether cycles are present by trying to topologically sort theedges.BNs do not allow for cycles.6.mb.The Markov Blanket of a node(variable)given a Bayesian Network.7.nei.The neighbours of a node(variable)given an undirected graph.8.undir.path.All paths between two nodes in an undirected graph.9.transitiveClosure.The transitive closure of an adjacency matrix,with and without arrow-heads.10.bn.skel.utils.Estimation of false discovery rate[22],plus AUC and ROC curves based onthe p-values.11.bn.skel.utils2.Estimation of the confidence of the edges[16],plus AUC and ROC curvesbased on the confidences.12.plotnetwork.Interactive plot of a graph.4AcknowledgmentsThe research leading to these results has received funding from the European Research Coun-cil under the European Union’s Seventh Framework Programme(FP/2007-2013)/ERC Grant Agreement n.617393.References[1]John Aitchison.The statistical analysis of compositional data.Chapman and Hall London,1986.[2]Giorgos Borboudakis and Ioannis Tsamardinos.Forward-Backward Selection with Early Drop-ping,2017.[3]Diego Colombo and Marloes H Maathuis.Order-independent constraint-based causal structurelearning.Journal of Machine Learning Research,15(1):3741–3782,2014.[4]David Henry Cox.Regression Models and Life-Tables.Journal of the Royal Statistical Society,34(2):187–220,1972.[5]Silvia Ferrari and Francisco Cribari-Neto.Beta regression for modelling rates and proportions.Journal of Applied Statistics,31(7):799–815,2004.[6]Edgar C Fieller and Egon S Pearson.Tests for rank correlation coefficients:II.Biometrika,48:29–40,1961.[7]Mitchell H Gail,Jay H Lubin,and Lawrence V Rubinstein.Likelihood calculations for matchedcase-control studies and survival studies with tied death times.Biometrika,68(3):703–707, 1981.[8]Joseph M Hilbe.Negative binomial regression.Cambridge University Press,2011.[9]Vincenzo Lagani,Giorgos Athineou,Alessio Farcomeni,Michail Tsagris,and IoannisTsamardinos.Feature Selection with the R Package MXM:Discovering Statistically-Equivalent Feature Subsets.Journal of Statistical Software,80(7),2017.[10]Diane Lambert.Zero-inflated Poisson regression,with an application to defects in manufac-turing.Technometrics,34(1):1–14,1992.[11]RARD Maronna,Douglas Martin,and Victor Yohai.Robust statistics.John Wiley&Sons,Chichester.ISBN,2006.[12]Jose Pinheiro and Douglas Bates.Mixed-effects models in S and S-PLUS.Springer Science&Business Media,2006.[13]FW Scholz.Maximum likelihood estimation for type I censored Weibull data including co-variates,1996.[14]Richard L Smith.Weibull regression models for reliability data.Reliability Engineering&System Safety,34(1):55–76,1991.[15]Peter Spirtes,Clark Glymour,and Richard Scheines.Causation,Prediction,and Search.TheMIT Press,second edi edition,12001.[16]Sofia Triantafillou,Ioannis Tsamardinos,and Anna Roumpelaki.Learning neighborhoods ofhigh confidence in constraint-based causal discovery.In European Workshop on Probabilistic Graphical Models,pages487–502.Springer,2014.[17]Michail Tsagris,Giorgos Borboudakis,Vincenzo Lagani,and Ioannis Tsamardinos.Constraint-based Causal Discovery with Mixed Data.In The2017ACM SIGKDD Work-shop on Causal Discovery,14/8/2017,Halifax,Nova Scotia,Canada,2017.[18]I.Tsamardinos,gani,and D.Pappas.Discovering multiple,equivalent biomarker sig-natures.In In Proceedings of the7th conference of the Hellenic Society for Computational Biology&Bioinformatics,Heraklion,Crete,Greece,2012.[19]Ioannis Tsamardinos,Constantin F Aliferis,and Alexander Statnikov.Time and sampleefficient discovery of Markov blankets and direct causal relations.In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining,pages673–678.ACM,2003.[20]Ioannis Tsamardinos,Constantin F Aliferis,Alexander R Statnikov,and Er Statnikov.Al-gorithms for Large Scale Markov Blanket Discovery.In FLAIRS conference,volume2,pages 376–380,2003.[21]Ioannis Tsamardinos and Giorgos Borboudakis.Permutation testing improves Bayesian net-work learning.In ECML PKDD’10Proceedings of the2010European conference on Machine learning and knowledge discovery in databases,pages322–337.Springer-Verlag,2010.[22]Ioannis Tsamardinos and Laura E Brown.Bounding the False Discovery Rate in LocalBayesian Network Learning.In AAAI,pages1100–1105,2008.[23]Ioannis Tsamardinos,Laura E.Brown,and Constantin F.Aliferis.The Max-Min Hill-ClimbingBayesian Network Structure Learning Algorithm.Machine Learning,65(1):31–78,2006.。
2022-2023学年北京市西城区高三(第1次)模考英语试卷学校:___________姓名:___________班级:___________考号:___________注意事项:1.答卷前,考生务必将自己的姓名、准考证号填写在答题卡上。
2.回答选择题时,选出每小题答案后,用铅笔把答题卡对应题目的答案标号涂黑;如需改动,用橡皮擦干净后,再选涂其他答案标号。
回答非选择题时,将答案写在答题卡上,写在试卷上无效。
3.考试结束后,本试卷和答题卡一并交回。
第I卷(选择题)一、阅读理解(本大题共14小题,共分)ADear Teachers and Parents,This June, during Financial Literacy Month, we have some to share.In 2015, a free online financial education course named FutureSmart was introduced to middle school students, specifically targeting this group at a time in their lives when financial habits take hold and grow.Fast forward to today, FutureSmart, available in English and Spanish, has reached over 13,000 schools across all 50 states. More than two million students have completed the course, with almost half coming from low-to-moderate income families.But we aren't stopping there. We promise to reach four million more students by the end of 2025.Why? Because this moment calls for brave action. Never before have money management and investment decisions been so easy to conduct at any time or place through the use of a smartphone. It is time to offer students more critical financial literacy education to encourage them to make good financial decisions on a daily basis as they make their way through a complex world.From weighing opportunity costs to delaying instant satisfaction for long-term financial gain, FutureSmart educates our youth using hands-on simulations (模拟) to introduce concepts like daily financial decisions and the rewards of long-term planning. Teaching young learners how to build solid financial foundations is an important step in building financially healthy communities.Although our work is far from complete, we know that FutureSmart works. And it works exceptionally well.In the largest study of its kind, supported by the MassMutual Foundation and EVERFI, the University of Massachusetts Donahue Institute (UMDI) recently concluded that 90% of students saw a statistically significant and educationally meaningful increase in knowledge after taking the FutureSmart course.What's more, these results were consistent across all student demographics including race, age, gender, school year, and socioeconomic status.We have a long way to go to reach every single middle school student, but we welcome the challenge. Together, our teams have started a movement to provide equal access to financial education, and we invite others to join us.Visit to learn more and see how you can bring FutureSmart to the young people in your life. MICHAEL FANNING RAY MARTINEZHead of MassMutual US President and Co-Founder of EVERFI1. The course FutureSmart _______.A. is offered in two different languagesB. requires skillful smartphone operationC. has been bought by a large number of schoolsD. targets students from low-to-moderate income families2. How does FutureSmart introduce financial concepts?A. By establishing financially healthy communities.B. By managing opportunities and rewards.C. By simulating real-life situations.D. By delaying financial gain.3. After taking the course, the students should be able to ______.A. improve their academic performanceB. accept financial challenges at any timeC. understand people from various backgroundsD. build a stable financial foundation for the futureBI was sitting in a chemistry lab class during my first year of university, nervous about the experiment we were to perform. I grabbed a pipette and, as I feared, my hand started to shake. The experience was disheartening. I was hoping to pursue a career in science, but I started to wonder whether that would be possible. I thought my dreams had crashed to the ground.I was a boy born with brain damage. My family managed to find good doctors where we lived, in Leningrad (now St. Petersburg), Russia, and I took part in clinical trials testing new treatments. Shortly after my first birthday, I started walking and it became clear my intelligence function was unaffected. So, in some sense, I was lucky. Still, I couldn't do some things growing up. Both hands shook, especially when I was nervous or embarrassed. My left hand was much worse than my right, so I learned to write and do simple tasks with my right hand, but it wasn't easy to do anything precisely.As a teenager, I faced a lot of bullying at school. Feeling alone, I joined a study group called "The natural world". I thought that getting into the world of animals would keep me away from people. That's how I came into the field of biology. At university, I enjoyed the lectures in my science classes. Many lab tasks proved impossible, however. As I struggled with my mood, I read a book about depression. From then on, the physiology of mental disorders became my scientific passion. I looked into what was being done locally and was excited to discover a lab that did behavioral experiments in rats to study depression.At the end of my second year, I approached the professor of the lab to see whether I could work with her. I was afraid to admit I couldn't do some lab tasks. To my relief, she was completely supportive. She set me to work performing behavioral experiments for others in the lab with the help of colleagues. I loved the supportive atmosphere and stayed there to complete my master's and Ph.D.I've come to realize that my hands aren't the barrier I thought they were. By making use of my abilities and working as part of a team, I've been able to follow my passions. I've also realized that there's much more to being a scientist than performing the physical labor. I may not collect all the data in mypapers, but I'm fully capable of designing experiments and interpreting results, which, to me, is the most exciting part of science.4. What was the author's dream?A. To live a normal life.B. To become a scientist.C. To get a master's degree.D. To recover from depression.5. The author said he was lucky in Paragraph 2 because ________.A. he didn't lose the function of both handsB. he learned how to walk at the age of oneC. his family could afford to see good doctorsD. his brain damage didn't affect his intellectual capacity6. From the passage, it is clear that ________.A. the author's own depression inspired him to help others with mental disordersB. the author was surrounded by a team who urged him to further his studyC. the author's loneliness moved him towards the world of biologyD. the author finally finished the lab tasks on his own7. What message does the author want to express?A. Loving yourself makes a difference.B. Opportunity follows prepared people.C. A bright future begins with a small dream.D. The sun somehow shines through the storm.CImagine a simple blood test that could flag most kinds of cancers at the earliest, most curable stage. Liquid biopsies could, in theory, detect a tumor (肿瘤) well before it could be found by touch, symptoms or imaging. Blood tests could avoid the need for surgeons to cut tissue samples and make it possible to reveal cancer hiding in places needles and scalpels cannot safely reach. They could also determine what type of cancer is taking root to help doctors decide what treatment might work best to destroy it.Liquid biopsies are not yet in hand, because it is hard to find definitive cancer signals in a tube of blood, but progress in recent years has been impressive. Last year the journal Science published the firstbig prospective study of a liquid biopsy for DNA and proteins from multiple types of cancers. Though far from perfect, the blood test called CancerSEEK found 26 tumors that had not been discovered with conventional screenings.Liquid biopsies can rely on a variety of biomarkers in addition to tumor DNA and proteins, such as free-floating cancer cells themselves. But what makes the search difficult, Ana Robles, a cancer biologist of the National Cancer Institute, explains, is that "if you have an early-stage cancer or certain types of cancer, there might not be a lot of tumor DNA," and tests might miss it. The ideal blood test will be both very specific and very sensitive so that even tiny tumors can be found. To tackle this challenge, CancerSEEK looks for cancer-specific mutations (突变) on 16 genes, and for eight proteins that are linked to cancer and for which there are highly sensitive tests.Simple detection is not the only goal. An ideal liquid biopsy will also determine the likely location of the cancer so that it can be treated. "Mutations are often shared among different kinds of cancer, so if you find them in blood, you don't know if that mutation is coming from a stomach cancer or lung cancer," says Anirban Maitra, a cancer scientist at the Anderson Cancer Center. To solve that problem, some newer liquid biopsies look for changes in gene expression. Such changes, Maitra notes, are "moreorgan-specific".On the nearer horizon are liquid biopsies to help people already diagnosed with cancer. Last year the government approved the first two such tests, which scan for tumor DNA so doctors can select mutation-targeted drugs. Scientists are working on blood tests to detect the first signs of cancer recurrence (复发) in patients who have completed treatment. This work is moving fast, but does it save lives?That is the question companies such as Thrive and Grail must answer for their broadly ambitious screening tests. "These companies have to prove that they can detect early cancer and, more important, that the early detection can have an impact on cancer survival," Maitra observes.8. According to the passage, liquid biopsies are expected to ______.A. flag cancer and determine the treatmentB. detect cancer signals from a sample of bloodC. take images of tumors and prevent potential cancersD. show types of cancer by measuring the amount of proteins9. What can we learn from the passage?A. Signs of cancer recurrence are not detectable.B. Different kinds of cancer have different gene mutations.C. Biomarkers are much more reliable than tumor DNA and proteins.D. Organ-specific cancers will be identified through changes in gene expression.10. The author is mostly concerned about whether ______.A. liquid biopsies can discover tumors conventional screenings can't findB. liquid biopsies can improve the application of mutation-targeted drugsC. liquid biopsies can help save the lives of those with cancerD. liquid biopsies can be developed for cancer preventionDTechnology seems to discourage slow, immersive reading. Reading on a screen, particularly a phone screen, tires your eyes and makes it harder for you to keep your place. So online writing tends to be more skimmable and list-like than print. The cognitive neuroscientist Mary Walt argued recently that this "new norm" of skim reading is producing "an invisible, game-changing transformation" in how readers process words. The neuronal circuit that sustains the brain's capacity to read now favors the rapid absorption of information, rather than skills developed by deeper reading, like critical analysis.We shouldn't overplay this danger. All readers skim. Skimming is the skill we acquire as children as we learn to read more skillfully. From about the age of nine, our eyes start to bounce around the page, reading only about a quarter of the words properly, and filling in the gaps by inference. Nor is there anything new in these fears about declining attention spans. So far, the anxieties have proved to be false alarms. "Quite a few critics have been worried about attention span lately and see very short stories as signs of cultural decline," the American author Selvin Brown wrote. "No one ever said that poems were evidence of short attention spans."And yet the Internet has certainly changed the way we read. For a start, it means that there is more to read, because more people than ever are writing. If you time travelled just a few decades into the past, you would wonder at how little writing was happening outside a classroom. And digital writing is meant for rapid release and response. An online article starts forming a comment string underneath as soon as it is published. This mode of writing and reading can be interactive and fun. But often it treats other people'swords as something to be quickly harvested as fodder to say something else. Everyone talks over the top of everyone else, desperate to be heard.Perhaps we should slow down. Reading is constantly promoted as a social good and source of personal achievement. But this advocacy often emphasizes "enthusiastic", "passionate" or "eager" reading, none of which adjectives suggest slow, quiet absorption.To a slow reader, a piece of writing can only be fully understood by immersing oneself in the words and their slow comprehension of a line of thought. The slow reader is like a swimmer who stops counting the number of pool laps he has done and just enjoys how his body feels and moves in water.The human need for this kind of deep reading is too tenacious for any new technology to destroy. We often assume that technological change can't be stopped and happens in one direction, so that older media like "dead-tree" books are kicked out by newer, more virtual forms. In practice, older technologies can coexist with new ones. The Kindle has not killed off the printed book any more than the car killed off the bicycle. We still want to enjoy slowly-formed ideas and carefully-chosen words. Even in a fast-moving age, there is time for slow reading.11. What is the author's attitude towards Selvin Brown's opinion?A. Favorable.B. Critical.C. Doubtful.D. Objective.12. The author would probably agree that _______.A. advocacy of passionate reading helps promote slow readingB. digital writing leads to too much speaking and not enough reflectionC. the public should be aware of the impact skimming has on neuronal circuitsD. the number of Internet readers is declining due to the advances of technology13. What does the underlined word "tenacious" in Paragraph 6 probably mean?A. Comprehensive.B. Complicated.C. Determined.D. Apparent.14. Which would be the best title for the passage?A. Slow Reading Is Here to StayB. Digital Technology Prevents Slow ReadingC. Screen vs. Print: Which Requires Deep Reading?D. Reading Is Not a Race: The Wonder of Deep Reading二、阅读七选五(本大题共5小题,共分)Adults are often embarrassed about asking for aid. It's an act that can make people feel emotionally unsafe. (1) Seeking assistance can feel like you are broadcasting your incompetence.New research suggests young children don't seek help in school, even when they need it, for the same reason. Until recently, psychologists assumed that children did not start to care about their reputation and their friends' thoughts about them until around age nine.But our research suggests that as early as age seven, children begin to connect asking for help with looking incompetent in front of others. At some point, every child struggles in the classroom. (2) To learn more about how children think about reputation, we created simple stories and then asked children questions about these situations to allow kids to showcase their thinking.Across several studies, we asked 576 children, ages four to nine, to predict the behavior of two kids in a story. One of the characters genuinely wanted to be smart, and the other merely wanted to seem smart to others. In one study, we told children that both kids did poorly on a test. (3) The four-year-olds were equally likely to choose either of the two kids as the one who would seek help. But by age seven or eight, children thought that the kid who wanted to seem smart would be less likely to ask for assistance. And children's expectations were truly "reputational" in nature-they were specifically thinking about how the characters would act in front of others. When assistance could be sought privately (on a computer rather than in person), children thought both characters were equally likely to ask for it.(4) Teachers could give children more opportunities to seek assistance privately. They should also help students realize asking questions in front of others as normal, positive behavior. (5) Parents could point out how a child's question kicked off a valuable conversation in which the entire family got to talk and learn together. Adults could praise kids for seeking assistance. These responses send a strong signal that other people value a willingness to ask for aid and that seeking help is part of a path to success.A. Kids could be afraid to ask their parents for help.B. Seeking help could even be taught as socially desirable.C. In another study we told them that only one kid did poorly.D. Such reputational barriers likely require reputation-based solutions.E. The moment you ask for directions, after all, you reveal that you are lost.F. But if they are afraid to ask for help because their classmates are watching, learning will suffer.G. We then asked which of these characters would be more likely to raise their hand in front of their class to ask the teacher for help.15. A. A B. B C. C D. D E. EF. FG. G16. A. A B. B C. C D. D E. EF. FG. G17. A. A B. B C. C D. D E. EF. FG. G18. A. A B. B C. C D. D E. EF. FG. G19. A. A B. B C. C D. D E. EF. FG. G三、完形填空(本大题共10小题,共分)As a child growing up in the 1980s, Marlene Irvin took many trips to Joyland, an amusement park in her hometown of Wichita, Kansas. She got excited the moment her family drove intoJoyland's parking lot. "The carousel circling at the entrance to the park was always the (20) for me," Marlene said. "I could watch the horses for hours."Joyland certainly made a/an (21) impression on Marlene, as she got her "first real job" years later at Wichita's Chance Manufacturing, the largest maker of amusement park rides in the world at the time. Marlene started in the fiberglass workshop, where the carousel horses' frames, along with parts for Ferris wheels, roller coasters, and other rides, were pieced together. She (22) found her way to Chance's art and decoration department, becoming one of the lead horse artists. Then, afterworking at Chance for nearly fifteen years, Marlene decided to start her own business, focusing on carousel restoration.Around the same time, Joyland started experiencing a (23) in attendance. At last, to the heartbreak of Wichitans young and old, Joyland (24) after more than fifty years of operation. Local preservation organizations purchased some of the park's historical items, and Joyland'sthirty-six carousel horses were donated to Botanica, a Wichita-owned botanical garden. Botanica asked Marlene to (25) the old, broken horses, and she accepted the challenge.As Marlene finished each horse, Botanica (26) them for the public to see. Although they looked (27) compared to their glory (辉煌) days at Joyland, thanks to Marlene's artistic efforts, the horses impressed observers even more than they had before. When native Wichitans saw them, their most (28) question was: "Will we be able to ride them?" Even as (29) , they remembered riding the horses at Joyland when they were kids.Marlene always smiled and answered: "They've been waiting for you to come back."20. A. memory B. dream C. highlight D. comfort21. A. immediate B. lasting C. accurate D. general22. A. suddenly B. definitely C. hesitantly D. eventually23. A. decline B. break C. boost D. return24. A. went down B. fell down C. got down D. shut down25. A. replace B. rearrange C. restore D. reuse26. A. displayed B. moved C. protected D. advertised27. A. modern B. different C. attractive D. unique28. A. basic B. unexpected C. common D. remarkable29. A. repairmen B. customers C. residents D. adults第II卷(非选择题)四、语法填空(本大题共3小题,共分)30.My name is Barbara and I work at a department store. I (1) (work) there for one year when another Barbara joined the staff. Then I changed my name tag from "Barb" to "Barbie". (2) made me feel funny was how small kids talked about me. "Is she really Barbie?" they asked. I changed it at my other job, too and began answering the phone, "This is Barbie. How can I help you?" The callers have gotten used to that over time, ninety percent of (3) now respond with my name: "Barbie, can you tell me." Pronouncing that long "e" sound forces your mouth into a smile, but I have found the smile is usually returned voluntarily.(1)(2)(3)31. It's said that for the Englishman, his house is his castle. However, this does not mean that his house is a beautiful palace that others (1) (invite) to see. For the British, the home is a place to protect oneself from the outside world. It's a private place in which he goes to hide away (2) the troubles of life. To the American, the home is an expression of (3) (he). Much money is often spent on each and every room (4) (create) the right "feel" according to the person's lifestyle. Therefore, he is happy to show his house to others.(1)(2)(3)(4)32. Smoke jumpers are firefighters, trained to fight fires in places that fire engines can't reach. They travel in small planes and, (1) (use) a parachute, jump into remote wild areas to fight fires. Smoke jumpers have to respond quickly. While a fire is still small, the pilot (2) (drop) team members into the area as needed. Their first job may be to build a fire line to stop the fire from spreading. Water is sent down to them. Smoke jumpers must be (3) (high) trained, very experienced and extremely fit. Their job is very dangerous.(1)(2)(3)五、阅读表达(本大题共1小题,共分)33. In Martin County, Florida, two non-profit organizations have come together to plant seeds of hope through community gardening. Recently, the House of Hope charity for the homeless and people with addictions and other mental health issues partnered with Project L.I.F.T. an organization that helps at-risk teens, to grow community gardens in four small towns across the county.The teens in Project L.I. F.T.'s program—many of them aged 14-19 who are also struggling with addictions, managing mental health or legal issues—visit the gardens every day after school where they grow seeds, maintain and water plants, harvest the produce and learn to create their own meals. They take some of the produce home to their families but most is sent to House of Hope for the homeless community.Beyond the need for food, Project L.I.F.T. hoped the gardens would provide an educational opportunity for their teens."We're trying to teach kids nutrition to deal with the health problems—diabetes and obesity—in our community, but when we get into the garden, now they're doing hands-on stuff that really connects." Bob Zaccheo, the executive director of Project L.I.F.T. tells Guideposts. org.The gardens also offer the teens professional skills that can help them find work later in their largely rural county. Beyond skills, this project has helped the teens find confidence and hope for their futures.So far, the four gardens around Martin County have generated 100 pounds of produce for House of Hope and the community at large. Although the amount of food can't meet the greater need of the area, the program is an opportunity to teach kids that the importance of giving back is just as valuable as the food they're harvesting."You see a major shift in the thinking of these kids," Zaccheo says. "You see them giving. The kids are learning to give at a bigger level than they've ever been able to give at before."(1) What kind of organization is Project L.L.F.T.?______________________________________________________________________(2) What do the teens do when they visit the gardens?______________________________________________________________________(3) Please decide which part is false in the following statement, then underline it and explain why.The four gardens were built only to provide an educational opportunity for at-risk teens.____________________________________________________________________________ __________________(4) In addition to what is mentioned in the passage, what else could at-risk teens learn through community gardening? Explain why. (In about 40 words)____________________________________________________________________________ ___________________六、书面表达(本大题共1小题,共分)34. 假设你是红星中学高三学生李华,最近你收到英国好友Jim的来信,得知他和朋友们骑车旅行一周的计划没有得到父母的许可,他感到非常沮丧。
Network impacts of a road capacity reduction:Empirical analysisand model predictionsDavid Watling a ,⇑,David Milne a ,Stephen Clark baInstitute for Transport Studies,University of Leeds,Woodhouse Lane,Leeds LS29JT,UK b Leeds City Council,Leonardo Building,2Rossington Street,Leeds LS28HD,UKa r t i c l e i n f o Article history:Received 24May 2010Received in revised form 15July 2011Accepted 7September 2011Keywords:Traffic assignment Network models Equilibrium Route choice Day-to-day variabilitya b s t r a c tIn spite of their widespread use in policy design and evaluation,relatively little evidencehas been reported on how well traffic equilibrium models predict real network impacts.Here we present what we believe to be the first paper that together analyses the explicitimpacts on observed route choice of an actual network intervention and compares thiswith the before-and-after predictions of a network equilibrium model.The analysis isbased on the findings of an empirical study of the travel time and route choice impactsof a road capacity reduction.Time-stamped,partial licence plates were recorded across aseries of locations,over a period of days both with and without the capacity reduction,and the data were ‘matched’between locations using special-purpose statistical methods.Hypothesis tests were used to identify statistically significant changes in travel times androute choice,between the periods of days with and without the capacity reduction.A trafficnetwork equilibrium model was then independently applied to the same scenarios,and itspredictions compared with the empirical findings.From a comparison of route choice pat-terns,a particularly influential spatial effect was revealed of the parameter specifying therelative values of distance and travel time assumed in the generalised cost equations.When this parameter was ‘fitted’to the data without the capacity reduction,the networkmodel broadly predicted the route choice impacts of the capacity reduction,but with othervalues it was seen to perform poorly.The paper concludes by discussing the wider practicaland research implications of the study’s findings.Ó2011Elsevier Ltd.All rights reserved.1.IntroductionIt is well known that altering the localised characteristics of a road network,such as a planned change in road capacity,will tend to have both direct and indirect effects.The direct effects are imparted on the road itself,in terms of how it can deal with a given demand flow entering the link,with an impact on travel times to traverse the link at a given demand flow level.The indirect effects arise due to drivers changing their travel decisions,such as choice of route,in response to the altered travel times.There are many practical circumstances in which it is desirable to forecast these direct and indirect impacts in the context of a systematic change in road capacity.For example,in the case of proposed road widening or junction improvements,there is typically a need to justify econom-ically the required investment in terms of the benefits that will likely accrue.There are also several examples in which it is relevant to examine the impacts of road capacity reduction .For example,if one proposes to reallocate road space between alternative modes,such as increased bus and cycle lane provision or a pedestrianisation scheme,then typically a range of alternative designs exist which may differ in their ability to accommodate efficiently the new traffic and routing patterns.0965-8564/$-see front matter Ó2011Elsevier Ltd.All rights reserved.doi:10.1016/j.tra.2011.09.010⇑Corresponding author.Tel.:+441133436612;fax:+441133435334.E-mail address:d.p.watling@ (D.Watling).168 D.Watling et al./Transportation Research Part A46(2012)167–189Through mathematical modelling,the alternative designs may be tested in a simulated environment and the most efficient selected for implementation.Even after a particular design is selected,mathematical models may be used to adjust signal timings to optimise the use of the transport system.Road capacity may also be affected periodically by maintenance to essential services(e.g.water,electricity)or to the road itself,and often this can lead to restricted access over a period of days and weeks.In such cases,planning authorities may use modelling to devise suitable diversionary advice for drivers,and to plan any temporary changes to traffic signals or priorities.Berdica(2002)and Taylor et al.(2006)suggest more of a pro-ac-tive approach,proposing that models should be used to test networks for potential vulnerability,before any reduction mate-rialises,identifying links which if reduced in capacity over an extended period1would have a substantial impact on system performance.There are therefore practical requirements for a suitable network model of travel time and route choice impacts of capac-ity changes.The dominant method that has emerged for this purpose over the last decades is clearly the network equilibrium approach,as proposed by Beckmann et al.(1956)and developed in several directions since.The basis of using this approach is the proposition of what are believed to be‘rational’models of behaviour and other system components(e.g.link perfor-mance functions),with site-specific data used to tailor such models to particular case studies.Cross-sectional forecasts of network performance at specific road capacity states may then be made,such that at the time of any‘snapshot’forecast, drivers’route choices are in some kind of individually-optimum state.In this state,drivers cannot improve their route selec-tion by a unilateral change of route,at the snapshot travel time levels.The accepted practice is to‘validate’such models on a case-by-case basis,by ensuring that the model—when supplied with a particular set of parameters,input network data and input origin–destination demand data—reproduces current mea-sured mean link trafficflows and mean journey times,on a sample of links,to some degree of accuracy(see for example,the practical guidelines in TMIP(1997)and Highways Agency(2002)).This kind of aggregate level,cross-sectional validation to existing conditions persists across a range of network modelling paradigms,ranging from static and dynamic equilibrium (Florian and Nguyen,1976;Leonard and Tough,1979;Stephenson and Teply,1984;Matzoros et al.,1987;Janson et al., 1986;Janson,1991)to micro-simulation approaches(Laird et al.,1999;Ben-Akiva et al.,2000;Keenan,2005).While such an approach is plausible,it leaves many questions unanswered,and we would particularly highlight two: 1.The process of calibration and validation of a network equilibrium model may typically occur in a cycle.That is to say,having initially calibrated a model using the base data sources,if the subsequent validation reveals substantial discrep-ancies in some part of the network,it is then natural to adjust the model parameters(including perhaps even the OD matrix elements)until the model outputs better reflect the validation data.2In this process,then,we allow the adjustment of potentially a large number of network parameters and input data in order to replicate the validation data,yet these data themselves are highly aggregate,existing only at the link level.To be clear here,we are talking about a level of coarseness even greater than that in aggregate choice models,since we cannot even infer from link-level data the aggregate shares on alternative routes or OD movements.The question that arises is then:how many different combinations of parameters and input data values might lead to a similar link-level validation,and even if we knew the answer to this question,how might we choose between these alternative combinations?In practice,this issue is typically neglected,meaning that the‘valida-tion’is a rather weak test of the model.2.Since the data are cross-sectional in time(i.e.the aim is to reproduce current base conditions in equilibrium),then in spiteof the large efforts required in data collection,no empirical evidence is routinely collected regarding the model’s main purpose,namely its ability to predict changes in behaviour and network performance under changes to the network/ demand.This issue is exacerbated by the aggregation concerns in point1:the‘ambiguity’in choosing appropriate param-eter values to satisfy the aggregate,link-level,base validation strengthens the need to independently verify that,with the selected parameter values,the model responds reliably to changes.Although such problems–offitting equilibrium models to cross-sectional data–have long been recognised by practitioners and academics(see,e.g.,Goodwin,1998), the approach described above remains the state-of-practice.Having identified these two problems,how might we go about addressing them?One approach to thefirst problem would be to return to the underlying formulation of the network model,and instead require a model definition that permits analysis by statistical inference techniques(see for example,Nakayama et al.,2009).In this way,we may potentially exploit more information in the variability of the link-level data,with well-defined notions(such as maximum likelihood)allowing a systematic basis for selection between alternative parameter value combinations.However,this approach is still using rather limited data and it is natural not just to question the model but also the data that we use to calibrate and validate it.Yet this is not altogether straightforward to resolve.As Mahmassani and Jou(2000) remarked:‘A major difficulty...is obtaining observations of actual trip-maker behaviour,at the desired level of richness, simultaneously with measurements of prevailing conditions’.For this reason,several authors have turned to simulated gaming environments and/or stated preference techniques to elicit information on drivers’route choice behaviour(e.g. 1Clearly,more sporadic and less predictable reductions in capacity may also occur,such as in the case of breakdowns and accidents,and environmental factors such as severe weather,floods or landslides(see for example,Iida,1999),but the responses to such cases are outside the scope of the present paper. 2Some authors have suggested more systematic,bi-level type optimization processes for thisfitting process(e.g.Xu et al.,2004),but this has no material effect on the essential points above.D.Watling et al./Transportation Research Part A46(2012)167–189169 Mahmassani and Herman,1990;Iida et al.,1992;Khattak et al.,1993;Vaughn et al.,1995;Wardman et al.,1997;Jou,2001; Chen et al.,2001).This provides potentially rich information for calibrating complex behavioural models,but has the obvious limitation that it is based on imagined rather than real route choice situations.Aside from its common focus on hypothetical decision situations,this latter body of work also signifies a subtle change of emphasis in the treatment of the overall network calibration problem.Rather than viewing the network equilibrium calibra-tion process as a whole,the focus is on particular components of the model;in the cases above,the focus is on that compo-nent concerned with how drivers make route decisions.If we are prepared to make such a component-wise analysis,then certainly there exists abundant empirical evidence in the literature,with a history across a number of decades of research into issues such as the factors affecting drivers’route choice(e.g.Wachs,1967;Huchingson et al.,1977;Abu-Eisheh and Mannering,1987;Duffell and Kalombaris,1988;Antonisse et al.,1989;Bekhor et al.,2002;Liu et al.,2004),the nature of travel time variability(e.g.Smeed and Jeffcoate,1971;Montgomery and May,1987;May et al.,1989;McLeod et al., 1993),and the factors affecting trafficflow variability(Bonsall et al.,1984;Huff and Hanson,1986;Ribeiro,1994;Rakha and Van Aerde,1995;Fox et al.,1998).While these works provide useful evidence for the network equilibrium calibration problem,they do not provide a frame-work in which we can judge the overall‘fit’of a particular network model in the light of uncertainty,ambient variation and systematic changes in network attributes,be they related to the OD demand,the route choice process,travel times or the network data.Moreover,such data does nothing to address the second point made above,namely the question of how to validate the model forecasts under systematic changes to its inputs.The studies of Mannering et al.(1994)and Emmerink et al.(1996)are distinctive in this context in that they address some of the empirical concerns expressed in the context of travel information impacts,but their work stops at the stage of the empirical analysis,without a link being made to net-work prediction models.The focus of the present paper therefore is both to present thefindings of an empirical study and to link this empirical evidence to network forecasting models.More recently,Zhu et al.(2010)analysed several sources of data for evidence of the traffic and behavioural impacts of the I-35W bridge collapse in Minneapolis.Most pertinent to the present paper is their location-specific analysis of linkflows at 24locations;by computing the root mean square difference inflows between successive weeks,and comparing the trend for 2006with that for2007(the latter with the bridge collapse),they observed an apparent transient impact of the bridge col-lapse.They also showed there was no statistically-significant evidence of a difference in the pattern offlows in the period September–November2007(a period starting6weeks after the bridge collapse),when compared with the corresponding period in2006.They suggested that this was indicative of the length of a‘re-equilibration process’in a conceptual sense, though did not explicitly compare their empiricalfindings with those of a network equilibrium model.The structure of the remainder of the paper is as follows.In Section2we describe the process of selecting the real-life problem to analyse,together with the details and rationale behind the survey design.Following this,Section3describes the statistical techniques used to extract information on travel times and routing patterns from the survey data.Statistical inference is then considered in Section4,with the aim of detecting statistically significant explanatory factors.In Section5 comparisons are made between the observed network data and those predicted by a network equilibrium model.Finally,in Section6the conclusions of the study are highlighted,and recommendations made for both practice and future research.2.Experimental designThe ultimate objective of the study was to compare actual data with the output of a traffic network equilibrium model, specifically in terms of how well the equilibrium model was able to correctly forecast the impact of a systematic change ap-plied to the network.While a wealth of surveillance data on linkflows and travel times is routinely collected by many local and national agencies,we did not believe that such data would be sufficiently informative for our purposes.The reason is that while such data can often be disaggregated down to small time step resolutions,the data remains aggregate in terms of what it informs about driver response,since it does not provide the opportunity to explicitly trace vehicles(even in aggre-gate form)across more than one location.This has the effect that observed differences in linkflows might be attributed to many potential causes:it is especially difficult to separate out,say,ambient daily variation in the trip demand matrix from systematic changes in route choice,since both may give rise to similar impacts on observed linkflow patterns across re-corded sites.While methods do exist for reconstructing OD and network route patterns from observed link data(e.g.Yang et al.,1994),these are typically based on the premise of a valid network equilibrium model:in this case then,the data would not be able to give independent information on the validity of the network equilibrium approach.For these reasons it was decided to design and implement a purpose-built survey.However,it would not be efficient to extensively monitor a network in order to wait for something to happen,and therefore we required advance notification of some planned intervention.For this reason we chose to study the impact of urban maintenance work affecting the roads,which UK local government authorities organise on an annual basis as part of their‘Local Transport Plan’.The city council of York,a historic city in the north of England,agreed to inform us of their plans and to assist in the subsequent data collection exercise.Based on the interventions planned by York CC,the list of candidate studies was narrowed by considering factors such as its propensity to induce significant re-routing and its impact on the peak periods.Effectively the motivation here was to identify interventions that were likely to have a large impact on delays,since route choice impacts would then likely be more significant and more easily distinguished from ambient variability.This was notably at odds with the objectives of York CC,170 D.Watling et al./Transportation Research Part A46(2012)167–189in that they wished to minimise disruption,and so where possible York CC planned interventions to take place at times of day and of the year where impacts were minimised;therefore our own requirement greatly reduced the candidate set of studies to monitor.A further consideration in study selection was its timing in the year for scheduling before/after surveys so to avoid confounding effects of known significant‘seasonal’demand changes,e.g.the impact of the change between school semesters and holidays.A further consideration was York’s role as a major tourist attraction,which is also known to have a seasonal trend.However,the impact on car traffic is relatively small due to the strong promotion of public trans-port and restrictions on car travel and parking in the historic centre.We felt that we further mitigated such impacts by sub-sequently choosing to survey in the morning peak,at a time before most tourist attractions are open.Aside from the question of which intervention to survey was the issue of what data to collect.Within the resources of the project,we considered several options.We rejected stated preference survey methods as,although they provide a link to personal/socio-economic drivers,we wanted to compare actual behaviour with a network model;if the stated preference data conflicted with the network model,it would not be clear which we should question most.For revealed preference data, options considered included(i)self-completion diaries(Mahmassani and Jou,2000),(ii)automatic tracking through GPS(Jan et al.,2000;Quiroga et al.,2000;Taylor et al.,2000),and(iii)licence plate surveys(Schaefer,1988).Regarding self-comple-tion surveys,from our own interview experiments with self-completion questionnaires it was evident that travellersfind it relatively difficult to recall and describe complex choice options such as a route through an urban network,giving the po-tential for significant errors to be introduced.The automatic tracking option was believed to be the most attractive in this respect,in its potential to accurately map a given individual’s journey,but the negative side would be the potential sample size,as we would need to purchase/hire and distribute the devices;even with a large budget,it is not straightforward to identify in advance the target users,nor to guarantee their cooperation.Licence plate surveys,it was believed,offered the potential for compromise between sample size and data resolution: while we could not track routes to the same resolution as GPS,by judicious location of surveyors we had the opportunity to track vehicles across more than one location,thus providing route-like information.With time-stamped licence plates, the matched data would also provide journey time information.The negative side of this approach is the well-known poten-tial for significant recording errors if large sample rates are required.Our aim was to avoid this by recording only partial licence plates,and employing statistical methods to remove the impact of‘spurious matches’,i.e.where two different vehi-cles with the same partial licence plate occur at different locations.Moreover,extensive simulation experiments(Watling,1994)had previously shown that these latter statistical methods were effective in recovering the underlying movements and travel times,even if only a relatively small part of the licence plate were recorded,in spite of giving a large potential for spurious matching.We believed that such an approach reduced the opportunity for recorder error to such a level to suggest that a100%sample rate of vehicles passing may be feasible.This was tested in a pilot study conducted by the project team,with dictaphones used to record a100%sample of time-stamped, partial licence plates.Independent,duplicate observers were employed at the same location to compare error rates;the same study was also conducted with full licence plates.The study indicated that100%surveys with dictaphones would be feasible in moderate trafficflow,but only if partial licence plate data were used in order to control observation errors; for higherflow rates or to obtain full number plate data,video surveys should be considered.Other important practical les-sons learned from the pilot included the need for clarity in terms of vehicle types to survey(e.g.whether to include motor-cycles and taxis),and of the phonetic alphabet used by surveyors to avoid transcription ambiguities.Based on the twin considerations above of planned interventions and survey approach,several candidate studies were identified.For a candidate study,detailed design issues involved identifying:likely affected movements and alternative routes(using local knowledge of York CC,together with an existing network model of the city),in order to determine the number and location of survey sites;feasible viewpoints,based on site visits;the timing of surveys,e.g.visibility issues in the dark,winter evening peak period;the peak duration from automatic trafficflow data;and specific survey days,in view of public/school holidays.Our budget led us to survey the majority of licence plate sites manually(partial plates by audio-tape or,in lowflows,pen and paper),with video surveys limited to a small number of high-flow sites.From this combination of techniques,100%sampling rate was feasible at each site.Surveys took place in the morning peak due both to visibility considerations and to minimise conflicts with tourist/special event traffic.From automatic traffic count data it was decided to survey the period7:45–9:15as the main morning peak period.This design process led to the identification of two studies:2.1.Lendal Bridge study(Fig.1)Lendal Bridge,a critical part of York’s inner ring road,was scheduled to be closed for maintenance from September2000 for a duration of several weeks.To avoid school holidays,the‘before’surveys were scheduled for June and early September.It was decided to focus on investigating a significant southwest-to-northeast movement of traffic,the river providing a natural barrier which suggested surveying the six river crossing points(C,J,H,K,L,M in Fig.1).In total,13locations were identified for survey,in an attempt to capture traffic on both sides of the river as well as a crossing.2.2.Fishergate study(Fig.2)The partial closure(capacity reduction)of the street known as Fishergate,again part of York’s inner ring road,was scheduled for July2001to allow repairs to a collapsed sewer.Survey locations were chosen in order to intercept clockwiseFig.1.Intervention and survey locations for Lendal Bridge study.around the inner ring road,this being the direction of the partial closure.A particular aim wasFulford Road(site E in Fig.2),the main radial affected,with F and K monitoring local diversion I,J to capture wider-area diversion.studies,the plan was to survey the selected locations in the morning peak over a period of approximately covering the three periods before,during and after the intervention,with the days selected so holidays or special events.Fig.2.Intervention and survey locations for Fishergate study.In the Lendal Bridge study,while the‘before’surveys proceeded as planned,the bridge’s actualfirst day of closure on Sep-tember11th2000also marked the beginning of the UK fuel protests(BBC,2000a;Lyons and Chaterjee,2002).Trafficflows were considerably affected by the scarcity of fuel,with congestion extremely low in thefirst week of closure,to the extent that any changes could not be attributed to the bridge closure;neither had our design anticipated how to survey the impacts of the fuel shortages.We thus re-arranged our surveys to monitor more closely the planned re-opening of the bridge.Unfor-tunately these surveys were hampered by a second unanticipated event,namely the wettest autumn in the UK for270years and the highest level offlooding in York since records began(BBC,2000b).Theflooding closed much of the centre of York to road traffic,including our study area,as the roads were impassable,and therefore we abandoned the planned‘after’surveys. As a result of these events,the useable data we had(not affected by the fuel protests orflooding)consisted offive‘before’days and one‘during’day.In the Fishergate study,fortunately no extreme events occurred,allowing six‘before’and seven‘during’days to be sur-veyed,together with one additional day in the‘during’period when the works were temporarily removed.However,the works over-ran into the long summer school holidays,when it is well-known that there is a substantial seasonal effect of much lowerflows and congestion levels.We did not believe it possible to meaningfully isolate the impact of the link fully re-opening while controlling for such an effect,and so our plans for‘after re-opening’surveys were abandoned.3.Estimation of vehicle movements and travel timesThe data resulting from the surveys described in Section2is in the form of(for each day and each study)a set of time-stamped,partial licence plates,observed at a number of locations across the network.Since the data include only partial plates,they cannot simply be matched across observation points to yield reliable estimates of vehicle movements,since there is ambiguity in whether the same partial plate observed at different locations was truly caused by the same vehicle. Indeed,since the observed system is‘open’—in the sense that not all points of entry,exit,generation and attraction are mon-itored—the question is not just which of several potential matches to accept,but also whether there is any match at all.That is to say,an apparent match between data at two observation points could be caused by two separate vehicles that passed no other observation point.Thefirst stage of analysis therefore applied a series of specially-designed statistical techniques to reconstruct the vehicle movements and point-to-point travel time distributions from the observed data,allowing for all such ambiguities in the data.Although the detailed derivations of each method are not given here,since they may be found in the references provided,it is necessary to understand some of the characteristics of each method in order to interpret the results subsequently provided.Furthermore,since some of the basic techniques required modification relative to the published descriptions,then in order to explain these adaptations it is necessary to understand some of the theoretical basis.3.1.Graphical method for estimating point-to-point travel time distributionsThe preliminary technique applied to each data set was the graphical method described in Watling and Maher(1988).This method is derived for analysing partial registration plate data for unidirectional movement between a pair of observation stations(referred to as an‘origin’and a‘destination’).Thus in the data study here,it must be independently applied to given pairs of observation stations,without regard for the interdependencies between observation station pairs.On the other hand, it makes no assumption that the system is‘closed’;there may be vehicles that pass the origin that do not pass the destina-tion,and vice versa.While limited in considering only two-point surveys,the attraction of the graphical technique is that it is a non-parametric method,with no assumptions made about the arrival time distributions at the observation points(they may be non-uniform in particular),and no assumptions made about the journey time probability density.It is therefore very suitable as afirst means of investigative analysis for such data.The method begins by forming all pairs of possible matches in the data,of which some will be genuine matches(the pair of observations were due to a single vehicle)and the remainder spurious matches.Thus, for example,if there are three origin observations and two destination observations of a particular partial registration num-ber,then six possible matches may be formed,of which clearly no more than two can be genuine(and possibly only one or zero are genuine).A scatter plot may then be drawn for each possible match of the observation time at the origin versus that at the destination.The characteristic pattern of such a plot is as that shown in Fig.4a,with a dense‘line’of points(which will primarily be the genuine matches)superimposed upon a scatter of points over the whole region(which will primarily be the spurious matches).If we were to assume uniform arrival rates at the observation stations,then the spurious matches would be uniformly distributed over this plot;however,we shall avoid making such a restrictive assumption.The method begins by making a coarse estimate of the total number of genuine matches across the whole of this plot.As part of this analysis we then assume knowledge of,for any randomly selected vehicle,the probabilities:h k¼Prðvehicle is of the k th type of partial registration plateÞðk¼1;2;...;mÞwhereX m k¼1h k¼1172 D.Watling et al./Transportation Research Part A46(2012)167–189。
Statistically Hiding SetsManoj Prabhakaran1, and Rui Xue2,1Dept.of Computer ScienceUniversity of Illinois,Urbana-Champaignmmp@2State Key Laboratory of Information SecurityInstitute of Software,Chinese Academy of Sciencesrxue@Abstract.Zero-knowledge set is a primitive introduced by Micali,Ra-bin,and Kilian(FOCS2003)which enables a prover to commit a set toa verifier,without revealing even the size of the ter the prover cangive zero-knowledge proofs to convince the verifier of membership/non-membership of elements in/not in the committed set.We present anew primitive called Statistically Hiding Sets(SHS),similar to zero-knowledge sets,but providing an information theoretic hiding guarantee,rather than one based on efficient simulation.Then we present a newscheme for statistically hiding sets,which does notfit into the“Merkle-tree/mercurial-commitment”paradigm that has been used for all zero-knowledge set constructions so far.This not only provides efficiency gainscompared to the best schemes in that paradigm,but also lets us pro-vide statistical hiding;previous approaches required the prover to main-tain growing amounts of state with each new proof for such a statisticalsecurity.Our construction is based on an algebraic tool called trapdoor DDH groups(TDG),introduced recently by Dent and Galbraith(ANTS2006).However the specific hardness assumptions we associate with TDG aredifferent,and of a strong nature—strong RSA and a knowledge-of-exponent assumption.Our new knowledge-of-exponent assumption maybe of independent interest.We prove this assumption in the generic groupmodel.1IntroductionZero-knowledge set is a fascinating cryptographic primitive introduced by Micali, Rabin,and Kilian[22],which has generated much interest since then [24,21,10,8,17,9].It enables a party(the prover)to commit a set—without revealing even its size—to another party(the verifier).Later the verifier can make membership queries with respect to the committed set;the prover can Supported in part by NSF grants CNS07-16626and CNS07-47027.Work done mostly while visiting UIUC.Supported by the China Scholarship Council, the973Program(No.2007CB311202),the863program(No.2008AA01Z347),and NSFC grants(No.60873260,60773029).M.Fischlin(Ed.):CT-RSA2009,LNCS5473,pp.100–116,2009.c Springer-Verlag Berlin Heidelberg2009Statistically Hiding Sets101 answer these queries and give proofs to convince the verifier of the correctness of the answers,without revealing anything further about the set.In this paper,we revisit the notion of zero-knowledge sets.We provide an alternate notion—which we call statistically hiding sets(SHS)—that replaces the zero-knowledge property by the slightly weaker requirement of“statisti-cal hiding.”Statistical hiding,unlike zero-knowledge,does not require efficient simulatability;this relaxation is comparable to how witness-independence is a weakening of zero-knowledge property for zero-knowledge proofs.But the intu-itive security guarantees provided by SHS is the same as that provided by zero-knowledge sets.(In particular,the informal description in the previous paragraph is applicable to both.)Then we present a novel scheme for this new primitive,significantly departing from previous approaches for building zero-knowledge sets.While all previous approaches for zero-knowledge sets used a tree-based construction(along with a primitive called mercurial commitments),ours is a direct algebraic construction. To the best of our knowledge,this is thefirst construction for this kind of primi-tive that does notfit into the Merkle-tree/mercurial-commitment paradigm.This construction(a)provides statistical zero-knowledge,without the prover having to maintain growing amounts of state with each new proof1and(b)provides effi-ciency gains compared to previous constructions of zero-knowledge sets.Further, since the techniques used are different,our construction opens up the possibil-ity of building zero-knowledge sets(or SHS)with certain features that are not amenable to the Merkle-tree/mercurial-commitment based approach.Our construction is based on trapdoor DDH groups(TDG),a primitive intro-duced recently by Dent and Galbraith[13].Ours is perhaps thefirst non-trivial application of this cryptographic primitive,illustrating its potential and versatil-ity.The specific hardness assumptions we associate with TDG are different from those in[13],and of a strong nature(strong RSA and a Knowledge-of-Exponent assumption).While we believe these assumptions are reasonable given the state-of-the-art in algorithmic number theory,we do hope that our approach leads to newer efficient constructions of statistically hiding sets and similar tools based on more standard assumptions.We also hope that our work will help in better understanding certain powerful assumptions.In particular,the new simple and powerful knowledge-of-exponent assumption that we introduce(and prove to hold in a generic group model)may be of independent interest.See Section1.4 below for more details.1.1Our ContributionsWe briefly point out the highlights of this work,and discuss the tradeoffs we achieve.1The Merkle-tree based approach requires the prover to use a pseudorandom function to eliminate the need for maintaining state that grows with each new proof.This makes the resulting zero-knowledge computational rather than perfect or statistical.102M.Prabhakaran and R.Xue–Prior constructions for ZK sets either required the prover to accumulate more state information with each query,or guaranteed only computational hid-ing(which was based on the security of pseudorandom functions).In contrast, our construction for SHS provides unconditional statistical hiding(without grow-ing state information).In particular,this makes our scheme unconditionally for-ward secure:even an unbounded adversary cannot break the security after the commitments and proofs are over.However,our soundness guarantee depends on new complexity assumptions (see Section1.4).But as we explain below,complexity assumptions are more justified when used for soundness than when used for hiding.–Compared to previous ZK set constructions,we obtain efficiency gains in communication complexity and in the amount of private storage that the prover has to maintain after the commitment phase.The computational complexity of verification of the proofs is also better,depending on the specifics of the mercurial commitments and the group operations involved.However,the computational complexity of generating the proofs is higher in our case.In[25]we provide a detailed comparison.–Since all previous constructions of ZK sets use a Merkle-tree/mercurial commitmet based approach,we consider it an important contribution to provide an alternate methodology.We hope that this can lead to constructions with features that could not be achieved previously.In particular,our construction suggests the possibility of achieving a notion of updateability with better privacy than obtained in[21].We do not investigate this here.–The definition of SHS is also an important contribution of this work.It dif-fers from the definition of ZK sets in certain technical aspects(see Section1.3) which might be more suitable in some situations.But more importantly,it pro-vides a technically relaxed definition of security,retaining the conceptual security guarantees of ZK sets.2This technical relaxation has already helped us achieve a fixed-state construction with statistical security.Going further,we believe SHS could lead to better composability than ZK sets,because the hiding guarantee is formulated as a statistical guarantee and not in terms of efficient simulation. Again,we leave this for future investigation.–Finally,in this work we introduce a new“knowledge-of-exponent”assump-tion(called KEA-DDH),closely related to the standard DDH assumption.We prove that the assumption holds in the generic group model.Due to its natural-ity,the new assumption could provide a useful abstraction for constructing and analysing new cryptographic schemes.On the use of computational assumptions for soundness.The main disadvantage of our construction is the use of non-standard computational assumptions.How-ever,we point out an arguably desirable trade-offit achieves,compared to existing constructions of ZK sets.When maintaining growing state information is not an 2The relaxation is in that we do not require efficient simulation.Note that compared to computational ZK sets,SHS’security is stronger in some aspects.We remark that though one could define computationally hiding sets,such a primitive does not have the above mentioned advantages that SHS has over ZK sets.Statistically Hiding Sets103 option,existing ZK set constructions offer only computational security for the prover,based on the security of the pseudorandom function being used.If the pseu-dorandom function gets broken eventually(due to advances in cryptanalysis,or better computational technology),then the prover’s security is lost.In contrast,our SHS construction provides unconditional and everlasting se-curity for the prover.Further,security guarantee for the verifier depends only on assumptions on prover’s computational ability during the protocol execution. So,in future,if the assumptions we make turn out to be false and even if an explicit algorithm is found to violate them,this causes no concern to a verifier who accepted a proof earlier.In short,our use of stronger complexity assumptions is offset by the fact that they are required to hold only against adversaries operating during the protocol execution.In return,we obtain unconditional security after the protocol execution. Statistically Hiding Databases.Merkle-tree based constructions of zero-knowledge sets naturally extend to zero-knowledge database with little or no overhead.Our construction does not extend in this manner.However in the full version[25]we point out a simple way to use zero-knowledge sets(or statistically hiding sets)in a black-box manner to implement zero-knowledge databases(or statistically hiding databases,respectively).1.2Related WorkMicali,Rabin and Kilian introduced the concept of zero-knowledge sets,and provided a construction based on a Merkle-tree based approach[22].All sub-sequent constructions have followed this essential idea,until now.Chase et al. abstracted the properties of the commitments used in[22]and successfully for-malized a general notion of commitments named mercurial commitments[10]. Catalano et al.[8]further clarified the notion of mercurial commitments used in these constructions.More recently Catalano et al.[9]introduced a variant of mercurial commitments that allowed using q-ary Merkle-trees in the above con-struction to obtain a constant factor reduction in the proof sizes(under stronger assumptions on groups with bilinear pairing).Liskov[21]augmented the original construction of Micali et al.to be up-datable.Gennaro and Micali[17]introduced a non-malleability-like requirement called independence and constructed mercurial commitments with appropriate properties which when used in the Merkle-tree based construction resulted in a zero-knowledge set scheme with the independence property.Ostrovsky et al.[24] extended the Merkle-tree based approach to handle more general datastruc-tures(directed acyclic graphs);however in their notion of privacy the size of the data-structure is allowed to be publicly known,and as such they do not require mercurial commitments.Our construction is motivated by prior constructions of accumulators.The notion of accumulators wasfirst presented in[5]to allow rolling a set of values into one value,such that there is a short proof for each value that went into it. Bari´c and Pfitzmann[3]proposed a construction of a collision-resistant accumu-lator under the strong RSA assumption.Camenisch and Lysyanskaya[6]further104M.Prabhakaran and R.Xuedeveloped a dynamic accumulator so that accumulated values can be added or removed from the accumulator.The most important difference between an accumulator and a zero-knowledge set is that the former does not require the prover to provide a proof of non-membership for elements not accumulated. Our scheme bears resemblance to the“universal accumulators”proposed in Li et al.[20],which does allow proofs of non-membership,but does not have the zero-knowledge or hiding property.Subsequent to our work Xue et al.[27]have proposed a more efficient scheme,secure in the random oracle model.Trapdoor DDH groups were introduced by[13],and to the best of our knowl-edge has not been employed in any cryptographic application(except a simple illustrative example in[13]).They also gave a candidate for this primitive based on elliptic curve groups with composite order,using“hidden pairings.”Indeed, Galbraith and McKee[15]had pointed out that if the pairing operation is not hidden,typical hardness assumptions like the RSA assumption may not be justi-fied in those groups.[13]also defined another primitive called Trapdoor Discrete Logarithm groups;however the candidate they proposed for this–with some reservations–was subsequently shown to be insecure[23].Our hardness assumptions on TDG are different from those in[13].Thefirst assumption we use,namely the Strong RSA assumption,was introduced by Bari´c and Pfitzmann in their work on accumulators[3]mentioned above,as well as by Fujisaki and Okamoto[14].Subsequently it has been studied extensively and used in a variety of cryptographic applications(for e.g.[16,11,7,1,6,2,26]).The second assumption we make is a Knowledge-of-Exponent Assumption(KEA).Thefirst KEA,now called KEA1,was introduced by[12],to construct an efficient public key cryptosystem secure against chosen ciphertext attacks.Hada and Tanaka [19]employed it together with another assumption(called KEA2)to propose 3-round,negligible-error zero-knowledge arguments for NP.Bellare and Palacio in[4]falsified the KEA2assumption and used another extension of KEA1,called KEA3,to restore the results in[19].1.3Differences with the Original DefinitionOur definition differs from that of Micali,Rabin and Kilian[22]in several tech-nical aspects.The original definition was in the setting of the trusted setup of a common reference string;it also required proofs to be publicly verifiable;the zero-knowledge property was required for PPT verifiers and was defined in terms of a PPT simulator(the indistinguishability could be computational,statistical or perfect).In contrast,we define an SHS scheme as a two-party protocol,with no require-ment of public verifiability of the proofs;but we do not allow a trusted setup. We require the hiding property to hold even when the verifier is computationally unbounded;but we do not require an efficient simulation.As mentioned before,we consider these differences to be of a technical nature: the basic utility provided by SHS is the same as that by ZK sets.However we do expect qualitative differences to show up when considering composition issuesStatistically Hiding Sets105 (cf.parallel composition of zero-knowledge proofs and witness indistinguishable proofs).We leave this for future investigation.Our definition of the statistical hiding property is formulated using a compu-tationally unbounded simulation.It is instructive to cast our security definition (when the verifier is corrupt)in the“real-world/ideal-world”paradigm of def-initions that are conventional for multi-party computation.In the ideal world the corrupt verifier(simulator)can be computationally unbounded,but gets access only to a blackbox to answer the membership queries.We require a sta-tistically indistinguishable simulation—effectively requiring security even in a computationally unbounded“environment.”(However our definition is not in the Universal Composition framework,as we do not allow the environment to interact with the adversary during the protocol.)In[25]we include a further discussion of the new definition,comparing and contrasting it with the definition of zero-knowledge proofs.1.4Assumptions UsedThe hardness asssumptions used in this work are of a strong nature.We use a combination of a strong RSA assumption and a knowledge-of-exponent as-sumption.Further these assumptions are applied to a relatively new family of groups,namely,trapdoor DDH groups.Therefore we advise caution in using our protocol before gaining further confidence in these assumptions.Nevertheless we point out that the assumptions are used only for soundness.The statistical hiding property is unconditional.This means that even if an adversary manages to violate our assumptions after onefinishes using the scheme,it cannot violate the security guarantee at that point.The knowledge of exponent assumption we use—called KEA-DH—is a new proposal,but similar to KEA1introduced in1991by Damgard[12].We believe this powerful assumption could prove very useful in constructing effi-cient cryptographic schemes,yet is reasonable enough to be safely assumed in different families of groups.In this work we combine KEA-DH with another (more standard)assumption called the strong RSA assumption.In Section3we descibe these assumptions,and in the full version[25]we provide some further preliminary observations about them(including a proof that KEA-DH holds in a generic group model).Our construction depends on the idea of trapdoor DDH groups by Dent and Galbraith[13].(But our use of this primitive does not require exactly the same features as a trapdoor DDH scheme offers;in particular,we do not use the DDH assumption.)2Statistically Hiding SetsIn this section we present our definition of statistically hiding sets(SHS).It is slightly different from the original definition of zero-knowledge sets of Micali et al.[22],but offers the same intuitive guarantees.106M.Prabhakaran and R.XueNotation.We write Pr[experiment :condition]to denote the probability that a condition holds after an experiment.An experiment is a probabilistic computa-tion (typically described as a sequence of simpler computations);the condition will be in terms of variables set during the experiment.We write {experiment :random variable }to denote the distribution in which a random variable will be distributed after an experiment.The security parameter will be denoted by k .By a PPT algorithm we mean a non-uniform probabilistic algorithm which terminates within poly(k )time for some polynomial poly.We say two distributions are almost the same if the statistical difference between them is negligible in k .Statistically Hiding Set Scheme:Syntax.A (non-interactive)SHS scheme con-sists of four PPT algorithms:setup ,commit ,prove and verify .–setup :This is run by the verifier to produce a public-key/private-key pair:(PK ,VK )←setup (1k ).Here k is the security parameter;the public-key PK is to be used by a prover who makes a commitment to the verifier,and the private (verification)key VK is used by the verifier to verify proofs of membership and non-membership.–commit :Algorithm used by the prover to generate a commitment σof a finite set S ,along with private information ρused for generating proofs:it takes as input (σ,ρ)←commit (S,PK ).–prove :Algorithm used by the prover to compute non-interactive proofs of memberships or non-memberships of queried elements:π←prove (x,ρ,PK ).Here πis a proof of membership or non-membership of x in a set S that was committed using commit ;ρis the the private information computed by commit .–verify :Algorithm used by the verifier to verify a given proof of membership or non-membership:b ←verify (π,σ,x,VK ).The output b is either a bit 0or 1(corresponding to accepting a proof of non-membership,or a proof of membership,respectively),or ⊥(corresponding to rejecting the proof).Statistically Hiding Set Scheme:Security.Let U k stand for the universal set of elements allowed by the scheme for security parameter k (i.e.,the sets committed using the scheme are subsets of U k ).U k can be finite (e.g.,{0,1}k )or infinite (e.g.{0,1}∗).We say that PPT algorithms (setup ,commit ,prove ,verify )form a secure SHS scheme if they satisfy the properties stated below.•Perfect completeness:For any finite set S ⊆U k ,and x ∈U k ,Pr (PK ,VK )←setup (1k );(σ,ρ)←commit (PK ,S );π←prove (PK ,ρ,x );b ←verify (VK ,π,σ,x ):x ∈S =⇒b =0and x ∈S =⇒b =1 =1That is,if the prover and verifier honestly follow the protocol,then the verifier is always convinced by the proofs given by the prover (it never outputs ⊥).Statistically Hiding Sets 107•Computational Soundness:For any PPT adversary A ,there is a negligible function νs.t.Pr (PK ,VK )←setup (1k );(σ,x,π0,π1)←A (PK );b 0←verify (VK ,π0,σ,x );b 1←verify (VK ,π1,σ,x ):b 0=0and b 1=1 ≤ν(k )That is,except with negligible probability,an adversary cannot produce a commitment σand two proofs π0,π1which will be accepted by the verifier respectively as a proof of non-membership and of membership of an element x in the committed set.•Statistical Hiding:There exists a distribution simcommit (PK ),and two dis-tributions simprove 0(PK ,ρ,x )and simprove 1(PK ,ρ,x ),such that for every ad-versary A (not necessarily PPT),and every finite S ⊆U k and any polynomial t =t (k ),the following two distributions have a negligible statistical distance between them.(PK ,s 0)←A (1k ;aux (S ));(σ,ρ)←commit (PK ,S );π0:=σ;for i =1,...,t(x i ,s i )←A (s i −1,πi −1);πi ←prove (PK ,ρ,x i );endfor :(s t ,πt ) (PK ,s 0)←A (1k ;aux (S ));(σ,ρ)←simcommit (PK );π0:=σ;for i =1,...,t (x i ,s i )←A (s i −1,πi −1);πi ←simprove χS x i (PK ,ρ,x i );endfor :(s t ,πt )Here aux (S )denotes arbitrary auxiliary information regarding the set S being committed to.3χS x is a bit indicating x ∈S or not,which is the only bit of infor-mation that each invocation of simprove uses.In the outcome of the experiment,the state s t may include all information that A received so far,including σand all the previous proofs πi .Note that the second of these two distributions does not depend on S ,but only on whether x i ∈S for i =1,...,t .We do not require the distributions simcommit (·)and simprove (·)to be efficiently sampleable.An alternate (but equivalent)definition of the hiding property is in terms of a (computationally unbounded)simulator which can statistically simulate an adversary’s view,with access to only an oracle which tells it whether x ∈S for each x for which it has to produce a proof.3We include aux only for clarity,because by virtue of the order of quantifiers (∀A and S )we are allowing A auxiliary information about the set S .108M.Prabhakaran and R.Xue3Trapdoor DDH GroupFollowing Dent and Galbraith[13],we define a trapdoor DDH group (TDG,for short)as follows.The hardness assumptions we make on such a group are dif-ferent.A TDG is defined by two PPT algorithms:a group generation algorithm Gen and a trapdoor DDH algorithm TDDH .–Gen takes as input the security parameter k (in unary),and outputs a de-scription (i.e.,an algorithm for the group operation)for a group G ,an el-ement g ∈G ,an integer n =2O (k )as an upperbound on the order of g in G ,and a trapdoor τ.The representation of this output should be such that given a tuple (G ,g,n )(purportedly)produced by Gen (1k ),it should be possible to efficiently verify that G is indeed a group and g ∈G has order at most n .4–TDDH takes as input (G ,g,n,τ)produced by Gen ,as well as elements A,B,C and outputs 1if and only if A,B,C ∈G and there exist integers a,b,c such that A =g a ,B =g b ,C =g ab .We make the following two hardness assumptions on a TDG.Assumption 1(Strong RSA Assumption).For every PPT algorithm A ,there is a negligible function νsuch that Pr (G ,g,n,τ)←Gen (1k );(x,e )←A (G ,g,n ):x ∈G ,e >1and x e =g <ν(k ).The probability is over the coin tosses of Gen and A .Assumption 2(Diffie-Hellman Knowledge of Exponent Assumption).(KEA-DH)For every PPT adversary A ,there is a PPT extractor E and a negligible function νsuch that Pr (G ,g,n,τ)←Gen (1k );(A,B,C )←A (G ,g,n ;r );z ←E (G ,g,n ;r ): ∃a,b :A =g a ,B =g b ,C =g ab ,C =A z ,C =B z <ν(k ).Here r stands for the coin tosses of A ;E may toss additional coins.The proba-bility is over the coin tosses of Gen ,A (namely r ),and any extra coins used by E .Further this holds even if A is given oracle access to a DDH oracle for the group G .The extractor E does not access this oracle.Informally,KEA-DH says that if an adversary takes g generated by Gen and outputs a DDH tuple (g a ,g b ,g ab ),then it must know either a or b .However,since A may not know the order of g in G ,the integers a,b are not unique.4The upperbound on the order of g can often be the order of G itself,which could be part of the description of G .We include this upperbound explicitly in the output of Gen as it plays an important role in the construction and the proof of security.However we stress that the exact order of g will be (assumed to be)hard to compute from (G ,g,n ).Statistically Hiding Sets109 For our use later,we remark that the extractor E can be modified to output its inputs as well as the output of A,along with z.Then KEA-DH asserts that if there exists an adversary that takes(G,g,n)as input and produces(α,A,B,C) which satisfies some prescribed property and in which(A,B,C)is a DDH tuple, with non-negligible probability,then there is a machine E which takes(G,g,n) as input,and with non-negligible probability outputs(G,g,α,A,B,C,z)such that(α,A,B,C)satisfies the prescribed property and either C=A z or C=B z, In[25]we discuss the trapdoor DDH group proposed by Dent and Gal-braith[13],and also make preliminary observations about our assumptions above.In particular we prove that KEA-DH holds in a generic group with a bilinear pairing operation.Note that in Assumption2we allow A access to a DDH oracle in G,but requires an extractor which does not have this access.This captures the intuition that A cannot effectively make use of such an oracle:either a query to such an oracle can be seen to be a DDH tuple by considering how the tuple was derived, or if not,it is highly unlikely that it will be a DDH tuple.Indeed,this is the case in the generic group.Finally,we point out that in the KEA-DH assumption the adversary does not obtain any auxiliary information(from Gen,for instance). For proving the security of our construction it will be sufficient to restrict to such adversaries(even though we allow auxiliary information in the security definition).4Our ConstructionOur SHS construction somewhat resembles the construction of accumulators in [3,16,18,6,20].In the following we require the elements in the universe(afinite subset of which will be committed to)to be represented by sufficiently large prime numbers.In[25]we show that such a representation is easily achieved (under standard number theoretic conjectures on dsitribution of prime numbers). Construction1(Statistically Hiding Sets).The construction consists of the following four algorithms:1.setup The verifier takes the security parameter k as the input and runs atrapdoor DDH group generation algorithm to obtain(G,g,n,τ).The verifier then sends(G,g,n)to the prover.The prover verifies that G is indeed a group and g∈G has order at most n.(Recall that an upper bound on the order ofg is explicitly included as part of description of G.See Footnote4.)In thefollowing let N=2k n.mit To commit to a set S={p1,p2,...,p m},the prover chooses aninteger v∈{0,...,N−1}at random and computesu=i=mi=1p i C=g uvand outputsσ=C as the public commitment andρ=uv as the private information for generating proofs in future.3.prove When the prover receives a query about an element p,it will responddepending on whether p∈S or not:–p∈S:In this case the prover computes u0=u/p and C0=g vu0andsendsπ=(yes;C0)to the verifier.–p/∈S:Since v<p and gcd(p,u)=1,we have gcd(p,uv)=1.Theextended Euclidean algorithm allows the prover to compute integers(a0,b0)such that a0uv+b0p=1.The prover then chooses a number γ∈{0,...,N−1}at random,and forms the pair of integers(a,b):=(a0+γp,b0−γuv).Finally,the prover evaluates A=g a,B=g b,D=C aand sendsπ=(no;A,B,D)to the verifier.4.verify When the verifier queries about an element p and receives a proofπfrom the prover,it does the following:–Ifπis of the form(yes,C0),i.e.,the prover asserts that p∈S,then theverifier checks if C p0=C holds.It outputs1if the preceding equation holds,and⊥otherwise.–Ifπis of the form(no,A,B,D),i.e.,the prover claims p/∈S,then theverifier checks if TDDH(A,C,D;G,g,τ)=1(i.e.,whether(g,A,C,D)isa DDH tuple in G)and if D·B p=g holds in the group.It outputs0ifboth the checks are satisfied,and⊥otherwise.5Security of Our SchemeTheorem1.Under the strong RSA and the KEA-DH assumptions,Construc-tion1is a secure statistically hiding set scheme.In the rest of this section we prove this.The soundness of the construction depends on the computational assumptions(Lemma2),but the statistical hiding property is unconditional(Lemma6which is proven using Lemma3,Lemma4 and Lemma5).Completeness.For the completeness property,we need to show that an honestprover can always convince the verifier about the membership of the queried element.Lemma1.Construction1satisfies the completeness property.Proof.For anyfinite set S and any element p,we show that an honest prover’sproof of membership or non-membership will be accepted by an honest verifier. Let S={p1,...,p m},let u,v be computed as in the commitment phase of our construction and p be an element queried by the verifier.If p∈S,then p|uv and so uv/p is an integer.So the prover can indeedcompute C0=g uv/p and send it to the verifier.The verifier will accept thisproof since C p0=C.If p∈S,then p u.Note that since we require p≥N and0≤v<N,we have p>v.So p uv and p being prime,gcd(p,uv)=1.So the provercan run the extended Euclidean algorithm andfind(a0,b0)∈Z×Z such that。