当前位置：文档之家› Digital Object Identifier (DOI) 10.1007s00446-004-0113-4 Efficient low-contention asynchron

Digital Object Identifier (DOI) 10.1007s00446-004-0113-4 Efficient low-contention asynchron

Digital Object Identi?er (DOI)10.1007/s00446-004-0113-4

https://www.doczj.com/doc/0811168081.html,put.(2005)17:

191–207

Ef?cient low-contention asynchronous consensus with the value-oblivious adversary scheduler

Yonatan Aumann 1, ,Michael A.Bender 2,

1Department of Mathematics and Computer Science,Bar-Ilan University,Ramat-Gan,Israel (e-mail:aumann@cs.biu.ac.il)2

Department of Computer Science,State University of New York at Stony Brook,Stony Brook,NY 11794-4400,USA (e-mail:bender@https://www.doczj.com/doc/0811168081.html,)

Abstract.We consider asynchronous consensus in the shared-memory setting.We present the ?rst ef?cient low-contention consensus algorithm for the weak-adversary-scheduler model.The algorithm achieves consensus in O (n log 2n )total work and O (log n )(hot-spot)contention,both expected and with high probability.The algorithm as-sumes the value-oblivious scheduler ,which is de?ned in the paper.Previous ef?cient consensus algorithms for weak ad-versaries suffer from ?(n )memory contention.

1Introduction 1.1Overview

The problem.Asynchronous consensus is a central problem in distributed computing.The consensus problem on n pro-cessors is de?ned as follows.Initially,each processor P i has a private input value v P i .The goal is to agree on a common consensus value v ,which must be one of the input values.If all the input values are either 0or 1,then the problem is called binary consensus .In this work we consider the asynchronous binary-consensus problem in the shared-memory setting.The asynchronous consensus problem can be viewed as a game between the processors and an adversary scheduler .The processors seek to attain consensus,while the adversary scheduler attempts to prevent consensus by controlling the processors’speeds.Technically,at each instant in time,the adversary determines the next processor to operate.

An early version of this paper was presented in the 23rd Inter-national Colloquium on Automata,Languages,and Programming (ICALP ’96)[8].

This work was partially completed while the author was at Har-vard University,supported in part by ONR contract ONR-N00014-91-J-1981.

This work was supported in part by HRL Laboratories,Sandia National Laboratories,and NSF Grants ACI-032497,CCR-0208670,and EIA-0112849.This work was partially completed while the au-thor was at Harvard University,supported in part by NSF grants CCR-9700365,CCR-9504436,and CCR-9313775.

Asynchronous consensus is impossible deterministi-cally [15,16,19,21](see Sect.1.3).Hence,we seek random-ized algorithms.The best known algorithms for the standard

(strong)adversary scheduler require expected ?O

(n 2)total work 1[13].Recently,Aspnes [2]proved that this bound is the best possible.Speci?cally,[2]proves that any shared-memory protocol for asynchronous consensus with the standard adver-sary model requires ?(n 2/log 2n )total work.Thus,in order

to attain sub-?O

(n 2)work,weaker adversary models must be used.(In general,a weaker adversary model is obtained by im-posing restrictions on the information available to the adver-sary for making its adversarial decisions.The exact de?nitions are provided later.)Chandra [14]and Aumann [7]provide al-gorithms that obtain consensus in expected ?O

(n )total work,assuming a weak adversary scheduler (O (n log 2n )in [14]and O (n log n )in [7]).However,both algorithms suffer from Θ(n )contention on shared-memory registers.In fact,Θ(n )is not only the worst-case contention,but also the expected contention if the processors operate in a more-or-less syn-chronized fashion (see Sect.1.3).Note that if processors must queue (and busy-wait)for accessing shared-memory variables,then Θ(n )contention translates to a Θ(n )additional factor in execution cycles,thus bringing the total number of cycles back to ?O

(n 2).Our results.In this paper we present an ef?cient low-contention consensus algorithm for a weak-adversary-scheduler model.We present a randomized algorithm that obtains consensus in O (n log 2n )total work,O (log n )hot-spot contention,and O (n log 3n )contention cost.Total work is de?ned as the total number of instruction steps performed by all processors collectively,regardless of the distribution of the instructions among the processors and regardless of the processors’relative speeds and interleaving activity.Hot-spot contention is the maximal number of simultaneous accesses to any shared-memory variable.Contention cost is an upper bound on the total number of busy-waiting cycles that pro-1

The notation g (n )=?O

(f (n ))means that g (n )=O (f (n )log c

n )for some constant c .I.e.,g is O (f )up to poly-logarithmic factors.Similarly,g (n )=??

(f (n ))means that g (n )=?(f (n )log c

n )for some constant c .

cessors incur throughout the execution.Formal de?nitions of the performance measures are provided in Sect.1.2.The per-formance bounds are obtained assuming the Value-Oblivious adversary scheduler,also de?ned in Sect.1.2.Our algorithm is error free(Las Vegas)and the performance is guaranteed both with high probability and in expectation.

We say that an event E occurs with high probability if for any any constant c>0there exists a proper choice of constants such that Pr[E]≥1?n?c.

Outline.In the remainder of this section we present the for-mal model and discuss related work.The description of the consensus protocol is divided into four parts.First,in Sect.2 we explain how the processors collectively toss O(log n)weak random coins.Next,in Sect.3we show how to use the weak random coins to obtain consensus with high probability.The algorithm thus obtained has a polynomially small failure prob-ability.In Sect.4we provide a construction that allows us to transform the consensus algorithm into one that never fails, thus obtaining a Las Vegas guarantee.In Sect.5we show how to reduce the contention from O(log2n)to O(log n).Finally, we integrate the results in Sect.6and present open problems.

1.2The model

Operations.We assume a system containing n asynchronous processors.The processors communicate via shared memory with multi-reader/multi-writer registers.The processors have a set of atomic operations that they may execute.The atomic operations include:

1.reading a word from shared memory,

2.writing a word to shared memory,and

3.executing one of a?xed set of basic computations(e.g.,

add,multiply,coin toss,etc.)in local registers.

We emphasize that no atomic operation can both read from and write to shared memory.Thus,no compound operation such as test&set or compare&swap is atomic. Asynchrony.The system operates in a sequence of discrete steps.At each step,exactly one processor performs one atomic operation.The ordering of the operations among the proces-sors is arbitrary and is determined by an adversary scheduler. Prior to each step,there is an enabled operation associated with each processor that has not yet completed the procedure. The enabled operation is the operation that the processor per-forms whenever it is granted the right to operate.There are no fairness constraints imposed on the schedule.In particular,all but one processor may fail.

The value-oblivious adversary scheduler.At each step,the adversary scheduler determines the next processor to operate. In de?ning the adversary scheduler,we must specify what information is available to it for making this decision.Here, we introduce the Value-Oblivious adversary scheduler,which, we believe,faithfully captures asynchronous behavior of real-world systems.Our algorithms assume the Value-Oblivious adversary scheduler.

The Value-Oblivious scheduler is assumed to know the entire state of the system prior to the beginning of the execution of the consensus procedure,including all values stored in all memory cells,all of the processors’input values,and the code of the protocol itself.The de?nition of the Value-Oblivious scheduler distinguishes between the functional description of an operation and the values description of an operation.The functional description of an operation consists of

1.a description of the type of the operation,e.g.,read,write,

add,multiply,etc.,and

2.a description of the addresses and locations of the operands

of the operation.

The values description of an operation provides the actual val-ues manipulated by the operation.For example,suppose that register R stores the value‘abc’,and consider the operation A[100]←R.In this case,the functional description of the operation says that the operation is a copy operation,copying the value stored in register R to the100-th location of the array A.The values description of the operation says that the value being copied is‘abc’.

The Value-Oblivious scheduler is provided with the func-tional description of the operations,but not the values descrip-tion.Speci?cally,the Value-Oblivious scheduler has knowl-edge of:

1.the functional description of all operations performed so

far,and

2.the functional description of all enabled operations.

The Value-Oblivious adversary is not provided with the values description of the past or enabled operations.In particular,the adversary is not(directly)provided with the outcomes of the random coin tosses.The scheduler can gain knowledge of the outcome of a coin toss only insofar as the outcome affects the functional description of the computation(e.g.,by affecting a branch decision or the location of an operation’s operand).

We claim that the power given to the Value-Oblivious scheduler captures the common sources of asynchronous be-havior of real-world systems.That is,the real-world schedule may depend on the functional description of operations,but it does not depend on the values description.To illustrate,com-mon causes of asynchrony such as hardware failures,clock skews,operating system interrupts,cache misses,page faults, and network delays all depend solely on the operations per-formed,not on the actual values manipulated by the opera-tions.This is due to the fact that on most modern processors common integer ALU operations run in the same number of cycles regardless of the values of the operands.2Similarly,the time to run a load or store operation may depend on the lo-cation in memory but rarely on the actual data.Accordingly, the scheduler,which is designed to model real-world schedul-ing dynamics,is provided with the functional description of operations but not the values description.

We note that there may be cases in which this model breaks down,but we believe that such cases are rare.Chandra[14] gives a hypothetical example of hardware where the cache optimization mechanism skips writes that do not change the register value.In this case,the values description may affect the dynamics of the execution,and hence the schedule.

2This may not apply to the division operation and to some?oating-point operations.Consequently,our algorithms should not use these operations.

Contention.In de?ning contention we follow[17].In[17]

each memory-access operation is assumed to consist of two stages:initiation(by the processor)and response(by the sys-tem).Between initiation and response the operation is pend-ing.Each initiation event must ultimately have a single match-

ing response event,but the ordering of responses is arbitrary and is determined by the system.In our model,initiation can be identi?ed with the point in time when the operation is en-abled,and the response with the point where it is executed.

For an operation op on a shared-memory variable v,the contention cost of op is the number of response events from

v that occur in between the initiation of op and its response.

Intuitively,the contention cost counts the number of busy-waiting cycles of the operation.The contention cost of an execution is the sum of contention costs of all operations in the execution.

The hot spot contention of an execution is de?ned as the maximum,taken over all shared-memory variables v,of the maximum number of simultaneous operations pending on v. Work.Performance is measured using the total work mea-

sure.Total work counts the total number of operations exe-cuted throughout the duration of the protocol,summed up over all processors.Total work does not account for busy waiting due to contention;this quantity is measured by the contention cost.

An asynchronous protocol is said to be wait free if any processor can complete the protocol after a?nite number of its own steps,regardless of the operations of the other proces-sors[20].A protocol A is bounded wait free if the following holds:There exists a function f A(n)such that for any number of processors n and any processor P that is participating in an n-processor implementation of protocol A,processor P com-

pletes its part in the protocol within f A(n)steps,regardless of the operations of the other processors;the protocol is expected bounded wait free if the processor completes within expected f A(n)steps.

The consensus protocol we present here is expected bounded wait free.

1.3Related work

Fischer et al.[19]proved the impossibility of deterministic asynchronous consensus in a message-passing environment even if only one processor may fail.Loui andAbu-Amara[21], Chor et al.Li[15],and Dolev et al.[16]proved analogous re-sults for the shared-memory setting.The?rst randomized so-lution for the shared-memory,wait-free setting was presented by Chor et al.[15];this algorithm runs in expected O(n2)to-tal work and assumes a weak adversary scheduler.The?rst solution for the strong(standard)adversary model was pre-sented by Abrahamson[1],but the work bounds are exponen-tial.Aspnes and Herlihy[4]gave the?rst polynomial-work algorithm for the strong-adversary model.A series of papers followed,ultimately reducing the complexity to O(n2log n) total work[13]and O(n log2n)steps per processor[5].The reader is referred to[3]for a comprehensive survey of con-sensus algorithms.Aspnes[2]showed that these last results are the best possible up to polylogarithmic factors by prov-ing that any protocol for the strong-adversary model requires ?(n2/log2n)total work.Thus,in order to obtain sub-?O(n2) performance,weaker adversary models must be used.

We now focus our attention on weak adversary models. Several different weak adversary models exist in the liter-ature.We denote these:the Oblivious scheduler,the Value-Oblivious scheduler,the Write-Oblivious scheduler,and the Local-Oblivious scheduler.Appendix Appendix A:provides a review of the de?nitions of these schedulers,and the relation-ships between them.The Oblivious scheduler is the weakest, requiring the adversary to determine the entire schedule in ad-vance.The other three schedulers allow for dynamic choices of the adversary,but differ in the information made available to the adversary for making these choices.Of these three models, the Value-Oblivious scheduler is weakest,and the other two are incomparable.

We now review the results for the weak models.

Aumann et al.[9]consider the consensus problem in the Oblivious-adversary setting.Their protocol obtains simultane-ous agreement on n words in O(n log n log log n)total work.

Chor et al.[15]provide an expected O(n2)total work protocol for the Local-Oblivious scheduler.Abrahamson[1] presents a protocol with expected O(n2log n)steps per pro-cessor,also for the Local-Oblivious scheduler.

Chandra[14]provides a protocol that obtains consensus in expected O(log2n)steps per processor,assuming the Write Oblivious scheduler.The consensus may be on inputs of any size,not only binary inputs.Following the?rst execution of the protocol,subsequent executions can be completed in O(log n) steps per processor;subsequent executions of binary consen-sus complete in O(1)steps per processor.Aumann[7]achieves consensus in expected O(log n)steps per processor,also as-suming the Write Oblivious scheduler.3Here too,the consen-sus may be on inputs of any size,not only binary.

The results of Chandra[14]and Aumann[7]translate to O(n log2n)and O(n log n)total work,respectively.In ad-dition to total-work bounds,both Chandra and Aumann give upper bounds on the number of steps performed by any indi-vidual processor.In this respect,the bounds of Chandra and of Aumann are stronger than those in this paper.However, as mentioned in the introduction,the expected hot-spot con-tention of both these protocols is?(n),and the expected con-tention cost is?(n2).The reason is that in both algorithms in each round there are only a constant number of registers into which all processes write.Thus,for example,if in the ?rst round all processors are synchronous,then there must be at least one register that has a?(n)hot-spot contention and hence?(n2)contention cost.

Aumann and Kapach-Levy[10]consider the problem of asynchronous consensus using only single-writer/single-reader variables.They provide a binary-consensus algorithm that completes in O(n log n e

√

log n)total work,assuming the Value-Oblivious scheduler.

It is worthwhile to compare the techniques used in our pa-per to those used in other papers on asynchronous consensus, and in particular to those of Chandra[14]and Aumann[7]. Most of the other papers are based,following the protocol of Ben-Or[12],on a strategy where processors operate in a se-3The protocol as published in[7]contains a bug.However,the bug is easily?xable,and the result still holds.The bug-free protocol shall appear in the full version of the paper(in preparation).

Weak-Random-Coins(for processor P i)

Repeat

1if total number of operation of processor exceeds n2then

run Alternate(Fig.10)

2k←Read-Clock

3if k=“unde?ned”then goto line13

4else if k>d then exit//algorithm completed 5else if k=0then//phase0,?ll inputs 6choose i∈[1,...,βlog n],j∈[1,...,N]at random

//random butter?y B i and cell B i[0][j] 7choose r∈{0,1}at random with Pr[r=1]=1/N

8B i[0][j]←r

9else//phase k>0 10for t=1to log n do

11choose i∈[1,...,βlog n],j∈[1,...,N]at random

//random butter?y B i and cell B i[k][j] 12B i[k][j]←B i[k?1][j]∨B i[k?1][j|k?1]//logical OR of predecessors 13Clock-Update

Fig.1.Weak-Random-Coins procedure

quence of rounds.In each round each processor proposes a

value and reads the proposals of all other processors.If all of

the proposals in the round(or the last two rounds for some

algorithms)agree,then consensus is obtained.If there is no

agreement in the round,then the computation advances to the

next round in one way or another.In the protocols of Chan-

dra[14]and of Aumann[7],if there is no agreement in the

round then the processors toss a shared weak coin to choose

their next proposal.

This paper,in contrast,uses a different approach.Initially,

processors repeatedly write their input value to random loca-

tions in a shared-memory array of sizeΘ(n).The array then

holds a mix of zeros and ones.The algorithm then gradu-

ally modi?es the array so that one principal value is stored

throughout the array.This convergence is achieved by repeat-

edly sampling the array and writing back a new value based on

the values present in the sample.A two-threshold policy guar-

antees convergence within O(log n)phases with high proba-

bility(see[23]for an early use of a two-threshold technique).

Following this convergence stage,an additional stage is exe-

cuted to guarantee that convergence has indeed been obtained

(the Estimate Counting and Exit-Stage procedures).If not,an

alternate consensus procedure is executed,thus ensuring that

consensus is obtained unconditionally.

Dwork et al.[17]study contention issues arising in the

context of distributed shared-memory protocols.Dwork et al.

prove that for any wait-free randomized consensus algorithm

with hot-spot contention bounded by c,there must be a pro-

cessor for which the(worst-case)expected number of steps is

n?1 c .The result is stated for the strong adversary scheduler,but

the proof also holds for the Value-Oblivious adversary sched-uler.This result means that?O(1)hot-spot contention implies the existence of a processor with expected??(n)steps.Thus, any protocol that has low contention,such as the one in the cur-

rent paper,cannot guarantee low per-processor bounds,which

explains our focus on total work as the performance metric.

Saks et al.[24]obtain a solution that is both wait free in the

general case and fast in the near-synchronous https://www.doczj.com/doc/0811168081.html,ing the interleaving-algorithms method([6])they provide a protocol that with f faulty processors terminates in O(log n)rounds if f<

√

n and in O(n3

n?f

)rounds,otherwise.The length of the round is determined by the speed of the slowest nonfaulty processor.

2Weak random coins

A weak random coin is not necessarily fair,but nonetheless

1.there is at least a constant probability that the coin falls on

heads,i.e.,gets the value1,and

2.there is at least a constant probability that the coin fall on

tails,i.e.,gets the value0.

There may also be some probability that the weak random coin has an‘unde?ned’value or that the coin exhibits different outcomes to different viewers.The notion of a weak random coin has been used in other agreement protocols(e.g.[4,13, 18]).

In this section we develop an algorithm for the collective of processors to tossΘ(log n)weak random coins.Speci?cally, we give a procedure by which the processors collectively toss βlog n=Θ(log n)coins,half of which are guaranteed(with high probability)to be weak random coins.The adversary may determine which coins are random and which are not,but the majority are weak random coins.

2.1The protocol

Let d= log(n/log n) and N=2d=O(n/log n)(all log s are to the base2).The coin-toss procedure employs the following data structure.Associated with each coin r i,i= 1,...,βlog n,is a?xed N×(d+1)array B i.We view each array B i as a butter?y structure with d+1levels as follows. For k=0,...,d,cells B i[k][·]constitute the k-th level of the butter?y.Each cell B i[k][j]of level k is said to have two predecessor cells in the previous level:B[k?1][j]and B[k?

1][j|k](here j|k is the number whose binary representation is identical to that of j except for the k-th bit,which is?ipped). We call the cells in level0the inputs and the cells in level d the outputs.

All cells are initialized to0.The pseudocode of the weak-coin protocol appears in Fig.1.The protocol is performed in rounds,which the processors execute repeatedly until the algorithm terminates.A round is a single execution of lines 1–13.

Throughout the protocol,each processor counts the num-ber of operations that it performs.If any processor exceeds n2operations,then it exits the entire consensus protocol and enters the Alternate protocol,as described in Fig.10.The use of Alternate ensures that high-probability guarantees can be converted into bounds on the expected work.(The choice of n2as the bound is arbitrary.Any polynomial bound that is ω(n log n)would suf?ce.)

The algorithm is composed of d+1phases.For each k, 0≤k≤d,phase k is devoted to?lling the k-th level of the butter?ies.Thus,in phase0the processors write to,or ?ll,input cells of all the butter?ies.To do so,each processor repeatedly picks a random input cell in a random butter?y and writes a0or a1to it,where

Pr[writing1]=1/N

Pr[writing0]=1?1/N.

Phase0completes after there have been O(N log N)=O(n)?llings of input cells.

Phases k,for k>0,are longer;each subsequent phase consists of O(N log2N)=O(n log n)?llings.To decide which cell to?ll during phase k,(as before)a processor picks a random butter?y and a random cell in level k of the butter?y and writes a0or a1to it.The goal of the procedure is now to propagate1’s from the butter?y’s input cells to its output cells.Accordingly,when a processor?lls a cell,it writes the logical OR of the values appearing in the two predecessor cells (i.e.,two0’s propagate a0,and a1at any of the predecessors propagates a1).

Each butter?y corresponds to a coin toss.A butter?y with a suf?ciently large fraction of1’s in the output corresponds to a1coin?ip,and a butter?y with a suf?ciently large fraction of0’s in the output corresponds to the0coin?ip.If there are neither enough1’s nor0’s,then the coin is unde?ned. We show that most of the butter?ies have at least a constant fraction of their inputs?lled,and for each of these butter?ies there is a a constant probability that some input is a1and a constant probability that all of the inputs are0’s.The objective is to propagate any1input to most or all of the outputs of the butter?y.If all inputs are0’s,then0’s will be propagated to the outputs.

The adversary can try to hinder the algorithm by stopping some1’s from being propagated.If the adversary knew which input cells contained1’s,the adversary could effectively stop all1’s from being propagated to the output cells of a butter?y, thus forcing all outputs to be0and causing all coin tosses to be0.Luckily,the Value-Oblivious adversary cannot be so destructive because it does not know which input cells contain the1’s.We show that for at least half of the butter?ies,there is a constant probability that at least3/4of the outputs are1’s, and a constant probability that at least3/4of the outputs are 0’s.

Fig.2.One of the butter?ies from the weak coins procedure.There is a constant probability of having a1in the input level and a constant probability of having no1’s in the input level.When subsequent levels are?lled the1’s propagate.If the adversary knew which inputs were1’s,it could force all outputs to be0’s by stopping processors propagating1’s.Luckily for the algorithm,the adversary is value oblivious

An important issue is how the processors know when to advance from one phase to the next.That is,how do the pro-cessors know when enough cells are?lled in the current phase, so that the computation may advance to the next?For this task we employ the Phase Clock construction of[11],which we describe below.

2.2The phase clock

Most of the procedures described in this paper operate in a se-quence of phases.Within each phase the processors perform a speci?c set of operations associated with the phase.Following a?xed amount of work(e.g.,Θ(n)orΘ(n log n))the phase completes,and the next phase begins.Thus,it is necessary to provide a tool that measures the amount of work performed in the phase and indicates when the phase is completed.Au-mann and Rabin[11]present a general Phase Clock structure, which provides the necessary features.We brie?y review the main characteristics of this Phase Clock.For details the reader is referred to[11].

Properties.The Phase Clock supports two procedures: Read-Clock and Update-Clock,each of which takes O(log n)operations.The procedure Read-Clock returns a non-negative integer or“unde?ned”.The Phase-Clock proce-dures guarantee that for any non-negative value,i(up to any exponential bound),there exist time intervals,Strong(i)and Weak(i),such that with high probability:4

?Strong(i)?Weak(i).

?Throughout interval Strong(i)Read-Clock necessarily returns the value i.

?Only during interval Weak(i)can Read-Clock return the value i.

?Weak(i)ends before Weak(i+1)starts.

?Strong(i)consists ofΘ(n)invocations of Update-Clock.Furthermore,for anyα,the clock can be con?gured such that Strong(i)consists of at least αn invocations of Update-Clock.

4In fact,the error probability is exponentially low.

?O(n)invocations of Update-Clock are suf?cient to advance from the beginning of Weak(i)to the beginning Weak(i+1).

Thus,the clock advances from one integral value to the next,staying at each value forΘ(n)invocations of Update-Clock.

Using the clock.The clock,as described so far,operates in isolation.By interleaving clock updates with other work,as in Fig.1,we obtain a measure of the total work performed in the system.

In our protocol we use a separate Phase Clock for each procedure.

2.3Obtaining the outcome of the coin?ip

We will show that when the Weak-Random-Coin procedure completes,with high probability for at least half of the butter?y arrays B i,3/4of the cells of the output layer share a common value r∈{0,1}.Furthermore,there is a constant probability that r is1and a constant probability that r is0.Thus,if we were to sample the output layer of these butter?ies,we would get a weak-random-coin behavior.However,directly sampling the output layers of the butter?ies,would result in high contention. Hence,in order to reduce contention,we“transfer”the values from the output layer of each butter?y B i,which is of size Θ(n/log n),into an array R i of sizeΘ(n).This is achieved by the processors,collectively,readingΘ(n)random values from the output cells of B i,and writing them to random locations of R i.The Transfer procedure is provided in Fig.3.

In order to obtain the value of the i-th coin?ip,denoted by r i,a processor samplesΘ(log n)locations of R i and takes the majority value.

2.4Analysis

2.4.1Correctness

We denote by phase k,k=0,...,d,the time interval when the clock value is k(i.e.,Strong(k)).

We view the process of randomly?lling cells as a balls-and-bins game.There is a subtle difference,however,because the adversary knows the next operation of each processor and can thus decide to stop any given operation.Fortunately,the adversary cannot stop too many cells from being?lled. Lemma1Consider the process where c1n balls are thrown to c2n bins.Suppose that an adversary can stop at most n of the balls in mid-air.Then,for any c2and <1,there exists a c1such that,for all n,with high probability,at most c2n bins do not hold a ball at the end of the process.

Similarly,if c1n log n balls are thrown,then at most c2n/log n bins do not hold a ball.

Proof.First consider the situation without the adversary.For a given bin the expected number of balls aimed at the bin is c1/c2.Say that a bin is weakly aimed,if fewer than c1

2c2 balls are aimed at it,and heavily aimed otherwise.By the Chernoff bounds,the probability that a bin is weakly aimed is≤e?c1/8c2< /4,for c1suf?ciently large.Applying the Chernoff bound again,we obtain

Pr[the number of weakly aimed bins is> c2n/2]

≤e? c2n/8,

which decays faster than any inverse polynomial in n.If a bin is heavily aimed,then the adversary must stop at least c1/2c2 balls in mid-air in order to stop the bin from holding a ball. Thus,the adversary can keep empty at most n

c1/2c2

≤ c2n/2 heavily aimed bins,for c1suf?ciently large.The total number of bins with no balls is bounded by the number of weakly-aimed bins plus the number of heavily-aimed bins for which the adversary stops all balls in mid-air.Thus,the total number is at most c2n.The proof for the case of c1n log n balls is similar. Lemma2For anyβ(whereβlog n is the number of butter-?ies),with high probability the following holds:

1.The total number of inputs not?lled during phase0is at

most2n.

2.For any k>0the total number of cells of level k not?lled

during phase k is at most n/log n.

Proof.Line13is reached at leastαn times during phase0. Thus,line6is reached at leastαn times.Level0containsβn cells,and each time a processor reaches line6it chooses a cell at random.After a processor chooses a cell,the adversary can stop the processor from advancing to line7.However,there are only n processors.Thus,the situation corresponds to that

Transfer

Repeat

1if total number of operation of processor exceeds n2then

2run Alternate(Fig.10)

3k←Read-Clock

4if k=“unde?ned”then goto line8

5else if k>βlog n then exit//algorithm completed 6else

7read random output cell of butter?y B k

8write the value to random location of R k

9Clock-Update

Fig.3.Transfer procedure:Transferring the values from the output layer of the butter?ies to the arrays R i

of Lemma 1,with c 1=αand c 2=β.Choose =2/βand the result follows.

Now consider phase k >0.Line 13is reached at least α1n times during the phase.There are n processors.Thus,the loop of lines 10–12is executed at least (α?1)n times.Hence,line 11is reached at least (α?1)n log n times.After a processor chooses a cell,the adversary can stop the processor from advancing to line 12.Because there are only n processors,the situation corresponds to that the Lemma 1with c 1=α?1and c 2=β.Choose =1and the result follows. For input cell x and output cell y of butter?y B i ,there is exactly one directed path x =x (0),x (1),...,x (d ?1),x (d )=y extending from input x to output y .

De?nition 1Let x (0),x (1),...,x (d )be an input-output path in one of the butter?ies B i .We say that input x (0)successfully affects output x (d )if for each k =0,...,d ,cell x (k )is ?lled during phase k .We say that input cell x (0)is strongly affecting if it successfully affects at least 3/4of the outputs of B i .De?nition 2We say that butter?y B i is strongly input depen-dent if

1.at least half of its inputs are strongly affecting,and

2.at most N/4writes to its inputs occur after the completion of phase 0.The following lemma guarantees that a suf?cient number of butter?ies are strongly input dependent.We will show in Lemma 4that if a butter?y is strongly input dependent,then there is enough connectivity between the inputs and the out-puts so that the butter?y is a weak coin.

Lemma 3There exists a constant β0such that for any β>β0with high probability at least βlog n/2(i.e.,half)of the B i ’s are strongly input dependent.

Proof.We use a counting argument.First we obtain a bound on the number of butter?ies for which fewer than half of the inputs are strongly affecting.Consider a butter?y B i .For k =

0,...,d ,let F (i )

k denote the set of cells of level k in B i not

?lled in phase k .Set F (i )= d k =0F (i )

k .For input x 0let D (x 0)be the set of outputs not successfully affected by x 0.By de?nition,if x d ∈D (x 0),then there must be a cell x ∈F (i )such that x is on the path from x 0to x d .However,for each x ∈F (i )exactly N input-output pairs (x 0,x d )have x on their connecting path.Thus,

N |F (i )

inputs x 0

|D (x 0)|

≥

x 0

not strongly affecting

|D (x 0)|≥

x 0

not strongly affecting

N 4

.(The last inequality is true since,by de?nition,if input x 0is not strongly affecting,then |D (x 0)|≥N/4.)Thus,if in B i there are N/2inputs that are not strongly affecting,then |F (i )

|≥N/8.By Lemma 2, i |F (

i )|≤2n +dn/log n .Thus,in at most (8/N )(2n +dn/log n )≤c 1log n of the

0% 1’s in bin 50% 1’s in bin

100% 1’s in bin

converges to 0converges to 1

Fig.4.A na¨?ve solution.The processors repeatedly sample the array

and write the majority value.In the ranges far from the 50%threshold with high probability all the samples have the same majority value and the array converges to one value.Near the 50%threshold,samples can have different majority values and the array may not converge

0% 1’s in bin

50% 1’s in bin

100% 1’s in bin

converges to 0converges to 1

converges to 1

converges to 0

Fig.5.Fixing the na¨?ve solution.Before the processors begin converg-ing,they toss a common weak coin to decide on the threshold.Now,for any initial number of 1’s in the array,the array converges with con-stant probability.By repeating the converging phase Θ(log n )times,the array converges with high probability.Once the array converges,the (overwhelming)majority value will not change

B i ’s (for some constant c 1)fewer than half the inputs are strongly affecting.

Let i be the number of cells of level 0of B i written af-ter the completion of phase 0.Each processor writes to level 0at most once after the phase is completed (thereafter it reads the clock and realizes that the phase completed).Thus, βlog n

i =1

i ≤n .Thus,the number of B i ’s for which more than 1/4of the input cells are written after phase 0is completed in at most 4n/N ≤c 2log n (for some constant c 2).Thus,the total number of B i ’s that are not strongly input dependent is ≤c 1log n +c 2log n .With β≥2(c 1+c 2)the claim follows.

De?nition 3Consider the state of a butter?y B i after the

procedure completes.We say that B i is 1-dominated (res.0-dominated )if at least 3/4of the outputs have value 1(res.0).Lemma 4There is a constant p such that for any strongly input dependent butter?y B i ,Pr [B i is 1-dominated ]≥p,Pr [B i is 0-dominated ]≥p.

Proof.Let t 0be the time when phase 0completes.For each input cell x ,consider the last write to x before t 0and all writes after t 0.Denote by W (x )the set of these writes,and set W = input x W (x ).Notice that any write to the inputs not in W has no bearing on the values appearing in the outputs.This is because the levels k >0are only ?lled after the phase 0completes.

Since B i is strongly input dependent,there are at most N/4late writes to level 0.Thus,by de?nition |W |≤N +N/4.By construction,for any write w ∈W ,Pr [value 1is written ]=

1/N ,independently.Any input x with W (x )=?(never writ-ten)stores the initial value 0.Thus,

Pr [at all times after t 0all inputs store the value 0]

≤

1?1N 5N 4

≤e ?5/4.

In this case,all outputs have the value 0,and B i is 0-dominated.

On the other hand,since B i is strongly input dependent,N/4of its inputs are strongly affecting and never overwritten after t 0.With probability ≥1?(1?1/N )N/4≥1?e ?1/4at least one of these affecting inputs has the value 1.In this case,at least 3/4of the outputs get the value 1,and B i is 1-dominated. Obtaining the outcome of the coin ?ip r i is achieved by sampling Θ(log n )locations of the corresponding array R i and taking the majority value.

Corollary 5With high probability at least half of the values r i constitute weak-random-coins.

Proof.By Lemma 3,at least half of the butter?ies B i are strongly-input-dependent.By Lemma 4for strongly input de-pendent butter?y B i ,with high probability 3/4of the output cells of B i share the same value b ,which with constant prob-ability is 0and with constant probability is 1.Thus,with high probability the same guarantee also hold for R i .In this case,with high probability the majority value of any Θ(log n )sam-ple of R i will provide the value b .Thus,for the βlog n/2strongly input dependent butter?ies B i ,the coin r i constitutes a weak random coin. 2.4.2Work and contention

Lemma 6With high probability the total work of the Weak-Random-Coins procedure and of the Transfer procedure is O (n log 2n ),and the Alternate procedure is not executed.Proof.Consider the Weak-Random-Coin procedure.With high probability at most O (n )invocations of the procedure Clock-Update are necessary in order to advance the clock by one.Thus,with high probability the total number of times line 13is performed is Θ(dn )=O (n log n ).Each round (lines 1–13)takes O (log n )operations.There are at most n rounds completed after the clock reaches value d +1(one for each processor).Thus,with high probability the total number of operations is O ((n log n )+n )·O (log n )=O (n log 2n ).

Similarly,for the Transfer Procedure.With high probabil-ity line 8is executed O (n log n )times.The entire loop takes O (1)operations.Hence,with high probability the total work is O (n log n ).

In both cases,the Alternate procedure is not executed. The following well established fact helps bound the hot-spot contention:

Lemma 7Suppose that t balls are randomly thrown into m bins such that with high probability t ≤M for some M .Then 1.for M =O (m ),with high probability no bin receives more than O (log m )balls,and

2.for M =?(m log m ),with high probability no bin re-ceives more than O (M/m )balls.The proof is by standard Chernoff bounds.

Lemma 8In the Weak-Random-Coins procedure and the Transfer procedure,with high probability any memory lo-cation,except those of the Phase-Clock,is accessed at most O (log n )times.

Proof.Consider the levels of the butter?ies in the Weak-Random-Coin procedure.Each level contains Nβlog n =O (n )locations in total.The ?rst level is accessed in line 8(for k =0),and subsequent levels are accessed in line 12(for k =1,...,d ).With high probability line 8is performed O (n )times,and line 12is performed O (n log n )times for each k .Each of these invocations accesses a random location of the level.Hence,by Lemma 7,with high probability any single locations is accessed O (log n )times.

Similarly for the Transfer procedure.First,consider the butter?y output cells.For each k ,with high probability there are O (n )reads from random output cells of B k .There are O (n/log n )outputs to B k .Hence,by Lemma 7,with high probability each memory location is accessed O (log n )times.For the arrays R k .With high probability there are O (n )writes to random locations of R k .The size of R k is n .Hence,by Lemma 7,with high probability each memory location is ac-cessed O (log n )times. 3The convergence procedure 3.1The procedure

We now explain how to achieve consensus using the weak-random-coin procedure of the previous section.A pseudocode description appears in Fig.6.The procedure employs a se-quence of arrays C 0,C 1,...,C βlog n of size γn ,where γis a constant to be determined later.All the cells of arrays C k are initialized to Λ.

The convergence procedure works in (βlog n +1)=Θ(log n )phases (measured by a Phase Clock),each consisting of Θ(n log n )operations.In phase 0,the ?lling phase ,each active processor P i repeatedly chooses a random cell of C 0and copies its private input,v P i ,into the cell.By the end of this phase with high probability most cells of C 0contain values (which are either 0or 1).

Phases k >0are the convergence phases .In these phases the goal is to have all values converge to a single consensus value v ∈{0,1}.In each phase k processors sample cells of C k ?1and write a value to C k .A na¨?ve method would have processors write the majority value of the sample.If the initial distribution of 1’s and 0’s in the array C 0is far from half and half,then the values will converge to the majority value.However,if the distribution of values is approximately half and half,then this procedure may not converge.(By artfully stopping and starting processors,the adversary could keep the distribution balanced around the 50%threshold.)

Thus,we employ a two-threshold policy,as follows.The processors use the values produced by the weak random coin procedure.If the value of the coin is 0then a sample with 1/3ones suf?ces to write back a one.If the value of the coin is

Convergence (for processor P i )

repeat 1if total number of operation of processor exceeds n 2then

run Alternate (Fig.10)

2k ←Read-Clock 3if k >βlog n then exit //algorithm completed 4else if k =‘unde?ned’then goto line 145else if k =0then //?lling phase,k =0

6s ←local input v P i //copy local input

7else //converging phases,k >0

8sample b log n locations of array C k ?19let λbe the fraction of 1’s in the sample 10Sample Θ(log n )locations of R k and let r k be the majority value 11s ← λ>1+r k 3 //threshold decision

12choose j ∈[1,...,γn ]at random 13C k [j ]←s //write s in random cell of array C k

14Clock-Update Fig.6.The convergence procedure

1then 2/3ones in the sample are necessary in order to write back a one.Let λbe the fraction of 1’s in the sample,and r k the value of the k -th weak random coin.Then the value,s ,written to C k is

r k =0r k =1s =

1,λ>13;0,otherwise;

s =

1,λ>23;0,otherwise.

This policy guarantees that regardless of the initial distribution

of values,there is a constant probability that the distribution is far from the threshold.The idea of randomly choosing between two thresholds was ?rst proposed by Rabin [23].3.2Analysis 3.2.1Correctness

Recall that the arrays are of size γn and with high probability the clock remains in each value for at least αn invocations of Clock-Update ,where αcan be tuned to need.

Lemma 9For any γthere exists an αsuch that with high probability for each k at least 0.9of the cells of C k are written during phase k .

Proof.Line 14is reached at least αn times during the phase.Thus,line 12is reached at least the same number of times.There are n processors and the adversary can stop any proces-sor from advancing from line 12to line 13.Thus,the situation corresponds to that of Lemma 1with c 1=αand c 2=γ.Choose =0.1and the result follows for suf?ciently large α.

For 1≤k ≤βlog n ,let m k be the fraction of cells in C k

that store the value 1when phase k +1starts.Recall that the sample in line 8is of size b log n .

Lemma 10There exist sample-size parameter b and array-size parameter γsuch that for all k >0the following holds.

If m k ≥1/2(res.m k <1/2)and r k =0(res.r k =1)then with high probability all the writes to C k during phase k are with the value 1(res.0).

Proof.Suppose m k ≥1/2and r k =0.Let I k be the time interval when the clock reads k .We prove that with high prob-ability all writes within I k (to array C k )output the value 1.There are at most n tardy writes to C k ?1during I k .Thus,throughout I k the number of 1’s in C k is at least γm k ?n .Hence,with γsuf?ciently large,if m k ≥1/2,then throughout the interval I k the fraction of 1’s in C k ?1is at least 0.4.Thus,by the Chernoff bounds,with high probability any b log n sam-ple taken during this interval ?nds at least 1/3of the cells with value 1.Since r k =0,it outputs the value 1.The argument for m k <1/2and r k =1is analogous. Lemma 11For βsuf?ciently large,with high probability,af-ter the completion of the Consensus Procedure at least 90%of the cells in the last array C βlog n are ?lled,and all ?lled cells store the same value v ∈{0,1}.

Proof.By Corollary 5,with high probability for at least half the values of k ,r k constitutes a weak-random coin.For these values of k ,there is a constant probability that r k is 1and a constant probability that r k is 0.Thus,for these values of k ,by Lemma 10,there is a constant probability that all writes to C k are with an identical value.If this happens,then in all subsequent phases,all writes will also be the same value.Thus,with high probability after βlog n phases all writes are with the same value.By Lemma 9at least 90%of the cells of C βlog n hold this value. The following lemma hold unconditionally.

Lemma 12Any value (other than Λ)appearing in C βlog n is one of the original input values (i.e.there is a processor P i for which this value is its private input value v P i ).

Proof.In the ?rst phase the array C 0is ?lled only with input values.From then on,values are only copied from one array to the next.

Thus,with high probability,the value appearing in the majority of the cells of Cβlog n is a consensus value,and pro-cessors can sample this array to obtain the consensus value. However,there is still a low probability that consensus is not achieved.In the next section we show how to guarantee that consensus is obtained with no error probability.

3.2.2Work and contention

Lemma13With high probability the total work for the conver-gence procedure is O(n log2n)and the Alternate procedure is not executed.

Proof.The algorithm completes when the value of the clock reachesβlog n+1.With high probability O(n)invocations of the procedure Clock-Update suf?ce in order to advance the clock by one.Thus,with high probability the total number of times that line14is performed is O(n log n).Each round (lines1–14)takes O(log n)operations.There are at most n rounds completed after the clock reaches value c log n+1(one for each processor).Thus,with high probability the total num-ber of operations is O(n log n+n)·O(log n)=O(n log2n). In this case,the Alternate procedure is not executed. Lemma14In the Convergence protocol,with high probabil-ity any memory location,except those of the Phase-Clock,is accessed at most O(log n)times.

Proof.Consider the arrays C k.For each k,the array C k is accessed in line13during phase k and in line8during phase k+1.Line13accesses one random location and line8accesses O(log n)random locations.With high probability line14is executed O(n)times for each k,and hence also lines8and13. Hence,by Lemma7,with high probability each location is accessed O(log n)times.

Consider the arrays R k.For each k,array R k is accessed during phase k in Line10,which accesses O(log n)random locations.With high probability line10is executed O(n)times during the phase.Hence,by Lemma7,with high probability each location is accessed O(log n)times. 4Guaranteeing consensus

The algorithm described so far has a polynomially-small prob-ability of failure.In this section we show how to transform the algorithm into a fault-free algorithm in the Las Vegas sense. We do so by augmenting the two stages presented so far with an extra stage,where processors either exit with a proof that consensus is reached or else discover a failure and run an al-ternate consensus protocol.

At?rst inspection it seems that this Las Vegas addition is in itself a consensus protocol,since the processors apparently agree on whether consensus has been obtained.In fact,the processors do not have to reach consensus and thus we avoid circular reasoning.In particular,some processors may run the alternate protocol while others have a proof of consensus.The guarantee is that if any processor thinks that consensus has been obtained with value v,then any other processor that runs the alternate consensus protocol will ultimately decide on the value v.4.1The Write-Once Register

The algorithm relies on a new building block,which we call

the Write-Once-Register(WOR).A WOR is a deterministic

data structure that has the property that it can be successfully

written only once by any single processor.If two processors

attempt to write to a WOR simultaneously,then the WOR

obtains a“garbage”value.Processors arriving after the WOR

is successfully written,cannot change the value of the WOR

and can recognize whether the WOR contains a“garbage”

value or a valid one.We show how to construct a WOR from

three elementary read/write registers having only atomic reads

and writes.

Let w1,w2,and w3be the three atomic read/write regis-

ters constituting the WOR w.Each w i has two?elds:w i.value

and w i.P-id.When a processor writes to any of the registers,

it attaches its unique identi?cation number(P-id)to the value

it writes.The three registers are initialized toΛ.The WOR is

said to hold a value iff all three registers of the WOR store

identical values,including identical P-id’s.If the three regis-

ters store different values,the value of the WOR is de?ned to

be“garbage”.To write to a WOR,the processor writes to the

three registers in sequence,checking before each write that no

other processor has attempted to write the same WOR.A full

description of the WOR procedures is provided in Fig.7. Lemma15Consider a WOR w and suppose that each pro-cessor attempts to write to the WOR at most once.Suppose that two separate reads of w return the values v1and v2.If v1,v2∈{Λ,“garbage”}then v1=v2.

Proof.Assume the contrary.Without loss of generality sup-pose processor P1writes v1and that processor P2writes v2.

Also,without loss of generality assume that P1writes to w1

before P2.Then,for P1to write w3,it must be that when P1

reads w1in line9,it?nds w1=(v1,P1).Therefore,P2must reach line4after P1reaches line9.Thus,when P2reaches

line5P1has already written w2.The test in line6must fail,

and P2aborts in line7.Thus,v2can never be written in all

three registers.

4.2The estimate counting procedure

We now present a procedure by which processors are either

convinced that consensus is achieved or else are warned of a

failure and run the Alternate protocol.The procedure is com-

posed of an estimate counting stage and then an exit stage.

The pseudocode appears in Figs.8and9.

Let C be the last array of the consensus protocol from

Sect.3,where =Θ(log n).In Sect.3we showed that with

high probability a consensus value appears in90%of theγn

cells of the array C .To obtain a proof that this consensus is

obtained,the processors collectively estimate the number of 1’s and0’s in this array.The procedure employs a butter?y structure W having WOR registers at the?rst and last levels and regular registers at intermediate levels.Each level of W hasγn cells and there are logγn+1levels.Nodes of inter-mediate levels are initialized to0.

The processors?ll the levels of W in sequence.To?ll the

input level,processors copy the values from C ,substituting

a?1for0in C and a0for aΛin C .In subsequent levels,

Write-WOR(v,P id)

1read w1,w2,and w3

2if w1=Λor w2=Λor w3=Λthen//if another processor is writing 3abort

4write w1←(v,id)

5read w1,w2,and w3

6if w1=(v,id)or w2=Λor w3=Λthen//if another processor is writing nl abort

7write w2←(v,id)

8read w1,w2,and w3

9if w1=(v,id)or w2=(v,id)or w3=Λthen//if another processor is writing 10abort

11write w2←(v,id)

Read-WOR()

12read w1,w2,and w3in order

13if w1=w2=w3then

14return w1.value

15else

16return“garbage”

Fig.7.The Write-Once Register(WOR)

Estimate-Counting(for processor P i)

repeat

1if total number of operations of the processor exceeds n2then

run Alternate(Fig.10)

2k←Read-Clock

3if k>logγn then

4exit//algorithm completed 5else if k=“unde?ned”then goto line15

6for i=1to log n do

7choose j∈[1,...,γn]at random

8if k=0then//?ll inputs 9if C [j]=Λthen//cell C [j]not?lled 10W[0][j]←0//?ll the substitute0forΛ11else//C [j]has a value 12W[0][j]←2C[c log n][j]?1//substitute a?1for0 13else//subsequent levels 14W[k][j]←W[k?1][j]+W[k?1][j(k)]//sum of predecessors 15Clock-Update

Fig.8.The Estimate-Counting procedure

node W[k][j]is written with the sum of the values appearing in its two predecessors.A garbage value is treated as a0. 4.3The Exit-Stage procedure

Once the last level of W is?lled,the processors obtain the consensus value by running the Exit Stage procedure,pre-sented in Fig.9.The processors obtain the consensus value by searching for an output having an absolute value greater than γn/2.We call such an output a witness output.The sign(pos-itive/negative)of the witness output determines the consensus value.A positive witness testi?es that most of the cells of C have value1and hence the processor exits with1as the con-sensus value.Similarly,a negative witness output testi?es that most of the cells of C have value0and hence the processor exits with the value0.We prove that no two processors will ?nd witnesses with opposite signs.

4.4The Alternate procedure

The Alternate procedure(Fig.10)is reached from any of the other procedures if and when something goes wrong.Speci?-cally,the procedure is executed by any processor that performs more than n2operations in any other procedure.This is a low probability event,which indicates that something wrong has happened.

In the Alternate procedure the processors participate in an alternate consensus protocol,implemented using any existing polynomial time protocol(e.g.[4]).Note that one processor may detect a failure,while another exits the Exit Stage with

Exit-Stage(for processor P i)

Repeat

1if total number of operations of processor exceeds n2then

run Alternate(Fig.10)

2choose j∈[1,...,γn]at random//choose random output 3w←W[logγn][j]

4if|w|>1

2γn then//if witness output

5V[i]←1

2+w

2|w|

//write value as candidate exit value

6w ←W[logγn][j]//reread output 7if w=w then//if output value W[logγn][j]unchanged

8exit with consensus value v=1

2+w

2|w|

Fig.9.Exit-Stage procedure

Alternate(for processor P i)

1for all j∈{1,...,γn}in order do

2overwrite W[logγn][j]with“garbage”a//erase all outputs

3s←v P

i //use private input

4for j=1toγn do//scan the array V 5if V[j]=Λthen//a processor exited with value V[j] 6s←V[j]//use this value 7run agreement protocol of[4]using s as the private input value.

a Here,the WOR write procedure is not followed.The WOR is overwritten regardless of what is already there.

Fig.10.Alternate procedure

a value v.In this case we guarantee that all processors that participate in the alternate procedure start the process with v as their private input value.The Candidate Value Array V, initialized toΛ,helps to provide this guarantee.

4.5Analysis

4.5.1Correctness

Lemma16For any >0,there exists an array-size parame-terγsuf?ciently large such that the following holds.With high probability at most an -fraction of the inputs and an -fraction of the outputs obtain“garbage”values.

Proof.Consider an input w=W[0][j].This is a WOR regis-ter.Thus,w obtains the value“garbage”only if during the?rst time that it is written,two different processors try to write to it simultaneously(otherwise the second processor notices that the w is already written and aborts).Speci?cally,it must be the case that two processors choose w in line7(of the estimate counting algorithm)and perform lines8–12simultaneously and that this is the?rst time lines8–12are applied to cell w.We call such an input w a clobbered input.Observe that the adversary can stop processors from advancing from line7 to lines8–12.We bound on the number of clobbers that the adversary can produce.

An input cell is in one of three states:empty,successfully written,or clobbered.We call each execution of lines7–11a round.We say that a round is effective if the input cell that it chooses in line7was never selected before.We say that a round clobbers if it chooses an input cell for which the effective round that selected it is still active.We say that the round is neutral if it chooses an input that was already selected by a round no longer active.

Set = /3.Let t0be the?rst time when(1? )γn inputs are chosen(in line7).Consider the nonneutral rounds before t0.At any given time before t0there are at most n active rounds(because there are only n processors),and at least γn cells not yet chosen.Thus,for a nonneutral round R before t0, Pr[round R clobbers|round R is not neutral]

≤n

γn

.(1)

Moreover,for any two rounds,the probabilities that they clob-ber are negatively correlated,and so the Chernoff bounds still apply.Let g be the number of effective rounds started before t0,and let h be the number of clobbering rounds started before t0.By Eq.(1)and the Chernoff bound,

g+h

≥2

≤

(s+h)/( γ)

.(2)

By de?nition g=(1? )γn.Thus,for suf?ciently largeγwith high probability up until t0there are at most h= γn clobber rounds.At most n of these effective rounds do not complete before t0.Thus,at least g?h?n≥(1?3 )γn= (1? )γn cells are successfully written,as claimed.A similar proof applies to the outputs. Recall that the clock remains in each integral value for at least αn clock updates.

Lemma17For anyγthere exists a clock parameterαsuch that with high probability the following holds.For any0≤k≤logγn,at most n/log n of the cells of level k are not ?lled during round k.

Proof.The proof is similar to the proof of Lemma2.In each round,with high probability line15is reached at leastαn times.Thus,lines7–14are invoked at leastαn log n times.In line7cell j is chosen at random.The adversary can stop at most n processors from continuing from line7to lines8–14. Thus,by Lemma1,setting c2=γand =1/γ,there exists anα=c1such that with high probability at most n/log n are not?lled in the round.

Recall that a witness output has an absolute value greater than γn/2.

Lemma18 1.With array-size parameterγand clock param-eterαsuf?ciently large with high probability at least half of the outputs are witness outputs.

2.All witness outputs are either all positive or all negative.

Proof.(1)Set d=logγn+1.Consider a cell c=C [j0]from the converging array(Fig.6)and an output cell w=W[d][j1] from the estimate counting butter?y(Fig.8).

There is exactly one directed path c,w0,...,w d=w from c to w.Suppose that for all rounds k,cell w k is?lled during round k and that neither w0nor w have the value“garbage.”In this case we say that c affects w.If c affects w and c=1, then c contributes1to the sum stored in w.If c affects w and c=0,then c contributes?1to the sum stored in w.By Lemma11with high probability0.9of the cells of C store the same value.In this case,if w is affected by more than0.9 of the c’s,then it is a https://www.doczj.com/doc/0811168081.html,ing a counting argument we show that at least half the w’s are affected by0.9of the c’s.

For each output w,let I(w)be the set of cells C [j],such that C [j]does not affect w.For level k,let F(k)be the set of cells not?lled during round k.Let G in be the set of inputs having the value“garbage”and G out be the set outputs having the value“garbage”.For any output w and cell c=C [j],such that c does not affect w(i.e.,c∈I(w)),one of the following must hold:

1.w∈G out;

2.W[0][j]∈G in;

3.there exists a k and w k∈F(k)such that w k is on the path

from W[0][j]to w.

For each w k∈F(k)there are at mostγn pairs(c,w)such that w k is along the path from c to w.Let B be the set of outputs,w,such that|I(w)|≥0.1γn(these are the“bad”outputs).Thus,with Lemmas17and16,we get

w∈B 0.1γn≤

w∈B

|I(w)|

≤

w output

|I(w)|

≤

γn

k=0

|F(k)|

+|G in|+|G out|

≤γn

log n

Thus,for suf?ciently small ,|B|≤γn/2.

(2)Suppose output w is a witness.Assume without loss of

generality that w is positive.Thus,there must have been at least

γn/2+1inputs with the value1.Inputs are WOR registers.

Thus,at no time can there beγn/2inputs with value0.(Note

that this claim holds unconditionally).

The following corollary and the next two lemmata hold un-

conditionally.

Corollary19There exists a value b∈{0,1}such that:

1.b is the only value possibly written to any entry of the array

V(line5of the Exit-Stage),

2.All processes successfully terminating the Exit Stage at

line8exit with the the consensus value b,

3.b is one of the original input values.

Proof.By Lemma18,there are no two witnesses with op-

posite signs.If all witnesses are positive then set b=1and

otherwise b=0.

1.Entry V[j]is only written in line5of the Exit Stage.In

this case,it is gets value1if the witness is positive and

value0if the witness is negative.Since all witness have

the same sign,all writes to V are with the value b.

2.The exit value in line8is the same as that written in line

5to V[j].

3.By Lemma12any value appearing in Cβlog n(other than

Λ)is an input value.Thus,if all inputs are1then there is no

negative value in the?rst level of the array W(W[0][·]),

and hence no negative witness,and hence b=1.Similarly

if all inputs are0,b=0.

This completes the proof that with high probability consensus

is achieved.We now consider the low-probability event of

failure and prove that consensus is still obtained.

Lemma20Any value s used as an input value to the alternate

consensus protocol in line7of the Alternate procedure is one

of the original input values.

Proof.In line3of the Alternate procedure s is set to be V P

This value can only change in line6to a value V[j]for some j.

In this case,by Corollary19,V[j]is an input value.

Lemma21If any processor exits the Exit Stage procedure in

line8with a consensus value v,then any processor participat-

ing in the Alternate procedure enters the alternate consensus

protocol with s=v.

Proof.Suppose P1exits with the value v and P2participates in

the alternate execution.Let x=W[logγn][j]be the witness

output found by P1.Processor P1reads x twice(lines3and

6).In lines1–2of the Alternate-Route processor P2erases all

outputs.Thus,P1must have read w for the second time before

P2reached line4of the Alternate-Route.By this time,P1has

already written the value v in V[1].Thus,at line5,P2?nds

the value v in V,and uses it as its input value for the alternate

protocol(lines6–7).By Corollary19no other value can be

written to any other entry in V.

Data Structure:3arrays,A0,A1,A2,each with cn cells;A j initialized to j.

Update-Clock

1for j=0,1,2do

2sampleΘ(log n)cells of A j

3if at least80%of the sampled cells equal i then

4write j+1to A(j+1)mod3

Read-Clock

1sampleΘ(log n)cells of A0

2if at least70%of the sampled cells share the same value i then

3return i/3

4else

5return“unde?ned”

Fig.11.Phase clock of Aumann-Rabin[11]

4.5.2Work and contention

Lemma22With high probability the Estimate-Counting and the Exit Stage procedures complete in O(n log2n)total work and the Alternate procedure is not executed.

Proof.Consider the Estimate Counting procedure.The proce-dure completes when the value of the clock reaches logγn= O(log n).With high probability at most O(n)invocations of Clock-Update are necessary to advance the clock by one.Thus,the total number of times line15is performed is O(n log n).Each round(lines1–15)takes O(log n)opera-tions.At most n rounds complete after the clock reaches value logγn(one for each processor).Thus,with a high probability the total number of operations is O((n log n)+n)·O(log n)=

O(n log2n)and the Alternate-Route is not reached.

Consider the Exit-Stage procedure.By Lemma18,with high probability at least half of the outputs are witnesses.By Lemma16at most ( <1/4)of these witness output may change its value(to“garbage”).Thus,there is a constant prob-ability of a processor executing the Exit Stage procedure to hit (line3)on such a witness output which does not change its value.In this case,the processor will exit in line8.Hence,with high probability every processor completes within O(log n) rounds of lines1–7.Each round takes O(1)steps,for a total of O(n log n).

The Alternate-Route procedure may be reached in two case:

1.Less than a constant fraction of the outputs are such that

they are witnesses and their value does not change.By Lemmata18and16,this event has polynomially low prob-ability.

2.At least a constant fraction of the outputs are non-changing

witnesses,but the processor fails to?nd such an output in n2operations.Each round of lines1–7takes O(1)opera-tions.Thus,the probability of this event is≤2??(n).

Lemma23In the Estimate-Counting and Exit Stage proce-dures with high probability any memory location,except those of the Phase-Clock,is accessed at most O(log n)times. Proof.Consider the Estimate-Counting procedure.The array C is accessed in lines9and12during the?rst phase.The?rst level of the butter?y is accessed in lines10and12,also during the?rst phase.For subsequent levels,level k is accessed in line14,during phases k and k+1.In each round(lines1–15), each of the above mentioned lines is executed O(log n)times. With high probability each phase consists of O(n)rounds of lines1–15.Thus,the total number of accesses to any single level is O(n log n),and these are to random locations.Hence, by Lemma7,with high probability each location is accessed O(log n)times.

Consider the Exit-Stage procedure.By Lemma18,with high probability at least half of the outputs are non-changing witnesses.In this case,with high probability for each proces-sor,after at most O(log n)rounds of the Exit-Stage procedure, it?nds a non-changing witness output and exits.Each round accesses O(1)memory locations and there are n processors. Hence,by Lemma7,with high probability each location is accessed O(log n)times. 5Contention on the phase clock

In this section we analyze the contention on the Phase Clock, and show how its hot-spot contention can be reduced to O(log n).In order to do so,we must?rst describe the structure of the Phase Clock,as presented in[11].

The Aumann-Rabin Phase Clock.The Phase Clock as de-scribed in[11]is composed of three arrays A0,A1,and A2, each containing cn=Θ(n)cells.The clock arrays drive each other in a circular fashion.Initially,all cells of A0are initial-ized to0,those of A1to1and those of A2to2.Now,the value 2,in A2,will start driving the value of A0from0to3.This, in turn,drives the value of A1from1to4,and so forth.The Phase Clock supports two procedures:Read-Clock,which returns the current value of the clock,and Update-Clock, which allows processors to participate in advancing the clock. The procedures are presented in Fig.11.

Reducing the https://www.doczj.com/doc/0811168081.html,ing the Phase Clock as described so far results in O(log2n)hot-spot contention.The reason is that during each phase each array is accessed?(n log n) times,and the same three arrays are reused again and again

Data Structure:3k arrays,A0,...,A3k?1,each with cn cells;all initialized to0.

Update-Clock

1 ←Read-Clock

2for j=0,1,2do

3sampleΘ(log n)cells of A3 +j

4if at least80%of the sampled cells equal3 +j then

5write3 +j+1to A3 +j+1

Read-Clock

Static Variable: -most recent known clock value(remembered from previous

execution);initialized to0

repeat

1sampleΘ(log n)cells of A3( +1)

2if at least70%of the sampled cells share the same value3( +1)then

3return = +1

4else

5return

6until =k+1

Fig.12.“Unfolded”phase clock

for all O(log n)phases.In order to reduce the contention to O(log n)we“unfold”the clock construction,as follows.Sup-pose we want to count for k phases.Then,instead of reusing the same three arrays,we keep3k distinct arrays,A0,...,A3k?1. All arrays are initialized to0.The arrays drive each other in a sequential manner,with A0driving the values written in A1from0to1,which,in turn,drives the values in A2to2, and so forth.The value of the clock is de?ned as the high-est number i,for which the array A3i is?lled,i.e.,at least 70%of the nodes hold the value3i.The resulting code for Read-Clock and Update-Clock is provided in Fig.12.

As can be seen,the code for the unfolded clock is analogous to the original one,except that in each phase the processors operate on a different triplet of array(A3 ,A3 +1and A3 +2 for phase ).The only difference is that processors must?rst determine the triplet of arrays on which to operate.In the Update-Clock procedure,the processor does this by?rst performing a Read-Clock to?nd the current phase.For the Read-Clock procedure,the processor samples the arrays in an increasing order,until it?nds the?rst array(whose in-dex divides by3)that is not?lled.For ef?ciency reasons the search for the last?lled array does not begin with A0,rather from the last array already known to be?lled.This guaran-tees that while some invocations of Read-Clock may take O(k log n)steps,the amortized cost is O(log n).

We thus obtain behavior analogous to that of the origi-nal[11]clock.In particular,with high probability advancing the clock from one value to the next takesΘ(n)invocations of Update-Clock.

Lemma24Suppose that the Unfolded Phase Clock is used such that each invocation of Read-Clock is coupled with an invocation of Update-Clock.Then,with high probability any shared-memory location is accessed O(log n)times. Proof.Consider an array A i,i=3 +j,0≤j≤2.The array A i may be accessed in the following cases:

1.During phases and ?1,invocations of Update-Clock

and Read-Clock reading and writing A i.With

high probability there areΘ(n)invocations of Update-Clock before the any phase is over.Since each Read-Clock is coupled with a Update-Clock,with high probability there are also only O(n)invocations of Read-Clock during these phase.Thus,the total num-ber of such invocations is O(n).Each such invocation accesses at most O(log n)random location of the array.

2.Invocations by tardy processors,acting after the comple-

tion of phase .Each processor performs one such invo-cation before realizing that the phase is over.Thus,the total number of such tardy invocations is O(n).Each such invocation accesses at most O(log n)random location of the array.

Thus,with high probability there are a total of at most O(n log n)accesses to any array A i,and these are to random locations.Hence,by Lemma7,the result follows. Corollary25In the procedures Weak-Random-Coins,Trans-fer,Convergence,Estimate-Counting,and Exit-Stage,with high probability any shared memory location of the Phase Clocks is accessed O(log n)times.

6Conclusions

We thus obtain the following:

Theorem1The procedures described in Sect.2-5together guarantee consensus.With high probability the total work is O(n log2n),the hot spot contention is O(log n)and the con-tention cost is O(n log3n).The same bounds also hold for the expected total work,hot-spot contention and contention cost, respectively.

Proof.First,we prove correctness.If no processor exits the Exit-Stage procedure in line8,then all processors participate in theAlternate procedure and correctness follows from that of the protocol of[4].Otherwise,by Corollary19all processors exiting in line8of the Exit Stage,have the same consensus

value v,which,by Lemma21,is also the input value for all

processors participating in the Alternate procedure.Thus,v

must also be the output of all processors participating in the

Alternate procedure.

The high-probability bound for total work follows from

Lemmata6,13,and22.The high-probability bound for the

hot-spot contention follows from Lemmata8,14,23,and

Corollary25.For the contention cost,note that for any shared-

memory variable v,if the total number of accesses to v is c

then the contention cost of any operation on v is also bounded

by c.With high probability any memory location is accessed

at most O(log n)times,and hence with high probability the

contention cost of any operation is O(log n).With high prob-ability the total number of operations is O(n log2n).Hence, with high probability the total contention cost is O(n log3n).

Next,we prove the bounds on the expected total work.With

high probability the total work is O(n log2n).Consider the low-probability event that the work is more than O(n log2n).

This event occurs with probability

to any constant.In this low probability event,the expected

total work is still bounded by O(n4)per processor,based on the complexity of the alternate protocol which uses[4].Thus,

setting c>5,the low probability event adds at most a constant

to the total work,which thus remains O(n log2n).Similarly,

the hot-spot contention in the low probability event is bounded

by O(n),and hence the expected hot-spot contention remains

O(log n).The expected contention cost in the low probability event is bounded by O(n·n4),and hence the overall contention cost remains O(n log3n).

Open problems.Several problems remain open:

?Can the total work be further reduced while keeping the contention low?Speci?cally,can the total work be reduced to O(n log n)without increasing the asymptotic hotspot contention?

?Can the contention cost be reduced?

?Is it possible to obtain analogous low contention results for one of the stronger adversary schedulers(the Write Oblivious or the Local Oblivious)?

Acknowledgements.We are grateful to the two anonymous referees for many helpful and constructive comments.The presentation of this paper has bene?ted much from their comments.

References

1.Abrahamson K:On achieving consensus using shared memory.

In:Proceedings of the7th Annual ACM Symposium on the Principles of Distributed Computing(PODC),1988,pp291–302

2.Aspnes J:Lower bounds for distributed coin-?ipping and ran-

domized consensus.Journal of the ACM45(3):415–450(1998) 3.Aspnes J:Randomized protocols for asynchronous consensus.

Distributed Computing16(2-3):165–175(2003)

4.Aspnes J,Herlihy M:Fast randomized consensus using shared

memory.Journal of Algorithms11(3):441–461(1990)

5.Aspnes J,Waarts O:Randomized consensus in expected

O(N log2N)operations per processor.SIAM Journal on Com-puting25(5):1024–1044(1996)

6.Attiya H,Lynch N,Shavit N:Are wait-free algorithms fast?In:

Proceedings of the31st Annual Symposium on the Foundations of Computer Science(FOCS),1990,pp55–64

7.AumannY:Ef?cient asynchronous consensus with the weak ad-

versary scheduler.In:Proceedings of the SixteenthAnnualACM Symposium on Principles of Distributed Computing(PODC), 1997,pp209–218

8.AumannY,Bender MA:Ef?cient asynchronous consensus with

the value-oblivious adversary scheduler.In:Proceedings of the 23rd International Colloquium on Automata,Languages,and Programming(ICALP),1996,pp622–633

9.Aumann Y,Bender MA,Zhang L:Ef?cient execution of non-

deterministic parallel programs on asynchronous systems.In-formation and Computation139(1):1–16(1997)

10.Aumann Y,Kapah-Levy A:Cooperative sharing and asyn-

chronous consensus using single-reader single-writer registers.

In:Proceedings of the10th ACM-SIAM Annual Symposium on Discrete Algorithms(SODA),1999,pp61–70

11.Aumann Y,Rabin MO:Clock construction in fully asyn-

chronous parallel systems and pram simulation.Theoretical Computer Science128:3–30(1994)

12.Ben-Or M:Another advantage of free choice:Completely asyn-

chronous agreement protocols.In:Proceedings of the second annual ACM symposium on Principles of distributed comput-ing,1983,pp27–30

13.Bracha G,Rachman O:Randomized consensus in expected

O(n2log n)operations.In:Proceedings of the5th Interna-tional Workshop on Distributed Algorithms(WDAG),vol.579 of Lecture Notes in Computer Science,1991,pp143–150 14.Chandra TD:Polylog randomized wait-free consenus.In:Pro-

ceedings of the15th ACM Symposium on Principles of Dis-tributed Computing(PODC),1996,pp166–175

15.Chor,B,Israeli A,Li M:Wait-free consensus using asyn-

chronous hardware.SIAM Journal of Computing23(4):701–712(1994)

16.Dolev D,Dwork S,Stockmeyer L:On the minimal synchronism

needed for distributed consensus.Journal of theACM34(1):77–97(1987)

17.Dwork C,Herlihy M,Waarts O:Contention in shared memory

algorithms.Journal of the ACM44(6):779–805(1997)

18.Dwork C,Shmoys D,Stockmeyer L:Flipping persuasively

in constant time.SIAM Journal of Computing19(3):472–499 (1990)

19.Fischer MJ,Lynch NA,Paterson MS:Impossibility of dis-

tributed commit with one faulty process.Journal of the ACM 32(2):374–382(1985)

20.Herlihy M:Wait-free synchronization.ACM Transactions on

Programming Languages and Systems13(1):124–149(1991) 21.Loui M,Abu-Amara H:Memory requirements for agreement

among unreliable asynchronous processes.Advances in Com-puting Research4:163–183(1987)

22.Martel C,Park A,Subramonian R:Asynchronous PRAMs are

(almost)as good as synchronous PRAMs.In:Proceedings of the31st Annual Symposium on the Foundations of Computer Science(FOCS),1990,pp590–599

23.Rabin MO:Randomized Byzantine generals.In:Proceedings

of the24th Annual Symposium on Foundations of Computer Science(FOCS),1983,pp403–409

24.Saks M,Shavit N,Woll H:Optimal time randomized consensus

–making resilient algorithms fast in practice.In:Proceedings of the2nd ACM-SIAM Annual Symposium on Discrete Algo-rithms(SODA),1991,pp351–362

Appendix

Appendix A:Weak adversary schedular models

In this section we provide a review of the different weak ad-versary scheduler models,and provide a comparison of their relative strengths.For comparison,it is worthwhile to recall the de?nition of the strong scheduler:

The Strong Scheduler:At each point in time,the strong scheduler chooses the next processor to operate.If a pro-cessor is granted the right to operate,the processor per-forms its enabled atomic operation.The adversary sched-uler can base its choice on the entire history of the state of the system including the protocol,the processors’input values,the history of all operations performed,and all val-ues stored anywhere in the system’s memory.In particular, the outcome of a coin toss is available to the scheduler im-mediately after the coin is tossed,even if the outcome is only stored in private memory.The only restriction on the scheduler’s knowledge is that the scheduler cannot foresee the outcome of future coin tosses.

The weak adversary models place limitations on the infor-mation available to the scheduler.These limitations generally prevent the scheduler from learning certain information about the outcomes of random coin tosses.The code of the protocol and the processors’input values are available to the scheduler in all weak models.

Oblivious Scheduler:This scheduler has no knowledge of the outcomes of coin tosses,ever.Thus,effectively the en-tire schedule must be determined in advance.The Obliv-ious adversary is the standard adversary used in the APRAM model(e.g.[22]).

Value-Oblivious Scheduler:This scheduler is used in this paper and in[10].The scheduler does not get direct access to outcomes of coin tosses.Rather,the scheduler can learn the outcome of a coin toss only insofar as the outcome affects the dynamics of the execution,i.e.,what operations are executed and what memory locations are accessed.

Enabled operations are known to the scheduler.Write-Oblivious Scheduler:This scheduler is used in[14]and[7].The outcome of a coin toss is hid-den from the scheduler until some processor reads its value.Writing the outcome of a coin toss does not reveal its value,and similarly copying this value from private memory to public memory does not reveal its value.

Enabled operations are known to the scheduler.The rational for this scheduler is similar to that of the Value-Oblivious https://www.doczj.com/doc/0811168081.html,ly,as long as no processor reads the value,the value cannot affect the dynamics of the execution and hence does not affect the schedule. Local-Oblivious Scheduler:This scheduler is used in[15]and[1].The scheduler knows the outcome of a coin toss as soon as it is written to shared memory.

However,local operations based on the outcome of the coin toss are hidden from the scheduler.Also,if the value written to shared memory depends on the outcome of

a coin toss(e.g.,it is x if heads and y if tails),then

the scheduler does not know the value until after it is written to shared memory.(In[1]the model is described as allowing an atomic p-write operation,i.e.,write with probability p and do nothing with probability1?p.For the algorithms of[15]and[1]this model is equivalent to Local-Oblivious.)

The Oblivious Scheduler is strictly weaker from all other schedulers mentioned here.By de?nition,the Value-Oblivious scheduler is weaker than the Write-Oblivious scheduler.The Value-Oblivious scheduler is also weaker than the Local-Oblivious scheduler.The reason is that it is possible to con-vert any protocol for the Local-Oblivious into a protocol for Value-Oblivious,by replacing any sequence of randomized lo-cal steps resulting in a speci?c output value v by a correspond-ing sequence of deterministic steps,coin tosses,and arithmetic operations resulting in the same output value.The detailed are beyond the scope of this paper.The Write-Oblivious and the Local-Oblivious schedulers are not directly comparable.On the one hand,the Local-Oblivious scheduler knows the out-come of a coin toss immediately upon writing it to shared memory,unlike the Write-Oblivious scheduler,which hides the value until it is read.On the other hand,local operations using the outcome of a coin toss would reveal its value to the Write-Oblivious scheduler,whereas this information is hidden from the Local-Oblivious scheduler.

潮流英语

People don’t always need advice. Sometimes all they need is a hand to hold, an ear to listen, and a heart to understand. 人们并不总是需要建议。有时候，人们只是需要有人来执手支持，有人来侧耳倾听，有人来读懂心声 Me? Weird? Please, I'm limited edition. 我？很怪异？拜托，我可是限量版 Never expect, never assume, and never demand. Just let it be, because if it's meant to be, it will happen the way you want it to... 永不期待，永不假设，永不强求。顺其自然，若是注定发生，必会如你所愿 How much you put into a relationship determines how much it will hurt when it ends. 一段感情里，你付出了多少，在结束的时候，就决定了你有多伤 Sometimes she'll wait and see if you will text her first just to see if you really care. 有时，女生会等着，看你会不会先给她发短信，来判断你是否真的在乎 You can love someone easily. But it's not easy to love someone unconditionally. 爱上一个人很容易，但无条件的爱则很难 I don’t ever need anyone to promise me that they will never hurt me because sooner or later... it will happen. 不需要任何人承诺说，不会伤我。因为迟早，这都会发生 The best gift you can give to someone is your time, because you're giving them something you can never get back. 你能给别人的最好礼物就是你的时间，因为你给了他们你再也得不到的东西 There are many words that I want to say to you, but I don’t know how…有很多话我想跟你说，却不知从何说起 Take a risk. If the outcome isn’t what you expected, at least you can say you tried. 冒险一试。就算结果不是你所期望的，至少你可以说你已经试过了 What makes you different makes you beautiful. 你的与众不同让你美丽动人。 It’s the rule of life that everything that you have always wanted comes at the very second you stop looking for it. 你一直渴望得到的东西，却在你即将放弃的那一刻不期而至，这就是生活的法则 Don’t expect someone to stay sweet forever because even the sweetest candy has an expiration date. 不要期待有人会一直甜美可爱，因为就算是最甜的糖果也有保质期 A real woman avoids drama, because she knows her time is precious and she's not wasting it on unimportant things. 一个真正的女人会避免小题大做，因为她知道时间宝贵，不会任之浪费在无谓的事情上 When you’re up, your friends know who you really are. When you’re down, you know who your friends are. 当你春风得意之时，你的朋友都知道你是谁；当你落魄潦倒之时，你就知道谁是你的朋友了 Don’t need a perfect boyfriend; I just want someone who won’t give up on me. 不需要完美的男朋友，我只想要一个不会放弃我的人 Love turns the smart to foolish, the evil to kind, and the weak to strong. 爱，让智者变愚钝，让恶人变善良，让懦夫变坚强

潮流英语026

世界真的很小，好像一转身，就不知道会遇见谁…世界真的很大，好像一转身，就不知道谁会消失…The world is really small, when you turn around, you don’t know who you’re gonna encounter with…The world is really big, when you turn around, you don’t know who’s already gone. Don’t look back——you’re not going that way! 别回头——那不是你要走的方向！ Your presence is a present to the world. 你的存在是献给世界的一份礼物。 Every story has an ending.but in life, every end is a new beginning.每段故事都有一个结局。但是在人的一生中，每一个终点同时也是一个新的起点。 I love you not because of who you are,but because of who I am when I am with you.我爱你，不是因为你是一个什么样的人，而是因为我喜欢与你在一起时的感觉。 You’re unique and one of a kind. 你是唯一的，无可替代。 Whenever you say “I love you”, please say it honestly. 无论何时说“我爱你”，请真心实意。 In fact, just intoxicate wine does not drink it remembered that unbearable past.其实酒不醉人，只是在喝的时候想起了那不堪的过去。 For in all adversity of fortune the worst sort of misery is to have been happy. 在所有不幸中，最不幸的事是曾经幸福过。 It takes a minute to have a crush on someone, an hour to like someone and a day to love someone ，but it takes a lifetime to forget someone.一分钟心动，一小时喜欢，一天爱上。忘记他，却是一辈子。

时尚潮流英文句子

时尚潮流英文句子 1、Fashion,isakindofaestheticview.Brotherisapunk,yousatisfied 时尚，就是一种审美观。哥就朋克，你不服吗? 2、Wheatcolorskintoahealthysenseofvitality,wearingNikeacomple tesetofpurewhitepinkedgesportswear,thetinycurlybrownhairti edinarelaxedandlivelybraids,alwaystheconfidenceofcuteexpres sions. 小麦色的皮肤给人一种健康活力的感觉，穿着耐克的一整套的纯白带粉色边运动服，微卷的褐色头发扎成一个轻松活泼的辫子，总是那自信可爱的表情。 3、Devilkillerbody,alargewavygoldenhairshine,slenderlegswearin gayellowgooseminiskirts,showfigureoftheperfect. 魔鬼般惹火的身材，一头大波浪形金黄卷发发出耀眼的光芒，修长的大腿穿着一条鹅黄色的超短迷你裙，显出身材的完美绝伦。 4、Whitecapsetheruplonghairandhalfofhisfaceisobscured,butfelts hemustbeverybeautiful,breathtakingbeauty!

白色的鸭舌帽把她那盘起的长发和半张脸都给遮住了，但能感觉出她一定很漂亮，惊人的漂亮! 5、Fashionisnotonlyakindofappearance,oraninner,popularmaynot besuitableforyou,butaccordingtotheirowncharacteristicstodre ssupyourself,youbelongtothatkind,mature,lady,orsimpleandna tural,orpure,ormovement,afactthatcanallbefashionable. 时尚不光是一种外表，还是一种内在，时下流行的不一定都适合你，而是要根据自身的特点来打扮自己，看你属于那一种类型，成熟的，淑女的，还是简单自然的，还是清纯的，还是运动的，其实那一种都可以是时尚的。 6、Hip-hop,cowboywind,andthewindwindwind,occupation,fur,all-match,hippie,ladiesfashion,Korean,Japanese,whatisitFashionist heurbanspeciallogo,isacityinthevastcityofspecialpsychological needs. 嘻哈风、牛仔风、欧美风、职业风、皮草风、百搭、嘻皮、淑女、韩流、哈日，（next88）时尚到底是什么?时尚其实是都市特殊的标志，是都市人在纷繁芜杂的城市中特殊的心理需要。 7、Wearingablueskirt,microstripwheatcolorskinlookssohealthy,bl ackhairlikeawaterfallverticallyovertheshoulders,withareddishfa

潮流英语

?Never say you're not good enough, if that person can't see you for who you are, they are the ones who aren't good enough. 永远别说自己不够好，如果那个人无法看清真实的你，那他们也不怎么样。 ?Sometimes that mountain you've been climbing is just a grain of sand. What you've been out there searching for forever is in your hands. 有时候，你攀登的山峰也许只是一粒沙子。而你一直找寻的永远，不经意已经在你的手上。 ?People say, we'd rather have something than nothing at all. Truth is, to have something halfway is harder than having nothing at all. 人们总说，有总比没有好。事实是，不能彻底的拥有，比一无所有更难受。 ?I used to believe in forever, but forever is too good to be true. 我曾相信永远，但永远太美好，而不真实。 ?The hardest part of moving on is not looking back at the memories you once shared with someone. 分手之后继续前行中最艰难的部分就是不去回首那些你曾与某人共同享有的回忆。 ?Happiness depends upon ourselves. –Aristotle幸福，由我们自己掌控。 ?Never flaunt what you have - there's always someone who has more than you do. As soon as you start to become arrogant, losing everything could be right around the corner. 永远不要炫耀自己所拥有的，总有人比你富足。一旦你开始狂妄，离失去这一切就不远了。 ?No matter how much life sucks, if you have good friends, they can make you laugh. 无论生活多么操蛋，要是你有帮好朋友，他们也会逗你开心。 ?You don't have to be a size zero to be pretty. Beauty should be judged by the size of your heart. 身材纤细不一定美丽。美丽应该由你心胸气度决定。 ?You can't lose what you never had, you c an't keep what’s not yours and you can't hold on to something that does not want to stay. 从未拥有过，你就无法失去。本不属于你，你就无法拥有。本不想为你停留，你就无法坚持挽留。 ?I really love the people never will I go, no matter how much trouble.真心爱我的人永远不会我走，不管遇到多大的困境。 ?t’s a great and terrible life. Don’t take it too seriously because none of us get out alive. 生活既美好又糟糕。别太认真，因为没有人能活着离开这个世界。 ?Share a Smile, pass it along and brighten another person's day! 分享笑容，传递笑容，照亮他人的一天。

史上最潮流英文

Some songs can make you sad&cry when you hear them. But it’s actually not the song that makes you cry, it’s the people behind the memories. ------ 总有那么一些歌，让我们悲伤，让我们哭泣。但其实让我们哭泣的并不是那些歌本身，而是藏在回忆里的那些人。 The greatest hurt in the world is not the pain in love but the betray you give me when I devote my heart to you. ------ 世界上最心痛的感觉，不是失恋，而是我把心给你的时候，你却在欺骗我。 The game of love, you will lose once you take it seriously. 爱情这个游戏，你一认真就输了 Miss, the same word, both a miss, but also missed. I miss you most when I realized that I missed you. ------ Miss，同一个单词，既是想念，也是错过。在我最想念你的时候，我才发觉我错过了你。 It's very easy to hurt someone and then say "sorry",but it's really very difficult to get hurt ans say "I'm fine". ------伤害他人然后说“对不起”是一件很容易的事，然而，受伤害后说“我没事”却是一件很难的事。 The greatest hurt in the world is not the pain in love but the betray you give me when I devote my heart to you. ------ 世界上最心痛的感觉，不是失恋，而是我把心给你的时候，你却在欺骗我。 It just hurts, that’s all. Sometimes a little discomfort in the beginning can save a whole lot of pain down the road. ------ 只是感觉有点受伤，没什么。有时侯，起初的隐忍，可以避免一路的疼痛。 Not easily cut open to irrelevant, because others are hilarious, and the pain is yourself. ------ 不要轻易把伤口揭开给不相干的看，因为别人看的是热闹，而痛的却是自己。 The reason why people worry are four sentences:you don't let go,you can't bear,you don't see through and you can't forget.You can convince all of the people, but failed to convince yourself. 人的烦恼不过就4句话：放不下，想不开，看不透，忘不了。你可以说服所有人，却说服不了自己。 I learned to have the mask smile, even if I were unhappy. I'm tired of people who judge me without knowing my history. ------ 我学会了带着面具微笑，即使我并不开心。我讨厌那些不了解我，却对我指手画脚的人。 You are the reason why I became stronger.But still，you are my weakness. ------ 因为你，我懂得了成长，可你，依旧是我的伤。 A man is not old until regrets take the place of dreams.Delay is the deadliest form of denial.------遗憾是一生最初的苍老。拖延，其实就是最彻底的拒绝。

潮流英语

Meeting you was fate, becoming your friend was a choice, but falling in love with you was beyond my control.——遇见你是命运的安排，成为了朋友是我的选择，而爱上你是我无法控制的意外。 Don’t complain that life is too unfair to you just because life simply does not know who you are .——不要抱怨生活对你不公平，因为生活根本就不知道你是谁。 I can accept failure but I can't accept not trying. ( Michael Jordan) 我可以接受失败，但绝对不能接受未曾奋斗过的自己。 Memory is a wonderful thing if you don't have to deal with the past。------回忆本来是非常美好的，只要你能让过去的都过去。 It's better to try hard to love yourself more than to wait someone to love u. if today u don't like yourself more than yesterday, so what's the meaning of tomorrow? -----与其等别人来爱你，不如自己学着努力多爱自己一些，如果今天的你没有比昨天更喜欢自己，那明天对你来说又有什么意义呢？【十二星座的英文怎么说】白羊座（Aries）、金牛座（Taurus）、双子座（Gemini）、巨蟹座（Cancer）、狮子座（Leo）、处女座（Virgo）、天秤座（Libra）、天蝎座（Scorpio）、射手座（Sagittarius）、摩羯座（Capricorn）、水瓶座（Aquarius）、双鱼座（Pisces）。 I want to be his favorite hello and his hardest goodbye. 我要成为他最心动的相遇，最不舍的离别。 Make up your mind to act decidedly and take the consequences. No good is ever done in this world by hesitation. 下定决心，果断行动，并承担后果。在这世界上犹豫不决成就不了任何事。 It takes being away from someone for a while, to realize how much you really need them in your life. 有的时候，你需要离开某人一段时间，才会发现自己有多需要他。Never be dependent anyone in this world . Because even your shadow leaves you when you’re in the dark. 在这个世上不要过分依赖任何人，因为即使是你的影子都会在某些时候离开你。 Sometimes it sucks being strong .Because when people know that you are strong, they

潮流英语4

When you’re scared but you still do it anyway, that’s brave.即使你害怕，但依然去做，这就是勇气。 That's the worst part. Liking someone so much and knowing he'll never feel the same way. 世界上最悲哀的事情是，你深深的恋上一个人，但心里却清楚得很，他不可能给你同样的回应。 If you hate me, I don't really care. I don't live to please you. 如果你讨厌我，我一点也不介意。我活着不是为了取悦你。 If you have great talents, industry will improve them; if you have but moderate abilities, industry will supply their deficiency. 如果你很有天赋, 勤勉会使你完善; 如果你能力一般, 勤勉能补足你的缺陷 Life consists not in holding good cards but in playing those you hold well. 生活不在于你有一副好牌，而在于你如何打好手上的这副牌。 Nothing is permanent in this wicked world - not even our troubles.?这个世界上没有东西是永恒的，麻烦也不会例外。- Charlie Chaplin

Real love is a permanently self-enlarging experience.?真爱是一种永久性自我放大的体验。- M. Scott Peck verything will be okay in the end. If it's not okay, it's not the end. / 每件事最后都会是好事。如果不是好事，说明还没到最后。 Giving up doesn't mean you're weak, sometimes it means you're strong enough to let go. -------- 放弃并不总意味着你软弱，有时反而说明你足够坚强去舍弃。 The very remembrance of your former misfortune proves be a new one. 对于过去不幸的记忆，构成了新的不幸。People kiss because they have feelings for each other. (人们接吻是因为彼此有感觉。) It's not always about dressing up, but it‘s more about expressing ourselves. 时尚不只总是乔装打扮穿大牌，它更多是种自我表达。 If you love somebody, let them go. If they return, they were always yours. If they don't, they never were. 如果你真的爱上谁了，就放他走，如果他回来了，他永远是你的，如果没回来，他就从来未属于过你。 1. Don't take it to heart. 别往心里去！

潮流英文单词

low key 低调 I’ve been back and forth.我犹豫不定。 squeezed juice 鲜榨的果汁juice with pulp 带果肉的果汁 side effect 副作用 he can’t come to the phone now.他现在不能接电话 herbal tea 花草茶ready for a refill?我再给你倒一杯吧？ I love what u have done with this place.我喜欢这里的布置。what was tonight?今晚本来要做什么？I can’t feel my hands.我手麻了。have an affair 外遇 will anyone miss me if i weren’t here?我在不在这里有什么区别吗？ I saw a lot of stuff.我大开眼界了、call security 通知警卫dog walker 遛狗的人does sth. mean squat to u?对你来说sth狗屁不是吗？what’s up with the greedy?怎么这么贪啊？work an extra shift 多轮一班 go on, i dare u!有种你就去！ u r a freak!你这个变态！ I sensed it was u.我感觉到是你了、 I apologize on behalf of him.我替他道歉。 why are u changing the subject?为什么要转移话题？ this is so meant to be!这就是天意！ th ere’s no need to place blame.没有指责的必要。 curling iron 卷发机 it’s gonna leave a stain。这要留印子的。 I have part of the fault.我也有责任。 distract her with a doll 拿娃娃哄她开心 they are all well received 收到的反响都很好 talk u up 说你的好话stand firm to 努力坚持I was just leering 我只是用余光看看 organize my thoughts 整理思绪get a little preoccupied 事先有事 no way to recover 没有掩饰的机会了 bouncy 活泼Intern 实习生mug抢劫drug dealer 毒贩子 admire your candor你还真胆大we are rolling摄像机正在拍摄hairnet发罩 go through this stack 看看这一叠r u spying on me？你监视我？just messing with u！跟你开玩笑呢！ enough is enough！闹够了flyers 寻人（物）海报it’s insensitive of me。我这么做很伤人 u don’t have to be brag。拽什么啊？nod along 跟着点头 a totally separate subject 完全题外话 I thought it was the other way around 我以为是反过来的close my account 注销银行卡 cuff him 把他铐起来 Woody，tingly 痒creep me out 雷死我了no peeking不要偷看啊sneakbite kit毒蛇解药 I feel wild today 我今天好亢奋！I’m kind of beat 我有点累了my ears r ringing so bad.我耳鸣得厉害。 can u get the door?你能去开门吗make a huge fool of myself出了洋相r u

最潮英语单词 eg

山寨copycatting “山寨”是依靠抄袭、模仿、恶搞等手段发展壮大起来，反权威、反主流且带有狂欢性、解构性、反智性以及后现代表征的亚文化的大众文化现象。 This Chinese term literally refers to the mountain strongholdsof bandits. First borrowed to describe rip-off products, it has evolvedto refer also to home made products, such as video parodies of movies。

囧be sunk/sunken 网义：郁闷、悲伤、无奈、无语等等，示意很好很强大，指处境困迫，喻尴尬，为难。 This is an ancient Chinese character, pronounced jiong. It means "light shining through a window". Young Chinese use it to express embarrassment, or a bad mood. Look at the character. Doesn't it look like a disappointed face?

很黄很暴力very pornographic,：[,p?:n?'gr?fik]very violent 网络流行语，语出2007年12月27日CCTV新闻联播一则关于净化网络视听的新闻里，一个名叫张殊凡的小学生接受央视记者采访时说道：“上次我上网查资料，突然弹出来一个网页，很黄很暴力，我赶紧把它给关了。” During a CCTV interview about a new Internet censorshipregulation, a girl said that an uncensored Web page once popped up onher computer. She called it "very pornographic, very violent". Some believe the girl was told to say it by CCTV, so it is now used to make the way the network covers news。

潮流英语

1、 Life is like an ice cream. Enjoy it before it melts.?生活就像是冰激凌。趁着它还没化掉，快点享受它吧。「melt:vt.vi (使)消散, 消失, 融化，溶化」 2、You have to be empty in order to be full. --Tom Shadyac ?清空你自己，方能再行注满。（空杯心态） 3、One of the greatest victories you can gain over someone is to beat him at politeness. — Josh Billings?与人斗的最大胜利之一：完全礼貌的，非常有修养的，打败了对方。 4、Even if my day is going bad, you always make things better.?就算我的日子变糟糕了，你也总能让一切都好起来。 5、A man can fail many times, but he isn’t a failure until he begins to blame somebody else. — John Burroughs?一个人可以失败很多次，但他并不是一个失败者，除非，他开始怪罪其他人。 6、Think big thoughts, but relish small pleasures.—Jackson Brown?想大大的思想，但享小小的乐趣。「relish: vt.欣赏，享受?reli?」 7、Too many young folk have addiction to superficial things and not enough conviction for substantial things like justice, truth and love. -Cornel West?太多年轻人为一些流于表面的事物而沉迷，而对那些有实质内涵的事物，比如正义，真理与爱，缺乏坚定的信仰。-美国哲学家康耐尔·维斯特 8、The best kind of kisses are unexpected, unplanned ones. The ones that come naturally, like in the middle of sentence.?最好的一种吻，是出其不意的，没有预谋的，自发自然的，就像话刚说到一半。。 9、Here is a simple but powerful rule: Always give people more than what they expect to get. — Nelson Boswell?给你一条简单而强大的法则：总是给的多于人们的意料。 10、We may love the wrong person, cry for the wrong reason. But one thing is sure, mistakes help us to find the right person.?可以爱错一个人，可以为一个错的原因而傻傻哭泣，但有一件事情千真万确——犯过的错会帮我们找到那个对的人。 11、Nothing is more expensive than a missed opportunity.— H. Jackson Brown ? 没有什么比一个错失的机会更昂贵了。 12、There’s no right or wrong when one chooses to be happy. It’s just a battle between one’s own happiness and the judgment of others.?一个人决定快乐点，不存在对错问题。只是在自己的快乐与别人的评论之间，有一场较量。 13、 A person who is nice to you, but rude to the waiter, is not a nice person.— Dave Barry?一个人如果对你很有好，却很粗暴的对待服务生，不会是个（人品）很好的人。

潮英语

潮英语 ●When a thing is done, it's done. Don't look back. Look forward to your next objective. 如果事情结束了，那就是结束了，别回头，冲着你下一个目标去吧 ●Sometimes when I say "I'm ok" I just want some one to look me in the eyes, hug me tight, and say, "I know you're not." 有时候，当我说“我很好”的时候，其实我希望有个人能看穿我的眼睛，紧紧的抱着我说：“我知道你并不好。" ●Whatever with the past has gone, the best is always yet to come. 无论过去发生什么，最好的尚未到来。 ●Do what makes you happy. Be with who makes you smile. Laugh as much as you breath. Love as long as you live. 做让你开心的事，交能逗你乐的朋友；像呼吸一样频繁地开怀笑，像生命一样长久地全心爱。 ●If you can't understand my silence, you will never understand my words. 如果你不懂我的沉默，你也永远不会明白我说的话语。 ●Silence is a girl's loudest cry. 沉默是一个女孩最大的哭声。 ●Keep your face always to the sunshine and the shadows will fall behind you.-Walt Whitman 永远面朝阳光吧，阴影就会被甩到后面。 ●When you need someone to listen, When you need a hug, When you need someone to hold your hand, I'll be there......... 当你需要有人倾听的时候，我就在这里；当你需要温暖的怀抱的时候，我就在这里；当你需要有人牵你的手，我就在这里。当你需要有人为你擦去伤心的泪水，你知道吗？我就在这里。 ●The world makes way for the man who knows where he is going 如果你明确自己的方向，世界也会为你让路。 ●If you have to lose something, the best way to keep it is in your memory. 当你不可以再拥有的时候,你唯一可以做的，就是令自己不要忘记. ●Nobody can go back and start a new begining, but anyone can start now and make a new ending. 没有人可以回到过去重新开始，但谁都可以从现在开始，书写一个全然不同的结局。

潮流英文词组

预约券reservation ticket 下午茶high tea 微博Microblog/ Tweets 裸婚naked wedding 亚健康sub-health 平角裤boxers 愤青young cynic 灵魂伴侣soul mate 小白脸toy boy 精神出轨soul infidelity 人肉搜索flesh search 浪女dillydally girl 公司政治company politics 剩女3S lady（single，seventies,stuck）/left girls 山寨copycat 异地恋long-distance relationship 性感妈妈yummy mummy ；milf(回复中指出的~) 钻石王老五diamond bachelor 时尚达人fashion icon 御宅otaku 上相的,上镜头的photogenic 脑残体leetspeak

学术界academic circle 哈证族certificate maniac 偶像派idol type 住房公积金housing funds 个税起征点individual income tax threshold 熟女cougar（源自电影Cougar Club）挑食者picky-eater 伪球迷fake fans 紧身服straitjacket 团购group buying 奉子成婚shotgun marriage 婚前性行为premarital sex 开博to open a blog 家庭暴力family/domestic violence （由回复更正）问题家具problem furniture 炫富flaunt wealth 决堤breaching of the dike 上市list share 赌球soccer gambling 桑拿天sauna weather 自杀Dutch act 假发票fake invoice

最潮英文单词

最潮英文单词预约券 reservation ticket 下午茶 high tea 微博 Microblog/ Tweets 裸婚 naked wedding 亚健康 sub-health 平角裤 boxers 愤青 young cynic 灵魂伴侣 soul mate 小白脸 toy boy 精神出轨 soul infidelity 人肉搜索 flesh search 浪女 dillydally girl 公司政治 company politics 剩女 3S lady（single，seventies,stuck）/left girls 山寨 copycat 异地恋 long-distance relationship 性感妈妈 yummy mummy ； milf 钻石王老五 diamond bachelor 时尚达人 fashion icon

御宅 otaku 上相的,上镜头的 photogenic 脑残体 leetspeak 学术界 academic circle 哈证族 certificate maniac 偶像派 idol type 住房公积金 housing funds 个税起征点individual income tax threshold 熟女 cougar（源自电影Cougar Club）挑食者 picky-eater 伪球迷 fake fans 紧身服 straitjacket 团购 group buying 奉子成婚 shotgun marriage 婚前性行为 premarital sex 开博 to open a blog 家庭暴力 family/domestic violence 问题家具 problem furniture 炫富 flaunt wealth 决堤 breaching of the dike

文档之家

Digital Object Identifier (DOI) 10.1007s00446-004-0113-4 Efficient low-contention asynchron

潮流英语

潮流英语026

时尚潮流英文句子

潮流英语

史上最潮流英文

潮流英语

潮流英语4

潮流英文单词

最潮英语单词 eg

潮流英语

潮 英 语

潮流英文词组

最潮英文单词

潮英语