Maintaining Database Consistency in Peer to Peer Networks
- 格式:pdf
- 大小:147.06 KB
- 文档页数:14
MaintainingDatabaseConsistencyinPeertoPeerNetworks
BaruchAwerbuchandCiprianTutu
DepartmentofComputerScience
JohnsHopkinsUniversity
Baltimore,MD21218,USA
{baruch,ciprian}@cnds.jhu.edu
TechnicalReport
CNDS-2002-2
http://www.cnds.jhu.edu
February6th,2002
Abstract
Wepresentanalgorithmforpersistentconsistentdistributedcommit(distributeddatabasecommit)inadynamic,asynchronous,peertopeernetwork.Thealgorithmhasconstantoverheadintimeandspaceandalmostconstantcommunicationcomplexity,allowingittoscaletoverylargesizenetworks.Previoussolutionsrequiredlinearoverheadincommunicationandspace,makingthemunscalable.Weintroduceamodularsolutionbasedonseveralwelldefinedblockswithclearformalspeci-fications.Theseblockscanbeimplementedinavarietyofwaysandwegiveexamplesofpossibleimplementations.Mostoftheexistingsolutionsrequireacknowledgmentsfromeveryparticipantforeachaction.Ouralgorithmishighlyefficientbyaggregatingtheseacknowledgments.Also,incontrastwithexistingsolutions,ouralgorithmdoesnotrequireanymembershipknowledge.Componentsaredetectedbasedonlocalinformationandtheinformationisdisseminatedonanoverlayspanningtree.Thealgorithmmayprovetobemoresuitedforpracticalimplementationthantheexistingones,becauseofitssimplicity.
1Introduction
Wepresentanalgorithmforpersistentconsistentdistributedcommit(distributeddatabasecommit)
inadynamic,asynchronous,peertopeernetwork.Theproblemhasbeenrecognizedasavery
difficult,yetveryimportantproblem.Replicateddatabasesarebecominganimperativenecessity
andtheexistingsolutionstradestrictsemanticsforefficiencyorviceversa.Weintroduceamodular
solutionbasedonseveralwelldefinedblockswithclearformalspecifications.
Webaseoursolutionontheconstructionofastableoverlayspanningtreeoverthesetofactive,
connectednodes.Usingthespanningtreeastheunderlyingdatastructureguaranteesanefficient
communicationpatternandfacilitatesthedeploymentofsimplealgorithmswithclearcorrectness
proofs.ThisconstructionisequivalentinrequirementstotheconstructionemployedbyLamport’s
Paxosalgorithm[Lam98]whichisbasedontheelectionofaleaderorwiththeconstructionsemployed
bysolutionsbasedongroupcommunication.
1Mostoftheexistingsolutionsrequireacknowledgmentsfromeveryparticipantforeachaction.
Ouralgorithmaggregatestheseacknowledgmentsduringnormaloperation,whenthenetworkissta-
ble.Intuitively,werunavirtualclockonaspanningtreebymeansofpulsemessagesandconverging
acknowledgments.Sincetheactionsaredisseminatedonthesamespanningtreeusedbythepulse
messagesandbecauseoftheFIFOguaranteesonthelinks,apulsemessagepwillimplicitlyacknowl-
edgeallmessagessentduringthepreviouspulsep-11.Thepulsemechanismasasynchronizeris
derivedfrom[Awe85].
Therestofthepaperisorganizedasfollows:thefollowingsubsectionsdescribethemodeland
relatetoexistingwork.Section2evaluatesthecomplexityofthealgorithmandcomparesittoexisting
solutions.Section3introducesthemaincomponentofthealgorithmandprovesitscorrectness.
Section4presentsthecompletealgorithmthatsupportsdynamicnetworks.Section5discussesthe
generalityofthealgorithmandconcludesthepaper.
1.1Model
Weconsideradynamicasynchronousnetworksusceptibletonetworkpartitions/re-mergesandserver
crashes/recoveries.ThenetworkconfigurationisgivenbyagraphG(V(t),E(t))whereVrepresents
thesetofserversinthesystemandEthesetofoverlayconnectionsbetweenservers.Communication
ontheedgesisassumedreliableandFIFO.Weassumetheexistenceofanetworkmonitormodule
thatdetectsfailuresofthephysicalnetworkthatdisrupttheoverlaytopology.
Eachnodeholdsalocalcopyofthedistributeddatabaseandmayissueread/writeoperationson
behalfoftheclientsthatconnecttothem.Thedistributeddatabasemustobeyconsistencyguarantees
equivalenttoone-copy-serializability.Alltheoperationsareassumedtohavedeterministicoutcome.
Thealgorithmwillsatisfystandardcorrectnessandlivenessproperties.Inordertoguarantee
correctness,thealgorithmestablishesaglobaltotalorderofactionsthatisconsistentwithFIFO
specifications.Ifalltheserverscommittheactionsinthisagreedorder,thedatabaseconsistencyis
guaranteed.Thecorrectnessspecificationbelowissimilartotheoneusedin[AT01].
Property1(GlobalTotalOrder)
Ifbothserverssandrcommittedtheirithaction,thentheseactionsareidentical.
Property2(GlobalFIFOOrder)
Ifserverrcommittedanactionageneratedbyservers,thenralreadycommittedeveryactionthats
generatedpriortoa.
Property3(Liveness)
Ifserverscommitsactionaandthereexistsasetofserverscontainingsandr,andatimefrom
whichonthatsetdoesnotfacecommunicationorprocessfailures,thenserverreventuallyorders
actiona.
1.2RelatedWork
Two-phasecommit[GR93,LL93]algorithmsarethemostcommonapproachtoprovidingaconsistent
viewinadistributeddatabasesystemoveranunreliablenetwork.Howevertheseprotocolsimpose
asubstantialcommunicationcostoneachtransactionandmayrequireconnectivityofallreplicas