SDAM A New data Aggregation Approach for Wireless Sensor Networks
- 格式:pdf
- 大小:131.23 KB
- 文档页数:4
stata中nmissing命令的用法-回复Stata中的nmissing命令用于计算变量中缺失值的数量。
缺失值是缺少观测值的数据点,它们可能是由于数据采集过程中的错误、非回答、遗漏或其他原因导致的。
缺失值的存在可能会影响数据的准确性和统计分析的可靠性。
nmissing命令可以帮助我们快速确定变量中的缺失值数量,并以此来衡量数据的完整性。
下面的步骤将详细介绍如何在Stata中使用nmissing命令,以及该命令的一些选项和应用示例。
步骤1:加载数据集首先,我们需要在Stata中加载包含要分析的变量的数据集。
可以使用以下命令加载数据集:use "数据集文件路径\数据集文件名.dta"请注意,"数据集文件路径\数据集文件名.dta"是你的数据集存储的路径和文件名。
步骤2:检查变量名和缺失值数量在使用nmissing命令之前,我们需要确定要分析的变量名称。
可以使用以下命令列出数据集中所有的变量名称:describe这将显示数据集中所有变量的相关信息,包括名称、数据类型和存储格式。
找到你感兴趣的变量名称,以便对其进行缺失值计数。
步骤3:使用nmissing命令一旦我们确定了要分析的变量名称,就可以使用nmissing命令来计算其缺失值数量。
以下是nmissing命令的基本语法:nmissing 变量名称例如,如果要计算名为"age"的变量中的缺失值数量,可以使用以下命令:nmissing age执行此命令后,Stata将显示变量"age" 中的缺失值的数量。
步骤4:nmissing命令的选项nmissing命令还提供了一些选项,可以根据需要对缺失值进行分组和计数。
以下是nmissing命令的常用选项:- by(varlist): 可以使用by选项将数据按照一个或多个变量进行分组,并计算各组中缺失值的数量。
例如,使用by选项可以计算来自不同城市或性别的年龄变量中的缺失值数量。
Introduction ................................................................................................................................................2Certification Details ....................................................................................................................................2Certification Benefits . (3)What IT Certification Offers What Oracle Certification OffersOracle Certification Innovation with Digital BadgingExam Preparation .......................................................................................................................................5Exam Topics (6)Oracle Database Administration I | 1Z0-082Oracle Database Administration II | 1Z0-083Sample Questions (13)Oracle Database Administration I | 1Z0-082Oracle Database Administration II | 1Z0-083Exam Registration Process .........................................................................................................................18Exam Score ................................................................................................................................................18Oracle Certification Program Candidate Agreement ...................................................................................19Oracle Certification Program Guidelines .. (19)Oracle Database Administration I & Oracle Database Administration IICertification Overview and Sample QuestionsOracle DatabaseIntroductionPreparing to earn the Oracle Database Administration 2019 Certified Professional certification helps candidates gain the skills and knowledge to install, patch and upgrade Oracle Database and Oracle Grid Infrastructure for a standalone server, create and manage a backup and recovery strategy using Recovery Manager (RMAN), use RMAN for Database duplication and transportation, diagnose failures using RMAN, and manage all aspects of Multitenant container databases, pluggable databases and application containers including creation, cloning, security, transportation and backup and recovery. The Administration I exam and recommended training focus on fundamental Database Administration topics such as understanding the database architecture, managing database instances, managing users, roles and privileges, and managing storage that lay the foundation for a Database Administrator entry-level job role. Additionally, the Admin I exam assumes knowledge of SQL.The Administration II exam and associated recommended training presents advanced topics such as multi-tenancy, backup and recovery, deploying, patching, and upgrading.Certification BenefitsWhat Oracle Certification OffersBy becoming a certified Oracle Database Administrator Professional , you demonstrate the full skill set needed to perform day to day administration of the Oracle Database.Preparing to take the Oracle Database certification exam broadens your knowledge and skills by exposing you to a wide array of important database features, functions and tasks. Oracle Database certification preparation teaches you how to perform complex, hands-on activities through labs, study and practice.Additionally, Oracle certification exams validate your capabilities using real-world, scenario-based questions that assess and challenge your ability to think and perform.What IT Certification OffersRecognitionof having required skillsExperienced a Greater Demandfor Their SkillsReceived Positive Impact onProfessional Imagethrough new skillsOpportunitiesSaid Certification was a Key Factor in Recent Raiseby peers and managementConfidence and RespectJanuary 2018 issue of Certification Magazine’s annual salary survey The kind of longevity suggests that earning and maintaining a certification can keep you moving for-ward in your career, perhaps indefinitely.73%65%71%January 2019 issue of Certification Magazine’s annual salary survey January 2019 issue of Certification Magazine’s annual salary surveyCertification that Signifies Y our Readiness to Perform Earned badges represent recognized skills and capabilitiesDisplay Y our Oracle Certification BadgeY ou‘ve earned it. Get the recognition you deserve.Modern Representation of Skills Tied to Real Time Job Markets View from your profile and apply to jobs that are matched to your skills; based on location, title, employers or salary rangeDisplay Y our AchievementA secure way to display and share your certification achievement • Industry Recognized • Credible • Role Based• Product Focused Across Database, Applications, Cloud, Systems, Middleware and Java• Globally one of the top 10 certifica-tion programs availableOracle Certification Innovation with Digital Badging Use Your Badge to Apply for JobsBenefitsOracle Certification Signifies a Candidate’s Readiness to Perform2019 Oracle Certified Professional Oracle Database AdministratorBoost Y our Professional ImageLearn MoreExam PreparationBy passing these exams, a certified individual proves fluency in and solid understanding of the skills required to be an Oracle Database Administrator.Recommendations to successfully prepare for Oracle Database Administration I | 1Z0-082 and Oracle Database Administration II | 1Z0-083 exams are:Attend Recommended Oracle T rainingThe courses below are currently available and are terrific tools to help you prepare not only for your exams, but also for your job as an Oracle Database Administrator.The new Oracle Database Administration Learning Subscription also helps you prepare for these exams with 24/7 access to continually updated training and hands-on labs and integrated certification.Recommended for 1Z0-082• O racle Database: Administration Workshop • Oracle Database: Introduction to SQL Recommended for 1Z0-083• O racle Database: Deploy, Patch and Upgrade Workshop• O racle Database: Backup and Recovery Workshop • O racle Database: Managing Multitenant Architecture• Oracle Database Administration: Workshop • O racle Database 19c: New Features for Administrators• O racle Database 18c: New Features forAdministrators (for 10g and 11g OCAs and OCPs)• O racle Database 12c R2: New Features for 12c R1 Administrators (12c R1 OCAs and OCPs)• O racle Database 11g: New Features for Administrators (for 10g OCAs and OCPs)The following topics are covered in the Oracle Database: Administration Workshop course.The following topics are covered in theOracle Database: Introduction to SQL course.The following topics are covered in the Oracle Database: Managing Multitenant Architecture Ed 1 course.The following topics are covered in the Oracle Database: Backup and Recovery Workshop course.The following topics are covered in the Oracle Database: Deploy, Patch and Upgrade Workshop course.The following topics are covered in the Oracle Database 19c: New Features for Administrators course.The following topics are covered in the Oracle Database: Administration Workshop course1. Which two statements are true about the Oracle Database server architecture?A. An Oracle Database server process represents the state of a user’s login to an instance.B. An Oracle Database server process is always associated with a session.C. Each server process has its own User Global Area (UGA).D. A connection represents the state of a user’s login to an instance.E. The entire data dictionary is always cached in the large pool.2. W hich two statements are true about the Oracle Database server during and immediatelyafter SHUTDOWN IMMEDIATE?A. New connection requests made to the database instance are refused.B. Uncommitted transactions are rolled back automatically.C. All existing connections to the database instance remain connected until all transactions eitherroll back or commit.D. Uncommitted transactions are allowed to continue to the next COMMIT.E. All existing transactions are aborted immediately.3. Which three statements are true about Oracle database block space management?A. A row can be migrated to a block in a different extent than the extent containing the originalblock.B. An insert statement can result in a migrated row.C. An update statement cannot cause chained rows to occur.D. A row can be migrated to a block in the same extent as the extent containing the originalblock.E. An insert statement can result in a chained row.1 2. 3.4. A n Oracle Database server session has an uncommitted transaction in progress whichupdated 5000 rows in one table.In which two situations does the transaction complete, thereby committing the updates?A. When a DDL statement is executed successfully by same user in a different session.B. When a DDL statement is executed successfully by the user in the same session.C. When a DML statement is executed successfully by same user in a different session.D. When a DML statement is executed successfully by the user in the same session.E. When a DBA issues a successful SHUTDOWN NORMAL statement and the sessionterminates normally.5.Which two statements are true about indexes and their administration in an Oracle database?A. An index can be scanned to satisfy a query without the indexed table being accessed.B. A non-unique index can be converted to a unique index using a Data Definition Language(DDL) command.C. A descending index is a type of bitmapped index.D. An invisible index is maintained when a Data Manipulation Language (DML) command is per-formed on its underlying table.E. An index is always created by scanning the key columns from the underlying table.6. Which two statements are true about sequences in a single instance Oracle database?A. Sequences that start with 1 and increment by 1 can never have gaps.B. A sequence can issue the same number more than once.C. Sequence numbers that are allocated require a COMMIT statement to make the allocationpermanent.D. A sequence can provide numeric values for more than one column or table.E. The data dictionary is always updated each time a sequence number is allocated.4. 5. 6.7. E xamine the description of the SALES table:Name Null? Type---------------------------- -------- --------------PRODUCT_ID NOT NULL NUMBER(10)CUSTOMER_ID NOT NULL NUMBER(10)TIME_ID NOT NULL DATECHANNEL_ID NOT NULL NUMBER(5)PROMO_ID NOT NULL NUMBER(5)QUANTITY_SOLD NOT NULL NUMBER(10,2)PRICE NUMBER(10,2)AMOUNT_SOLD NOT NULL NUMBER(10,2)The SALES table has 55,000 rows.Examine this statement:CREATE TABLE mysales (prod_id, cust_id, quantity_sold, price)ASSELECT product_id, customer_id, quantity_sold, priceFROM salesWHERE 1 = 2;Which two statements are true?A. MYSALES is created with no rows.B. MYSALES will have no constraints defined regardless of which constraints might be de-fined on SALES.C. MYSALES has NOT NULL constraints on any selected columns which had that constraintin the SALES table.D. MYSALES is created with 2 rows.E. MYSALES is created with 1 row.71. Which three are true about an application container?A. It always contains multiple applications.B. Two or more application PDBs in the same application container can share access to tables.C. It can have new application PDBs created by copying PDB$SEED.D. T wo or more application PDBs in the same application container can be given exclusive accessto some tables.E. It always has a new application PDBs created by copying PDB$SEED.F. It always contains a single application.2. RMAN has just been connected to a target database and the recovery catalog database.In which two cases would an automatic partial resynchronization occur between this target database’s control file and the RMAN recovery catalog?A. When any control file metadata for data file backups or image copies is now older thanCONTROL_FILE_RECORD_KEEP_TIME.B. When a new data file is added to a tablespace in a registered target database.C. When a backup of the current SPFILE is created.D. When the target is first registered.E. When any control file metadata for archive log backups or image copies is now older thanCONTROL_FILE_RECORD_KEEP_TIME.3. Which two are true about Oracle Grid Infrastructure for a Standalone Server?A. Oracle Restart can be used without using ASM for databases.B. Oracle Restart can attempt to restart a failed ASM instance automatically.C. It must be installed before Oracle Database software is installed.D. It must be installed after Oracle Database software is installed.E. It allows ASM binaries to be installed without installing Oracle Restart.F. It allows Oracle Restart binaries to be installed without installing ASM.1 2. 3.4. W hich two are true about creating container databases (CDBs) and pluggable databases (PDBs) inOracle 19c and later releases?A. A CDB can be duplicated using the Database Configuration Assistant (DBCA) in silent mode.B. A CDB can be duplicated using Recovery Manager (RMAN) with no configuration requiredbefore starting the duplication.C. A PDB snapshot must be a full copy of a source PDB.D. A PDB snapshot can be a sparse copy of a source PDB.E. A CDB can be duplicated only by using the Database Configuration Assistant (DBCA).5. Which two are true about the Oracle Optimizer?A. It requires system statistics when generating SQL execution plans.B. It always generates an index access operation when a statement filters on an indexed columnwith an equality operator.C. It ignores stale object statistics in the Data Dictionary.D. It can automatically re-optimize execution plans that were detected to be sub-optimal whenexecuting.E. It can re-write a statement internally in order to generate a more optimal plan.4. 5.Exam Registration ProcessOracle exams are delivered through the independent company Pearson VUE. Create a Pearson VUE loginOracle Certification Program Candidate AgreementIn order to take your Oracle certification, you will need to agree to the Oracle Certification Program Candidate Agreement. Please review this document by going here.Oracle Certification Program GuidelinesLearn more about Oracle Certification policies here.This certification overview and sample questions were created in June 2019. The content is subject to change,please always check the web site for the most recent information regarding certifications and related exams: /certification。
Contents Intro..................................Introduction to data management reference manualData management............................Introduction to data management commands append............................................................Append datasets assert..........................................................Verify truth of claim assertnested...................................................Verify variables nestedbcal...............................................Business calendarfile manipulation by........................................Repeat Stata command on subsets of the data cd................................................................Change directory pare two datasets changeeol.....................................Convert end-of-line characters of textfile checksum..................................................Calculate checksum offile clear................................................................Clear memory clonevar......................................................Clone existing variable codebook.....................................................Describe data contents collapse............................................Make dataset of summary statistics pare two variables press data in memory contract....................................Make dataset of frequencies and percentages copy....................................................Copyfile from disk or URL corr2data...............................Create dataset with specified correlation structure count.................................Count observations satisfying specified conditions cross...................................Form every pairwise combination of two datasets Data types.............................................Quick reference for data types datasignature.....................................Determine whether data have changed Datetime............................................Date and time values and variables Datetime business calendars.........................................Business calendars Datetime business calendars creation...........................Business calendars creation Datetime conversion....................................Converting strings to Stata dates Datetime display formats.............................Display formats for dates and times Datetime durations................................Obtaining and working with durations Datetime relative dates................Obtaining dates and date information from other dates Datetime values from other software...........Date and time conversion from other software describe..........................................Describe data in memory or in afile destring.......................Convert string variables to numeric variables and vice versa dir...............................................................Displayfilenames drawnorm..............................Draw sample from multivariate normal distribution drop..................................................Drop variables or observations pactly list variables with specified properties duplicates....................................Report,tag,or drop duplicate observations dyngen....................................Dynamically generate new values of variables edit..............................................Browse or edit data with Data Editor egen.........................................................Extensions to generate encode.......................................Encode string into numeric and vice versaiii Contents erase..............................................................Erase a diskfile expand.......................................................Duplicate observations expandcl..............................................Duplicate clustered observations export...........................................Overview of exporting data from Stata filefilter.......................................Convert ASCII or binary patterns in afile fillin.........................................................Rectangularize dataset format....................................................Set variables’output format fralias..............................................Alias variables from linked frames frames intro...................................................Introduction to frames frames................................................................Data frames frame change.................................Change identity of current(working)frame frame copy..................................................Make a copy of a frame frame create.....................................................Create a new frame frame drop................................................Drop frames from memory frame prefix...............................................The frame prefix command frame put..........................Copy selected variables or observations to a new frame frame pwf....................................Display name of current(working)frame frame rename..................................................Rename existing frame frames describe..................................Describe frames in memory or in afile frames dir......................................Display names of all frames in memory frames reset.............................................Drop all frames from memory frames save..............................................Save a set of frames on disk frames use.............................................Load a set of frames from disk frget................................................Copy variables from linked frame frlink.................................................................Link frames frunalias.........................................Change storage type of alias variables generate..........................................Create or change contents of variable gsort..................................................Ascending and descending sort hexdump...........................................Display hexadecimal report onfile icd...................................................Introduction to ICD commands icd9.....................................................ICD-9-CM diagnosis codes icd9p....................................................ICD-9-CM procedure codes icd10.......................................................ICD-10diagnosis codes icd10cm.................................................ICD-10-CM diagnosis codes icd10pcs................................................ICD-10-PCS procedure codes import...........................................Overview of importing data into Stata import dbase............................................Import and export dBasefiles import delimited...................................Import and export delimited text data import excel.............................................Import and export Excelfiles import fred.............................Import data from Federal Reserve Economic Data import haver................................Import data from Haver Analytics databases import sas.........................................................Import SASfiles import sasxport5..................Import and export data in SAS XPORT Version5format import sasxport8..................Import and export data in SAS XPORT Version8format import spss.............................................Import and export SPSSfiles infile(fixed format).......................Import text data infixed format with a dictionary infile(free format).........................................Import unformatted text data infix(fixed format).....................................Import text data infixed format input......................................................Enter data from keyboardContents iii insobs.....................................................Add or insert observations inspect.....................................Display simple summary of data’s attributes ipolate.........................................Linearly interpolate(extrapolate)values isid......................................................Check for unique identifiersjdbc...........................Load,write,or view data from a database with a Java API joinby.....................................Form all pairwise combinations within groupslabel.............................................................Manipulate labels label bels for variables and values in multiple languages bel utilities list.........................................................List values of variables lookfor....................................Search for string in variable names and labelsmemory.......................................................Memory management merge..............................................................Merge datasets Missing values.......................................Quick reference for missing values mkdir.............................................................Create directory mvencode.........................Change missing values to numeric values and vice versa notes............................................................Place notes in data obs.....................................Increase the number of observations in a dataset odbc.....................................Load,write,or view data from ODBC sources order.....................................................Reorder variables in dataset outfile...................................................Export dataset in text format pctile............................................Create variable containing percentiles putmata....................................Put Stata variables into Mata and vice versa range.....................................................Generate numerical range recast................................................Change storage type of variable recode...................................................Recode categorical variables rename............................................................Rename variable rename group..............................................Rename groups of variables reshape..............................Convert data from wide to long form and vice versa rmdir............................................................Remove directory sample........................................................Draw random sample save.............................................................Save Stata dataset separate....................................................Create separate variables shell.............................................Temporarily invoke operating system snapshot..............................................Save and restore data snapshots sort.....................................................................Sort data split..................................................Split string variables into parts splitsample.............................................Split data into random samples stack...................................................................Stack data statsby..................................Collect statistics for a command across a by list e shipped dataset type.......................................................Display contents of afile unicode............................................................Unicode utilities unicode nguage-specific Unicode collators unicode convertfile...........................Low-levelfile conversion between encodingsiv Contentsunicode encoding...........................................Unicode encoding utilities unicode locale.................................................Unicode locale utilities unicode translate............................................Translatefiles to Unicode use.............................................................Load Stata datasetvarmanage...........................Manage variable labels,formats,and other properties vl............................................................Manage variable lists vl create....................................Create and modify user-defined variable lists vl drop................................Drop variable lists or variables from variable lists vl list...................................................List contents of variable lists vl rebuild......................................................Rebuild variable lists vl set................................................Set system-defined variable listse dataset from Stata websitexpose...........................................Interchange observations and variableszipfipress and uncompressfiles and directories in zip archive formatGlossary.........................................................................Subject and author index...........................................................Contents v Stata,Stata Press,and Mata are registered trademarks of StataCorp LLC.Stata andStata Press are registered trademarks with the World Intellectual Property Organization®of the United Nations.Other brand and product names are registered trademarks ortrademarks of their respective companies.Copyright c 1985–2023StataCorp LLC,College Station,TX,USA.All rights reserved.。
teradata开窗函数-回复什么是Teradata 开窗函数?Teradata 开窗函数是一种用于数据库查询中的高级技术,它允许用户在查询结果集中执行聚合、分析和排序操作。
开窗函数提供了对查询结果集中每一行的访问能力,并且允许用户对这些行进行分组、排序和聚合。
通过使用开窗函数,用户可以轻松地执行复杂的分析任务,而无需编写复杂的嵌套查询或使用临时表。
使用开窗函数的好处是什么?使用开窗函数可以极大地简化查询操作,提高查询效率和可读性。
开窗函数允许用户在一个查询中执行多个聚合、排序和分析操作,而无需执行多个单独的查询。
这种方式比传统的嵌套查询或使用临时表更高效,因为它减少了数据的扫描次数和数据的重复操作,从而减少了查询的执行时间。
开窗函数的语法和用法是什么?在Teradata 中,开窗函数通过OVER 子句来定义。
OVER 子句用于指定开窗函数的分区和排序规则。
分区用于将查询结果集分为若干个子集,每个子集的行与指定的列或表达式的值相同。
排序规则用于指定对每个分区中的行进行排序的方式。
在OVER 子句中,可以使用以下关键字和函数:- PARTITION BY:用于指定分区的列或表达式。
每个不同的值将创建一个新的分区。
- ORDER BY:用于指定分区中的行的排序规则。
可以指定多个列或表达式,并按升序或降序进行排序。
- ROWS BETWEEN:用于指定如何在分区中定义开窗的范围。
可以指定从某一行开始和某一行结束,或者指定从分区的第一行到某一行、从某一行到分区的最后一行、从分区的第一行到最后一行等。
开窗函数的常用类型有哪些?Teradata 提供了多种开窗函数类型,可以满足不同的分析需求。
常用的开窗函数类型包括:- 聚合函数:如SUM、COUNT、AVG、MIN、MAX 等。
聚合函数会对每个分区中的行进行聚合操作,并返回一个聚合结果。
- 分析函数:如RANK、DENSE_RANK、ROW_NUMBER 等。
网络安全管理员中级工练习题库(含答案)一、单选题(共40题,每题1分,共40分)1、关于机房建设物理位置的选择,下列选项中正确的是( )A、大楼中部B、大楼顶楼C、地下室D、一楼正确答案:A2、关于IDS和IPS,说法正确的是( )A、IDS部署在网络边界,IPS部署在网络内部B、IDS适用于加密和交换环境,IPS不适用C、用户需要对IDS日志定期查看,IPS不需要D、IDS部署在网络内部,IPS部署在网络边界正确答案:D3、双绞线与避雷引下线之间的最小平行净距为( )。
A、600mmB、800mmC、1000mmD、400mm正确答案:C4、在任务管理器的( )选项卡中可以查看CPU和内存使用情况。
A、“应用程序”选项卡B、“进程”选项卡C、“性能”选项卡D、“联网”选项卡正确答案:C5、1000BASE-T标准规定网卡与HUB之间的非屏蔽双绞线长度最大为( )。
A、50米B、100米C、200米D、500米正确答案:B6、网络中使用光缆的优点是什么?( )A、能够实现的传输速率比同轴电缆或双绞线都高B、容易安装C、是一个工业标准,在任何电气商店都能买到D、便宜正确答案:A7、根据《中国南方电网有限责任公司信息系统运行维护管理办法(2014年)》,缺陷指信息系统发生的( ),这些异常或隐患将影响信息系统安全可靠运行、性能、寿命或服务质量。
A、故障B、安全隐患C、异常或存在的隐患(包括信息安全漏洞)D、事件正确答案:C8、应配备集中运行监控系统实现对所有服务器、网络设备、安全设备、数据库、中间件、应用系统的( )。
A、单独控制B、集中测试C、集中监控D、分散监控正确答案:C9、以下( )标准是信息安全管理国际标准。
A、ISO9000-2000B、SSE-CMMC、ISO27000D、ISO15408正确答案:C10、不能防范ARP欺骗攻击的是( )A、使用静态路由表B、使用ARP防火墙软件C、使用防ARP欺骗的交换机D、主动查询IP和MAC地址正确答案:A11、在数据库系统中,当数据库的模式改变时,用户程序可以不做改变,这是数据的( )。
oracle sga详解Oracle SGA详解Oracle SGA(System Global Area)是Oracle数据库的一个重要组成部分,它是在内存中分配的一块共享内存区域,用于存储数据库实例运行时所需要的信息。
SGA中存储了数据库缓冲区、共享池、重做日志缓冲区等关键组件,对于数据库的性能和稳定性起着至关重要的作用。
SGA的主要组件包括:1. 数据库缓冲区(Database Buffer Cache):用于存储从磁盘读取的数据块,以提高数据库查询的性能。
当用户查询数据时,Oracle 首先会在数据库缓冲区中查找,如果找到了相应的数据块,则直接返回给用户,避免了从磁盘读取的开销。
2. 共享池(Shared Pool):用于存储共享SQL和PL/SQL代码的执行计划(Execution Plan)、共享游标(Shared Cursor)和共享数据字典缓冲区(Shared Data Dictionary Cache)。
共享池的存在可以提高SQL查询的效率,避免重复解析和优化SQL语句的开销。
3. 重做日志缓冲区(Redo Log Buffer):用于存储数据库操作的重做日志信息,以保证事务的持久性。
当用户执行数据库操作时,Oracle会将操作的重做日志写入重做日志缓冲区,并定期将其刷新到磁盘上的重做日志文件中,以防止系统故障导致数据丢失。
4. Java池(Java Pool):用于存储运行在Oracle数据库中的Java 代码的执行结果、Java类和Java对象。
Java池的存在可以提高Java代码的执行效率,避免重复编译和加载Java代码的开销。
5. Large Pool:用于存储较大的内存分配请求,如排序操作和并行查询等。
Large Pool的存在可以提高这些特殊操作的性能,避免占用太多的SGA空间。
6. 其他组件:还包括Java大对象池(Java Large Object Pool)、固定区(Fixed Area)等,不同版本的Oracle数据库中可能会有所不同。
Grid Evolution and the Bologna TOF SiteM.L.Luvisetto,E.UgoliniMarch22,20071Introduction,Aknowledgement and Disclaimer After thefirst setup and usage of Grid testbed sites,in2004the LHC experiments started using the Grid in a more intensive way in their Data Challenge computations.At the same time both LCG and EDG evolved:the European Community funded the new Grid project EGEE1(Enabling Grids for E-Science in Europe)and LCG entered the next steps LCG2and gLite.The LCG commettee strictly interacts with EGEE to recommend the common com-puting model and tools,define responsabilities of LCG teams,experiments and sites with the aim of implementing a standard protocol for Grid computations(hardware and soft-ware platforms,environment,software release level and upgrades,configuration,certifica-tion,accounting,security,testing,support,resource scheduling,planning and allocation, policies,etc.)EGEE is revising the middleware to improve performance,scalability and reliability. The middleware evolution is towards message based services.Message interaction is performed with message handling standards.Services are network-capable units of software that implement logic,manage state,com-municate via messages and are governed by policy as defined in the EGEE Middleware Architecture proposal.In the new architecture is foreseen the introduction of a language to control authorization. Information services communicate through SQL as query language(candidate model). The present notes,starting from the DC04lesson up to the present DC06challenge,try to summarize in a simple and brief way the current status and forseen evolution of the Bologna site as seen by the local managers and users.For a deeper insight refer to the bibliography,the glossary and site links.The notes that follow are based on personal experience gained managing the Bologna Grid Site and at the SA1phone conferences,using the Grid manuals and gathering information at the sites listed along the report and in the Reference Section.The Bologna site is managed by the author of the paper with the help of Franco Semeria for the Grid nodes and Paolo Mazzanti for the computer room infrastructure.The local monitor facilities are implemented and maintained by Enzo Ugolini.2Experiment Computing:Traditional vs.GridBefore entering the details of Grid computing evolution,we should review the off-line computing activities related to experiment data processing comparing the traditional methods and Grid.In general the computing tasks is divided into steps:•the experiment collaboration takes care of the common software repository:libraries, simulation tools and programs,analysis tools,etc.•each site takes care of the local installation of the common software and the related customization with a common platform and one or more development groups•tests,improvements,subdetector software,specialized simulations are local respon-sability;once a new release is validated it is stored in the common repository•the production task is a centralized collaboration job•analysis is a wide spread activityProduction refers to huge data processing with results stored at a central site.Other sites may store small fractions of the experiment data for variable periods depending on analysis needs.Production involves simulation and raw data processing when the experiment is ready for data taking.First of all let us take into account the tasks that a Grid environment must support:work performance monitor:provide network measures with standard interfaces2.trouble ticket system:provide network managed support3.security:security policy and incident response4.operation monitoring:operation performance and reliability5.accounting6.job monitoring:job state viewing7.storage interface:standards and interoperation of storage systemsrmation system to publish static and dynamic resources9.job submission:user mapping and cross-grid job submission10.multiple VOsmon operation policiesage of proprietary software(like LSF,PBSpro,etc.)13.future possible integration of academic and industrial usage of the grid14.support,training and documentationThe above list is derived from the EGEE documents and is likely meaningless to end-users that feel Grid computing much more complex that the traditional use of personal workstations or local batch systems.The weak points of Grid computation for end-users are at least the following ones:•the users must install the UI middleware support•certificate management is required and users may have their jobs stopped because of expired certificates•a new submission language to learn•no easy way to check job and data progress•at least basic knowledge of the information services is required•a new ticketing system to learn and,most likely,very few or no local support•more complex,slower and less versatile data access•one or more site/job monitor tools to learn•no easy way for local customization of experiment software as installation is per-formed only by the VO manager;it is not clear whether VOMS will enable the local manager to perform local experiment installation•limited control on Grid storage(data ownership,management,movement,cleanup, etc.)•dependency on the VO experiment manager actions for installation,cleanup,etc.that results in slower operation and reducedflexibilityHaving access to more powerful resources is not felt so important as the capability of exercising full control on jobs,checking data as they are created onfile and beeing able to drop jobs if not correctly working.With increasing disk sizes at low prices,analysis jobs on the same data sample over a few months performed on local workstations on data downloaded from the Grid storage are preferred to jobs run on Grid resources with effeciency that depends on workload and network status with unkown response time.At present Grid is seen mostly as a resource for centralized productions and data archive.3LCG2and the DC04ExperienceExperiments exercise and stress the Grid every year submitting jobs,collecting results, verifying performance,checking scalability and learning from experience for the next test. This way of testing and stressing is called Data Challenge(DC).Challenge tests are performed at service level and experiment level.Service challenges(SC)test network and data transfer performances giving feedback to LHC as a whole.Data challenges test the middleware as a whole and give feeback to the experiments,while physics data challenge (PDC)are customized for each experiment.Networking performance and problems of SC1and SC2performed during DC04are described in the report:SC1&2--The wide-area-network point of viewstored at CERN OpenLab site openlab-mu-internal.web.cern.ch.In this Section we try to summarize the results of the DC performed in2004and the problems met by the LHC experiments.A summary document about the DC04feedback to LCG2is available in[27](V2.2–October29,2004).DC04feedback was restricted to the production experience in which jobs are submitted in clusters by production managers(many jobs,few submitters). The main points that emerge from the experiment reports are summarized below.•Experiment software installation:procedures,problems,creation of VOBox units to customize experiment needs and requirements adding features not supported by the middleware.•The information system:substitute MDS(due to query hangs and site drop out) with BDII and R-GMA.BDII is made of a standard OpenLDAP server with Berke-ley database and a database update script.Problems met updating the database (readonly)by cron jobs.At start one BDII per major site,during DC04one BDII per VO.Query timing critical,may cause site drop if local information index(GIIS) is not refreshed within timeout(30s).•Bulk job submissions in the same JDL.Improved job handling(submit,retrieve, check status).Efficient handling of WN free disk space and allocation.At present (PDC06)bulk job submission is handled by the VOBox software like AliEn.•Synchronization of information index and bulk job submission.Job distribution and ranking(minimum waiting time before starting to run).Sites with misleading rank-ing metrics.Job rescheduling either on demand or by RB(e.g.no transition from scheduled to running).Efficient evaluation of JDL requirements for job scheduling.Resource distributions among VOs,site configuration and resources(granularity).Do not reschedule to sites that have problems(job never enters execution or fails to produce output).User configuration:choose a specific RB,Replica Catalogue,etc.•Running job resubmission problems:resubmission triggered by RB if job considered failed at a WN(not true in some cases,e.g.job done by steps and some steps alreadycompleted).Normalized time limit(normalization depending on CPU performance), space limit preserving output sandboxes.•Proxy expiration failure.VO specific queues;queue setup problem in general.A safe and reliable method to publish really free CPUs.A more accurate site view:not accepting jobs,total jobs aborted,sandbox problems,running jobs,etc.Logging and bookkeeping cleanup.Do not review JID for jobs already cleared,aborted,etc.•Input sandbox site wide for bulk jobs.Never loose output sandboxes.Better error report and debug tools.•RBfinds out a reasonable SE for output.A more versatile way to access in-put/output data on SEs selecting the best CE candidate.3.1DC04ProblemsThe2004Data Challenge exercise has shown the weak points of Grid computing as sum-marized in the following lists.Grid middleware probems:•server failures:IS-WMS-Monitor server-File Catalogue server-HTTP-SFTP -GRIDFTP-BBFT servers•failures and problems originated from hardware,software,overload(with>3500 concurrent jobs)•network problems,connection blackout,etc.•NFS problems with the Resource Brokers mounting part of thefilesystem from external disk servers•Information System problems(hanging,off-line,no information published,etc.)•NFS exportedfilesystems with write permission problems•random Globus errors at some sites•network speed limits with remote SE(small sites that must use remote SE)•SE storage failures,possible communication errors between CE and SE,dedicated BDII as possible solutionJob failures originated by:•site misconfiguration•job starts running,then fails•lost output sandbox•bad scheduling due to wrong CE information•problems with commercial software(ex.license expired for PBSpro)Suggested improvements and User requirements•difficult debugging(too many distributed logs)•CPU time limit handling•disk space limit/allocation management•allow local disk space allocation•information for site shutdown•include bulk submission of jobs in the middlewareWeak Points that require attention identified by data challenge2004may be summarized as follows•allow preselection of good sites by collecting site performance statistics•create experiment private BDII andflag of bad sites with site blacklist maintained by the software manager•improve support,answer speed,site manager unavailable condition•avoiding sites with small memory WNs using the IS attribute called:GlueHostMainMemoryRAMSize(in MB).thus the job will not arrive to the CE.The RB will make the matchmaking and will send jobs to those sites which satisfy the requirements you put into the jdl using the attributes inside the IS.•add the option to kill long lasting jobs•create a special queue for each LHC experiment and for site monitoring3.2ALICE DC04ProblemsProblems that arise at installation:•installation job never terminates,possible disk quota problem•AliEn installation fails,possible time skew that confuses gmake•installation fails due to faulty NFS mounts•installation job is hanging•installation is broken;job ends in segmentation violation•wrong permissions on the experiment’s software directory •wrong mount point name and/or wrong value of VO SW•problems arising in mixed UI/RB nodes(security,hanging,etc.)•memory size(512MB/cpu is a minimum)•SE node off-line during job execution•scheduled shutdown•manual load balance submitting a mixture of long and short jobs•queue mix-up:ALICE jobs should go to the dedicated ALICE queue,but they went to the normal queue•check available space if not correctly published•random errors like failing to start aliroot and then having success;same problem when saving output3.3DC04ALICE Memory ProblemsALICE jobs are memory and CPU demanding,therefore a job may fail after a long processing time due to insufficient memory.To avoid sites with a small memory the job can put into the JDL requirements a condition which sets the minimum value of the memory attribute.If the site is properly configured,the requirement in the JDL should work,but in production it is safer to perform also a check using the output from:cat/proc/meminfoA DC04improvement includes a script that parses the meminfo output and aborts the job if the WN available memory is not sufficient.The test is now performed by Job Agents (JA).In the beginning of DC04the parameter GlueHostMainMemoryRAMSize published by the sites was not very consistent and most likely was not taking into account the allowed number of jobs per WN,which would make the number irrelevant.It also does not take into account the swap space.You may have512MB RAM,but enough swap for the job to run.ALICE simulation jobs werefine-tuned during phase I of PDC04,and at phase II were able to run on nodes with600MB RAM provided that there was enough swap space to run a program that could grow to1.2GB.During PDC04ALICE active set(i.e.the amount of memory active at each moment and that needs to be pinned in the physical memory)rarely exceeded500MB.Ideally there should be a parameter in glue that tells which is the maximum size of the image that can be run on a node,which is NOT the physical memory size.In PDC04ALICE efficiency went up to80%when ALICE jobs were the only onew running in the grid.Efficiency went down to50%with load increase.4The gLite EvolutiongLite stands for Lightweight Middleware for Grid Computing and it is the new-generation generic middleware(grid services based).gLite components were initially deployed on a prototype infrastructure of small scale at CERN and Wisconsin University to collect user feedback on service.The following sections summarize the most relevant existing or foreseen services of the gLite middleware.4.1gLite Load Balance and Job SubmissiongLite implements a new Work Load Manager(WMS)to improve load balance and job submission The new WMS puts all jobs in its task queue(TQ)and submits immediately in push mode an agent in all CEs which satisfy the initial matchmaking job requirements. This agent makes a complete set of configuaration checks and only once these are satisfied pulls the real jobs on the WN with one TQ per VO.The TQ works as a very long-term(longer than book-keeping)job database,holding information,interactating with the notification stream from the LB,and providing a job view.WMS will work both in push and pull mode but always with a task queue containing only the jobs submitted to a specific instance of the WMS.To avoid the problem of jobs staying for ever on an overloaded CE,a possible solution is the creation of pilot jobs or job agents that start real jobs when the resource is ready. Job agents(JAs)are also eligible for periodic cancel and resubmission if the job stays in a batch system queue for too long(idle jobs).For efficient job submission and execution,the WMS manager must maintain a re-liable link with data access to distribute jobs accordingly to input data needs.Data driven submission is most important when analysis starts with random data access and as opposed to the centralized job submission like in the present production jobs.Thefinal goal is also to bring back to users only their application failures and not grid failures.With regard to Job Clusters i.e.series of thousands of istances of the same applica-tion differing for a few input parameters,all jobs of a cluster must be treated in a special way doing authentication and authorization only once.4.2VOMS,FQAN and G-PboxVO management is provided by VOMS that allows an improved granularity for Grid resource access.VOMS is the user management service that handles user certificates and allows Grid access as appropriate.With regard to data management,VOMS enables ACL definition to grant data security.With regard to software managers and special user tasks,VOMS handles roles and site policy.Role handling allows the managemet of users belonging to more than one VO with the same or different roles.VOMS ACLs are set as attributes.The attributes of the VOMS entry for each registered user are:•VO,e.g.alice•VO,by default at least one,i.e.the same as the VO,e.g.alice/manager•Role,e.g.normal user,software manager,etc.•Capability,wjere capability is a string assigned by the VOMS manager and forseen for future useAfter registering with voms-proxy-init--voms,attributes are listed byvoms-proxy-info--all,e.g.attribute:/alice/Role=NULL/Capability=NULLattribute:/alice/lcg1/Role=NULL/Capability=NULLG-Pbox is the policy handling service designed to manage resource allocation and sharing.The service interacts with VOMS,the CE and the WMS to execute jobs and allocate storage according with the site,VO and Grid policy.Resource managers at each site interact with the service to grant/withdraw resource access according to the site policy inside the wider resource management of the whole Grid.For more information about VOMS check[33].4.3CONSTANZA-G-DSEThe CONSTANZA service is designed to check and maintain replica consistency both of files and databases.CONSTANZA could be used to mirror etherogeneos databases like VOMS,data catalogues,etc.G-DSE,the Grid Data Source Engine is a model of database distribution over the Grid with distributed queries transparent to the user with regard of database location and access method.4.4LCAS-LCMAPSSites partecipating to the Grid must be able to enforce local policies.The LCAS service can authorize local users and grant local resources.LCMAPS keeps track of credentials and their mapping between Grid and the local site.Both services are part of gLite.4.5DGASDGAS is the accounting service that records usage for Grid jobs.The service records job resource consumption on the CE with several queueing products(e.g.LSF,PBS, Maui/Torque,Condor,etc.).The service is fully distributed,uses encription and access is granted only to authorized users.4.6R-GMA:Relational Grid Monitoring ArchitectureR-GMA is one of the monitoring components that collects and publishes information forfurther queries,taking into account the dynamic nature of the Grid.As an example a RBneeds information updated to at most the last10seconds to be able to distribute/recoverthe work load in an afficient way.R-GMA,as query system,should help HEP users access to the LHC data storage granting authorization,security and privacy.R-GMA is supposed to replace the Globus MDS Toolkit beeing more versatile and scalableand offers several query tools,like command line tools in Python,Java,R-GMA webbrowser,etc.4.7SLA and Agreement ServicesAgreement services are designed to allow reservation.Reservation grants users exclusiveaccess for a specific time interval according to resource requirements and availability.SLA stands for Service Level Agreements and is related to methods of evaluationof Grid load and resources.SLA enables users to schedule and execute complex multi-partjobs,assuring also job start and end time.The methods should be able to match user requirements to available resources in a dynamical and efficient way.4.8Tank&SparkIt is a service that installs automatically experiment specific software on all sites support-ing a specific VO.Experiment software installation is partly described in:https://edms.cern.ch/file/498080/1.0/SoftwareInstallation.pdfhttp://grid-deployment.web.cern.ch/grid-deployment/eis/docs/configuration tankspark5Current Data Challenges:SC4,DC06,PDC06Thefirst service challanges SC1(December2004)and SC2(March2005)were focused on the basic infrastructure without involving experiments or Tier2.SC procuced an useful feedback to build up and offer stable operation.Until the of end2005SC3increased complexity and functionality levels involving also all experiments with specific solutions and all Tier1s.Beyond2005SC4is designed to manage all LHC offline use cases,scale as needed and setup the production services for LHC.The aim of SC4is the validation of resources and performance in view of thefirst LHC data taking.The required services must provide the necessary level of functionality, reliability and scale according to needs by running in a realistic environment.Thefinal goal of SC4is the setup of the LCG production environment.The current SC4planning is described at:https://twiki.cern.ch/twiki/bin/view/LCG/SC4ExperimentPlansData rates for CERN and T1centres in SC4is targeted at200MB/s as highest speed. For the smallest centers the target is50MB/s and CERN is targeted at1600MB/s.The planning is distributed among the LHC experiments,while resource percentage is analized for each country.Just as an example the INFN Tier1at CNAF is expected to provide the following resources:7%of ALICE Resources-7%of ATLAS Resources13%of CMS resources-11%of LHCB resourcesResources include CPU power in MSI2K,disk storage in PBytes for Tier1s and Tier2s, and tape storage in PBytes for Tier1s only.The complete resources planning is available at:http://lcg.web.cern.ch/LCG/planning/phase2Also DAQ is involved in SC performance evaluation with the resource planning avail-able at:https://uimon.cern.ch/twiki/bin/view/Main/DaqTierZeroTierOnePlanningThe DAQ dataflow involves the T0/T1system:•DAQ→T0involving size/access of the DAQ disk buffer•T0→Tape involvingfile sizes,I/O scheduling of R/W streams and possible disk server congestion•T0→Reconstruction Farm involving direct access to disk servers with local cache on the worker nodes•T0→Tier1export involving high number of parralel I/O and TCP streams towards the involved Tier1s taking into account that each Tier1may and will have different mass storage systems6ALICE Computing:TOF and GridThis Section is an overview of the current status both of local and Grid computing and site management for ALICE TOF in the ALICE framework.Local computing is performed using a small farm for batch job submission.Site management involves the setup of the computing hardware,operating system and middleware installation,local test and monitoring as described in Section6.2.6.1Overview of ALICE TOF Job SubmissionJob submission requires a working installation of the basic software like root,aliroot, geant,etc.On the local farm the installation is performed by each user according to the processing needs.On the Grid the basic software must be the same at each site and grant homogeneous output for a whole production set,thus the installation is performed by the Grid software managers according to VOMS roles.In Grid production Alien is responsible for relevant information about the output data and the job parameters.As any jobs produces a log of the steps and errors with optional run parameters,it is a good practice to store also the log information for any customized production with special care for jobs submitted outside of the ALICE production.The following suggestions about log information and data storage structure are an example of useful practice for local computing that may be valid also in Grid computation taking into account the different structure of Grid computing.As job environment information like date,job timings,executing node,etc.are not published neither by local batch managers(BBS-condor,PBS,LSF,etc.)nor by the middleware,the user should insert in the submission scripts a fewflags at strategic points (job start,job end,job event no.,data save start and end,etc.)that improve output checking,job timing evaluation and failure debug.The suggestedflags are of the following type:echo"Job-beg-at:‘date‘"echo"Job-evt-no:09999"echo"Job-WKnode:‘hostname-f‘"echo"Job-end-at:‘date‘"echo"Job-Aliroot Starting at:"‘date‘>>aliroot.logecho"Job-Aliroot Ending at:"‘date‘>>aliroot.logecho"Job-beg-save:‘date‘"echo"Job-end-save:‘date‘"When we submit a cluster of TOF jobs without using the AliEn framework,we must use some local submission script that creates one job per event with the event number range supplied to the script as argument.The script action is the same for the queue man-agers and Grid,the difference applies only to the submission command(edg-job-sbmit, qsub,bbs_sumit,etc.)and the requirement descriptionfile(e.g.my-job.jdl).In is a good practice to store the job identifiers of each cluster submission in afile with a name related to the event number or range.As the Grid job identification numbers aremeaningless to the user,in this way the user creates a connection between the event no. and the related job identifier that is essential for job retrieval.For Bologna local jobs,an example of directory tree naming for TOF simulations is: $TOF_ROOT/Prod year/$Run_ID/$Evt_ID/EVT Collectionmiddleware.The VOBox requirements are experiment dependent.ALICE requiresa VOBox per site.To manage the computing resources and keep a high working efficiency a site-manager must perform the following highly demanding tasks:•install new hardware•maintain existing hardware,plan warranty support,update very old nodes with new ones•install the operating systems•test new nodes running stressing jobs•install and configure the required tools as appropriate(middleware or local batch system)•add a local monitoring service•manage local users•improve installation granularity•install and/or test the local applications•manage ACLs if any•check frequently the on-line job monitoring•quikly react to Grid tickets concerning malfunctions of the site•follow the meeting of the Grid and VO regarding site management and status6.3Monitoring Tools and Bologna ActivityGrid computing offers several monitoring tools both at user and at administrator level.•INFN Production Gridaf.infn.it.–Deplyment of INFN Grid–Downtime Advices–Ticketing System•SAM and GSTATService Availability Monitoring(SAM)jobs are launched every3hours tofindoutany submission problem;SAM is intended for site administartors and requires reg-istration for access.GSTAT displays status of the site with most recent charts..tw/gstat/INFN-BOLOGNA/•GridIceaf.infn.it:50080/gridice/site/site.phpThe default monitor view is by ers may select VO view and choose theappropriate VO.The main displayed metrics are related to CE,SE,hosts,job andchart.Charts show the work load at each site;the dafault view displays data every5hours for the last3days.Historical information is available at request(see examplein Section F)•MonALISAALICE production is monitored by MonALISAhttp://pcalimonitor.cern.ch:8889/show?page=index.htmlthat produces reports of the production activity over the Grid as a whole or foreach partner site.More than40sites are running the production with an averageof>1000jobs.The ALICE PDC06started on15-April-2006.Bologna joined at the end of Au-gust when we re-organized the local farm and gained a free node for the VOBoxinstallation.The table below lists the activity since our installation.Activity Summary August-November2006(MonALISA)Jobs Av.kSI2K Percent xrootd out Aug0.001 4.18MB86.58% Sep60.8229.14MB62.57% Oct199.940.06MB97.14% Nov178.930.66MB73.63%。
stata中nmissing命令的用法 -回复S t a t a中的n m i s s i n g命令是一个用于计算变量中缺失值数量的命令。
缺失值是数据分析中常见的问题,因此正确地理解和处理缺失值对于得到准确的分析结果非常重要。
首先,让我们来了解一下如何使用n m i s s i n g命令。
在S t a t a中,只需在命令窗口或d o文件中输入“n m i s s i n g”命令,后跟要计算缺失值的变量列表即可。
变量可以是连续型或离散型变量。
以下是n m i s s i n g命令的语法:n m i s s i n g[v a r l i s t][,c l e a r]其中,v a r l i s t表示要计算缺失值的变量列表。
c l e a r选项表示清除当前数据集中计算缺失值之外的其他变量。
使用n m i s s i n g命令的一个简单示例是计算数据集中每个变量的缺失值数量。
例如,考虑一个名为"d a t a"的数据集,其中包含了四个连续型变量"v a r1"、"v a r2"、"v a r3"和"v a r4"。
要计算这些变量中的缺失值数量,可以输入以下命令:n m i s s i n g v a r1v a r2v a r3v a r4运行该命令后,S t a t a会返回一个包含每个变量中缺失值数量的表格。
该表格的每一行对应一个变量,包括变量名称和对应的缺失值数量。
一旦我们掌握了n m i s s i n g命令的基本用法,我们可以进一步了解如何解释和处理缺失值。
在数据分析中,我们经常需要考虑缺失值的原因以及处理缺失值的方法。
缺失值的原因通常可以分为两类:随机缺失和非随机缺失。
随机缺失是指缺失值在数据集中的分布是随机的,而非随机缺失是指缺失值与观察到的数据存在特定的关联。
如果出现随机缺失,我们通常可以安全地忽略缺失值,并在分析中使用完整的数据集。
5 Reasons to run your business on Exadata Oracle Exadata is the only platform that delivers optimum database performance and efficiency for mixed data, analytics, and OL TP workloads. With a full range of deployment options, itallows you to run your Oracle Database and data workloads where you want, how youwant —on-premises, in the Oracle Cloud,Cloud at Customer in your data center , or any combination of these models.Here are five top reasons to chooseExadata to run your business.“ ”We chose Oracle Exadata for its integrated hardware and software platform. It costs 31 percent less than products from other vendors, such as IBM and SAP . By running Oracle’s JD Edwards ERP system on Oracle Exadata, we’ve gained a high-performing, reliable, and scalable database platform that enabled us to create daily sales reports 60x faster,introduce products 36x f a ster, enhanceuser satisfaction, increase IT productivity by 40 percent, and reduce operating costs. D. V. Jachak, General Manager, IT, Sai Prasad GroupZheng Tao, Head of IT, Wumart Stores, Inc.By consolidating 92 percent of our IBM servers and four databases onto a single Oracle Exadata Database Machine, we gained an integrated, high-performing private cloudplatform to support e-business growth. We can process online orders 8x faster, and havereduced operating costs by over 100,000 USD per year. “ ”By consolidating 40 disparate databases on Oracle Exadata Database Machine, we boosted sales and production system performance up to 60 percent and cut initial installation costs by 60 percent. We also enhanced IT governance across the organization, and supported 10 billion USD in turnover via business expansion. Akio Yoshizawa, Senior Manager, IT Infrastructure Solutions Department, NSK Network and Systems Co. Ltd.“ “ Ziraat Bank was alwaysunder pressure to optimizeperformance, because anysmall addition to end-of-dayprocessing could negativelyimpact revenue and the bank’sreputation. Oracle Exadata took all the steam off. We decreased the overnight batch window by more than 60 percent, reduced disk usage by 8x and overallsystem utilization from 70percent to 30 percent, whileimproving uptime for our corebanking online transactionprocessing system. Serdar Mutlu, Manager, Database Systems, T.C. Ziraat Bankasi A.Ş.”Anantha Spirama, VP , Systems and Technology, Macy’s“ Applications, databases,and infrastructure all haveto work together in harmony.When we looked at othercloud providers, they offeredthese in pieces. It was up tous to craft a solution. Oracleoffered an integrated solutionfor me. It was a natural choice. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.。
Features and Benefits Overview Control ITHarmony RackCommunications Control Network, Cnet, is a high-speed data communicationhighway between nodes in the Symphony™ Enterprise Man-agement and Control System. Cnet provides a data pathamong Harmony control units (HCU), human system inter-faces (HSI), and computers. High system reliability andavailability are key characteristics of this mission-criticalcommunication network. Reliability is bolstered by redun-dant hardware and communication media in a way that thebackup automatically takes over in the event of a fault in theprimary. Extensive use of error checking and messageacknowledgment assures accurate communication of criticalprocess data.Cnet uses exception reporting to increase the effective band-width of the communication network. This method offers theuser the flexibility of managing the flow of process data andultimately the process. Data is transmitted only when it haschanged by an amount which can be user selected, or when apredetermined time-out period is exceeded. The system pro-vides default values for these parameters, but the user cancustomize them to meet the specific needs of the processunder control.TC00895A■Fast plant-wide communication network: Cnet provides fastresponse time to insure timelyinformation exchange.■Efficient data transfer: Message packing and multiple address-ing increase data handlingefficiency and throughput.■Plant-wide time synchronization: Time synchronization of Cnetnodes throughout the entirecontrol process insures accuratedata time-stamping.■Independent node communica-tion: Each Cnet node operatesindependently of other nodes.Requires no traffic directors;each node is its owncommunication manager.■Accurate data exchange: Multi-ple self-check features including positive message acknowledg-ment, cyclic redundancy checks(CRC), and checksums insuredata integrity.■Automatic communications recovery: Rack communicationmodules provide localized start-up/shutdown on power failurewithout operator intervention.Each type of interface supportsredundancy.Harmony Rack CommunicationsOverviewHarmony rack communications encompasses various communication interfaces as shown inFigure1: Cnet-to-Cnet communication, Cnet-to-HCU communication, and Cnet-to-computercommunication.Figure 1. Harmony Rack Communications ArchitectureThe communication interface units transfer exception reports and system data, control, and con-figuration messages over Cnet. Exception reported data appears as dynamic values, alarms, and state changes on displays and in reports generated by human system interfaces and other system nodes. Exception reporting is automatic at the Harmony controller level. Specifically, the control-ler generates an exception report periodically to update data, after a process point reaches adefined alarm limit or changes state, or after a significant change in value occurs.Harmony Rack Communications Control NetworkCnet is a unidirectional, high speed serial data network that operates at a 10-megahertz or two-megahertz communication rate. It supports a central network with up to 250 system node connec-tions. Multiple satellite networks can link to the central network. Each satellite network supports up to 250 system node connections. Interfacing a maximum number of satellite networks gives a system capacity of over 62,000 nodes.On the central network, a node can be a bridge to a satellite network, a Harmony control unit, a human system interface, or a computer, each connected through a Cnet communication interface.On a satellite network, a node can be a bridge to the central network, a Harmony control unit, a human system interface, or a computer.Harmony Control UnitThe Harmony control unit is the fundamental control node of the Symphony system. It connects to Cnet through a Cnet-to-HCU interface. The HCU cabinet contains the Harmony controllers and input/output devices. The actual process control and management takes place at this level. HCU connection to Cnet enables Harmony controllers to:■Communicate field input values and states for process monitoring and control.■Communicate configuration parameters that determine the operation of functions such asalarming, trending, and logging on a human system interface.■Receive control instructions from a human system interface to adjust process field outputs.■Provide feedback to plant personnel of actual output changes.Human System InterfaceA human system interface such as a Signature Series workstation running Maestro or ConductorSeries software provides the ability to monitor and control plant operations from a single point. It connects to Cnet through a Cnet-to-computer interface. The number of workstations in a Sym-phony system varies and depends on the overall control plan and size of a plant. The workstation connection to Cnet gives plant personnel access to dynamic plant-wide process information, and enables monitoring, tuning, and control of an entire plant process from workstation color graphics displays and a pushbutton keyboard.ComputerA computer can access Cnet for data acquisition, system configuration, and process control. It con-nects to Cnet through a Cnet-to-computer interface. The computer connection to Cnet enablesplant personnel, for example, to develop and maintain control configurations, manage the system database, and create HSI displays remotely using Composer™engineering tools. There are addi-tional Composer and Performer series tools and applications that can access plant informationthrough a Cnet-to-computer interface.Cnet-to-Cnet Communication InterfaceThe Cnet-to-Cnet interfaces are the INIIR01 Remote Interface and the INIIL02 Local Interface.Figure2 shows the remote interface and Figure 3 shows the local interface.Harmony Rack CommunicationsFigure 2. Cnet-to-Cnet Remote Interface (INIIR01)Figure 3. Cnet-to-Cnet Local Interface (INIIL02)Harmony Rack Communications INIIR01 Remote InterfaceThe INIIR01 Remote Interface consists of the INNIS01 Network Interface Module and the INIIT12 Remote Transfer Module (Fig.2). This interface is a node on a central network that can communi-cate to an interface node on a remote satellite network. In this arrangement, two interfaces arerequired: one for the central network, and the other for the satellite network. Bidirectional commu-nication from the central network to the remote satellite network is through standard RS-232-Cports.The remote interface supports hardware redundancy. Redundancy requires a full set of duplicate modules (two INNIS01 modules and two INIIT12 modules on each network). The secondaryINIIT12 module continuously monitors the primary over dedicated Controlway. A failover occurs when the secondary module detects a primary module failure. When this happens, the secondary interface takes over and the primary interface is taken offline.INIIL02 Local InterfaceThe INIIL02 Local Interface consists of two INNIS01 Network Interface modules and the INIIT03 Local Transfer Module (Fig.3). This interface acts as a bridge between two local Cnets. One of the INNIS01 modules operates on the central network side and the other operates on the satellite net-work side. Bidirectional communication from the central network to the local satellite network is through cable connection to the NTCL01 termination unit. The maximum distance betweentermination units on the two communication networks is 45.8 meters (150feet).The local interface supports hardware redundancy. Redundancy requires a full set of duplicatemodules (four INNIS01 modules and two INIIT03 modules). The secondary INIIT03 module con-tinuously monitors the primary over dedicated Controlway. A failover occurs when the secondary detects a primary module failure. When this happens, the secondary assumes responsibility and the primary is taken offline.Cnet-to-HCU Communication InterfaceThe Harmony control unit interface consists of the INNIS01 Network Interface Module and the INNPM12 or INNPM11 Network Processing Module (Fig. 4). This interface can be used for a node on the central network or on a satellite network (Fig.1). Through this interface the Harmony con-trol unit has access to Cnet and to Controlway at the same time. Controlway is an internal cabinet communication bus between Harmony rack controllers and the communication interfacemodules.The HCU interface supports hardware redundancy. Redundancy requires a full set of duplicate modules (two INNIS01 modules and two INNPM12 or INNPM11 modules). The secondary net-work processing module (INNPM12 or INNPM11) continuously monitors the primary through a direct ribbon cable connection. A failover occurs when the secondary detects a primary module failure. When this happens, the secondary assumes responsibility and the primary is taken offline. Cnet-to-Computer Communication InterfaceThe Cnet-to-computer interfaces are the INICI03 and INICI12 interfaces. The INICI03 interfaceconsists of the INNIS01 Network Interface Module, the INICT03A Computer Transfer Module,and the IMMPI01 Multifunction Processor Interface Module (Fig. 5). The INICI12 interface con-sists of the INNIS01 Network Interface Module and the INICT12 Computer Transfer Module(Fig6).Harmony Rack CommunicationsFigure 4. Cnet-to-HCU InterfaceFigure 5. Cnet-to-Computer Interface (INICI03)Figure 6. Cnet-to-Computer Interface (INICI12)Harmony Rack CommunicationsA computer interface can be used for a node on the central network or on a satellite network (Fig.1). It gives a host computer access to point data over Cnet. The computer connects through either an RS-232-C serial link at rates up to 19.2 kilobaud or through a SCSI parallel port when using an INICI03 interface. The computer connects through an RS-232-C serial link at rates up to 19.2 kilobaud when using an INICI12 interface. Each interface is command driven through soft-ware on the host computer. It receives a command from the host computer, executes it, then replies to the host computer.Note: A workstation running Conductor VMS software does not use an INICI03 or INICI12 Cnet-to-Computer Interface but instead has its own dedicated version of the Cnet-to-computer interface (IIMCP02 and IIMLM01).Communication ModulesTable 1 lists the available Harmony rack communication modules. These modules, in certain combinations, create the various Cnet communication interfaces.Network Interface ModuleThe INNIS01 Network Interface Module is the front end for all the different Cnet communication interfaces. It is the intelligent link between a node and Cnet. The INNIS01 module works in con-junction with the transfer modules and the network processing module. This allows any node to communicate with any other node within the Symphony system.The INNIS01 module is a single printed circuit board that occupies one slot in the module mount-ing unit (MMU). The circuit board contains microprocessor based communication circuitry that enables it to directly communicate with the transfer modules and network processing module, and to interface to Cnet.The INNIS01 module connects to its Cnet communication network through a cable connected to an NTCL01 termination unit. Communication between nodes is through coaxial or twinaxial cables that connect to the termination units on each node.Cnet-to-Cnet Remote Transfer ModuleThe INIIT12 Remote Transfer Module supports bidirectional communication through twoRS-232-C ports. Port one passes system data only. Port two passes system data or can be used as a diagnostic port. The central network INIIT12 module can use a variety of means to link to the sat-ellite network INIIT12 module such as modems, microwave, and transceivers. The INIIT12Table 1. Harmony Rack Communication Modules ModuleDescription Cnet-to-Cnet Cnet-to-HCU Cnet-to-Computer INIIR01 INIIL02 INICI03INICI12 IMMPI01Multifunction processor interface •INICT03ACnet-to-computer transfer •INICT12Cnet-to-computer transfer •INIIT03Cnet-to-Cnet local transfer •INIIT12Cnet-to-Cnet remote transfer •INNIS01Network interface •••••INNPM11 or INNPM12Network processing•Harmony Rack Communicationsmodule directly communicates with an INNIS01 module. Many of the operating characteristics of the INIIT12 module are determined by function code202 (INIIT12 executive) specifications.The INIIT12 module is a single printed circuit board that occupies one slot in the module mount-ing unit. The circuit board contains microprocessor based communication circuitry that enables it to serially communicate with another INIIT12 module, to directly communicate with its INNIS01 module, and to interface to Controlway.The INIIT12 module connects through a cable to an NTMP01 termination unit. The two RS-232-C ports are located on the termination unit.Cnet-to-Cnet Local Transfer ModuleThe INIIT03 Local Transfer Module serves as the bridge between two local Cnet communication networks. It holds the node database and is responsible for transferring all messages between net-works. Messages include exception reports, configuration data, control data, and system status.This module directly communicates with the INNIS01 module of the central network and of the satellite network simultaneously.The INIIT03 module is a single printed circuit board that occupies one slot in the module mount-ing unit. The circuit board contains microprocessor based communication circuitry that enables it to directly communicate with its two INNIS01 modules and to interface to Controlway.Cnet-to-Computer Transfer ModuleThe INICT03A Computer Transfer Module and INICT12 Computer Transfer Module handle all communication with a host computer. These modules are command driven through software on the host computer. The module receives a command from the host computer, executes it, thenreplies. Its firmware enables the host computer to issue commands for data acquisition, process monitoring, and process control, and to perform system functions such as security, timesynchronization, status monitoring, and module configuration.The INICT03A and INICT12 modules are single printed circuit boards that occupy one slot in the module mounting unit. Their capabilities and computer connection methods differ. The INICT03A module can store up to 30,000 point definitions (depending on point types). The INICT12 module can store up to 10,000 point definitions.For the INICT03A module, the circuit board contains microprocessor based communication cir-cuitry that enables it to directly communicate with its INNIS01 module and to directlycommunicate with an IMMPI01 module. It communicates with the IMMPI01 module through a ribbon cable connection. The IMMPI01 module handles the actual host computer interface andsupports RS-232-C or SCSI serial communication.For the INICT12 module, the circuit board contains microprocessor based communication cir-cuitry that enables it to directly communicate with its INNIS01 module and to directlycommunicate with a host computer using RS-232-C serial communication. The module cable con-nects to an NTMP01 termination unit. Two RS-232-C ports are located on the termination unit. The NTMP01 jumper configuration determines DTE or DCE operation.Multifunction Processor Interface ModuleThe IMMPI01 Multifunction Processor Interface Module handles the I/O interface between thehost computer and the INICT03A Computer Transfer Module. The IMMPI01 module supportseither a SCSI or RS-232-C computer interface. When communicating through the RS-232-C port, the module can act as data communication equipment (DCE) or data terminal equipment (DTE).Harmony Rack Communications The IMMPI01 module is a single printed circuit board that occupies one slot in the module mount-ing unit. The circuit board contains microprocessor based communication circuitry that enables it to communicate with its INICT03A module through a ribbon cable connection.For RS-232-C computer interface, the module cable connects to an NTMP01 termination unit. Two RS-232-C ports are located on the termination unit. The NTMP01 jumper configuration determines DTE or DCE operation. The SCSI port is located at the module faceplate. In this case, notermination unit is required.Network Processing ModuleThe INNPM12 or INNPM11 Network Processing Module acts as a gateway between Cnet andControlway. The module holds the Harmony control unit database and handles the communica-tion between controllers residing on Controlway and the INNIS01 module.The INNPM12 or INNPM11 module is a single printed circuit board that occupies one slot in the module mounting unit. The circuit board contains microprocessor based communication circuitry that enables it to directly communicate with its INNIS01 module and to interface to Controlway.Rack Communications PowerHarmony rack communication modules are powered by 5, 15, and -15VDC logic power. Modular Power System II supplies the logic power. These operating voltages are distributed from thepower system through a system power bus bar mounted in the cabinet. A module mounting unit connects to this bus bar then routes the power to individual modules through backplaneconnectors.Rack Communications Mounting HardwareHarmony rack communication modules and their termination units mount in standard ABB cabi-nets. The option for small cabinet mounting is provided. The number of modules that can bemounted in a single cabinet varies. Modules of an interface are always mounted in adjacent slots.An IEMMU11, IEMMU12, IEMMU21, or IEMMU22 Module Mounting Unit and an NFTP01 Field Termination Panel are used for module and termination unit mounting respectively (Fig. 7). The mounting unit and termination panel both attach to standard 483-millimeter (19-inch) width side rails. Front mount and rear mount MMU versions are available to provide flexibility in cabinetmounting.A module mounting unit is required to mount and provide power to rack mounted modules. Theunit is for mounting Harmony rack controllers, I/O modules, and communication interfacemodules. The MMU backplane connects and routes:■Controlway.■I/O expander bus.■Logic power to rack modules.The Controlway and I/O expander bus are internal cabinet, communication buses. Communica-tion between rack controllers and HCU communication interface modules is over Controlway. The Cnet-to-Cnet interfaces use dedicated Controlway for redundancy communication. This dedicated Controlway is isolated from all other modules.Harmony Rack CommunicationsFigure 7. Rack I/O Mounting HardwareRelated DocumentsNumber Document TitleWBPEEUD250001??Harmony Rack Communications, Data SheetHarmony Rack Communications WBPEEUS250002C111Harmony Rack CommunicationsWBPEEUS250002C1Litho in U.S.A.May 2003Copyright © 2003 by ABB, All Rights Reserved® Registered Trademark of ABB.™ Trademark of ABB.For more information on the Control IT suiteofproducts,***************************.comFor the latest information on ABB visit us on the World Wide Web at /controlAutomation Technology Products Mannheim, Germany www.abb.de/processautomation email:*********************************.com Automation Technology ProductsWickliffe, Ohio, USA/processautomation email:****************************.com Automation Technology Products Västerås, Sweden /processautomation email:************************.com ™Composer, Control IT , and Symphony are trademarks of ABB.。
dataphin 调度逻辑在当今信息爆炸的时代,数据处理和管理变得至关重要。
随着企业数据规模的不断增长,如何高效地处理和管理数据成为了企业面临的重要挑战。
在这个背景下,dataphin调度逻辑应运而生,成为了企业数据处理流程优化的得力助手。
dataphin调度逻辑是指通过dataphin平台对数据处理流程进行调度和管理的逻辑机制。
它通过对数据处理流程进行规划、调度和监控,实现了数据处理的自动化和高效化。
在实际应用中,dataphin调度逻辑能够帮助企业实现数据的自动抽取、转换和加载(ETL),提升数据处理的效率和准确性。
首先,dataphin调度逻辑通过对数据处理流程进行规划和设计,实现了数据处理流程的可视化和自动化。
通过dataphin平台,用户可以轻松地创建数据处理流程,并设置相应的调度逻辑,实现数据处理的自动化和定时执行。
这大大减轻了数据处理人员的工作负担,提升了工作效率。
其次,dataphin调度逻辑实现了对数据处理流程的监控和管理。
通过dataphin平台,用户可以实时地监控数据处理流程的执行情况,及时发现和处理数据处理中的异常情况。
这有助于保障数据处理的准确性和及时性,提升了数据处理的质量。
最后,dataphin调度逻辑实现了对数据处理流程的智能优化。
通过dataphin平台,用户可以根据数据处理流程的执行情况和需求进行智能调度和优化,提升数据处理的效率和性能。
这有助于企业更好地利用数据资源,实现数据驱动决策。
综上所述,dataphin调度逻辑通过对数据处理流程的规划、调度和监控,实现了数据处理的自动化和高效化,提升了企业数据处理的效率和准确性。
在未来,随着企业数据规模的不断增长,dataphin调度逻辑将会发挥越来越重要的作用,成为企业数据处理流程优化的得力助手。
stata结构方程模型命令Stata结构方程模型命令结构方程模型(Structural Equation Modeling, SEM)是一种广泛应用于社会科学研究中的统计分析方法。
它可以用来验证理论模型、检验因果关系以及预测潜在变量之间的关系。
在Stata中,我们可以使用一些命令来进行结构方程模型的估计和分析。
一、安装sem命令在使用Stata进行结构方程模型分析之前,我们需要安装sem命令。
在Stata的命令栏中输入以下命令:```statassc install sem```这样就可以安装sem命令了。
二、加载数据在进行结构方程模型分析之前,我们需要加载数据。
假设我们的数据文件名为data.dta,我们可以使用以下命令加载数据:```statause "data.dta", clear```三、设定模型在进行结构方程模型分析之前,我们需要先设定模型。
设定模型包括两个方面的内容:变量之间的关系以及测量模型。
1. 变量之间的关系假设我们的模型包含三个潜在变量:X、Y和Z。
我们认为X对Y和Z有直接影响,而Y对Z也有直接影响。
我们可以使用以下命令设定变量之间的关系:```statasem (X -> Y) (X -> Z) (Y -> Z)```2. 测量模型在结构方程模型中,潜在变量是通过观测变量来进行测量的。
我们需要设定每个潜在变量对应的观测变量。
假设潜在变量X由观测变量X1、X2和X3测量,潜在变量Y由观测变量Y1和Y2测量,潜在变量Z由观测变量Z1和Z2测量。
我们可以使用以下命令设定测量模型:```statasem (X -> X1 X2 X3) (Y -> Y1 Y2) (Z -> Z1 Z2)```四、估计模型在设定完模型之后,我们需要估计模型。
我们可以使用以下命令进行模型估计:```statasem, method(ml) //最大似然估计```其中,method(ml)表示使用最大似然估计方法进行模型估计。
reallocated sector count 感叹号-回复什么是"reallocated sector count"?为什么它引发人们的关注?以及我们应该如何处理这个问题?"reallocated sector count"(重新分配扇区计数)是硬盘驱动器S.M.A.R.T.(Self-Monitoring, Analysis and Reporting Technology,自我监测、分析和报告技术)功能中的一个参数。
它用于记录在硬盘中出现的损坏扇区数量。
当硬盘检测到扇区损坏时,它会将这些扇区标记为不可用,并将数据从损坏扇区复制到备用扇区中。
该过程被称为“重新分配”。
"reallocated sector count"成为人们关注的焦点,是因为它可以提供有关硬盘健康状况的重要信息。
当"reallocated sector count"数值增加时,这意味着硬盘扇区的损坏数量在增加。
如果这个数值过高,可能意味着硬盘已经或即将发生故障。
硬盘故障可能导致数据丢失或硬盘完全无法使用,因此及时处理"reallocated sector count"的问题非常重要。
那么,我们应该如何处理"reallocated sector count"的问题呢?首先,我们应该定期检查并记录硬盘的S.M.A.R.T.参数,包括"reallocated sector count"。
这样可以及早发现问题并采取相应的处理措施。
大多数操作系统都提供了一些工具,可以用来读取和解释S.M.A.R.T.数据。
然后,如果"reallocated sector count"的数值在不断增加,我们应该考虑备份重要数据。
这样,即使硬盘完全发生故障,我们也可以通过备份文件恢复数据。
备份的频率应根据数据重要性和用户需求而定。
SDAM: A New data Aggregation Approach for Wireless Sensor NetworksHang Qin Software Engineering State KeyLaboratoryWuhan UniversityWuhan 430072, Chinahangqin100@Hui WangSoftware Engineering State KeyLaboratoryWuhan UniversityWuhan 430072, ChinaHuaibei ZhouInternational School of SoftwareWuhan UniversityWuhan 430072, ChinaAbstract—In this paper, a novel scalable scheme, Scalable Data Aggregation Monitoring (SDAM), is proposed f or ef f icient datagathering with aggregation (fusion) in Wireless Sensor Networks (WSN). Di f f erent f rom existing schemes, SDAM not only optimizes data transmission cost, but also incorporates the unction of data f usion, which is crucial f or emerging sensor networks with data management and high availability requirements. Employing a randomized algorithm that allows f usion points to be chosen by nodes’ data amount, SDAM can achieve high availability or sensor nodes, which results the optimal solution f or nowadays system setup. Simulation results demonstrate that SDAM scheme can re lect the node status promptly, and save the network throughput of sensor nodes. Therefore, the lifetime of WSN is significantly extended.Keywords-sensor networks, data aggregation, zone monitoring, timestampI.I NTRODUCTIONWSN is an accumulation of sensors interconnected by wireless communication channels. Under the control of the network, every sensor node is a small device that can collect data from surrounding area, communicate with each other, and carry out computation. Long distance communication is normally achieved in a multi-hop manner. Thanks to recent advances in remote monitoring system, such networks are progressing rapidly, and are expected to be popular in applications such as environment monitoring, intrusion detection, and earthquake warning.Nonetheless, existing strategies miss one vital measures in the optimization space for routing correlated data, namely, the data aggregation cost, which may not be negligible for certain applications. As the result, WSN monitoring field parameters need to seek simple functions (for instance, average, maximum, or minimum) with cost-effectiveness. Meanwhile other sensor nodes may require complex operations for data aggregation. In terms of network availability for monitoring algorithm, the sensor data fusion has been shown in the same order of data communication. Partitioning and recombining information are demonstrated as two aggregation process examples in experiment.II.B ACKGROUND AND R ELATED W ORK WSN have acquired much attention because of the potential application of the technology. Numerous data communication protocols have been proposed lately, such as: EPAS (Energy-Efficient Protocol for Aggregator Selection ) [1], TAG (Tiny AGgregation)[2], TTDD (Two-Tier Data Dissemination)[3], GRAB(GRAdient Broadcast)[4], MF ST (Minimum F usion Steiner Tree)[5]. Three types of data collections in WSN are discussed: (1)Event-based data, such as intrusion detection or object tracking, are accumulated when an event at a particular locale within the deployment region comes. The event is strengthened by detecting sensors with local agreement, and presented to the control authority.(2)Global state-based data, such as temperature or humidity, are accumulated by sensor nodes all over the deployment area and transmitted to the sink. (3)Focused state-based data is accumulated in response to a query sent to selected sensors requesting relevant data. Our interest here is in focused state-based data.III.S YSTEM M ODEL AND SDAM D ESIGNA.Problem Formulation and Sensor Node’s StateTransformationIn this paper, aggregation organizer determines the status of node machine from the data packet content and interval sent out by sensor node[6][7], and the conceivable node machine status may be: Down,Shift, Activate and Pause. When the data packet of node machine is received by aggregating mechanism, the aggregation organizer can search a new buffer region in the shared memory area to save this node’s information. Down is aggregation organizer has not received the data packet after waiting for a lot of time. At this time sensor node is closed down or has a fault, and aggregation organizer will inform this status to the administrator. Shift shows sensor node has been closed down for a long time, and aggregation organizer can shift this node machine from theshared memory area for the saving of shared memory area. Activate defines sensor node can work normally and aggregation organizer can activate sensor node’s previous data. Pause describes aggregation organizer can not receive sensor node’s data packet in the certain time, it represents sensor node is busy, fault or failure, and aggregation organizer can wait for the data packet continuously. Figure 1 depicts theaggregation states model of the above states:Figure 1. Aggregation States Model of Sensor NodeB.Finite Automata of Sensor Node’s State TransitionGive for the formal approach of sensor network nodes, the node machine’s states transformation diagram is enhanced in this section. In Figure 2, the start status and the terminal status are added. Ready depicts the initialization of aggregation organizer, double circles show the final state in this status diagram and one state transformation procedure of node machine. Thus the Deterministic Finite Automata (DF A) [6] can be constructed from sensor nodes’ state transformation diagram:Figure2. Transition diagram of DFA to node via aggregationDef inition 3.1DFA node =(T, Q, δ, F, S ). T = {p, d, s, r }, pdefines having not received data packet in t p senonds, d stands for having not received data packet in t d seconds, r depicts having not received data packet in t s seconds, r shows receive one data packet. Q is the state set, Q = {R, S , D, P, A }, R stands for the ready status, S stands for shift, D stands for Down, P stands for pause, A stands for activate. And T is transaction set. δstands for a state transformation function:Q ×T → Q , and the state transformation formula can be showed in Table 1. F is the final states set, F = {S }, and F ⊆Q .Table 1. the sensor nodes’ transformation aggregation rule r p d s R A S A D A S P A D A A PThe sensor nodes’ one action set from the above equation set can be got, this action depicts the process from start state to failure state, or any states from the start status to the shutting down status. And this set can be described by the regular expression: R node =r(r + pr + pdr + pdsr)* pds . Any sentence produced by the regular expression R node is one action of node machine, such as rpds depicts the failure for some reason after the sensor nodes send one information packet to the monitoring mechanism. Therefore finite automata DFA node and regular expression R node provide the formal description for the research of sensor nodes’ state transformation.C.SDAM Algorithm DesignIn this Section, the data aggregation algorithm is proposed as below.Epoch 1COUNT packet is the number of 8K data packets, LENGTH total is the overall length of the data to be sent, LENGTH data is the length of data to be hold in one packet, MOD is the arithmetic operation, << is the shift left operation, packet is one data packet, TS is timestamp, and sequence is the current sending packet serial number. Whereas LENGTH total MOD LENGTH data = 0, all the data can be hold in the COUNT packet data packets:1.COUNT packet ←LENGTH total / LENGTH data ,TS ←current system time, LENGTH data ← 8K – the length of information packet header, id ← 0;2.if LENGTH total MOD LENGTH data = 0, then cover← (1 << COUNT packet ) – 1, else cover ← (1 << (COUNT packet + 1)) – 1;3.if COUNT packet = 0, then goto 7;4.run distribute( ), packet_cover ← cover, packet_id← 1 << id, packet_timestamp ←TS ,LEGTH packet ← 8K5.copy LENGTH data data to the information message atthe beginning of data’s id ×LENGTH data offset address, send information packet;6.id ← id + 1, if id = COUNT packet ,then goto 4;7.if LENGTH total MOD LENGTH data = 0, then goto 10;8.packet_cover ←cover, packet_id ← id,packet_timestamp ←TS ,LEGTH packet ←LENGTH total MOD LENGTH data + the length of information packet;9.copy LENGTH data data to the information message atthe beginning of data’s id ×LENGTH data offset address, send information packet;10.return .Epoch 2:packet is the information packets received by aggregation organizer, TS is the timestamp when aggregation organizer receives the former information packet, and the initialization is TS ← 0.Aggregation organizer allocates 256K buffer region to save the received data, LENGTH data is he length of data to be hold in one packet, | is the arithmetic OR operation. The current packet will be discarded when it is older than the former packet:1.if packet_timestamp < TS ,goto 5;2.copy the packet data with the length of packet.LENGTH data to the buffer region at the position of corresponding offset address;3.run election( ), if packet_timestamp > TS ,thencover ← 0, TS ←packet_timestamp; 4.cover ← cover | packet_id;5.if cover = packet_cover, then clean up theinformation in the buffer region, and save it in the shared memory area; 6.return .IV.T RANSMISSION AND P ACKET S ALVAGING A NALYSIS OFSDMAPerformance of this sensor monitoring system can be evaluated from two aspects: the response time and the sensor network bandwidth in occupancy. In order to describe the performance of remote monitoring system more accurately, it is modeled as: Definition 4.1 Node machine performance 4-tuple PF= (F com ,V max , TP, LT ), F com (Communication F requency) is the bandwidth of communication, which is the number of sensor nodes’ request from monitoring node to sensor nodes per second, V max is the transmission speed maximum of node machine sensor network device, TP (Throughput Percentage) describes the percentage of sensor network bandwidth in theresponding process of node machine, LT (Latency Time)defines the monitoring request time of node machine responsemonitoring time. Therefore the responding time LT is the sumof sensor network communication time and the sensor nodes data aggregation time, and LT = T agg +T com . Sensor network throughput TP is closely related with the communicationquantity, communication frequency and the maximum sensor network transmission ratio of sensor nodes and monitoring node, and as is show in formula (1), where S cmd and S data are the size of command packet and information packetrespectively:max()100%com cmd data F S S TP V +=× (1)De inition 4.2 Sensor aggregation monitoring performancemodel SDAM PF = (M, E, N, ATP, ALT ), and M (Medium)represents what wireless network medium the remotemonitoring system runs, M ∈{L, X, P }, L (WLAN) is thewireless local area network, X (WiMAX) is the wirelessmetropolitan area network or the Internet, P (WPAN) is thewireless personal area networks. E (Environment) describes thehardware and soft configuration environment of the whole sensor networks. Hardware environment includes the configuration of sensor nodes, monitoring nodes and wireless network connectivity, such as the number of sensor nodes, the transmission ratio of wireless network communication and the capacity of data storage device. Software environment includes the version of operating system, communication protocol and so on. N (Nodes) is the set of performance 4-tuple in the monitored sensor nodes, thus N={x | x=PF }. ATP (Average Throughput Percentage) defines the network bandwidth percentage of monitoring system in the whole wireless sensor networks. ALT (Average Latency Time) describes the average responding time of sensor nodes.max cmd datacom n Nn NS S F n TPnV ATP NN∈∈+××==¦¦ (2)()comagg n Nn Nn LT n TT ALT NN∈∈××+==¦¦ (3)As far as the average latency time ALT is concerned, it is responsible for the N ,T com and T agg in every sensor nodes. With the increasing of sensor nodes number ΊN Ί,T com inevery node is becoming larger and larger. It is because monitoring node will process more information packet from the sensor nodes, while the sensor nodes’ processing time T agg is invariable. With the increasing of sensor nodes’ number, the average responding time will decrease. As the percentage ATP of the monitoring system in the whole wireless sensor network, it is closely related with the number of sensor node VΊN Ί, the size of command packet S cmd and information packet S data, the transmission speedmaximum of wireless network device V max, and thecommunication frequency F combetween the monitoring nodesand sensor nodes. When the number of sensor nodes isincreasing, ATP is invariable practically, although theresponding time in the sensor nodes is becoming larger. It isbecause S cmd ,S data ,V max and F com are invariable. At the sametime, ATP is invariable practically when the overload of the sensor nodes system becomes larger. Due to the invariability of S cmd and S data , the change of ATP depends on F com and V max .V.S IMULATION S TUDY To verify the algorithm introduced above, a simulationprogram is run and shows following results.Suppose the surveillance region have 2500 square meters with a foursquare shape. Comparing the results of two different groups shown in Figure 3 and Figure 4, the overheadof WSN increase slowly. Moreover, if increasing theprobability of edge fails, the overhead decreases on thecontrary.When the system overload as the sensor nodes increase,the processing time T agg is becoming larger, and T com isbecoming larger even if these overloads focus on the wirelessnetwork communication. Otherwise T com will has a littlevariation, triggered by the increasing of sensor nodes overload. Therefore the average responding time becomes larger when the system overload is increasing.Number of Sensor NodesL a t e n c y T i m e (m S e c )Figure 3. Data aggregation and Latency101010101010Packet CountL a t e n c y T i m e (m S e c )Figure 4. Transmission Cost and LatencyIt is obviously the ALT is increasing with the increase of F com and the decrease of V max . This provides an approach for the administrator to reduce ALT when the wireless network overload is increasing. We can not only configure the sensornetworks monitoring system for the increase of collecting information interval and the decrease of F com , but also deploythe faster sensor networks device for the increase of V max .VI.C ONCLUSIONIn this paper, a scalable data fusion server based on WSN architecture is proposed. Techniques such as parallel data storing and distribution policy, adaptive timestamp data aggregation algorithm, are discussed extensively. Simulation results based on existing data aggregation monitoring system demonstrate the expected advantages. Also, the results will lead to further evaluation of these techniques later. In the future, the impacts on the performance of the data fusion system will be explored in details, especially it is induced by communication operation reduction. The principle of dynamic transmitting rate at aggregator to reduce the user perceived latency needs further study in depth.R EFERENCES[1] Yuanzhu, Peter Chen, and Arthur L. Liestman, “A Hierarchical Energy-Efficient F ramework for Data Aggregation in Wireless Sensor Networks”, IEEE Transactions on vehicular technology, Vol. 55, No. 3, May 2006, 789–796[2] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong, “TAG: A tinyaggregation service for ad-hoc sensor networks”, in Proc.5th S ymp. OSDI, Boston, MA, Dec. 2002.[3] F.Ye, H.Luo, J.Cheng, S.Lu and L.Zhang,“A two-tier data disseminationmodel for large-scale wireless sensor networks”, in Proc. MobiCom, Atlanta, GA, Sep. 2002[4] F. Ye, G. Zhong, S. Lu, and L. Zhang, “Robust data delivery protocol forlarge scale sensor networks”, in Proc. IEEE Int. Workshop IPS N, Palo Alto, CA, Apr. 2003.[5] Hong Luo, Yonghe Liu and Sajal K. Das, “Routing Correlated Data withFusion Cost in Wireless Sensor Networks”, IEEE Transactions on mobile computing , Vol. 5, No. 11, November 2006, pp. 1620–1632.[6] Harry R. Lewis and Christos H. Papadimitrious, Elements of the Theoryof Computation, Prentice Hall, Second Edition, 2006, pp.34–39.[7] Alberto Cerpa and Deborah Estrin, “ASCENT: Adaptive Self-ConfiguringSensor Networks Topologies”, in Proc. 11th Joint Conf on IEEE Computer and Communications Societies (INFOCOM), New York, NY, June 2002.。