Creating the DISEQuA corpus A test set for multilingual question answering

格式：pdf
大小：155.93 KB
文档页数：14

下载文档原格式

高二英语科研项目实施流程单选题40题

高二英语科研项目实施流程单选题40题1. When starting a scientific research project, the first step is to _____.A. make a planB. collect dataC. define the problemD. analyze the results答案：C。

本题考查科研项目启动阶段的首要步骤。

选项A“make a plan（制定计划）”通常在确定问题之后进行。

选项B“collect data（收集数据）”是后续的步骤。

选项C“define the problem（定义问题）”是启动阶段的首要任务，只有明确问题，后续工作才能有针对性地开展。

选项D“analyze the results 分析结果）”是项目后期的工作。

2. Before launching a research project, we need to _____.A. conduct experimentsB. determine the research methodC. write a conclusionD. present the findings答案：B。

此题聚焦科研项目启动前的准备。

选项A“conduct experiments（进行实验）”在确定研究方法之后实施。

选项B“determine the research method 确定研究方法）”是启动前必要的准备，决定了后续研究的方向和方式。

选项C“write a conclusion（写结论）”是项目接近尾声时的工作。

选项D“present the findings（展示研究结果）”也是项目后期的环节。

3. In the initiation stage of a scientific research project, one should _____.A. review related literatureB. organize a teamC. publish the researchD. evaluate the resources答案：A。

drug-testing-kit

One Step Drug Screen Test CardPackage Insert for Single and Multi Drug Screen Test CardsThis Instruction Sheet is for testing of any combination of Amphetamine, Barbiturates, Benzodiazepines, Cocaine, Methamphetamine, Morphine, Methadone, Phencyclidine and Marijuana.A rapid, one step screening test for the simultaneous, qualitative detection of multiple drugs and drug metabolites in human urine.For healthcare professionals including professionals at point of care sites.For in vitro diagnostic use only.INTENDED USEThe One Step Drug Screen Test Card is a lateral flow chromatographic immunoassay for the qualitative detection of multiple drugs and drug metabolites in urine at the following cut-off concentrations:Test Calibrator Cut-offAmphetamine (AMP) D-Amphetamine 1,000 ng/mL Barbiturates (BAR) Secobarbital 300 ng/mL Benzodiazepines (BZO) Oxazepam 300 ng/mL Cocaine (COC) Benzoylecgonine 300 ng/mL Methamphetamine (mAMP) D-Methamphetamine 1,000 ng/mLMorphine (MOP 300 or OPI 300) Morphine 300 ng/mLMethadone (MTD) Methadone 300 ng/mLOpiates (OPI 2000) Morphine 2,000 ng/mLPhencyclidine (PCP) Phencyclidine 25 ng/mL Marijuana (THC)11-nor-∆9-THC-9 COOH50 ng/mLThe configurations of the One Step Multi-Drug Screen Test Card come with any combination of the above listed drug analytes. This assay provides only a preliminary analytical test result. A more specific alternate chemical method must be used in order to obtain a confirmed analytical result. Gas chromatography/mass spectrometry (GC/MS) is the preferred confirmatory method. Clinical consideration and professional judgment should be applied to any drug of abuse test result, particularly when preliminary positive results are used.SUMMARYAMPHETAMINE (AMP)Amphetamine is a Schedule II controlled substance available by prescription (Dexedrine®) and is also available on the illicit market. Amphetamines are a class of potent sympathomimetic agents with therapeutic applications. They are chemically related to the human body’s natural catecholamines: epinephrine and norepinephrine. Acute higher does lead to enhanced stimulation of the central nervous system and induce euphoria, alertness, reduced appetite, and a sense of increased energy and power. Cardiovascular responses to Amphetamines include increased blood pressure and cardiac arrhythmias. More acute responses produce anxiety, paranoia, hallucinations, and psychotic behavior. The effects of Amphetamines generally last 2-4 hours following use, and the drug has a half-life of 4-24 hours in the body. About 30% of Amphetamines are excreted in the urine in unchanged form, with the remainder as hydroxylated and deaminated derivatives.The AMP One Step Amphetamine Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of Amphetamine in urine. The AMP One Step Amphetamine Test Strip yields a positive result when Amphetamines in urine exceed 1,000 ng/mL.BARBITURATES (BAR)Barbiturates are central nervous system depressants. They are used therapeutically as sedatives, hypnotics, and anticonvulsants. Barbiturates are almost always taken orally as capsules or tablets. The effects resemble those of intoxication with alcohol. Chronic use of barbiturates leads to tolerance and physical dependence.Short acting Barbiturates taken at 400 mg/day for 2-3 months produces a clinically significant degree of physical dependence. Withdrawal symptoms experienced during periods of drug abstinence can be severe enough to cause death.Only a small amount (less than 5%) of most Barbiturates are excreted unaltered in the urine. The approximate detection time limits for Barbiturates are: Short acting (e.g. Secobarbital) 100 mg PO (oral) 4.5 days Long acting (e.g. Phenobarbital) 400 mg PO (oral) 7 days1The BAR One Step Barbiturates Test Strip is a rapid urine-screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of Barbiturates in urine. The BAR One Step Barbiturates Test Strip yields a positive result when the Barbiturates in urine exceed the cut-off level.BENZODIAZEPINES (BZO)Benzodiazepines are medications that are frequently prescribed for the symptomatic treatment of anxiety and sleep disorders. They produce their effects via specific receptors involving a neurochemical called gamma aminobutyric acid (GABA). Because they are safer and more effective, Benzodiazepines have replaced barbiturates in the treatment of both anxiety and insomnia. Benzodiazepines are also used as sedatives before some surgical and medical procedures, and for the treatment of seizure disorders and alcohol withdrawal.Risk of physical dependence increases if Benzodiazepines are taken regularly (e.g., daily) for more than a few months, especially at higher than normal doses. Stopping abruptly can bring on such symptoms as trouble sleeping, gastrointestinal upset, feeling unwell, loss of appetite, sweating, trembling, weakness, anxiety and changes in perception.Only trace amounts (less than 1%) of most Benzodiazepines are excreted unaltered in the urine; most of the concentration in urine is conjugated drug. The detection period for the Benzodiazepines in the urine is 3-7 days.The BZO One Step Benzodiazepines Test Strip is a rapid urine-screening test that can be performed without the use of an instrument. The test utilizes the antibody to selectively detect elevated levels of Benzodiazepines in urine. The BZO One Step Benzodiazepines Test Strip yields a positive result when the Benzodiazepines in urine exceeds cut-off concentration.COCAINE (COC)Cocaine is a potent central nervous system (CNS) stimulant and a local anesthetic. Initially, it brings about extreme energy and restlessness while gradually resulting in tremors, over-sensitivity and spasms. In large amounts, cocaine causes fever, unresponsiveness, and difficulty in breathing and unconsciousness.Cocaine is often self-administered by nasal inhalation, intravenous injection and free-base smoking. It is excreted in the urine in a short time primarily as Benzoylecgonine 1,2. Benzoylecgonine, a major metabolite of cocaine, has a longer biological half-life (5-8 hours) than cocaine (0.5-1.5 hours), and can generally be detected for 24-48 hours after cocaine exposure 2.The COC One Step Cocaine Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of cocaine metabolite in urine. The COC One Step Cocaine Test Strip yields a positive result when the cocaine metabolite in urine exceeds 300 ng/mL. This is the suggested screening cut-off for positive specimens set by the Substance Abuse and Mental Health Services Administration (SAMHSA, USA). MARIJUANA (THC)THC (∆9--tetrahydrocannabinol) is the primary active ingredient in cannabinoids (marijuana). When smoked or orally administered, it produces euphoric effects. Users have impaired short term memory and slowed learning. They may also experience transient episodes of confusion and anxiety. Long term relatively heavy use may be associated with behavioral disorders. The peak effect of smoking marijuana occurs in 20-30 minutes and the duration is 90-120 minutes after one cigarette. Elevated levels of urinary metabolites are found within hours of exposure and remain detectable for 3-10 days after smoking. The main metabolite excreted in the urine is 11-nor-∆9-tetrahydrocannabinol-9-carboxylic acid (∆9-THC-COOH).The THC One Step Marijuana Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of marijuana in urine. The THC One Step Marijuana Test Strip yields a positive result when the concentration of marijuana in urine exceeds 50 ng/mL. This is the suggested screening cut-off for positive specimens set by the Substance Abuse and Mental Health Services Administration (SAMHSA, USA). 3METHADONE (MTD)Methadone is a narcotic pain reliever for medium to severe pain. It is also used in the treatment of heroin (opiate dependence: Vicodin, Percocet, Morphine, etc.) addiction. Oral Methadone is very different than IV Methadone. Oral Methadone is partially stored in the liver for later use. IV Methadone acts more like heroin. In most states you must go to a pain clinic or a Methadone maintenance clinic to be prescribed Methadone.Methadone is a long acting pain reliever producing effects that last from twelve to forty-eight hours. Ideally, Methadone frees the client from the pressures of obtaining illegal heroin, from the dangers of injection, and from the emotional roller coaster that most opiates produce. Methadone, if taken for long periods and at large doses, can lead to a very long withdrawal period. The withdrawals from Methadone are more prolonged and troublesome than those provoked by heroin cessation, yet the substitution and phased removal of methadone is an acceptable method of detoxification for patients and therapists.1The MTD One Step Methadone Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of Methadone in urine. The MTD One Step Methadone Test Strip yields a positive result when the Methadone in urine exceeds 300 ng/mL.METHAMPHETAMINE (mAMP)Methamphetamine is an addictive stimulant drug that strongly activates certain systems in the brain. Methamphetamine is closely related chemically to amphetamine, but the central nervous system effects of Methamphetamine are greater. Methamphetamine is made in illegal laboratories and has a high potential for abuse and dependence. The drug can be taken orally, injected, or inhaled. Acute higher does lead to enhanced stimulation of the central nervous system and induce euphoria, alertness, reduced appetite, and a sense of increased energy and power. Cardiovascular responses to Methamphetamine include increased blood pressure and cardiac arrhythmias. More acute responses produce anxiety, paranoia, hallucinations, psychotic behavior, and eventually, depression and exhaustion.The effects of Methamphetamine generally last 2-4 hours and the drug has a half-life of 9-24 hours in the body. Methamphetamine is excreted in the urine primarily as amphetamine and oxidized and deaminated derivatives. However, 10-20% of Methamphetamine is excreted unchanged. Thus, the presence of the parent compound in the urine indicates Methamphetamine use. Methamphetamine is generally detectable in the urine for 3-5 days, depending on urine pH level.The mAMP One Step Methamphetamine Test Strip is a rapid urine screening test that can be performedwithout the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of Methamphetamine in urine. The mAMP One Step Methamphetamine Test Strip yields a positive result when the Methamphetamine in urine exceeds 1,000 ng/mL.OPIATE (MOP 300 or OPI 300)Opiate refers to any drug that is derived from the opium poppy, including the natural products, morphine and codeine, and the semi-synthetic drugs such as heroin. Opioid is more general, referring to any drug that acts on the opioid receptor.Opioid analgesics comprise a large group of substances which control pain by depressing the central nervous system. Large doses of morphine can produce higher tolerance levels, physiological dependency in users, and may lead to substance abuse. Morphine is excreted unmetabolized, and is also the major metabolic product of codeine and heroin. Morphine is detectable in the urine for several days after an opiate dose (1).The MOP One Step Opiate Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of opiates in urine. The MOP One Step Opiate Test Strip yields a positive result when the concentration of opiate exceeds the 300 ng/mL cut-off level.OPIATE (2000)Opiate refers to any drug that is derived from the opium poppy, including the natural products, morphine and codeine, and the semi-synthetic drugs such as heroin. Opioid is more general, referring to any drug that acts on the opioid receptor.Opioid analgesics comprise a large group of substances which control pain by depressing the central nervous system. Large doses of morphine can produce higher tolerance levels, physiological dependency in users, and may lead to substance abuse. Morphine is excreted unmetabolized, and is also the major metabolic product of codeine and heroin. Morphine is detectable in the urine for several days after an opiate dose.4The OPI One Step Opiate Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of morphine in urine. The OPI One Step Opiate Test Strip yields a positive result when the morphine in urine exceeds 2,000 ng/mL. This is the suggested screening cut-off for positive specimens set by the Substance Abuse and Mental Health Services Administration (SAMHSA, USA).PHENCYCLIDINE (PCP)Phencyclidine, also known as PCP or Angel Dust, is a hallucinogen that was first marketed as a surgicalanesthetic in the 1950’s. It was removed from the market because patients receiving it became delirious and experienced hallucinations.Phencyclidine is used in powder, capsule, and tablet form. The powder is either snorted or smoked after mixing it with marijuana or vegetable matter. Phencyclidine is most commonly administered by inhalation but can be used intravenously, intra-nasally, and orally. After low doses, the user thinks and acts swiftly and experiences mood swings from euphoria to depression. Self-injurious behavior is one of the devastating effects of Phencyclidine.PCP can be found in urine within 4 to 6 hours after use and will remain in urine for 7 to 14 days, depending on factors such as metabolic rate, user’s age, weight, activity, and diet.5 Phencyclidine is excreted in the urine as an unchanged drug (4% to 19%) and conjugated metabolites (25% to 30%).6The PCP One Step Phencyclidine Test Strip is a rapid urine screening test that can be performed without the use of an instrument. The test utilizes a monoclonal antibody to selectively detect elevated levels of phencyclidine metabolite in urine. The PCP One Step Phencyclidine Test Strip yields a positive result when the phencyclidine metabolite in urine exceeds 25 ng/mL. This is the suggested screening cut-off for positive specimens set by the Substance Abuse and Mental Health Services Administration (SAMHSA, USA).PRINCIPLEThe One Step Drug Screen Test Card is an immunoassay based on the principle of competitive binding. Drugs which may be present in the urine specimen compete against their respective drug conjugate for binding sites on their specific antibody.During testing, a urine specimen migrates upward by capillary action. A drug, if present in the urine specimen below its cut-off concentration, will not saturate the binding sites of its specific antibody. The antibody will then react with the drug-protein conjugate and a visible colored line will show up in the test line region of the specific drug strip. The presence of drug above the cut-off concentration will saturate all the binding sites of the antibody. Therefore, the colored line will not form in the test line region.A drug-positive urine specimen will not generate a colored line in the specific test line region of the strip because of drug competition, while a drug-negative urine specimen will generate a line in the test line region because of the absence of drug competition.To serve as a procedural control, a colored line will always appear at the control line region, indicating that proper volume of specimen has been added and membrane wicking has occurred.REAGENTSThe test contains a membrane strip coated with drug-protein conjugates (purified bovine albumin) on the test line, a goat polyclonal antibody against gold-protein conjugate at the control line, and a dye pad which contains colloidal gold particles coated with mouse monoclonal antibody specific to Amphetamine, Cocaine, Methamphetamine, Morphine, THC, Phencyclidine, Benzodiazepine, Methadone or Barbiturate.PRECAUTIONS• For healthcare professionals including professionals at point of care sites. • For in vitro diagnostic use only. Do not use after the expiration date. • The test panel should remain in the sealed pouch until use.• All specimens should be considered potentially hazardous and handled in the same manner as an infectious agent.• The used test card should be discarded according to federal, state and local regulations.STORAGE AND STABILITYStore as packaged in the sealed pouch at 2-30°C. The test strip is stable through the expiration date printed on the sealed pouch. The test strips must remain in the sealed pouch until use. DO NOT FREEZE. Do not use beyond the expiration date.SPECIMEN COLLECTION AND PREPARATIONUrine AssayThe urine specimen must be collected in a clean and dry container. Urine collected at any time of the day may be used. Urine specimens exhibiting visible precipitates should be centrifuged, filtered, or allowed to settle to obtain a clear specimen for testing.Specimen StorageUrine specimens may be stored at 2-8°C for up to 48 hours prior to testing. For prolonged storage, specimens may be frozen and stored below -20°C. Frozen specimens should be thawed and mixed well before testing.MATERIALS Materials Provided• Test cards • Package insertMaterials Required But Not Provided• Specimen collection container • Timer • External controlsDIRECTIONS FOR USEAllow the test panel, urine specimen, and/or controls to equilibrate to room temperature (15-30°C) prior to testing.1. Bring the pouch to room temperature before opening it. Remove the test panel from the sealed pouchand use it as soon as possible.2. Remove the cap from the end of the test card. With arrows pointing toward the urine specimen,immerse the strip(s) of the test card vertically in the urine specimen for at least 10-15 seconds. Do not pass the arrow(s) on the test panel when immersing the panel. See the illustration below.3. Place the test card on a non-absorbent flat surface, start the timer and wait for the red line(s) toappear. The results should be read at 5 minutes. Do not interpret results after 10 minutes.INTERPRETATION OF RESULTS(Please refer to the illustration above)NEGATIVE:* Two lines appear . One red line should be in the control region (C), and another apparent red or pink line adjacent should be in the test region (T). This negative result indicates that the drug concentration is below the detectable level.*NOTE: The shade of red in the test line region (T) will vary, but it should be considered negative whenever there is even a faint pink line.POSITIVE: One red line appears in the control region (C). No line appears in the test region (T). This positive result indicates that the drug concentration is above the detectable level.INVALID: Control line fails to appear. Insufficient specimen volume or incorrect procedural techniques are the most likely reasons for control line failure. Review the procedure and repeat the test using a new test panel. If the problem persists, discontinue using the lot immediately and contact your local distributor.QUALITY CONTROLA procedural control is included in the test. A red line appearing in the control region (C) is considered an internalprocedural control. It confirms sufficient specimen volume, adequate membrane wicking and correct procedural technique.Control standards are not supplied with this kit. However, it is recommended that positive and negative controls be tested as good laboratory practice to confirm the test procedure and to verify proper test performance.LIMITATIONS1. The One Step Drug Screen Test Card provides only a qualitative, preliminary analytical result. A secondary analytical method must be used to obtain a confirmed result. Gas chromatography and mass spectrometry (GC/MS) is the preferred confirmatory method. 3,4,72. There is a possibility that technical or procedural errors, as well as other interfering substances in the urine specimen may cause erroneous results.3. Adulterants, such as bleach and/or alum, in urine specimens may produce erroneous results regardless of the analytical method used. If adulteration is suspected, the test should be repeated with another urine specimen.4. A Positive result does not indicate level or intoxication, administration route or concentration in urine.5. A Negative result may not necessarily indicate drug-free urine. Negative results can be obtained when drug is present but below the cut-off level of the test.6. Test does not distinguish between drugs of abuse and certain medications.PERFORMANCE CHARACTERISTICSAccuracyA side-by-side comparison was conducted using the One Step Single Drug Test Strip and commercially available drug rapid tests. Testing was performed on approximately 1,700 specimens previously collected from subjects presenting for Drug Screen Testing. Presumptive positive results were confirmed by GC/MS. The following compounds were quantified by GC/MS and contributed to the total amount of drugs found in presumptive positive urine samples tested.TestCompounds Contributed to the Totals of GC/MSAMP AmphetamineBAR Secobarbital, Butalbital, Phenobarbital, PentobarbitalBZO Oxazepam, Nordiazepam, a-OH-Alprazepam,Desaky-frazepamCOC Benzoylecgonine mAMP MethamphetamineMTD Methadone OPI Morphine, Codeine PCP PhencyclidineTHC 11-nor-9-carboxy-delta-9-tetrahydrocanabinolThe following results are tabulated from these clinical studies:Method GC/MSMulti-Drug Single-Line Test CardNeg. Neg. (< –25% cutoff)Near cutoffneg. (-25% cutoff to cutoff)Near cutoff pos. (cutoff to +25% cutoff)Pos. (> +25% cutoff)% agreement with GC/MSPositive 0 1 818 114 97%AMPNegative149 1 5 4 0 95% Positive 0 0 45 117 92%BARNegative150 1 5 1 9 98% Positive 0 7 15 26 97%BZONegative149 7 1 3 1 95% Positive 0 2 15 16 103 98% COCNegative150 5 7 1 1 91% Positive 0 0 109 126 99%mAMPNegative150 0 4 1 0 94% Positive 0 0 10 10 112 99% MTDNegative150 171 94%Positive 0 2 7 10 131 >99%MOPNegative150 0 0 0 0 94% Positive 0 0 16 18 116 >99% OPINegative150 0 0 0 0 90% PCP Positive0 0 610 40 >99%Negative 150 6 0 0 0 96% Positive 0 13 912 109 88%THCNegative150 6 0 0 1 99%Forty (40) clinical samples for each drug were run using each of The One Step Single Drug Test Strip by an untrained operator at a Professional Point of Care site. Based on GC/MS data, the operator obtained statistically similar Positive Agreement, Negative Agreement and Overall Agreement rates as trained laboratory personnel.Precision and Analytical SensitivityA study was conducted at three physician offices by untrained operators using three different lots of product to demonstrate the within run, between run and between operator precision. An identical panel of coded specimens, containing drugs at the concentration of ± 50% and ± 25% cut-off level, was labeled, blinded and tested at each site. The results are given below:COCAINE (COC)Site A Site B Site CBenzoylecgonine conc. (ng/mL)n per site - + - + - + 015 14* 0 15 0 15 0 150 15 14 1 15 0 14 1 225 15 4 11 5 10 8 7 37515 0 15 0 15 0 15 450150 15 0 15 1 14*Note: One invalid result was obtained.AMPHETAMINE (AMP)Site A Site B Site CAmphetamine conc. (ng/mL)n per site - + - + - + 0 15 15 0 15 0 15 0 500 15 15 0 15 0 14 1 750 15 13 2 11 4 11 4 1,250 15 6 9 4 11 4 11 1,500 15 2 13 1 14 1 14METHAMPHETAMINE (mAMP)Site A Site B Site C Methamphetamine conc. (ng/mL) n per site - + - + - +0 15 15 0 15 0 15 0 500 15 15 0 14 1 13 2 750 15 11 4 10 5 10 5 1,250 15 8 7 4 11 6 9 1,500 15 1 14 1 14 0 15MARIJUANA (THC)Site A Site B Site C 11-nor-∆9 -THC-9-COOHconc. (ng/mL) n per sitee - + - + - +0 15 15 0 15 0 15 0 25 15 15 0 15 0 14 1 375 15 9 6 14 1 9 6 62.5 15 2 13 0 15 0 15 75 15 0 15 0 15 0 15OPIATES (OPI 2000)Site A Site B Site CMorphineconc. (ng/mL)n per site - + - + - + 0 15 15 0 15 0 15 0 1,000 15 15 0 15 0 14 1 1,500 15 13 2 11 4 7 8 2,500 15 4 11 1 14 2 13 3,000 15 0 15 0 15 2 13 PHENCYCLIDINE (PCP)Site ASite BSite CPhencyclidine conc. (ng/mL)n per site- + - + - +15 15 0 15 0 15 012.5 15 15 0 14 1 14 1 18.75 15 11 4 13 2 10 5 31.25 15 8 7 5 10 1 14 37.5 15 4 11 0 15 0 15 BARBITURATES (BAR)Site ASite BSite CSecobarbital conc. (ng/mL)n per site - + - + - + 0 15 15 0 15 0 15 0 150 15 13 2 15 0 15 0 225 15 5 10 7 8 10 5 375 15 2 13 5 10 5 10 450150 15 1 14 1 14BENZODIAZEPINES (BZO)Site A Site B Site COxazepam conc. (ng/mL)n per site - + - + - +0 15 15 0 15 0 15 0 150 15 14 1 14 1 15 0 225 15 11 4 14 1 14 1 375 15 0 15 1 14 3 12 450150 15 0 15 0 15METHADONE (MTD)Site A Site B Site CMethadone conc. (ng/mL) n per site - + - + - +0 15 15 0 15 0 15 0150 15 12 3 15 0 15 0225 15 8 7 14 1 15 0 375 15 0 15 0 15 1 14450 15 1 14 0 15 0 15OPIATE 300 (MOP 300 OR OPI 300) Site A Site B Site C Morphine conc. (ng/mL) n per site - + - + - + 0 15 15 0 15 0 15 0 150 15 13 2 13 2 15 0 225 15 3 12 7 8 10 5 375 15 0 15 1 14 0 15 450 15 0 15 0 15 0 15Analytical Specificity The following table lists the concentration of compounds (ng/mL) that are detected positive in urine by One Step Drug Screen Test Card at 5 minutes. AMPHETAMINE ng/mL D-Amphetamine 1,000 D,L-Amphetamine sulfate 3,000 L-Amphetamine 50,000(±)3,4-Methylenedioxyamphetamine2,000 Phentermine 3,000BARBITURATESSecobarbital 300Amobarbital 300Alphenol 150 Aprobarbital 200 Butabarbital 75 Butalbital 2,500 Butethal 100Cyclopentobarbital 600 Pentobarbital 300Phenobarbital 100BENZODIAZEPINESOxazepam 300Alprazolam 196a-Hydroxyalprazolam 1,262Bromazepam 1,562Chlordiazepoxide 1,562Chlordiazepoxide HCI 781Clobazam 98Clonazepam 781Clorazepate dipotassium 195Delorazepam 1,562Desalkylflurazepam 390Diazepam 195Estazolam 2,500Flunitrazepam 390(±) Lorazepam1,562 RS-Lorazepam glucuronide 156Midazolam 12,500Nitrazepam 98Norchlordiazepoxide 195 Nordiazepam 390 Oxazepam 300Temazepam 98Triazolam 2,500 COCAINE Benzoylecgonine 300Cocaine HCl 780 Cocaethylene 12,500Ecgonine HCl 32,000 MARIJUANA (THC)11-nor-∆9-THC-9 COOH50 Cannabinol 20,000 11-nor-∆8-THC-9 COOH30 ∆8 -THC15,000 ∆9 -THC15,000 METHADONEMethadone 300Doxylamine 50,000METHAMPHETAMINED-Methamphetamine 1,000 ρ-Hydroxymethamphetamine 30,000 L-Methamphetamine 8,000 (±)-3,4-Methylenedioxymethamphetamine 2,000 Mephentermine 50,000 OPIATE 300 (MOP) Morphine 300 Codeine 300 Ethylmorphine 6,250 Hydrocodone 50,000 Hydromorphone 3,125 Levophanol 15006-Monoacetylmorphine 400Morphine 3-β-D-glucuronide1,000 Norcodeine 6,250Normorphone 100,000 Oxycodone 30,000Oxymorphone 100,000 Procaine 15,000 Thebaine 6,250 OPIATES (2000) Morphine 2,000 Codeine 2,000 Ethylmorphine 5,000 Hydrocodone 12,500 Hydromorphone 5,000 Levophanol 75,000 6-Monoacetylmorphine 5,000 Morphine 3-β-D-glucuronide 2,000 Norcodeine 12,500 Normorphone 50,000 Oxycodone 25,000 Oxymorphone 25,000 Procaine 150,000 Thebaine 100,000 PCP Phencyclidine 25 4-Hydroxyphencyclidine 12,500 Effect of Urinary Specific Gravity Fifteen (15) urine samples of normal, high, and low specific gravity ranges (1.000-1.037) were spiked withdrugs at 50% below and 50% above cut-off levels respectively. The One Step Drug Screen Test Card was tested in duplicate using fifteen drug-free urine and spiked urine samples. The results demonstrate that varying ranges of urinary specific gravity does not affect the test results. Effect of the Urinary pHThe pH of an aliquoted negative urine pool was adjusted to a pH range of 5 to 9 in 1 pH unit increments and spiked with drugs at 50% below and 50% above cut-off levels. The spiked, pH-adjusted urine was tested with One Step Drug Screen Test Card. The results demonstrate that varying ranges of pH does not interfere with the performance of the test. Cross-Reactivity A study was conducted to determine the cross-reactivity of the test with compounds in either drug-freeurine or drug positive urine containing Cocaine, Barbiturates, Benzodiazepine, Amphetamine, Methamphetamine, Marijuana, Methadone, Opiate or Phencyclidine. The following compounds show no cross-reactivity when tested with One Step Drug Screen Test Card at a concentration of 100 µg/mL.Non Cross-Reacting Compounds Acetaminophen Maprotiline Acetophenetidin MDE N-Acetylprocainamide Meperidine Acetylsalicylic acid Meprobamate Aminopyrine Amitryptyline Methoxyphenamine Amoxicillin Nalidixic acidAmpicillin NaloxoneL-Ascorbic acid NaltrexoneNaproxenApomorphine NiacinamideAspartameNifedipine Atropine NorethindroneBenzilic acid D-Norpropoxyphene Benzoic acid Noscapine Benzphetamine DL-Octopamine Bilirubin Oxalic acid (±) – Brompheniramine Caffeine Oxolinic acid Cannabidiol Oxymetazoline Chloralhydrate Papaverine Chloramphenicol Penicillin-G Chlorothiazide Pentazocine hydrochloride (±) – Chlorpheniramine Perphenazine Chlorpromazine PhenelzineChlorquine Trans-2-phenylcyclo-propylamineCholesterol hydrochloride。

STQA习题集

Software Testing and Quality AssuranceChapter 1 Introducing Software Quality Assurance1. Please describe which tasks SQA Activities can be broken down into. (p1.4)A:SQA activities can be broken down into the following tasks:（1）Application of technical methods:This helps the development team to achieve high quality design specifications and develop high quality software design.（2）Conducting Formal T echnical Reviews (FTRs):These are structured review meetings in which a review team assesses the software product technically.（3）Enforcement of standards:This task is a combination of two subtasks:Process monitoring and Product Evaluation.（4）Control of change:It combines manual methods with automated tools to provide a mechanism for the control of change.This process ensures software quality by formalizing requests for change, evaluating the nature of the change, and controlling the impact of the change.（5）Measurement:The quality of a software product can be measured by using software metrics. （6）SQA Audits:These are conducted to inspect a process or a product in detail by comparing the process or product with established procedures and standards.Audits review the management, technical, and quality assurance processes being followed during software development.（7）Record keeping and reporting:This provides procedures for collecting and circulating SQA information.2. Which phases the SDLC consists of? (p1.5)A:（1）Software conception and initiation（2）Analysis（3）Design（4）Construction（5）T esting3. Please describe SQA activities in the Software Analysis Phase. (p1.6)A:（1）These involve reviewing of the Requirements Document created as part of the software requirement phase.（2）These ensure that the software requirements are complete, testable, and correctly expressed as functional, performance, and interface requirements.（3）The SQA activities in this phase can be recorded in the Software Requirement Review Checklist.4. Please describe SQA activities in the Software Design Phase. (p1.7)A:SQA activities in the software design phase involve assuring the following factors:（1）The design adheres to the approved design standards defined in the management plan created in the project initiation phase.（2）All software requirements are mapped to the software components.（3）All action items are resolved according to the review finding of the high-level design review documentation.（4）The approved design is placed under configuration management.（5）The development team follows approved design standards.（6）The allocated modules are included in the detailed design.（7）The results of design inspections are included in the design.（8）All action items are resolved according to the review findings of the detailed design review documentation.5. Please describe SQA activities in the Software Construction Phase. (p1.9)A:SQA activities in the software construction phase involve assuring the following factors: （1）Audit of the results of coding and design activities including the schedule in the software development plan（2）Audit of configuration management activities and the software development library （3）Audit of deliverable items（4）Audit of nonconformance reporting and corrective action system（5）FTR of code6. Please describe SQA activities in the Software Testing Phase. (p1.9)A:（1）These involve monitoring the testing process for conformance to standards.（2）These ensure that the software testing process is in accordance with plans and procedures.（3）T est documentation is reviewed for completeness and adherence to standards. （4）SQA activities in this phase also involve reviewing the test plan.（5）The observations from a test plan review are recorded in the Test Plan Review Checklist.7. Please describe the differents between two Quality activities, QA and QC. (p1.10) A:（1）QA is a planned and systematic set of activities that involve monitoring and improving the software development process.（2）QA is oriented to the prevention of defects rather than their detection and is used to implement the defined quality policy of an organization through the process of development and continuous improvement.（3）Quality Control (QC) is the process by which the quality of a product is compared with specific standards, and action is taken if the quality does not match the applicable standards.（4）QC is oriented to detection of defects rather than prevention.8. Please list some QA activities, and some QC activities. (p1.11)A:Quality Assurance (QA) activities include:（1）Quality Audit（2）Process definition（3）T ool selection（4）Training（5）Peer review（6）Requirements tracking（7）Quality metrics collectionQC activities include:（1）Inspection（2）T esting（3）Checkpoint review9. Please describe the Role of Metrics in SQA, and the four main steps of creating a metric. (p1.12)A:QA is a planned and systematic set of activities that involves monitoring and improving the software development process. Metrics are important in QA because they help measure and evaluate various aspects of the software development process These measurements help organizations improve their processesMetrics are crucial for the development process and project management because they enable you to measure the quality of each factor in a project. Measuring the quality of various factors helps determine if the project will meet time and quality requirements.In addition, over a period of time, metrics help track your progress. Y ou can use metrics to compare various projects of different sizes After calculating metrics, you need to communicate them to the management and to every person involved in the process. Then, you need to organize several meetings to analyze metrics. Based on. the analysis, .areas of improvement are identified and suggestions are invited to improve the processes Based on the suggestions, corrective action is decided and implemented. After implementing the changes, you need to again implement the processes to verify whether or not they solved the problem.The QA and develOpment team decides upon the metrics to be created and tracked in the beginning of a software project There are four main steps of creating a metricDefining the goal of the metric: It is important to define a goal because it helps design the metric The goal should be clear, measurable, and explicit For example,the goal can be to measure the number of defects reported by the clientIdentifying the requirements of the metric The requirements include human resource, data collection techniques, and methodologies used to process the data For example, the requirements of a metric that measures the number of defects reported by clients include the availability of quality assurance professionals and past data to specify severity criteria Identifying the organizational baseline value for the metric A baseline value is an average value that an organization identifies based on prior experience. A metric is designed to achieve the baseline value.1. Which of the following is a quality control activity?A. Quality auditB. T ools selectionC. TrainingD. Inspection2. Which of the following is a quality assurance activity?A. T estingB. T ools selectionC. InspectionD. Walkthrough3. Which of the following SQA activities involves assessing and review the prototype and product design for quality?A. Application of technical methodsB. Conducting FTRsC. Enforcement of standardsD. Control of change4. Which of the following SQA activities ensures that the development team follows the documented steps to complete a process?A. Application of technical methodsB. Conducting FTRsC. Enforcement of standardsD. Control of changeChapter 2 Introducing Software Testing1. Please describe the benefits of early testing. (p2.4)A:The benefits of early testing include:（1）Reduces the possibility of introducing errors when making changes.（2）Reduces the possibility of forgetting design decisions and conditions.（3）Saves the time required to reanalyze designs and code.（4）Reduces the possibility of similar errors by providing early feedback.（5）Reduces the number of defects that leak through various phases of SDLC, which helps reduce the defect tracking overhead.2. Please describe the steps of Testing Life Cycle. (p2.6)A:（1）Risk analysis(2)Planning progress(3)T est design(4)Performing tests(5)Defect tracking and management(6)Quantitative measurement(7)T est reporting3. Please describe the Roles in a testing team, and their responsibility of each role. (p2.8)A:(1)The key roles in a testing team are:T est managerT est leadT est environment specialistT ester(2)A test manager plans and coordinates the test process for a project and is responsible for:a. Representing the testing team for interdepartmental interactionsb. Interacting with customers and vendors, if requiredc. Recruiting, supervising, and training staffd. Creating a test plane. Creating the budget and schedule for the test process, including test-effort estimationsf. Acquiring hardware and software for the test environmentg. Ensuring proper configuration management of the test environment and the test producth. Defining the test processi. Tracking progress of the test processk. Coordinating pre- and post-test meetings(3)A test lead directs the testing team and is responsible for:a. Providing technical leadership for the test programb. Providing support for customer interface, recruiting, test-tool introduction, test plan execution, staff supervision, and cost and progress status reportingc. Verifying the quality of the requirements, including testability, requirement definition,test design, test-script and test-data development, test automation, test-environment configuration, test-script configuration management, and test executiond. Interacting with test-tool vendors to identify the best ways to leverage test tools on the projecte. Receiving information about the latest test approaches and tools, and transferring this knowledge to the test teamf. Conducting test-design and test-procedure walkthroughs and inspectionsg. Implementing test-process improvements based on surveys conductedh. Tracing the test procedures to the test requirements by using the Requirements Traceability Matrixi. Implementing the test processj. Ensuring that the test-product documentation is complete(4)A test environment specialist specializes in setting up the test environment and is responsible for:a. Installing test tools and establishing the test-tool environmentb. Creating and controlling the test environment by using environment setup scriptsc. Creating and maintaining the test databased. Maintaining a requirements hierarchy within the test-tool environment(5)A tester helps deliver a quality product and is responsible for the following activities during the testing process:a. Developing test cases and proceduresb. Creating test datac. Reviewing analysis and design artifactsd. Executing testse. Using automated tools for executing testsf. Preparing test documentationg. Tracking defectsh. Reporting test results4. Please describe the key performance areas of a tester. (p2.11)A:(1) Defect-detection efficiency(2)Schedule slippage in test case design and test execution(3)Productivity (total number of test cases designed or executed, depending on the nature of project)(4)Number of weighted defects in user acceptance testing(5) Initiatives taken in:Self developmentDeveloping toolsCertificationsT ools learned5. Please describe the main technical skills and behavior skills of a tester. (p2.11) A:(1)T echnical: The technical skills include the following:a. Knowledge of software development, operation, and maintenance processesb. Knowledge of the applicationc. Knowledge of tools that aid in software developmentd. Knowledge of project managemente. Knowledge of the testing processf. Knowledge of test process documentation(2)Behavioral: The behavioral skills include the following:a. Sensitivity to small detailsb. T olerance for chaosc. Organized approach1. Errors that are undetected at a particular stage in the development life cycle and are carried forward to next stage are called .A. Leakage errorsB. Logical errorsC. Debugging errorsD. Integration errors2. Which of the following cannot be achieved by testing?A. Detecting errors in a software productB. Verifying that a software product conforms to its requirementsC. Showing that a software product has no defectsD. Establishing confidence that a program does what it should3. Which of the following is the correct sequence of phases in the testing life cycle?A. Risk analysis, planning, test design, performing tests, defect tracking and management, quantitative measurement, test reportingB. Planning, risk analysis, test design, performing tests, defect tracking and management, quantitative measurement, test reportingC. Planning, risk analysis, test design, performing tests, test reporting, defect tracking and management, quantitative measurementD. Risk analysis, planning, test design, performing tests, quantitative measurement, test reporting, defect tracking and management4. In which phase of the testing life cycle are defects communicated to the development team?A. Defect tracking and managementB. Performing testsC. T est reportingD. Quantitative measurementChapter 3 Planning Software Tests1. Please describe which phases the test planning process includes. (p3.3)A:The test planning process includes the following phases:(1)Pre-planning(2)T est planning(3)Post-planning2. In the pre-planning phase, the test specifications are identified. Which components are included in the test specifications? (p3.3)A:This phase identifies the test specifications. The test specifications include the following components:a. T est objectivesb. T est assumptionsc. T est success/acceptance criteriad. T est entrance/exit criteria3. Which activities are included in the test planning phase? (p3.4)A:The test planning phase includes the following activities:(1) Performing requirements traceability(2) Estimating test effort(3) Scheduling the test iterations(4) Planning resources(5) Identifying testing approaches(6) Defining test quality control4. Which components the test plan should focused on? (p3.5)A:The test plan focuses on the following components:(1) Scope of test(2) T est objectives(3) List of assumptions(4) Results of risk analysis(5) Resource allocation(6) T est schedule(7) T est design(8) T est environment(9) T esting tools and techniques(10)T est completion criteria5. Which steps should be followed when create a test plan? (p3.5)A:T o create a test plan, the steps to be followed are:(1)Forming a test team(2)Understanding project risks(3)Building the test plan6. Which activities should be involved when developing a test plan? (p3.6)A:The development of a test plan involves the following activities:1. Documenting test objectives2. Creating a test matrix3. Writing the test plan7. The post-planning phase of the test planning process includes identifying a configuration management plan for the software project. Which activities are included in the configuration management? (p3.13)A:Configuration management includes the following activities:(1)Baseline control(2)Software configuration identification(3)Configuration control(4)Configuration status accounting(5)Software configuration authentication(6)Software development libraries8. Please describe the V model and W model. (p3.14)A:The cost of correcting a defect that is detected early in the development life cycle is much less than the cost of correcting a defect detected at a later stage. Therefore, to reducethe cost of correcting defects, you must try locating defects early in the development life cycle..The V model proposes an approach to software development in which both the software development process and the software test process begin simultaneously When the project starts, the development team starts the software development process and the testmg team starts planmng for the test process This planning is based on the documents created during the development processThe V model places the development phases such as requirements, analysis, design, and coding on one side of the V The various types of testing such as umt, integration, system, and acceptance, are placed on the other side of the V.Unit testing involves testing each individual unit of software to detect errors in its code. A developer or a peer programmer typically does unit testingIntegration testing involves testing two or more previously tested and accepted units to illustrate that they work together when combined into a single entity Integration testing exposes faults in interfaces and in the interaction between integrated components System testing is the process of testing a completely integrated system to verify that itmeets specified requirements This testing is performed to identify defects that will surface only when a complete system is assembled. System testing includes testing for performance, security, and recovery from failure.Acceptance testing is the process in which actual users test a complete information system to determine whether it satisfies the acceptance criteria This testing enables the customer to determine whether to accept or reject the system.1. According to the V model, documents created during the analysis phase can be used to define the .A. System test criteriaB. Acceptance criteriaC. Integration test criteriaD. Unit test criteria2. Which of the following configuration management activities involve performing configuration reviews and audits?A. Baseline controlB. Configuration controlC. Configuration status accountingD. Software configuration authentication3. Which of the following activities is performed as part of the pre-planning phase of testing?A. Documenting risks related to testingB. Creating test matrixC. Defining the success/acceptance test criteriaD. Forming a testing team4. Which of the following is a dynamic testing technique?A. ReviewB. WalkthroughC. AuditD. T estingChapter 4 Identifying Test Approaches1. Please describe static testing and dynamic testing. (p4.3)A:Static testing: Static testing verifies the conformance of a software system to its specification without executing the code. This testing involves analysis of the source text by individuals.Dynamic testing: Dynamic testing involves executing the source code to check if it works as expected.2. Please describe the types of errors can be located by using functional approaches. (p4.3)A:Functional approaches are useful for locating the following types of errors:Incorrect functionalityMissing functionalityInterface errorsIncorrect specificationsInitialization errorsT ermination errors3. Please describe the benefits and limitations of using functional test approaches. (p4.3)A:The benefits of using functional test approaches are:●They are effective for large units of code.●T esters do not need any knowledge of implementation, including specificprogramming languages.●T esters and developers can be independent of each other.●T ests are conducted from a user's point of view.●T ests help easily identify ambiguities or inconsistencies in the specifications.●T est cases can be designed as soon as the specifications are complete.●The limitations of functional test approaches are:●Can leave many program paths untested.●Cannot be used for complex segments of code. Therefore, such segments cancontain errors.●Cannot determine a reason for failure.●Difficult to design tests without clear and concise specifications.4. Please describe the benefits and limitations of using structural test approaches. (p4.5)A:The benefits of structural testing approaches are:●Useful in locating non-specified functions that cannot be detected using functionalapproaches●More effective than functional approaches for small modulesThe limitations of structural test approaches are:● A program contains a large number of logical paths. It is not practically possible tocheck all logical paths because it involves time and effort. Y ou can test only some important logical paths.●It is necessary for the tester to know programming languages.●These approaches do not ensure meeting user requirements.5. Please describe which types of testing structural test approaches and functional test approaches should be applied to. Four basic types of testing are: Unit Testing, Integration Testing, System Testing, and Acceptance Testing. (p4.5)6. Please describe the Structural Testing Techniques. (p4.6)A:The structural testing techniques are:●Stress testing: Involves testing the system in a manner that demands resources inabnormal quantity, frequency, or volume.●Recovery testing: Verifies the ability of the system to recover from varying degrees offailure.●Operations testing: Ensures that when an application is developed, it is tested andthen integrated into the operating environment.●Compliance testing: Verifies whether the application is developed in accordance withinformation technology standards, procedures, and guidelines.●Security testing: Identifies security defects in the software.7. Please describe the Functional Testing Techniques. (p4.8)A:The functional testing techniques are:●Requirements testing: This type of testing is conducted to verify that a systemperforms correctly over a continuous period of time.●Regression testing: When a change is made to one segment of a system.●Error-handling testing: This type of testing is done by a group of individuals who thinknegatively to anticipate what can go wrong with the system.●Manual-support testing: This involves testing the interface between users and theapplication system.●Intersystem testing: Applications are often interconnected with other applications.●Control testing: This type of testing is conducted to ensure that processing isperformed in accordance with the intent of the management.●Parallel testing: When a new system is developed.1. Which of the following is a static testing technique?A. Black-box testingB. White-box testingC. ReviewsD. Regression testing2. Which of the following is a structural testing technique? A. Unit testing B. System testing C. Acceptance testing D. Requirements testing3 Which of the following is a functional testing technique? A. Stress testing B. Executing testing C. Recovery testing D. Regression testing4. Which of the following is a dynamic testing technique? A. ReviewB. WalkthroughC. AuditD. White-box testingChapter 5 Designing the Test Environment1. Please describe the test process and its minor process activities by using ETVX diagram. (p5.4)A:A test process provides a systematic approach to accomplish the objective of testing. A test process can also be defined as a set of minor process activities within major process activities represented by the Entry-T ask-Verification-Exit (ETVX) diagram.2. Please describe the Life Cycle of a Test Process. (p5.5) A:There are various phases in the life cycle of a test process. These phases are as follows: ●System study: The purpose of the system study phase is to understand the testprocess and define its requirements.●Design test cases: The purpose of this phase is to design and build a set of intelligenttest cases for the test process.●Execution: The purpose of the execution phase is to execute the test cases preparedin the design test cases phase.●Wind-up: The purpose of the wind-up phase is to provide an organized and formalwrap up of the test execution phase.3. Please describe the criteria affecting the selection of an appropriate testing tool. (p5.6)A:The criteria affecting the selection of an appropriate testing tool are as follows:●The objectives of testing should be accomplished successfully.●The tool should be easy to use.●The time spent in installing and learning about the tool should be the least.●The tool should be compatible with the platform and software used for testing.●The purchase cost of the tool should be within the project budget.4. What steps the testing team should follow while designing the test environment? (p5.7)A:While designing the test environment, the testing team should follow the given steps:●Gather information about proposed test environment.●Document the test environment specifications for the project.●Simulate the server environment.●Simulate the client environment.●Design domains for testing.●Keep the test logs and reports safe for the future.5. What is a test bed? What are the benefits of test beds? And what are the factors that affect test bed decisions? (p5.8)A:A test bed is a test environment that contains the hardware, simulators, software tools, and other support elements necessary for conducting the test.Benefits of test beds are:●Observing the impact of running applications in an environment changed by softwarepatches,new software installed, ornew hardware purchased before these are used on an everyday basis.●Developing off-line maintenance procedures that help minimize non-functionalperiods of the application software.The following factors affect test bed decisions:●Budget and resource constraints: Setting up a test bed requires specific hardware,software, and other resources.●T echnical support constraints: Maintaining a test bed requires technical support fromspecialized personnel.6. Please describe the testing tool types. Which ones are the Manual tools? (p5.14) A:Some of the important testing tools are as follows:●Unit testing tools●Regression testing tools●Load testing tools●Traceability tools●Code coverage tools●Manual toolsThe most frequently-used manual tools are as follows:●Checklists●T est scripts●Decision T ables1. Which testing is used when there is high risk of a recent change affecting unchanged areas of the application software?A. Parallel testingB. Control testingC. Requirement testingD. Regression testing2. Which of the following steps is not a part of the process of setting up a regression-testing tool?A. Designing the framework of testingB. Identifying the utility functions related to the application softwareC. Configuring an isolated network with servers of specified configurationD. Designing test scenarios3. Which of the following statement holds true for test bed?A. A test bed is the key to quality and stability in a software testing processB. A test bed captures the input of test processC. A test bed helps define a test script in exact terms by defining the hardware and software requirements.D. A test bed executes a test case on time.4. Which testing ensures that the operations of an application software continue after a disaster?A. Recovery testingB. Operations testingC. Compliance testing。

学术英语写作智慧树知到课后章节答案2023年下天津外国语大学

学术英语写作智慧树知到课后章节答案2023年下天津外国语大学天津外国语大学第一章测试1.What steps are involved in writing? （）A:PrewritingB:Revising and editingC:Writing the first draftD:Rewriting答案:Prewriting;Revising and editing;Writing the first draft2.The objectives of the course include （）A:To improve students’ thinking ability via choosing a topic, sourcinginformation, writing different parts of a project paper including introduction, literature review, methods, findings, discussions and conclusionB:To improve students’ writing and reading proficiency in EnglishC:To improve students’ capability of carrying out research by going through the process of writing a project paperD:To improve students’ cultural competence through learning western ways of thinking in English writing答案:To improve students’ thinking ability via choosing a topic,sourcing information, writing different parts of a project paperincluding introduction, literature review, methods, findings,discussions and conclusion;To im prove students’ writing and reading proficiency in English;To improve students’ capability of carrying out research by goingthrough the process of writing a project paper;To improve students’ cultural competence through learning westernways of thinking in English writing3.In freewriting, the writer needs to make everything perfect.（）A:错 B:对答案:错4.The relationship between reading and writing can be described as（）A:When you start learning writing, you will read differently.B:Reading is the input, while writing is the output of language.C:When you start learning writing, you will improve the awareness ofaccumulating language, information and idea for various topics, which will consequently change the way you read.D:Reading and writing are reciprocal activities; the outcome of a readingactivity can serve as input for writing, and writing can lead a student tofurther reading resources.答案:When you start learning writing, you will read differently.;Reading is the input, while writing is the output of language.;When you start learning writing, you will improve the awareness ofaccumulating language, information and idea for various topics, which will consequently change the way you read.;Reading and writing are reciprocal activities; the outcome of a reading activity can serve as input for writing, and writing can lead a student to further reading resources.5.The functions of writing include （）A:The discipline of writing will strengthen your skills as a reader and listener B:Writing will make you a strong thinker and writing allows you to organize your thoughts in clear and logical waysC:Writing encourages you to seek worthwhile questionsD:Writing helps you refine and enrich your ideas based on feedback fromreaders答案:The discipline of writing will strengthen your skills as a readerand listener;Writing will make you a strong thinker and writing allows you toorganize your thoughts in clear and logical ways;Writing encourages you to seek worthwhile questions;Writing helps you refine and enrich your ideas based on feedback fromreaders第二章测试1. A thesis statement should be one single sentence, including only one idea. ( ）A:对 B:错答案:对2.An essay should include at least three specific details to be well-developed. ( ）A:对 B:错答案:错3.The topic sentence for each body paragraph should be fairly specific becauseeach body paragraph deals with only one of the many things stated in thethesis statement. ( ）A:错 B:对答案:对4.Living with my ex-roommate was unbearable. First, she thought everythingshe owned was the best. Secondly, she possessed numerators filthy habits.Finally, she constantly exhibited immature behavior.Read this paragraph and decide which statement is NOT TRUE? （）。

小学上册第十三次英语第一单元测验卷

小学上册英语第一单元测验卷英语试题一、综合题(本题有100小题，每小题1分，共100分.每小题不选、错误，均不给分)1.The __________ is a major city located on the coast. (迈阿密)2.The __________ (历��的启示) guides our journey.3.What is the name of the largest desert in the world?A. SaharaB. GobiC. KalahariD. ArcticA4. A ______ is a type of chemical analysis.5.The cat loves to chase its own _____ tail.6.I see a _____ (car/bike) on the road.7.What is the main ingredient in bread?A. SugarB. FlourC. RiceD. SaltB8.I see a _____ (bird/fish) in the tree.9.I have a pet ___ (小鸽子) that coos softly.10.What do you call the process of heating water to create steam?A. BoilingB. MeltingC. FreezingD. Condensing11.I hope to travel to ________ (外国) one day.12.We need to ______ (study) for the test.13.What is the term for a story that is not real?A. BiographyB. FictionC. HistoryD. MemoirB14.Which part of the body helps us to see?A. EarsB. EyesC. NoseD. MouthB15.Solubility can be affected by temperature and _____ of the solvent.16.My _____ (外婆) loves to bake.17.The ______ (小鸟) makes a beautiful nest.18.We like to eat _____ (pizza/salad) for dinner.19.What is the capital of Latvia?A. RigaB. VilniusC. TallinnD. Kaunas20. A solution can be diluted by adding more ______.21.The _____ (lake/ocean) is blue.22.I have a good _____ (朋友).23.The process of separating a mixture based on boiling points is called ______.24.What is the opposite of ‘empty’?A. FullB. BareC. ClearD. Void25.The _______ of a wave can be affected by the temperature of the medium.26.The ____ has bright blue wings and is found near water.27.The ____ has a unique tail and enjoys climbing.28.What do you call a person who studies the environment?A. EcologistB. BiologistC. ChemistD. Physicist29.Certain plants can ______ (影响) climate locally.30. A _______ is a reaction that produces a sweet smell.31.The _____ (花的形状) can vary greatly between species.32.I can learn about history with my ________ (玩具名称).33.The ________ (农业实践改进) ensures sustainability.34. of Tears affected many __________ (印第安人). The Trai35.What do we call a person who plays music?A. ArtistB. MusicianC. WriterD. PainterB36.My cat enjoys climbing up on ______ (家具).37.The process of combining two or more substances is called ______.38. A diatomic molecule consists of ______ atoms.39. A ____ is known for hopping and quick movements.40.The _____ (果实成熟) can signal the end of the growing season.41. A ____ is a small creature that enjoys eating fruit.42.What is the primary color of a blueberry?A. BlueB. RedC. GreenD. YellowA43.What is the term for a baby chicken?A. CalfB. DucklingC. ChickD. LambC44.What is the process of a seed developing into a plant called?A. GerminationB. PropagationC. CultivationD. GrowthA45.The __________ is the process of water evaporating into the atmosphere.46.What is the term for a baby frog?A. TadpoleB. FryC. LarvaD. CaterpillarA47.My favorite color is ________.48. A shadow is formed when an object blocks _______.49.I have a _____ (dream) to travel.50. A _______ can convert mechanical energy into electrical energy.51.The cake is ______ (frosted) with vanilla icing.52.Flowers come in many ______, such as red, yellow, and blue. (花的颜色有很多种，如红色、黄色和蓝色。

新视野第三版第二册读写教案

New Horizon College English Book Ⅱ
新视野大学英语教案
第二册（第三版）
课题目（教学章节或主题）：
Unit 1 An Impressive English Lesson
教学目标或要求: Students are supposed to 1. think and talk about language teaching and learning 2. understand and know the main idea of the text 3. learn and apply words, phrases and sentence patterns 4. master the essay writing skill 教学重点、难点： Key points: 1. text understanding 2. language points Difficult points: 1. language use 2. writing skill application 教学进程（包括基本内容、环节、步骤）：教学时数：6 学时 I. Pre-reading activities and text study (2 学时) II. Text study and essay writing (2 学时) III. Language focus and critical thinking (2 学时) 教学手段与方法： (Application of BOPPPS Model) 思考题、讨论题、作业： Do you think grammar is important in English learning? How can students enlarge their vocabulary? Write an essay using the writing model: introduction-body-conclusion. 教学过程： I. Pre-reading activities and text study 1. Bridge-in Idea-sharing: an unforgettable English lesson 2. Objective Students are required to: 1)think and talk about English teaching and learning 2)understand and know the main idea of the text 3. Pre-test Listen to a talk about an English learner’s learning experience and answer questions. 4. Participatory Learning (1) Pre-reading activities Discuss the following questions. a. In your opinion, what is the most effective way to learn English? b. What do you think is the most difficult in English learning? c. Do you think grammar is important in English learning? (2) Text understanding i. What does the son think of the father? (Para. 1)

Creating the DISEQuA corpus A test set for multilingual question answering

Creating the DISEQuA Corpus:a Test Set for Multilingual Question AnsweringBernardo Magnini*, Simone Romagnoli*, Alessandro Vallin*Jesús Herrera**, Anselmo Peñas**, Víctor Peinado**, Felisa Verdejo**Maarten de Rijke**** ITC-irst, Centro per la Ricerca Scientifica e TecnologicaVia Sommarive, 38050 Povo (TN), Italy.{magnini,romagnoli,vallin}@itc.it** UNED, Spanish Distance Learning University, Dpto. Lenguajes y Sistemas Informaticos Ciudad Universitaria, c./Juan del Rosal 16, 28040 Madrid, Spain.{anselmo,felisa,jesus.herrera,victor}@lsi.uned.es*** Language and Inference Technology Group, ILLC, University of AmsterdamNieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands.mdr@science.uva.nlAbstract. This paper describes the procedure adopted by the three coordinatorsof the CLEF 2003 question answering track (ITC-irst, UNED and ILLC) tocreate the question set for the monolingual tasks. Despite the little resourcesavailable, the three groups collaborated and managed to formulate and verify alarge pool of original questions posed in three different languages: Dutch,Italian and Spanish. A part of these queries was translated into English andshared between the three coordination groups. Thus, a second cross-verificationwas conducted, in order to extract the queries that had an answer in all the threemonolingual document collections. Finally, the result of the joint efforts wasthe creation of the DISEQuA (Dutch Italian Spanish English Questions andAnswers) corpus, a useful and reusable resource that is freely available for theresearch community. The article reports on the different stages of the corpuscreation, from the monolingual kernels to the multilingual extension.1 IntroductionThe question answering (QA) track at CLEF 2003, starting from the experiences accumulated during the past TREC campaigns, focused on the evaluation of QAsystems created for non-English European languages and consequently promoted both monolingual (Dutch, Italian and Spanish) and cross-language tasks. Cross-lingualitywas a necessary step to push participants into designing systems that can find answersin languages different from the source language of the queries, which mirrors apossible scenario of future applications.The document collections were those used at CLEF 2002, i.e. articles drawn from newspapers and news agencies of the year 1994 (Dutch, Italian, Spanish) and 1995(Dutch). Nevertheless, as coordinators of the monolingual tasks, we first needed to create a corpus of questions with related answers for the evaluation exercise, i.e. a replicable gold standard.According to the CLEF QA guidelines, that are based on last years’ TREC ones, the question set released to participants should be made up of simple, mostly short, straightforward and ‘factoid’ queries. Systems should process questions that sound naturally spontaneous, and a good, realistic question set should consist of questions arisen from a real desire to know something about a particular event or situation. Actually, we could have extracted our questions directly from the document collection, simply turning assertive statements into interrogative ones. Such a procedure would have turned out to be quite quick and pragmatic, but it would have undermined the original intentions of the QA track, which is to evaluate the systems’ performance in finding possible answers to open domain questions, independently from the target document collection used. Drawing the queries from the corpus itself would have influenced us in the topics and words choice, and in the syntactic formulation of the questions.The coordinators of the TREC 2002 QA track obtained their 500 questions corpus from question logs of WWW search engines (like the MSN portal). They extracted a thousand queries that satisfied determined patterns from the millions of questions registered in the logs, and then, after correcting linguistic errors, they searched the answers in a 3GB wide corpus. Similarly, the organizers of the TREC-8 QA (held in 1999) drew one hundred of the 200 final questions from a pool of 1,500 candidate questions contained in the FAQFinder logs [3].This strategy leads to a well formed questions and answers corpus, but it requires a lot of available resources, i.e. many native speaker people involved in the verification of the questions, a huge document collection, the access to the logs borrowed from search engines companies and - last but not least – a considerable amount of time. We could take advantage neither of question logs nor of a corpus big enough to enable the extraction of any kind of answer. In order to cope with this lack of resources, we conceived an alternative approach to the QA corpus creation, trying to preserve spontaneity of formulation and independence from the documents collection.The monolingual tasks of the CLEF 2003 QA track required a test set of 200 fact-based questions. Our goal was to collect a heterogeneous set of queries that would represent an extensive range of subjects and find their related answers in three different corpora. The creation of the three test sets constituted the first step toward the generation of a multilingual corpus of questions and answers, whose entries are written into four languages, with the related responses that the assessors extracted from each monolingual document collection during the verification phase.Our activity could be roughly divided into four steps:1.Formulation of a pool of 200 candidate questions with their answers in eachlanguage;2.Selection of 150 questions from each monolingual set and their translation intoEnglish in order to share them with the other groups;3.Second translation and further processing of each shared question in two differentdocument collections;4.Data merging and final construction of the DISEQuA corpus.2 Question GenerationThe corpora addressed by the questions for the monolingual tasks were three collection of newspaper and news agency documents released in 1994 and 1995, and written in Dutch, Italian and Spanish respectively. We used the document collections licensed by the Cross Language Evaluation Forum. These articles constituted a heterogeneous, open domain text collection. Each article had a unique identifier, i.e. a DOCID number, that participants’ systems had to return together with the answer string in order to prove that their responses were supported by the text. The text of the Italian collection was constituted by about 27 millions words (200 Mb) drawn from the newspaper La Stampa and the Swiss-Italian SDA press agency. The Spanish corpus contained more than 200,000 international news from the EFE press agency published during the year 1994. The Dutch collection was the CLEF 2002 Dutch collection, which consists of the 1994 and 1995 editions of Algemeen Dagblad and NRC Handelsblad (about 200,000 documents, or 540 Mb).<DOC><DOCNO>EFE19940101-00001</DOCNO><DOCID>EFE19940101-00001</DOCID><DATE>19940101</DATE><TIME>00.28</TIME><SCATE>POX</SCATE><FICHEROS>94F.JPG</FICHEROS><DESTINO>ICX EXG</DESTINO><CATEGORY>POLITICA</CATEGORY><CLAVE>DP2403</CLAVE><NUM>736</NUM><TITLE> GUINEA-OBIANGPRESIDENTE SUGIERE RECHAZARA AYUDA EXTERIOR CONDICIONADA</TITLE><TEXT> Malabo, 31 dic (EFE).- El presidente de Guinea Ecuatorial, Teodoro ObiangNguema, sugirió hoy, viernes, que su Gobierno podría rechazar la ayuda internacional querecibe si ésta se condiciona a que en el país haya "convulsiones políticas".En su discurso de fin de año, [......] conceptos de libertad, seguridad ciudadana ydesarrollo económico y social. EFEDN/FMR01/01/00-28/94</TEXT></DOC>Fig. 1. Format of the target document collection (example drawn from the Spanish corpus) The textual contents of the Spanish collection, as shown in figure 1, were not tagged in any way. The text sections of the Italian corpus on the contrary, according to the NIST guidelines, had been annotated with named entities tags such as <PERSON>, <LOCATION> and <AUTHOR>. The Dutch collections were formatted similarly.Given these three corpora, our final goal was to formulate a set of 180 fact-based questions shared by all the three monolingual QA tasks. The intention of having the same queries in all the tasks was motivated by the need of comparing the systems’ performance in different languages. Since the track was divided in many tasks andmost of the participants took part in just one of them, the use of the same test set, although it was translated into other languages, would allow us to compare the accuracy of different runs. Besides 180 shared queries, we planned to include in each test set 20 questions with no answer in the corpora (the so-called NIL questions).2.1 From Topics to KeywordsThe key element that guided our activity through the first phase of questions generation was the CLEF collection of topics. If we had asked people to generate questions without any restraint, we could have probably obtained just a few usable queries for our purpose. Otherwise, it would have been even more difficult to ask them to focus just on events occurred in 1994 or 1995, which is the time coverage of the articles in our text collections. Besides, we noticed that the mental process of conceiving fact-based questions without having any topic details could take a considerable amount of time: asking good questions could be as difficult as giving consistent answers. In order to cope with these drawbacks, to improve the relevance of the queries and to reduce the time necessary to their generation, we decided to use some CLEF topics.Topics, that can be defined as “original user requests” [1], represent a resource developed for many NLP applications, included question answering. The team that generated the CLEF topics wanted to create a set of real life subjects which should meet the contents of the document collections. The main international political, social, cultural, economic, scientific and sporting issues and events occurred in 1994 and 1995 were included and topics were written in a SGML style, with three textual fields, as in figure 2.<top><num> C001<I-title>Architettura a Berlino<I-desc>Trova documenti che riguardano l'architettura a Berlino.<I-narr>I documenti rilevanti parlano, in generale, degli aspetti architettonici di Berlino o, inparticolare, della ricostruzione di alcuni parti della città dopo la caduta del Muro.</top>1Fig. 2. An Italian topic released by CLEF in the year 2000 (translation in the footnote)The title field sketches straightforwardly the main content of the topic, the description field mirrors the needs of a potential user, presenting a more precise formulation in one sentence, and the narrative field gives more information concerning relevance.1 <I-title>Architecture in Berlin<I-desc>Retrieve documents that concern architecture in Berlin.<I-narr>Generally speaking, the relevant documents deal with the architectural features of Berlin or, particularly, with the reconstruction of some parts of the city after the knocking down of the Wall.In the very first experiment ITC-irst carried out to generate its questions set, two volunteers were provided with three CLEF topics structured as above, asking them to produce ten queries for each one. It took about forty-five minutes to conclude their task, and it was immediately noticed that the questions were too closely related to the source topics. Therefore this pilot experiment showed the weaknesses and drawbacks of the strategy, which would lead to overspecified questions, and underlined the need to improve the stimulating power of the topics reducing their specificity without losing relevance to the corpus.The simplest way to expand the structure of the topics and widen the scope of activity for the people in charge of the questions generation seemed to extract manually from each topic a series of relevant keywords, that would replace the topics themselves. No particularly detailed instructions were given in that phase: we just isolated the most semantically relevant words. A keyword could be defined as an independent, unambiguous and precise element that is meant to arise interest and stimulate questions over a specific issue. We also inferred keywords that were not explicitly present in the topic, assuming that even external knowledge, though related to the topic, could help to formulate pertinent questions. ITC-irst coordinators took into consideration the topics developed by CLEF in the years 2000, 2001 and 2002. Three people were involved in the extraction of keywords, that were appended to each topic in form of a ‘signature’, as the tag in the following example testifies. So, the topic entitled “Architecture in Berlin” (shown in figure 2) was converted into a list of word that could even appear unrelated to each other:<IT-tsig>architettura, Berlino, documenti, aspetti architettonici, ricostruzione, città, caduta del Muro, Muro</IT-tsig>2It is interesting to notice that the keywords, even though originated from the topics, allowed a certain detachment from the restricted coverage of the topics themselves, without losing the relation with the important issues of the years 1994 and 1995, that constituted the core of the document collection. Thus the experiment was repeated and much better results in terms of variety and generality of the queries were achieved, in fact the people who were given the keywords instead of the topics had more freedom to range over a series of concepts without any restraint or conditions of adherence to a single specific and detailed issue. Though the nearness of correlated keywords led to the generation of similar queries, this strategy was definitely adopted.The CLEF topics had a pivotal role also in the generation of the Spanish and Dutch queries. As a preparatory work, the Spanish UNED NLP group studied the test set used at TREC 2002 and tried to draw some conclusions in terms of the questions formulation style and the necessary casuistry to find the answer. Then, four people were given the CLEF topics of the years 2000, 2001 and 2002 (but no keywords) with the task of producing 200 short, fact-based queries. The Dutch LIT group adopted the same strategy in its preparation. TREC QA topics (1-1893) were translated into Dutch, and old CLEF retrieval topics (1-140) were used to generate Dutch example questions, usually around 3 per topic.2 Architecture, Berlin , documents, architectural aspects, reconstruction, city, knocking down of the Wall, Wall.2.2 From Keywords to QuestionsBefore generating the queries, the three groups agreed on common guidelines that would help to formulate a good and useful test set. Following the model of past TREC campaigns, and particularly of the TREC 2002 QA track, a series of basic instructions were formulated.Firstly, questions should be fact-based, and, if possible, they should address events that occurred in the years 1994 or 1995. When a precise reference to these two years lacked in the questions, it had to be considered that systems would use a document collection of that year. No particular restraints were imposed on the length and on the syntactic form of the queries, but coordinators kept them simple.Secondly, questions should ask for an entity ( i.e. a person, a location, a date, a measure or a concrete object), avoiding subjective opinions or explanations. So, “Why-questions” were not allowed. Queries like “Why does Bush want to attack Iraq?” or “Who is the most important Italian politician of the twentieth century?” could not be accepted.Since the TREC 2002 question set constituted a good term of comparison, and it did not include any definition question of the form “Who/What is X?”, it was decided to avoid them, as well.Thirdly, coordinators agreed that multiple-item questions, like those used in the TREC list-task, should be avoided. If the community will be interested in processing list questions, we could propose them in next year’s track, possibly together with definition queries. As a pilot evaluation exercise, we did not want to introduce too many difficulties that could have discouraged potential participants.Similarly, the people in charge for the questions generation could not formulate ‘double queries’, in which there is a second indirect question subsumed within the main one (for instance, “Who is the president of the poorest country in the world?”).Finally, closed questions, known as yes/no questions, should be left out, too. Queries should be related to the topics or to the keywords extracted from the topics, without any particular restraint in the word choice. It was not necessary to know the answer before formulating a question: on the contrary, assessors had to be as close as possible to the information they found in the document collection. A prior knowledge of the answer could influence the search in the corpus.Given these instructions, thirty people at ITC-irst were provided with two sets of keywords (extracted from two topics) and were asked to generate ten questions for each one. In this way, a large pool of 600 candidate queries was created. The examples shown in figure 3 demonstrate that the keywords extended the limited scope of the topic “Architecture in Berlin”, allowing people to pose questions related to history or even politics. Some questions, as number 5 and 9, lost connection with the original form of the topic, introducing the name of a famous architect and asking for the number of inhabitants rather than focusing on the architectural features of the city. Adopting this strategy, we could preserve a certain adherence to the original content of the topic, introducing some new elements. Inevitably, as a side effect a number of queries turned out to be useless because they were almost unrelated to the keywords or badly formulated.<num>C001</num><keyword> architettura, Berlino, documenti, aspetti architettonici, ricostruzione, città,caduta del Muro, Muro </keyword><question n=1> Quando e' caduto il muro di Berlino? </question><question n=2> Chi ha costruito il Muro di Berlino? </question><question n=3> Quanto era lungo il muro di Berlino? </question><question n=4> Qual e' la piazza piu' importante di Berlino? </question><question n=5> Qual e' la professione di Renzo Piano? </question><question n=6> Quando e' stato costruito il muro di Berlino? </question><question n=7> Quando e' che Berlino e' ritornata ad essere capitale?</question><question n=8> Dove si trova Berlino? </question><question n=9> Quanti abitanti ha Berlino? </question><question n=10> Che cosa divideva il muro di Berlino? </question> 3Fig. 3. Questions generated from a list of keywords (translation in the footnote)In spite of the generation guidelines established before producing the candidate questions, some inconsistencies persisted. For instance, question 4 concerns a personal opinion rather than a fact-based datum: it is not clear how the importance of a place could be objectively measured. Similarly, question 7 deals with events occurred later than 1994: although the German government took the decision in 1991, Berlin officially became the capital city in 1999.2.3 Questions VerificationOnce the candidate questions had been collected, it was necessary to verify whether they had an answer in the target document collection. This phase constituted the actual manual construction of the replicable gold standard for the CLEF QA track: systems would later process the questions automatically.ITC-irst involved three native Italian speakers in this work. In order to cope with the large amount of candidate questions and with the possibility that many of them were not compliant with the generation guidelines and could not be used for the QA track, three different categories of queries were arranged and each question was classified: the entries of list A were queries that respected the generation guidelines and whose answer was intuitively known, in list B were placed the relevant questions that in the assessors’ opinion had a more difficult answer, while list C contained those that were badly formulated or did not respect the guidelines instructions. As expected, list B was the largest one, including 354 questions. At the end of the question 3<question n=1> When did the Berlin Wall fall? </question><question n=2> Who built the Berlin Wall? </question><question n=3> How long was the Berlin Wall? </question><question n=4> Which is the most important square in Berlin? </question><question n=5> What is Renzo Piano’s job? </question><question n=6> When was the Berlin Wall built? </question><question n=7> When did Berlin become the capital again? </question><question n=8> Where is Berlin? </question><question n=9> How many inhabitants are there in Berlin? </question><question n=10> What did the Berlin Wall divide? </questionverification phase, a total of 480 questions were processed manually, and the remaining 120, most of those included in list C, were eliminated.Browsing a document collection in search of the answers could be a very exhausting activity without any tool that facilitates the detection of the relevant strings. Fortunately, ITC-irst had available a concordancer4 that allowed the three assessors to make selective searches within the corpus, to find the correct answers and to go back to the docid, i.e. the unique identifier, of the document that supported each answer. The common strategy employed by the assessors was to type parts of the query or parts of the known answer in the concordancer, and then browse the most relevant documents retrieved by the software in search of a text snippet that justified and supported the correct answer. The Dutch group developed a small number of grep-based shell scripts with the same purpose.The problem of structuring data and find a sensible format to describe both questions and answers arose during this first phase of the creation of DISEQuA. The issue was addressed conceiving an XML syntax that would show the number of each question, the keywords set (or topic) from which it was generated, the person who verified it in the document collection and the type of entity it was related to. Similarly, the answers found for each question needed to be numbered, and the docid of the document that supported each response had to be logged. The adoption of a precise format could solve the problem of losing trace of the changes that each question could undergo, in fact new tags could be added to give more information. Secondly, structured data can be easily browsed and analyzed: for instance, the tag used to indicate the question type proved to be quite useful in balancing the test set. Thirdly, a common format for questions and answers was necessary to share them between the three groups that put together the DISEQuA corpus.Figure 4 shows an example drawn from the Italian question set : the attribute ‘cnt’ indicates the number assigned to the question, ‘assessor’ is an identifier of the person who processed the query, which seemed to be important in case of inconsistencies. In the attribute ‘origin’ is given the name of the file containing the keywords extracted from a single topic, while the attribute ‘type’ describes the category to which the answer belongs. Seven different question types were considered: PERSON, LOCATION, MEASURE, DATE, ORGANIZATION, OBJECT (i.e. concrete things) and OTHER (when the response could not be labeled with one precise type). The aim was to create a well-balanced test set, with a good coverage of all these categories. Likewise, the attribute ‘n’ in the tag <answer> represents a progressive number of responses, in fact a single query could have several correct answers in the same document collection. Dates and numbers in particular change across different news for the same event. Sometimes former news in the document collection are less precise than the latter ones, because they register a process that changes over a period. Since systems were expected to give an answer supported by a unique document, and not the final or best answer in the whole corpus, in such cases there were many correct responses. In the attribute ‘idx’ is given the docid identifier of the document in which each single answer appears. Systems should return the docid as a justification of the answer, and in strict evaluation the unsupported responses were considered as incorrect.4 the “Toolbox for Lexicographers” developed by Claudio Giuliano.<qa><question cnt="42" assessor="ALE" origin="keyword_C001.txt" type="MEASURE">Quanti abitanti ha Berlino?</question><answer n="1" idx="SDA19940804.00147">3,5 milioni</answer></qa>Fig. 4. Format of the verified questions (see question 9 in figure 3)When no answer was found in the target corpus, answer ‘n’ and ‘idx’ were labeled with 0 (zero), and the answer string was replaced by the string “NIL”. Queries with no answer were not eliminated: on the contrary, twenty NIL questions were included in the final version of each monolingual test set to evaluate systems’ accuracy in recognizing that there was no response.Sometimes the responsiveness of the retrieved string was doubtful and the assessors could not decide whether it was acceptable. These cases required a deeper analyses and an agreement between different assessors. In order to signal the doubts that emerged during the verification phase, a “star” character (*) was put before the uncertain answers and a significant remark that justified the uncertainty was appended to the question within the tag <rem>, as in the following example (see question 10 in figure 3):<question n=5 origin=keyword_C001 type=LOCATION>Che cosa divideva il muro di Berlino ?</question>*<answer n=”1” idx=”LASTAMPA19941016.00038”>Germania</answer>*<answer n=”2” idx=”LASTAMPA19941016.00038”>mondo</answer><rem>"Un evento inatteso, spettacolare, emozionante: sotto gli occhi del mondo cade il Muro di Berlino, simbolomateriale della divisione della Germania e del mondo."</rem>5A cut-and-pasted text snippet found in the document collection was usually placed in the tag <rem>, so that another assessor could take a decision without opening again the corpus in search of the necessary contextual information. In the example above, it was not clear whether the retrieved answers, which are metaphorical, could be accepted (actually, the Berlin Wall isolated West Berlin from the German Democratic Republic), so the first assessor that processed the question left the response undetermined. If a second assessor could not take a decision, the question was passed to a third person, who normally solved the doubts. Alternatively, badly-formulated questions could be slightly modified in order to match the retrieved answer.5 *<answer n=”1”> Germany*<answer n=”2”> the world<rem>"An unexpected, spectacular and exciting event: the eyes of the world are on the Berlin Wall that is falling, a concrete symbol of Germany's and the world's division.”</rem>。

拉丁文的发音及发音规则第一课语音一拉丁文字母发音和名称表

拉丁文的发音及发音规则第一课语音一、拉丁文字母发音和名称表二、拼音练习四、读音练习bene fere vero vidi natura nisi lamina vena male 好几乎，差不多真实我见过自然除非叶片细脉坏lateraroriparosaminimeseroalalaxequasi宽稀少河边，岸玫瑰最小晚、迟翼，翼辨疏散好似，如同rete nota aqua seta lobi ibi 网注解水刚毛裂片在那里maneegodiumaresine早晨我长久海无五、一元音一辅音ab eb ib ob ub acecicocucadedidodudafefifofufagegigogugalelilolulamemimomumaneninonunapepipopuparerirorurasesisosusatetitotut六、两个辅音一个元音的拼音bla bra cla cra dra tra fla fra gla gra pla pra blebreclecredretreflefreglegreplepreblibriclicridritriflifrigligriplipriblobroclocrodrotroflofroglofroploproblubruclucrudrutruflufruglugruplupru七、读音练习a,ab planta infra e,ex et 从，离开，被，和……不同植物在……下在内，里面和anteadarborherbaextus在……前我，近，为，沿乔木，树草木在外，外面frutex inter bis valva supra vel pilus valdeut semen minus plus apex satis per margo demum mox flos magis est sunt nervus stylus different dens visus adest glans 灌木在……间二次果辨在……上或毛很为，如同种子少多顶端，先端够，颇通过边缘然后，最后不久花更是是脉花柱区别齿被看见存在槲果insylvapropeterbulbusversussubglumaultrajamstamensemperpostbasisprimumnonradixsupervixcalyxdruparamusdiffertformafoliumdeestparsstigma在，在……内森林靠近三次鳞茎向着在……下颗超过已经雄蕊常常在……后基部开始，首先不根在……上仅，不过花萼管，筒枝区别形状叶缺，无部分柱头第二课C，ti和q的发音和拼音一、C的发音和拼音1．在a,u,o前，在词尾和铺音前读2．在e,i,y前读汉语拼音的3．拼音ca ac cla cra ceecclecreciicclicricoocclocrocuucclucru二、q的发音和拼音q常与u联用读（kw）并且在这两个字母组合之后永远跟着一个元音，构成一个音节。

The Organization of Informationmooc课后章节答案期末考试题库

The Organization of Information（信息组织）_南京大学中国大学mooc 课后章节答案期末考试题库2023年1.When handling resource heterogeneity, the best way to prevent problems ofscope and scale is through standardization.参考答案:正确2.The amount of resource description is always shaped by the currentlyavailable technology for capturing, storing, and making use of it.参考答案:正确3.All resources in a collection require the same degree of description.参考答案:错误4.Personal and cultural categories and organizing systems are highly biased.And creatinginstitutional categories using more systematic processes canprevent them from being biased.参考答案:错误5. A category is a group, collection, category, or set sharing characteristics orattributes.参考答案:错误6.The organizing principles of organizing systems depend on ____.参考答案:The types of domains being organized_The types of resources_Thepersonal, social, or institutional setting7.We can unpack the degree of organization into three dimensions, including______.参考答案:The overall extent to which interactions in and between organizing systems are shaped by resource description and arrangement._The amount of organization of resources into classes or categories._The amount of description detail or organization applied to each resource.8.Which of the following is the fundamental interaction in any collection ofresources?参考答案:Access9.Which of the following “Category Categories” has flexible boundaries?参考答案:Cultural categories10.Wisdom is the ability to solve problems. It is not unique to human beings.参考答案:错误11.The organization principles and ways are same in different fields.参考答案:错误12.Which description about “Resource” below is NOT true?参考答案:A resource must be a physical thing.13.Which of the following is the bottom layer of the DIKW model?参考答案:Data14.Which of the following can NOT be thought of as an organizing system?参考答案:The piles of debris left after a tornado15.What is the ultimate purpose of organizing?参考答案:Creating capabilities16.The expected lifetime of the organizing system is the same as the expectedlifetime of the resources it contains.参考答案:错误17.Big data collections are often large, so their scale is their most importantchallenge from an organizing system perspective.参考答案:错误18.Which of the following is the central discipline of Knowledge Organization inits narrow sense?参考答案:Library and Information Science19.Which of the following is NOT a benefit of Knowledge Organization?参考答案:Focus on the latest software program20.Which description about “knowledge” is NOT true?参考答案:Tacit knowledge is much more easily shared than Explicit knowledge.21.How many key actions should we focus on when create a knowledgemanagement plan?参考答案:622._____ and _____ are two main types of knowledge.参考答案:Tacit knowledge_Explicit knowledge23.Which of the following ways can be used to test information integrity on theInternet?参考答案:Credibility_Authorship_Objectivity_Timeliness24.Knowledgebases attempt to capture almost every imaginable Tacitintellectual asset that an organization possesses.参考答案:错误25.The effective date of resources is the moment they are created.参考答案:错误26. A resource can only have one identifier.参考答案:错误27.Which of the followin g descriptions about “Passive Resource” are NOT true?参考答案:Passive resources serve as verbs that cause and carry outactions._Passive resources initiate effects or create value on their own.28.For information resources, we more often distinguish domains based on______properties.参考答案:semantic29.The distinction of Resource Format is most important in________.参考答案:Resource storage or preservation30.Which of the following is usually the most important property of informationresources?参考答案:Content31.Which of the following is NOT in Dublin Core Metadata Element Set ?参考答案:Author32.It is a real problem if your organizing system is only designed for yourselfwith a limited lifetime.参考答案:错误33.Backwards traceability includes what is the implementation design of thisrequirement.参考答案:错误34.Most of the specific decisions that must be made for an organizing system arestrongly shaped by the initial decisions about its domain, scope, and scale.参考答案:正确35.The scope of a collection largely determines the extent and complexity of theresource descriptions needed by organizing principles and interactions. The impact of broad scope arises more from the _________ of the resources in acollection than its absolute scale.参考答案:Heterogeneity36.Which of the following is the dominant factor in the design of an organizingsystem?参考答案:The scope of a collection37.In organizing systems that contain digital resources, the logical boundarybetween the resources and their interactions is clear and it is easy todistinguish the interactions supported by the organizing system.参考答案:错误38.Sampling is important when large numbers of resources need to be selectedto satisfy functional requirements. A good sample for statistical purposes is one in which the selected resources are very different in the important ways from the ones that were not selected.参考答案:错误39.Libraries often emphasize intrinsic value, scarcity, or uniqueness as resourceselection criteria.参考答案:错误40.The specifications that guide selection are precise and measurable for anyresource.参考答案:错误41.When resources are unique or rare, organizing activities typically occur afterselection takes place.参考答案:正确42.The purpose of Selection is determining wheth er resources are “Fitness foruse”.参考答案:正确43.Which of the following is FALSE about descriptive statistics?参考答案:Range and Mode are commonly used measures of central tendency.44.Which of the following is the most fundamental decision for an organizingsystem?参考答案:Determining its resource domain。

审计学：一种整合方法阿伦斯英文版第12版课后答案 Chapter 15 Solutions Manual

Chapter 15Audit Sampling for Tests of Controls andSubstantive Tests of TransactionsReview Questions15-1 A representative sample is one in which the characteristics of interest for the sample are approximately the same as for the population (that is, the sample accurately represents the total population). If the population contains significant misstatements, but the sample is practically free of misstatements, the sample is nonrepresentative, which is likely to result in an improper audit decision. The auditor can never know for sure whether he or she has a representative sample because the entire population is ordinarily not tested, but certain things, such as the use of random selection, can increase the likelihood of a representative sample.15-2Statistical sampling is the use of mathematical measurement techniques to calculate formal statistical results. The auditor therefore quantifies sampling risk when statistical sampling is used. In nonstatistical sampling, the auditor does not quantify sampling risk. Instead, conclusions are reached about populations on a more judgmental basis.For both statistical and nonstatistical methods, the three main parts are:1. Plan the sample2. Select the sample and perform the tests3. Evaluate the results15-3In replacement sampling, an element in the population can be included in the sample more than once if the random number corresponding to that element is selected more than once. In nonreplacement sampling, an element can be included only once. If the random number corresponding to an element is selected more than once, it is simply treated as a discard the second time. Although both selection approaches are consistent with sound statistical theory, auditors rarely use replacement sampling; it seems more intuitively satisfying to auditors to include an item only once.15-4 A simple random sample is one in which every possible combination of elements in the population has an equal chance of selection. Two methods of simple random selection are use of a random number table, and use of the computer to generate random numbers. Auditors most often use the computer to generate random numbers because it saves time, reduces the likelihood of error, and provides automatic documentation of the sample selected.15-5In systematic sampling, the auditor calculates an interval and then methodically selects the items for the sample based on the size of the interval. The interval is set by dividing the population size by the number of sample items desired.To select 35 numbers from a population of 1,750, the auditor divides 35 into 1,750 and gets an interval of 50. He or she then selects a random number between 0 and 49. Assume the auditor chooses 17. The first item is the number 17. The next is 67, then 117, 167, and so on.The advantage of systematic sampling is its ease of use. In most populations a systematic sample can be drawn quickly, the approach automatically puts the numbers in sequential order and documentation is easy.A major problem with the use of systematic sampling is the possibility of bias. Because of the way systematic samples are selected, once the first item in the sample is selected, other items are chosen automatically. This causes no problems if the characteristics of interest, such as control deviations, are distributed randomly throughout the population; however, in many cases they are not. If all items of a certain type are processed at certain times of the month or with the use of certain document numbers, a systematically drawn sample has a higher likelihood of failing to obtain a representative sample. This shortcoming is sufficiently serious that some CPA firms prohibit the use of systematic sampling. 15-6The purpose of using nonstatistical sampling for tests of controls and substantive tests of transactions is to estimate the proportion of items in a population containing a characteristic or attribute of interest. The auditor is ordinarily interested in determining internal control deviations or monetary misstatements for tests of controls and substantive tests of transactions.15-7 A block sample is the selection of several items in sequence. Once the first item in the block is selected, the remainder of the block is chosen automatically. Thus, to select 5 blocks of 20 sales invoices, one would select one invoice and the block would be that invoice plus the next 19 entries. This procedure would be repeated 4 other times.15-8 The terms below are defined as follows:15-8 (continued)15-9The sampling unit is the population item from which the auditor selects sample items. The major consideration in defining the sampling unit is making it consistent with the objectives of the audit tests. Thus, the definition of the population and the planned audit procedures usually dictate the appropriate sampling unit.The sampling unit for verifying the occurrence of recorded sales would be the entries in the sales journal since this is the document the auditor wishes to validate. The sampling unit for testing the possibility of omitted sales is the shipping document from which sales are recorded because the failure to bill a shipment is the exception condition of interest to the auditor.15-10 The tolerable exception rate (TER) represents the exception rate that the auditor will permit in the population and still be willing to use the assessed control risk and/or the amount of monetary misstatements in the transactions established during planning. TER is determined by choice of the auditor on the basis of his or her professional judgment.The computed upper exception rate (CUER) is the highest estimated exception rate in the population, at a given ARACR. For nonstatistical sampling, CUER is determined by adding an estimate of sampling error to the SER (sample exception rate). For statistical sampling, CUER is determined by using a statistical sampling table after the auditor has completed the audit testing and therefore knows the number of exceptions in the sample.15-11 Sampling error is an inherent part of sampling that results from testing less than the entire population. Sampling error simply means that the sample is not perfectly representative of the entire population.Nonsampling error occurs when audit tests do not uncover errors that exist in the sample. Nonsampling error can result from:1. The auditor's failure to recognize exceptions, or2. Inappropriate or ineffective audit procedures.There are two ways to reduce sampling risk:1. Increase sample size.2. Use an appropriate method of selecting sample items from thepopulation.Careful design of audit procedures and proper supervision and review are ways to reduce nonsampling risk.15-12 An attribute is the definition of the characteristic being tested and the exception conditions whenever audit sampling is used. The attributes of interest are determined directly from the audit program.15-13 An attribute is the characteristic being tested for in a population. An exception occurs when the attribute being tested for is absent. The exception for the audit procedure, the duplicate sales invoice has been initialed indicating the performance of internal verification, is the lack of initials on duplicate sales invoices.15-14 Tolerable exception rate is the result of an auditor's judgment. The suitable TER is a question of materiality and is therefore affected by both the definition and the importance of the attribute in the audit plan.The sample size for a TER of 6% would be smaller than that for a TER of 3%, all other factors being equal.15-15 The appropriate ARACR is a decision the auditor must make using professional judgment. The degree to which the auditor wishes to reduce assessed control risk below the maximum is the major factor determining the auditor's ARACR.The auditor will choose a smaller sample size for an ARACR of 10% than would be used if the risk were 5%, all other factors being equal.15-16 The relationship between sample size and the four factors determining sample size are as follows:a. As the ARACR increases, the required sample size decreases.b. As the population size increases, the required sample size isnormally unchanged, or may increase slightly.c. As the TER increases, the sample size decreases.d. As the EPER increases, the required sample size increases.15-17 In this situation, the SER is 3%, the sample size is 100 and the ARACR is 5%. From the 5% ARACR table (Table 15-9) then, the CUER is 7.6%. This means that the auditor can state with a 5% risk of being wrong that the true population exception rate does not exceed 7.6%.15-18 Analysis of exceptions is the investigation of individual exceptions to determine the cause of the breakdown in internal control. Such analysis is important because by discovering the nature and causes of individual exceptions, the auditor can more effectively evaluate the effectiveness of internal control. The analysis attempts to tell the "why" and "how" of the exceptions after the auditor already knows how many and what types of exceptions have occurred.15-19 When the CUER exceeds the TER, the auditor may do one or more of the following:1. Revise the TER or the ARACR. This alternative should be followed onlywhen the auditor has concluded that the original specifications weretoo conservative, and when he or she is willing to accept the riskassociated with the higher specifications.2. Expand the sample size. This alternative should be followed whenthe auditor expects the additional benefits to exceed the additionalcosts, that is, the auditor believes that the sample tested was notrepresentative of the population.3. Revise assessed control risk upward. This is likely to increasesubstantive procedures. Revising assessed control risk may bedone if 1 or 2 is not practical and additional substantive proceduresare possible.4. Write a letter to management. This action should be done inconjunction with each of the three alternatives above. Managementshould always be informed when its internal controls are notoperating effectively. If a deficiency in internal control is consideredto be a significant deficiency in the design or operation of internalcontrol, professional standards require the auditor to communicatethe significant deficiency to the audit committee or its equivalent inwriting. If the client is a publicly traded company, the auditor mustevaluate the deficiency to determine the impact on the auditor’sreport on internal control over financial reporting. If the deficiency isdeemed to be a material weakness, the auditor’s report on internalcontrol would contain an adverse opinion.15-20 Random (probabilistic) selection is a part of statistical sampling, but it is not, by itself, statistical measurement. To have statistical measurement, it is necessary to mathematically generalize from the sample to the population.Probabilistic selection must be used if the sample is to be evaluated statistically, although it is also acceptable to use probabilistic selection with a nonstatistical evaluation. If nonprobabilistic selection is used, nonstatistical evaluation must be used.15-21 The decisions the auditor must make in using attributes sampling are: What are the objectives of the audit test?Does audit sampling apply?What attributes are to be tested and what exception conditions are identified?What is the population?What is the sampling unit?What should the TER be?What should the ARACR be?What is the EPER?What generalizations can be made from the sample to thepopulation?What are the causes of the individual exceptions?Is the population acceptable?15-21 (continued)In making the above decisions, the following should be considered: The individual situation.Time and budget constraints.The availability of additional substantive procedures.The professional judgment of the auditor.Multiple Choice Questions From CPA Examinations15-22 a. (1) b. (3) c. (2) d. (4)15-23 a. (1) b. (3) c. (4) d. (4)15-24 a. (4) b. (3) c. (1) d. (2)Discussion Questions and Problems15-25a.An example random sampling plan prepared in Excel (P1525.xls) is available on the Companion Website and on the Instructor’s Resource CD-ROM, which is available upon request. The command for selecting the random number can be entered directly onto the spreadsheet, or can be selected from the function menu (math & trig) functions. It may be necessary to add the analysis tool pack to access the RANDBETWEEN function. Once the formula is entered, it can be copied down to select additional random numbers. When a pair of random numbers is required, the formula for the first random number can be entered in the first column, and the formula for the second random number can be entered in the second column.a. First five numbers using systematic selection:Using systematic selection, the definition of the sampling unit for determining the selection interval for population 3 is the total number of lines in the population. The length of the interval is rounded down to ensure that all line numbers selected are within the defined population.15-26a. To test whether shipments have been billed, a sample of warehouse removal slips should be selected and examined to see ifthey have the proper sales invoice attached. The sampling unit willtherefore be the warehouse removal slip.b. Attributes sampling method: Assuming the auditor is willing toaccept a TER of 3% at a 10% ARACR, expecting no exceptions inthe sample, the appropriate sample size would be 76, determinedfrom Table 15-8.Nonstatistical sampling method: There is no one right answer tothis question because the sample size is determined usingprofessional judgment. Due to the relatively small TER (3%), thesample size should not be small. It will most likely be similar in sizeto the sample chosen by the statistical method.c. Systematic sample selection:22839 = Population size of warehouse removal slips(37521-14682).76 = Sample size using statistical sampling (students’answers will vary if nonstatistical sampling wasused in part b.300 = Interval (22839/76) if statistical sampling is used(students’ answers will vary if nonstatisticalsampling was used in part b).14825 = Random starting point.Select warehouse removal slip 14825 and every 300th warehouseremoval slip after (15125, 15425, etc.)Computer generation of random numbers using Excel (P1526.xls):=RANDBETWEEN(14682,37521)The command for selecting the random number can be entereddirectly onto the spreadsheet, or can be selected from the functionmenu (math & trig) functions. It may be necessary to add theanalysis tool pack to access the RANDBETWEEN function. Oncethe formula is entered, it can be copied down to select additionalrandom numbers.d. Other audit procedures that could be performed are:1. Test extensions on attached sales invoices for clericalaccuracy. (Accuracy)2. Test time delay between warehouse removal slip date andbilling date for timeliness of billing. (Timing)3. Trace entries into perpetual inventory records to determinethat inventory is properly relieved for shipments. (Postingand summarization)15-26 (continued)e. The test performed in part c cannot be used to test for occurrenceof sales because the auditor already knows that inventory wasshipped for these sales. To test for occurrence of sales, the salesinvoice entry in the sales journal is the sampling unit. Since thesales invoice numbers are not identical to the warehouse removalslips it would be improper to use the same sample.15-27a. It would be appropriate to use attributes sampling for all audit procedures except audit procedure 1. Procedure 1 is an analyticalprocedure for which the auditor is doing a 100% review of the entirecash receipts journal.b. The appropriate sampling unit for audit procedures 2-5 is a line item,or the date the prelisting of cash receipts is prepared. The primaryemphasis in the test is the completeness objective and auditprocedure 2 indicates there is a prelisting of cash receipts. All otherprocedures can be performed efficiently and effectively by using theprelisting.c. The attributes for testing are as follows:d. The sample sizes for each attribute are as follows:15-28a. Because the sample sizes under nonstatistical sampling are determined using auditor judgment, students’ answers to thisquestion will vary. They will most likely be similar to the samplesizes chosen using attributes sampling in part b. The importantpoint to remember is that the sample sizes chosen should reflectthe changes in the four factors (ARACR, TER, EPER, andpopulation size). The sample sizes should have fairly predictablerelationships, given the changes in the four factors. The followingreflects some of the relationships that should exist in student’ssample size decisions:SAMPLE SIZE EXPLANATION1. 90 Given2. > Column 1 Decrease in ARACR3. > Column 2 Decrease in TER4. > Column 1 Decrease in ARACR (column 4 is thesame as column 2, with a smallerpopulation size)5. < Column 1 Increase in TER-EPER6. < Column 5 Decrease in EPER7. > Columns 3 & 4 Decrease in TER-EPERb. Using the attributes sampling table in Table 15-8, the sample sizesfor columns 1-7 are:1. 882. 1273. 1814. 1275. 256. 187. 149c.d. The difference in the sample size for columns 3 and 6 result fromthe larger ARACR and larger TER in column 6. The extremely largeTER is the major factor causing the difference.e. The greatest effect on the sample size is the difference betweenTER and EPER. For columns 3 and 7, the differences between theTER and EPER were 3% and 2% respectively. Those two also hadthe highest sample size. Where the difference between TER andEPER was great, such as columns 5 and 6, the required samplesize was extremely small.Population size had a relatively small effect on sample size.The difference in population size in columns 2 and 4 was 99,000items, but the increase in sample size for the larger population wasmarginal (actually the sample sizes were the same using theattributes sampling table).f. The sample size is referred to as the initial sample size because itis based on an estimate of the SER. The actual sample must beevaluated before it is possible to know whether the sample issufficiently large to achieve the objectives of the test.15-29 a.* Students’ answers as to whether the allowance for sampling error risk is sufficient will vary, depending on their judgment. However, they should recognize the effect that lower sample sizes have on the allowance for sampling risk in situations 3, 5 and 8.b. Using the attributes sampling table in Table 15-9, the CUERs forcolumns 1-8 are:1. 4.0%2. 4.6%3. 9.2%4. 4.6%5. 6.2%6. 16.4%7. 3.0%8. 11.3%c.d. The factor that appears to have the greatest effect is the number ofexceptions found in the sample compared to sample size. For example, in columns 5 and 6, the increase from 2% to 10% SER dramatically increased the CUER. Population size appears to have the least effect. For example, in columns 2 and 4, the CUER was the same using the attributes sampling table even though the population in column 4 was 10 times larger.e. The CUER represents the results of the actual sample whereas theTER represents what the auditor will allow. They must be compared to determine whether or not the population is acceptable.15-30a. and b. The sample sizes and CUERs are shown in the following table:a. The auditor selected a sample size smaller than that determinedfrom the tables in populations 1 and 3. The effect of selecting asmaller sample size than the initial sample size required from thetable is the increased likelihood of having the CUER exceed theTER. If a larger sample size is selected, the result may be a samplesize larger than needed to satisfy TER. That results in excess auditcost. Ultimately, however, the comparison of CUER to TERdetermines whether the sample size was too large or too small.b. The SER and CUER are shown in columns 4 and 5 in thepreceding table.c. The population results are unacceptable for populations 1, 4, and 6.In each of those cases, the CUER exceeds TER.The auditor's options are to change TER or ARACR, increase the sample size, or perform other substantive tests todetermine whether there are actually material misstatements in thepopulation. An increase in sample size may be worthwhile inpopulation 1 because the CUER exceeds TER by only a smallamount. Increasing sample size would not likely result in improvedresults for either population 4 or 6 because the CUER exceedsTER by a large amount.d. Analysis of exceptions is necessary even when the population isacceptable because the auditor wants to determine the nature andcause of all exceptions. If, for example, the auditor determines thata misstatement was intentional, additional action would be requiredeven if the CUER were less than TER.15-30 (Continued)e.15-31 a. The actual allowance for sampling risk is shown in the following table:b. The CUER is higher for attribute 1 than attribute 2 because the sample sizeis smaller for attribute 1, resulting in a larger allowance for sampling risk.c. The CUER is higher for attribute 3 than attribute 1 because the auditorselected a lower ARACR. This resulted in a larger allowance for sampling risk to achieve the lower ARACR.d. If the auditor increases the sample size for attribute 4 by 50 items and findsno additional exceptions, the CUER is 5.1% (sample size of 150 and three exceptions). If the auditor finds one exception in the additional items, the CUER is 6.0% (sample size of 150, four exceptions). With a TER of 6%, the sample results will be acceptable if one or no exceptions are found in the additional 50 items. This would require a lower SER in the additional sample than the SER in the original sample of 3.0 percent. Whether a lower rate of exception is likely in the additional sample depends on the rate of exception the auditor expected in designing the sample, and whether the auditor believe the original sample to be representative.15-32a. The following shows which are exceptions and why:b. It is inappropriate to set a single acceptable tolerable exception rateand estimated population exception rate for the combinedexceptions because each attribute has a different significance tothe auditor and should be considered separately in analyzing theresults of the test.c. The CUER assuming a 5% ARACR for each attribute and a samplesize of 150 is as follows:15-32 (continued)d.*Students’ answers will most likely vary for this attribute.e. For each exception, the auditor should check with the controller todetermine an explanation for the cause. In addition, the appropriateanalysis for each type of exception is as follows:15-33a. Attributes sampling approach: The test of control attribute had a 6% SER and a CUER of 12.9%. The substantive test of transactionsattribute has SER of 0% and a CUER of 4.6%.Nonstatistical sampling approach: As in the attributes samplingapproach, the SERs for the test of control and the substantive testof transactions are 6% and 0%, respectively. Students’ estimates ofthe CUERs for the two tests will vary, but will probably be similar tothe CUERs calculated under the attributes sampling approach.b. Attributes sampling approach: TER is 5%. CUERs are 12.9% and4.6%. Therefore, only the substantive test of transactions resultsare satisfactory.Nonstatistical sampling approach: Because the SER for the test ofcontrol is greater than the TER of 5%, the results are clearly notacceptable. Students’ estimates for CUER for the test of controlshould be greater than the SER of 6%. For the substantive test oftransactions, the SER is 0%. It is unlikely that students will estimateCUER for this test greater than 5%, so the results are acceptablefor the substantive test of transactions.c. If the CUER exceeds the TER, the auditor may:1. Revise the TER if he or she thinks the original specificationswere too conservative.2. Expand the sample size if cost permits.3. Alter the substantive procedures if possible.4. Write a letter to management in conjunction with each of theabove to inform management of a deficiency in their internalcontrols. If the client is a publicly traded company, theauditor must evaluate the deficiency to determine the impacton the auditor’s report on internal control over financialreporting. If the deficiency is deemed to be a materialweakness, the auditor’s report on internal control wouldcontain an adverse opinion.In this case, the auditor has evidence that the test of control procedures are not effective, but no exceptions in the sampleresulted because of the breakdown. An expansion of the attributestest does not seem advisable and therefore, the auditor shouldprobably expand confirmation of accounts receivable tests. Inaddition, he or she should write a letter to management to informthem of the control breakdown.d. Although misstatements are more likely when controls are noteffective, control deviations do not necessarily result in actualmisstatements. These control deviations involved a lack ofindication of internal verification of pricing, extensions and footingsof invoices. The deviations will not result in actual errors if pricing,extensions and footings were initially correctly calculated, or if theindividual responsible for internal verification performed theprocedure but did not document that it was performed.e. In this case, we want to find out why some invoices are notinternally verified. Possible reasons are incompetence,carelessness, regular clerk on vacation, etc. It is desirable to isolatethe exceptions to certain clerks, time periods or types of invoices.Case15-34a. Audit sampling could be conveniently used for procedures 3 and 4 since each is to be performed on a sample of the population.b. The most appropriate sampling unit for conducting most of the auditsampling tests is the shipping document because most of the testsare related to procedure 4. Following the instructions of the auditprogram, however, the auditor would use sales journal entries asthe sampling unit for step 3 and shipping document numbers forstep 4. Using shipping document numbers, rather than thedocuments themselves, allows the auditor to test the numericalcontrol over shipping documents, as well as to test for unrecordedsales. The selection of numbers will lead to a sample of actualshipping documents upon which tests will be performed.c. Note: The sampling data sheet that follows assumes an attributessampling approach. The only difference between the sampling datasheet for attributes sampling and for nonstatistical sampling is theactual determination of sample size. For nonstatistical sampling,students’ answers will vary, but will most likely be comparable tothe sample sizes determined under attributes sampling.。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Creating the DISEQuA Corpus:a Test Set for Multilingual Question AnsweringBernardo Magnini*, Simone Romagnoli*, Alessandro Vallin*Jesús Herrera**, Anselmo Peñas**, Víctor Peinado**, Felisa Verdejo**Maarten de Rijke**** ITC-irst, Centro per la Ricerca Scientifica e TecnologicaVia Sommarive, 38050 Povo (TN), Italy.{magnini,romagnoli,vallin}@itc.it** UNED, Spanish Distance Learning University, Dpto. Lenguajes y Sistemas Informaticos Ciudad Universitaria, c./Juan del Rosal 16, 28040 Madrid, Spain.{anselmo,felisa,jesus.herrera,victor}@lsi.uned.es*** Language and Inference Technology Group, ILLC, University of AmsterdamNieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands.mdr@science.uva.nlAbstract. This paper describes the procedure adopted by the three coordinatorsof the CLEF 2003 question answering track (ITC-irst, UNED and ILLC) tocreate the question set for the monolingual tasks. Despite the little resourcesavailable, the three groups collaborated and managed to formulate and verify alarge pool of original questions posed in three different languages: Dutch,Italian and Spanish. A part of these queries was translated into English andshared between the three coordination groups. Thus, a second cross-verificationwas conducted, in order to extract the queries that had an answer in all the threemonolingual document collections. Finally, the result of the joint efforts wasthe creation of the DISEQuA (Dutch Italian Spanish English Questions andAnswers) corpus, a useful and reusable resource that is freely available for theresearch community. The article reports on the different stages of the corpuscreation, from the monolingual kernels to the multilingual extension.1 IntroductionThe question answering (QA) track at CLEF 2003, starting from the experiences accumulated during the past TREC campaigns, focused on the evaluation of QAsystems created for non-English European languages and consequently promoted both monolingual (Dutch, Italian and Spanish) and cross-language tasks. Cross-lingualitywas a necessary step to push participants into designing systems that can find answersin languages different from the source language of the queries, which mirrors apossible scenario of future applications.The document collections were those used at CLEF 2002, i.e. articles drawn from newspapers and news agencies of the year 1994 (Dutch, Italian, Spanish) and 1995(Dutch). Nevertheless, as coordinators of the monolingual tasks, we first needed to create a corpus of questions with related answers for the evaluation exercise, i.e. a replicable gold standard.According to the CLEF QA guidelines, that are based on last years’ TREC ones, the question set released to participants should be made up of simple, mostly short, straightforward and ‘factoid’ queries. Systems should process questions that sound naturally spontaneous, and a good, realistic question set should consist of questions arisen from a real desire to know something about a particular event or situation. Actually, we could have extracted our questions directly from the document collection, simply turning assertive statements into interrogative ones. Such a procedure would have turned out to be quite quick and pragmatic, but it would have undermined the original intentions of the QA track, which is to evaluate the systems’ performance in finding possible answers to open domain questions, independently from the target document collection used. Drawing the queries from the corpus itself would have influenced us in the topics and words choice, and in the syntactic formulation of the questions.The coordinators of the TREC 2002 QA track obtained their 500 questions corpus from question logs of WWW search engines (like the MSN portal). They extracted a thousand queries that satisfied determined patterns from the millions of questions registered in the logs, and then, after correcting linguistic errors, they searched the answers in a 3GB wide corpus. Similarly, the organizers of the TREC-8 QA (held in 1999) drew one hundred of the 200 final questions from a pool of 1,500 candidate questions contained in the FAQFinder logs [3].This strategy leads to a well formed questions and answers corpus, but it requires a lot of available resources, i.e. many native speaker people involved in the verification of the questions, a huge document collection, the access to the logs borrowed from search engines companies and - last but not least – a considerable amount of time. We could take advantage neither of question logs nor of a corpus big enough to enable the extraction of any kind of answer. In order to cope with this lack of resources, we conceived an alternative approach to the QA corpus creation, trying to preserve spontaneity of formulation and independence from the documents collection.The monolingual tasks of the CLEF 2003 QA track required a test set of 200 fact-based questions. Our goal was to collect a heterogeneous set of queries that would represent an extensive range of subjects and find their related answers in three different corpora. The creation of the three test sets constituted the first step toward the generation of a multilingual corpus of questions and answers, whose entries are written into four languages, with the related responses that the assessors extracted from each monolingual document collection during the verification phase.Our activity could be roughly divided into four steps:1.Formulation of a pool of 200 candidate questions with their answers in eachlanguage;2.Selection of 150 questions from each monolingual set and their translation intoEnglish in order to share them with the other groups;3.Second translation and further processing of each shared question in two differentdocument collections;4.Data merging and final construction of the DISEQuA corpus.2 Question GenerationThe corpora addressed by the questions for the monolingual tasks were three collection of newspaper and news agency documents released in 1994 and 1995, and written in Dutch, Italian and Spanish respectively. We used the document collections licensed by the Cross Language Evaluation Forum. These articles covered one year period, and constituted a heterogeneous, open domain text collection. Each article had a unique identifier, i.e. a DOCID number, that participants’ systems had to return together with the answer string in order to prove that their responses were supported by the text. The text of the Italian collection was constituted by about 27 millions words (200 Mb) drawn from the newspaper La Stampa and the Swiss-Italian SDA press agency. The Spanish corpus contained more than 200,000 international news from the EFE press agency published during the year 1994. The Dutch collection was the CLEF 2002 Dutch collection, which consists of the 1994 and 1995 editions of Algemeen Dagblad and NRC Handelsblad (about 200,000 documents, or 540 Mb).<DOC><DOCNO>EFE19940101-00001</DOCNO><DOCID>EFE19940101-00001</DOCID><DATE>19940101</DATE><TIME>00.28</TIME><SCATE>POX</SCATE><FICHEROS>94F.JPG</FICHEROS><DESTINO>ICX EXG</DESTINO><CATEGORY>POLITICA</CATEGORY><CLAVE>DP2403</CLAVE><NUM>736</NUM><PRIORIDAD>U</PRIORIDAD><TITLE> GUINEA-OBIANGPRESIDENTE SUGIERE RECHAZARA AYUDA EXTERIOR CONDICIONADA</TITLE><TEXT> Malabo, 31 dic (EFE).- El presidente de Guinea Ecuatorial, TeodoroObiang Nguema, sugirió hoy, viernes, que su Gobierno podría rechazarla ayuda internacional que recibe si ésta se condiciona a que en elpaís haya "convulsiones políticas".En su discurso de fin de año, ...... conceptos de libertad, seguridad ciudadana ydesarrollo económico y social. EFEDN/FMR01/01/00-28/94</TEXT></DOC>Fig. 1. Format of the target document collection (example drawn from the Spanish corpus) The textual contents of the Spanish collection, as shown in figure 1, were not tagged in any way. The text sections of the Italian corpus on the contrary, according to the NIST guidelines, had been annotated with named entities tags such as <PERSON>, <LOCATION> and <AUTHOR>. The Dutch collections were formatted similarly.Given these three corpora, our final goal was to formulate a set of 180 fact-based questions shared by all the three monolingual QA tasks. The intention of having the same queries in all the tasks was motivated by the need of comparing the systems’performance in different languages. Since the track was divided in many tasks and most of the participants took part in just one of them, the use of the same test set, although it was translated into other languages, would allow us to compare the accuracy of different runs. We aimed at collecting 180 shared queries and not all the 200 provided to the participants because we planned to include in each test set 20 questions with no answer in the corpora (the so-called NIL questions). We did not have enough time to share also the NIL queries, so each group added its own to the shared 180 ones.2.1 From Topics to KeywordsThe first step toward the creation of the DISEQuA test set was the generation of the questions for each monolingual document collection, which was brought out independently by our three groups. The key element that guided our activity through this first phase was the CLEF collection of topics. If we had asked people to generate questions without any restraint, we could have probably obtained just a few usable queries for our purpose. Otherwise, it would have been even more difficult to ask them to focus just on events occurred in 1994 or 1995, which is the time coverage of the articles in our text collections. Besides, we noticed that the mental process of conceiving fact-based questions without having any topic details could take a considerable amount of time: asking good questions could be as difficult as giving consistent answers. In order to cope with these drawbacks, to improve the relevance of the queries and to reduce the time necessary to their generation, we decided to use some CLEF topics.Topics, that can be defined as “original user requests” [1], represent a resource developed for many NLP applications, included question answering. They evolved during the years: at the beginning they were so thoroughly formulated that systems did not almost need to apply any query expansion techniques, but, since this procedure was not very realistic, they were later structured in a more concise way. The team that generated the CLEF topics wanted to create a set of real life subjects which should meet the contents of the document collections. The main international political, social, cultural, economic, scientific and sporting issues and events occurred in 1994 and 1995 were included and topics were written in a SGML style, with three textual fields, as in figure 2.<top><num> C001<I-title>Architettura a Berlino<I-desc>Trova documenti che riguardano l'architettura a Berlino.<I-narr>I documenti rilevanti parlano, in generale, degli aspetti architettonici diBerlino o, in particolare, della ricostruzione di alcuni parti della cittàdopo la caduta del Muro.</top>1Fig. 2. An Italian topic released by CLEF in the year 2000 (translation in the footnote)The title field sketches straightforwardly the main content of the topic, the description field mirrors the needs of a potential user, presenting a more precise formulation in one sentence, and the narrative field gives more information concerning relevance.In the very first experiment ITC-irst carried out to generate its questions set, two volunteers were provided with three CLEF topics structured as above, asking them to produce ten queries for each one. It took about forty-five minutes to conclude their task, and it was immediately noticed that the questions were too closely related to the source topics. Therefore this pilot experiment showed the weaknesses and drawbacks of the strategy, which would lead to overspecified questions, and underlined the need to improve the stimulating power of the topics reducing their specificity without losing relevance to the corpus.The simplest way to expand the structure of the topics and widen the scope of activity for the people in charge of the questions generation seemed to extract manually from each topic a series of relevant keywords, that would replace the topics themselves. No particularly detailed instructions were given in that phase: we just isolated the most semantically relevant words. A keyword could be defined as an independent, unambiguous and precise element that is meant to arise interest and stimulate questions over a specific issue. We also inferred keywords that were not explicitly present in the topic, assuming that even external knowledge, though related to the topic, could help to formulate pertinent questions. ITC-irst coordinators took into consideration the topics developed by CLEF in the years 2000, 2001 and 2002. Three people were involved in the extraction of keywords, that were appended to each topic in form of a ‘signature’, as the tag in the following example testifies. So, the topic entitled “Architecture in Berlin” (shown in figure 2) was converted into a list of word that could even appear unrelated to each other:<IT-tsig>architettura, Berlino, documenti, aspetti architettonici, ricostruzione, città, caduta del Muro, Muro</IT-tsig>2It is interesting to notice that the keywords, even though originated from the topics, allowed a certain detachment from the restricted coverage of the topics themselves, without losing the relation with the important issues of the years 1994 and 1995, that constituted the core of the document collection. Thus the experiment was repeated 1 <I-title>Architecture in Berlin<I-desc>Retrieve documents that concern architecture in Berlin.<I-narr>Generally speaking, the relevant documents deal with the architectural features of Berlin or, particularly, with the reconstruction of some parts of the city after the knocking down of the Wall.2 Architecture, Berlin , documents, architectural aspects, reconstruction, city, knocking down of the Wall, Wall.and much better results in terms of variety and generality of the queries were achieved, in fact the people who were given the keywords instead of the topics had more freedom to range over a series of concepts without any restraint or conditions of adherence to a single specific and detailed issue. Though the nearness of correlated keywords led to the generation of similar queries, this strategy was definitely adopted.The CLEF topics had a pivotal role also in the generation of the Spanish and Dutch queries. As a preparatory work, the Spanish UNED NLP group studied the test set used at TREC 2002 and tried to draw some conclusions in terms of the questions formulation style and the necessary casuistry to find the answer. Then, four people were given the CLEF topics of the years 2000, 2001 and 2002 (but no keywords) with the task of producing 200 short, fact-based queries. The Dutch LIT group adopted the same strategy in its preparation. TREC QA topics (1-1893) were translated into Dutch, and old CLEF retrieval topics (1-140) were used to generate Dutch example questions, usually around 3 per topic.2.2 From Keywords to QuestionsBefore generating the queries, the three groups agreed on common guidelines that would help to formulate a good and useful test set. Following the model of past TREC campaigns, and particularly of the TREC 2002 QA track, a series of basic instructions were formulated. Firstly, questions should be fact-based, and, if possible, they should address events that occurred in the years 1994 or 1995. When a precise reference to these two years lacked in the questions, it had to be considered that systems would use a document collection of that year.Secondly, questions should ask for an entity ( i.e. a person, a location, a date, a measure or a concrete object), avoiding subjective opinions or explanations. So, “Why-questions” were not allowed. Queries like “Why does Bush want to attack Iraq?” or “Who is the most important Italian politician of the twentieth century?” could not be accepted. Since the TREC 2002 question set constituted a good term of comparison, and it did not include any definitional question of the form “Who/What is X?”, it was decided to avoid them, as well.Thirdly, coordinators agreed that multiple-item questions, like those used in the TREC list-task, should be avoided. Similarly, the people in charge for the questions generation could not formulate ‘double queries’, in which there is a second indirect question subsumed within the main one (for instance, “Who is the president of the poorest country in the world?”).Finally, closed questions, known as yes/no questions, should be left out, too. Queries should be closely related to the topics or to the keywords extracted from the topics, without any particular restraint in the word choice. It was not necessary to know the answer before formulating a question: on the contrary, this could influence the search for a response in the corpus.Given these instructions, thirty people at ITC-irst were provided with two sets of keywords (extracted from two topics) and were asked to generate ten questions for each one. In this way, a large pool of 600 candidate queries was created. The examples shown in figure 3 demonstrate that the keywords extended the limited scopeof the topic “Architecture in Berlin”, allowing people to pose questions related to history or even politics. Some questions, as number 5 and 9, lost connection with the original form of the topic, introducing the name of a famous architect and asking for the number of inhabitants rather than focusing on the architectural features of the city.<num>C001</num><keyword> architettura, Berlino, documenti, aspetti architettonici, ricostruzione, città,caduta del Muro, Muro </keyword><question n=1> Quando e' caduto il muro di Berlino? </question><question n=2> Chi ha costruito il Muro di Berlino? </question><question n=3> Quanto era lungo il muro di Berlino? </question><question n=4> Qual e' la piazza piu' importante di Berlino? </question><question n=5> Qual e' la professione di Renzo Piano? </question><question n=6> Quando e' stato costruito il muro di Berlino? </question><question n=7> Quando e' che Berlino e' ritornata ad essere capitale?</question><question n=8> Dove si trova Berlino? </question><question n=9> Quanti abitanti ha Berlino? </question><question n=10> Che cosa divideva il muro di Berlino? </question> 3Fig. 3. Questions generated from a list of keywords (translation in the footnote)In spite of the generation guidelines established before producing the candidate questions, some inconsistencies persisted. For instance, question 4 concerns a personal opinion rather than a fact-based datum: it is not clear how the importance of a place could be objectively measured. Similarly, question 7 deals with events occurred later than 1994: although the German government took the decision in 1991, Berlin officially became the capital city in 1999.2.3 Questions VerificationOnce the candidate questions had been collected, it was necessary to verify whether they had an answer in the target document collection. This phase constituted the actual manual construction of the replicable gold standard for the CLEF QA track: systems would later process the questions automatically.ITC-irst involved three native Italian speakers in this work. In order to cope with the large amount of candidate questions and with the possibility that many of them were not compliant with the generation guidelines and could not be used for the QA track, three different categories of queries were arranged and each question was classified: the entries of list A were queries that respected the generation guidelines 3<question n=1> When did the Berlin Wall fall? </question><question n=2> Who built the Berlin Wall? </question><question n=3> How long was the Berlin Wall? </question><question n=4> Which is the most important square in Berlin? </question><question n=5> What is Renzo Piano’s job? </question><question n=6> When was the Berlin Wall built? </question><question n=7> When did Berlin become the capital again? </question><question n=8> Where is Berlin? </question><question n=9> How many inhabitants are there in Berlin? </question><question n=10> What did the Berlin Wall divide? </questionand whose answer was intuitively known, in list B were placed the relevant questions that in the assessors’ opinion had a more difficult answer, while list C contained those that were badly formulated or did not respect the guidelines instructions. As expected, list B was the largest one, including 354 questions. At the end of the question verification phase, a total of 480 questions were processed manually, and the remaining 120, most of those included in list C, were eliminated.Browsing a document collection in search of the answers could be a very exhausting activity without any tool that facilitates the detection of the relevant strings. Fortunately, ITC-irst had available a concordancer4 that allowed the three assessors to make selective searches within the corpus, to find the correct answers and to go back to the docid, i.e. the unique identifier, of the document that supported each answer. The common strategy employed by the assessors was to type parts of the query or parts of the known answer in the concordancer, and then browse the most relevant documents retrieved by the software in search of a text snippet that justified and supported the correct answer. The Dutch group developed a small number of grep-based shell scripts with the same purpose.The problem of structuring data and find a sensible format to describe both questions and answers arose during this first phase of the creation of DISEQuA. The issue was addressed conceiving an XML syntax that would show the number of each question, the keywords set (or topic) from which it was generated, the person who verified it in the document collection and the type of entity it was related to. Similarly, the answers found for each question needed to be numbered, and the docid of the document that supported each response had to be logged. The adoption of a precise format could solve the problem of losing trace of the changes that each question could undergo, in fact new tags could be added to give more information. Secondly, structured data can be easily browsed and analyzed: for instance, the tag used to indicate the question type proved to be quite useful in balancing the test set. Thirdly, a common format for questions and answers was necessary to share them between the three groups that put together the DISEQuA corpus.Figure 4 shows an example drawn from the Italian question set : the attribute ‘cnt’ indicates the number assigned to the question, ‘assessor’ is an identifier of the person who processed the query, which seemed to be important in case of inconsistencies. In the attribute ‘origin’ is given the name of the file containing the keywords extracted from a single topic, while the attribute ‘type’ describes the category to which the answer belongs. Seven different question types were considered: PERSON, LOCATION, MEASURE, DATE, ORGANIZATION, OBJECT (i.e. concrete things) and OTHER (when the response could not be labeled with one precise type). The aim was to create a well-balanced test set, with a good coverage of all these categories. Likewise, the attribute ‘n’ in the tag <answer> represents a progressive number of responses, in fact a single query could have several correct answers in the same document collection. Dates and numbers in particular change across different news for the same event. Sometimes former news in the document collection are less precise than the latter ones, because they register a process that changes over a period. Since systems were expected to give an answer supported by a unique document, and not the final or best answer in the whole corpus, in such cases there were many 4 the “Toolbox for Lexicographers” developed by Claudio Giuliano.correct responses. In the attribute ‘idx’ is given the docid identifier of the document in which each single answer appears. Systems should return the docid as a justification of the answer, and in strict evaluation the unsupported responses were considered as incorrect.<qa><question cnt="42" assessor="ALE" origin="keyword_C001.txt" type="MEASURE">Quanti abitanti ha Berlino?</question><answer n="1" idx="SDA19940804.00147">3,5 milioni</answer></qa>Fig. 4. Format of the verified questions (see question 9 in figure 3)When no answer was found in the target corpus, answer ‘n’ and ‘idx’ were labeled with 0 (zero), and the answer string was replaced by the string “NIL”. Queries with no answer were not eliminated: on the contrary, twenty NIL questions were included in the final version of each monolingual test set to evaluate systems’ accuracy in recognizing that there was no response.Sometimes the responsiveness of the retrieved string was doubtful and the assessors could not decide whether it was acceptable. These cases required a deeper analyses and an agreement between different assessors. In order to signal the doubts that emerged during the verification phase, a “star” character (*) was put before the uncertain answers and a significant remark that justified the uncertainty was appended to the question within the tag <rem>, as in the following example (see question 10 in figure 3):<question n=5 origin=keyword_C001 type=LOCATION>Che cosa divideva il muro di Berlino ?</question>*<answer n=”1” idx=”LASTAMPA19941016.00038”>Germania</answer>*<answer n=”2” idx=”LASTAMPA19941016.00038”>mondo</answer><rem>"Un evento inatteso, spettacolare, emozionante: sotto gli occhi del mondo cade il Muro di Berlino, simbolomateriale della divisione della Germania e del mondo."</rem>5A cut-and-pasted text snippet found in the document collection was usually placed in the tag <rem>, so that another assessor could take a decision without opening again the corpus in search of the necessary contextual information. In the example above, it was not clear whether the retrieved answers, which are metaphorical, could be accepted (actually, the Berlin Wall isolated West Berlin from the German Democratic Republic), so the first assessor that processed the question left the response 5 *<answer n=”1”> Germany*<answer n=”2”> the world<rem>"An unexpected, spectacular and exciting event: the eyes of the world are on the Berlin Wall that is falling, a concrete symbol of Germany's and the world's division.”</rem>。

Creating the DISEQuA corpus A test set for multilingual question answering

合集下载

高二英语科研项目实施流程单选题40题

drug-testing-kit

STQA习题集

学术英语写作智慧树知到课后章节答案2023年下天津外国语大学

小学上册第十三次英语第一单元测验卷

新视野第三版第二册读写教案

Creating the DISEQuA corpus A test set for multilingual question answering

拉丁文的发音及发音规则第一课语音一拉丁文字母发音和名称表

The Organization of Informationmooc课后章节答案期末考试题库

审计学：一种整合方法阿伦斯英文版第12版课后答案 Chapter 15 Solutions Manual

文档推荐

最新文档

Creating the DISEQuA corpus A test set for multilingual question answering

合集下载

高二英语科研项目实施流程单选题40题

drug-testing-kit

STQA习题集

学术英语写作智慧树知到课后章节答案2023年下天津外国语大学

小学上册第十三次英语第一单元测验卷

新视野第三版第二册读写教案

Creating the DISEQuA corpus A test set for multilingual question answering

拉丁文的发音及发音规则 第一课 语 音 一拉丁文字母发音和名称表

The Organization of Informationmooc课后章节答案期末考试题库

审计学：一种整合方法 阿伦斯 英文版 第12版 课后答案 Chapter 15 Solutions Manual

文档推荐

最新文档

拉丁文的发音及发音规则第一课语音一拉丁文字母发音和名称表

审计学：一种整合方法阿伦斯英文版第12版课后答案 Chapter 15 Solutions Manual