基于全文检索的XML存储查询系统
- 格式:pdf
- 大小:377.79 KB
- 文档页数:4
XML文献数据库检索系统的建立与实现
刘红
【期刊名称】《情报学报》
【年(卷),期】2003(022)004
【摘要】本文主要论述如何建立和为什么要建立一个完全基于XML的文献数据库检索系统,并就XML数据库的一些相关问题作了简单讨论.
【总页数】6页(P439-444)
【作者】刘红
【作者单位】南京政治学院上海分院,上海200433
【正文语种】中文
【中图分类】G35
【相关文献】
1.节水农业文献数据库主题检索系统的建立 [J], 刘喜;梁金萍;郭杰光;曹力萌;马丽萍
2.节水农业文献数据库分类检索系统的建立:节水农业文献资源… [J], 刘喜;梁金萍
3.地方文献数据库检索系统建立之设想 [J], 曹志梅;渠芳
4.信息检索系统在建立"中文地质文献数据库"中的应用 [J], 李永球
5.一个用Foxpro实现的中西文文献数据库自动生成与检索系统 [J], 刘有才
因版权原因,仅展示原文概要,查看原文内容请购买。
基于XML数据模型的Web数据库查询系统
陈玉哲;代术成;庄成三
【期刊名称】《计算机应用》
【年(卷),期】2002(022)003
【摘要】文中提出用XML作为统一的数据模型的Web数据库的概念和体系结构,设计并实现了基于XML的Web数据库上的查询,提出并实现了用Web索引机制实现快速、高效的Web查询.
【总页数】3页(P41-43)
【作者】陈玉哲;代术成;庄成三
【作者单位】四川大学,计算机系,四川,成都,610065;四川大学,计算机系,四川,成都,610065;四川大学,计算机系,四川,成都,610065
【正文语种】中文
【中图分类】TP311.132
【相关文献】
1.基于XML的异构数据库查询与更新集成系统探讨 [J], 李岩榕;林锋;林晓东
2.基于XML技术的虚拟数据库查询系统 [J], 谭汉松;胡春华
3.基于XML的石油勘探数据库查询应用系统的实现 [J], 薛任;徐斌;张喆;张新雷;乔银梅
4.基于XML的Web数据模型 [J], 秦杰;赵淑梅;杨树强;窦文华
5.基于XML数据模型及面向Web数据挖掘技术 [J], 陈一明
因版权原因,仅展示原文概要,查看原文内容请购买。
基于纯XML数据库Natix系统存储技术研究的开题报告一、课题背景随着互联网技术的飞速发展,数据量的增长和数据处理的速度要求越来越高,数据库系统的要求也随之变化。
传统的关系型数据库系统往往面临着结构缺陷(Schema Rigid)、冗余数据(Redundancy)、难以处理半结构化数据(Semi-structured Data)等问题。
近年来,XML作为一种广泛应用于Web技术中的数据交换格式,已经成为一种备受关注的数据表示和交换形式。
XML具有自描述性和通用性强、易于扩展等优点,越来越多的数据和应用程序采用XML格式进行存储和交换。
在这种趋势下,基于XML的数据库系统成为了研究的热点。
本课题旨在基于纯XML数据库Natix系统存储技术进行研究,实现高效稳定的XML数据存储和查询功能,提高数据管理的效率和质量。
二、研究内容1. Natix系统的介绍和基本原理。
Natix是一个纯XML数据库系统,采用DOM树存储结构,支持XQuery查询语言。
本研究将分析Natix系统的架构、存储方式、查询引擎、索引机制等方面的实现原理。
2. Natix存储技术的研究和优化。
本研究将深入分析Natix系统的存储结构、索引机制和查询引擎等方面的机制和性能问题,探索在提高查询效率和处理大规模数据的过程中如何优化存储技术。
3. XPath和XQuery查询语言的优化。
XPath和XQuery作为XML数据库的标准查询语言,可支持复杂的查询操作,但随着数据量的增加,查询效率也会受到影响。
本研究将分析XPath和XQuery的查询机制,并探索优化其查询效率的方法。
4. 数据库索引优化。
索引是提高数据库查询效率的关键,本研究将分析索引的类型、建立方式、查询优化的方法,探索提高索引查询效率的技术。
三、研究意义基于XML的数据库系统具有很强的表达和扩展能力,可以适应不同的应用场景和需要,因此得到了广泛的应用和发展。
本课题将通过对Natix系统的研究和优化,提高XML数据存储和查询效率,深入探索XML 数据库的存储和查询原理,增进对XML数据库系统的认识。
XML数据查询与信息检索系统参考文献[1] Bachman C, William S.A General purpose programming systems for Random access memories. In Proceeding of the Fall Joint Computer Conference, AFIPS, 26, 1964.[2] Bachman C. The programming as a Navigator, CACM. 1973.[3] Bachman C. The data structure set Model. In proceedings of the ACMSOGMOD 1974.[4] Codd E.F. A data base sublanguage founded on the relational calculus. In proceedings of ACM SIGFIDET workshop on data description. 1971.[5] Codd E.F. Relational completeness of data base sublanguage in data base systems. Courant Computer Science Symposia Series. V ol.6.1972[6] Codd E.F. Further Normalization of the data base relational Model. In data base systems, Prentice-Hal, 1972.[7] Codd E.F. Recent Investigations in Relational Database Systems. In proceedings of the IFIP Congress, 1974.[8] Y. Papakonstantious, H.Gracia-Molina and J.Widom. Object exchange across heterogeneous information sources. In IEEE ICDE, 1995.[9] P. Buneman. Tutorial: Semistructured data. In PODS, 1997[10]Arenas, Marcelo and Libkin, Leonid. A normal form for XML documents. Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2002), P85-96, 2002.[11] C.- C. Kanne and G. Moerkotte, Efficient Storage of XML Data, Proc. Of the international Conf on Data Engineering , 2000.[12] Deutsch, A., Fernandez, M., and Suciu, D. Storing semistructured data in relations. In Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats 1999b.[13] S. lu, Y. Sun, M. Atay, and F. FOTOUHI. A new inlining algorithm for mapping XML DTDs to relational schemas. In Proc. of the 1st International Workshop on XML Schema and Data Management, Chicago, Illinois, USA, October 2003.[14] W. Fan and L. Libkin. On XML integrity constrains in the presence of DTDS. In Proc. ACM PODS, 2001.[15] P. Buneman, S. Davidson, W. Fan, C. Hara and W. Tan. Reasoning about keys for XML.In DBPL’01 2001.[16] R. Goldman, J. McHugh, and J. Widom. From semistructured data to XML: Migrating the Lore data and query language. In Proc.of the WebDB workshop, Philadelphia, 1999.[17] J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A database management system for semistructured data. Technical report, Stanford University Database Group, February 1997.[18] Serge Abiteboul, Dallan Quass, Jason McHugh, Jennifer Widom, and Janet L. Weiner. The lore language for semistructured data. In Journal of Digital Libraries, volume 1:1, 1997[19] J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. Dewitt, and J. F. Naughton. Relational database for querying XML documents: Limitations and opportunities. In The VLDB Journal, pages 302-304, 1999.[20] A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with STORED .In Proceeding of the Workshop on Query Processing for Semistructured Data and Non-Standard DataFormats, pages 431-442, 1999.[21] D. Florescu and D. Kossman. A performance evaluation of alternative mapping schemes for storing xml data in a relational database. In Proc. of the VLDB, 1999[22] F. Tian, D. DeWitt, J. Chen, and C. Zhang. The design and performance evaluation of alternative XML storage strategies. In ACM Sigmod Record, 31 (1), March 2002.[23] Dongwon Lee, Murali Mani, Wesley. W. Chu. Effective Schema Conversions between XML and Relational Models. In Workshop on Knowledge Transformations for the Semantic Web Lyon, France, July 2002[24] Millist W. V incent, Jixue Liu Completeness and Decidability Properties for Functional Dependencies in XML, /abs/cs/0301017[25] Murali Mani, Dongwon Lee: XML to Relational Conversion Using Theory of Regular Tree Grammars. In Proc. of the VLDB 2002[26] Meike Klettke, Holger Meyer.XML and Object Relational Database Systems-Enhancing Structural Mapping Based On Statistics. In Int. Workshop on the Web and Database(WebDB), Dallas, May 2000.[27] Marcelo Arenas, Leonid Libkin. An Information Theoretic Approach to Normal Forms for Relational and XML Data. In ACM PODS June 2003[28] Klemens Borm, Karl Aberer, Erich J. Neuhoold and Xiaoya yang. Structured document storage and refined declarative and navigational access mechanisms in HyperStorM. In VLDB Touenal 6(4):296-311, 1997.[29] Yi Chen, Susan Davidson, Carmen Hara, and Y ifeng Zheng. RRXS: Redundancy reducing XML storage in relations. In Proceedings of the 29th VLDB Conference, 2003[30] Phil Bohannon Juliana Freire Parson Roy Jerome Simeon. From XML Schema to Relations:A Cost-Based Approach to XML Storage. In ICDE,2002[31] Daniela Florescu, Donald Kossmann Storing and Querying XML Data using an RDDMBS. In Bulletin of the Technical Committee on Data Engineering, P27-34, September 1999[32] Y i Chen, Susan. B. Davidson and Yifeng Zheng.Constraint Preserving XML Storage in Relations. In Fifth International Workshop on the Web and Databases (WebDB) June 2002. [33] Wenfei Fan,Leonid Libkin. Finite Implication of Keys and Foreign Keys for XML Data. Technical Report TUCIS-TR-2000-004, Department of Computer and International Science, Temple University, 2000.[34] Y an, L.L., R. J. Miller, L. M. Haas, R. Fagin. Data-Driven Understanding and Refinement of Schema Mappings. SIGMOD, 2001[35] Miller, R. J. et al. The Clio Project: Managing Heterogeneity. SIGMOD Record 30:1:78-83, 2001[36] Do, H. H., E. Rahm: COMA- A Systems for Flexible Combination of Schema Matching Approach. In VLDB 2002[37] Madhavan, J., P. A. Bernstein, E. Rahm: Generic Schema Matching with Cupid. In VLDB 2001 page 49-58.[38] Li, W. S., C. Clifton: Semantic Integration in Heterogeneous Database Using Neural Networks. In VLDA.1994[39] Li, W. S., C. Clifton: SemInt: A Tool for Identifying Attribute Correspondences in Heterogeneous Database Using Neural Network. In. Data and Knowledge Engineering 33: 1, 49-84, 200..[40] Li, W. S., Clifton, S. Y. Liu: Database Integration Using Neural Networks: Implementation and Experiences. Knowledge and Information Systems 2:1, 2000[41] Melnik, S., H. Garcia-Molina, E. Rahm: Similarity Flooding: A V ersatile Graph Matching Algorithm. In ICDE 2002[42] Mitra P, Wiederhold G, Jannink J: Semiautomatic integration of knowledge sources. In: Proc of Fusion ’99, Sunnyvale, USA.[43] Mitra P, Wiederhold G, Kersten M: A graph oriented model for articulation of ontology interdependencies. In Proc Extending Database Technologies, Lecture Notes in Computer Science, vol.1777.2000, pp. 86-100.[44] An Hai Doan, Pedro Domingos, Alon Y. Halevy: Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. In SIGMOD 2001[45] Jaewoo Kang Jeffrey F. Naughton On schema matching with opaque column names and data values. In ACM SIGMOD 2003, California Pages: 205-216.[46] Rahm, E., P.A. Bernstein: A Survey of Approach to Automatic Schema Matching. VLDB Journal.10:4, 2001.[47] Hong Hai Do, Sergey Melnik, Erhard Rahm: Comparison of Schema Matching Evaluations. In Proceeding of the 2nd Int. Workshop on Web Databases (German Information Society), 2002 [48] Li Xu and David W. Embley: Discovery Direct and Indirect Matches for Schema Elements. Eighth International Conference on Database Systems for Advanced Applications (DASFAA’03).2003[49] Guuilian Wang, Joseph Goguen, Y ong-Kwang Nam and Kai Lin: Critical Points for Interactive Schema Matching. In Proceedings of the Sixth Asia Pacific Web Conference, Hangzhou, China, April, 2004[50] L. Palopoli, G. Teracina, and D. Ursino. The system DIKE: Towards the semi-automatic synthesis of cooperative information systems and data warehouses. In Proceedings of ADBIS-DASFAA 2000, page3 108-117[51] /TR/rdf-concepts/[52] Anhai Doan, Jayant Madhavan Pedro Domingos, and Alon Y. Halevy: Learning to Map between Ontologies on the Semantic Web. In Proceedings of the 11th International World Wide Web Conference (WWW), 2002[53] Miilo, T., S. Zohar: Using Schema Matching to Simplify Heterogeneous Data Translation. In VLDB. 1998 Pages: 122-133[54] Paul F. Dietz. Maintaining order in a linked list. In Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing, pages 122-127, San Francisco, California, 5-7 May 1982.[55] R. Sacks-Davis, T. Dao, J.A. Thom, J. Zobel. Indexing Document for Queries on Structure, Content and Attributes. In Proc. of International Symposium on Digital Media Information Base (DMIB), Nara, Japan, pages 236-245,1997[56] C. L. A. Clarke, G. V. Cormack, and F. J. Burkowski. An algebra for structured text search and a framework for its implementation. In The Computer Journal, 38(1): 43-56 1995[57] D. D. Kha, M. Y oshikawa, S.Uemura. An XML Indexing Structure with Relative Region Coordinate. In Proceedings of the 17th ICDE, pages 313-320. Heidelberg, Germany, April, 2001 [58] Q. Li and B. Moon. Indexing and querying XML data for regular path expressions. In Proceedings of the 27th VLDB, pages 361-370. Roma, Italy, September 2001[59] C. Zhang, J. F. Naughton, D. J. DeWitt, Q. Luo, and G. M. Lohman. On supporting containment queries in relational database management systems. In Proceedings of the 27th ACM SIGMOD, pages 425-436. Santa Barbara, California, USA, May 2001[60] W. Wang. H. Jiang, H. Lu and J. X..Y u. PbiTree Coding and Efficient Processing of Containment Join. In Proceedings of 19th ICDE, pages 391-402. Bangalore, India, March 2003 [61] Sl-Khalifa et al. Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In Proc. of ICDE, San Jose, Feb 2002[62] S.-Y. Chien, Z.V agena, . Zhang, V. J. Tsotras, and C. Zaniolo. Efficient structural joins on indexed XML documents. In Proceedings of the 28th VLDB Conference, Hong Kong, China, August 2002[63] Alan Halverson, Josef Burger, etc. Mixed Mode XML Query Processing. In proceedings of the 29th VLDB, pages 361-370. Berlin, Germany 2003[64] Roy Goldman, Jennifer Widom. DataGuides: Enabling Query Formulation and Optimization in Semistructured Database. In Proceeding of the 23rd VLDB Conference Athens, Greece, 1997 [65] Jan Marco Bremer and Michael Gertz. An Efficient XML Node Identification and Indexing Schema. Teach report. Department of Computer Science University of California, Davis.2003 [66] Haixun Wang Sanghyun Park Wei Fan Philip S. Y u. V ist: A Dynamic Index Method for Querying XML Data by Tree Structures. In SIGMOD 2003, June 912, 2003, San Diego, CA. [67] D. Chamberlin, D. Florescu, J. Robie, J. Simon, and M. Stefanescu. XQuery: A query language for XML W3C working draft. Technical Report WD-xquery-20010215, World Wide Web Consortium, 2001[68] D. Chamberlin, J. Robie, and D. Florescu. Quilt: An XML query language for heterogeneous data sources. In WebDB, May 2000[69] J. Clark and S. DeRose. XML path language (XPath) version 1.0 w3c recommendation. Technical Report REC-xpath-19991116, World Wide Web Consortium,1999[70] Edith Cohen, Haim Kaplan, and Tova Milo. Labeling dynamic XML trees. In PODS, pages 271-281, 2002[71] Zhang et al. On Supporting Containment Queries in Relational Database Management Systems, SIGMOD Conference, 2001[72] A. R. Schmidt, F. Wass, M. L. Kersten, D. Florescu, I. Manolescu, M. J. Carey, and R. Busse. The XML benchmark project. Technical Report INS-R0103, Centrum voor Wiskunde en Informatics, 2001[73] Michael Ley. DBLP database web site. rmation.uni-trie.de/ley/db[74] XMARK: The XML-benchmark project. http://monetdb.cwi.nl/xml.[75] Praveen Rao and Bangka Moon PRLK: Indexing And Querying XML Using Prufer Sequences. In ICDE’2004 March 2004[76] H.Jiang, H. Lu, W. Wang and B. C. Ooi. XR-Tree: indexing XML Data for Efficient Structural Joins. In ICDE, 2003[77] N. Bruno, N. Koudas, D. Srivastava. Holistic Twing Joins: Optimal XML Pattern Matching. In SIGMOD 2002[78] H.Jiang, H. Lu, W. Wang. Holistic Twing Joins on Indexed XML Document. In VLDB 2003[79] SAX(Simple API for XML). .Appendix[80] Haixun Wang, Xiaofeng Meng.On the Sequencing of Tree Structures for XML Indexing. In ICDE 2003[81] Proter, M. An Algorithm for Suffix Stripping, Program, 14(3), pp. 130-137,1980[82] Proter, M. An Algorithm for Suffix Stripping, In Reading in Information Retrieval, Sparck Jones and Willett, eds.,pp. 313-316,1997[83] Harman, er-friendly systems instead of user-friendly front-ends. In Journal of the American Society for Information Science, 43(2), pp164-174, 1992[84] ANSI/NISO Z39.50-1995. Information Retrieval(Z39.50):Application Service Definition and Protocol Specification, July 1995[85] Lee, J.H. Properties of extended boolean models in information retrieval, Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information pp. 182-190,1994[86] Paice, C.P. Soft Evaluation of boolean search queries in information retrieval systems. In Information Technology: Research and Development, 3(1), pp. 33-42, 1984[87] Waller, W.C., Kraft, D.H. A mathematical model of a weighted Boolean retrieval systems. In Information Processing and Management, 15, pp.235-245, 1979[88] Zimmerman, H.J. Fuzzy Set Theory and its Applications, 2nd edition, Kluwer Academic Publishers, 1991[89] Greiff, W. R., Turtle, H. Computationally Tractable Probabilistic Modeling of Boo lean Operators. In Proceedings of the 20th Annual International ACM SIGR Conference and Development in Information Retrieval, pp.119-128, 1997. page 214[90] Salton, G., Fox, E.A., Wu H. Extended Boolean information retrieval. In Communications of the ACM, 26(11), pp.1022-1036, 1983[91] Salton, G. Automatic text processing: The transformation, analysis, and retrieval of information by computer, Addison-Wesley, Reading, MA, 1989[92] Korfhage, R. R. Information Storage and Retrieval, John Wiley and Sons, New Y ork, 1997[93] van Rijsbergen, C. J. Information Retrieval (2nd ed.), Butterworths, London, 1979[94] Lee, bining multiple evidence from different properties of weighting schemes, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 180-188, 1995[95] Salton, G, Buckley, C. Term-weighting approaches in automatic text retrieval. In Information Processing & Management, 24 (5), pp.513-523, 1988[96] Callan, J. P. Passage-level evidence in document retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 302-310, 1994[97] Wilkinson, R. Effective Retrieval of Structured Document. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 311-317, 1994[98] Hearst, M.A., Plaunt, C. Subtopic Structuring for Full-Length Document Access. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59-68, 1993[99] Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R. Indexing by latent semantic analysis. In Journal of the American Society for Information Science, 41 (6), pp. 391-407, 1990[100] Hull, D. Improving text retrieval for the routing problem using Latent Semantic Indexing. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval, pp. 282-291, 1994[101] Bartrell, B. T., Cottrell, G. W., Belew, R. K. Latent Semantic Indexing is an optimal special case of Multidimensional Scaling. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 161-167, 1992[102] Schutze, H., Silvertstein,. C.A. Comparison of Projections for Efficient Document Clustering. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 74-81, 1997[103] Cooper, W.S.. Inconsistencies and misnomers in probabilistic IR. In Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 57-61, 1991[104] Copper, W. S.. Some inconsistencies and misidentified modeling assumption in probabilistic information retrieval. In ACM Transaction on Information Systems, V ol.13, No.1, pp.100-111, January 1995[105] Cooper, W .S., Gey, F. C., Dabney, D. P. Probabilistic retrieval based on staged log istic regression. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 198-210, 1992[106] Winkler, R. L., Hays, W. L Statistic: Probability, Inference, and Decision, 2nd ed, Holt, Rinehart, and Winston, New Y ork, 1975[107] Robertson, S. E., Sparck Jones, K. Relevance weighting of search terms. In Journal of the American Society for Information Science, 27, pp. 129-146, 1976[108] Turtle, H.,Croft, W.B. Evaluation of an inference network-based retrieval model. In ACM Transactions on Information Systems, V ol.9, No.3, July 1991[109] Callan, J. P., Croft, W. B., Broglio, J. TREC and TIPSTER experiments with INQUERY. In Information Processing & Management, V ol.31, No.3, pp.327-343, 1995[110] Callan, J. P., Croft, W. B., Harding, S. M. The INQUERY retrieval system, in Database and Expert Systems Application. In Proceedings of the International Conference, V alenc ia Spain, pp.78-83, 1992[111] Shaw, W. M. Term-relevance computations and perfect retrieval performance. In Information Processing & Management, V ol.31, No.44, pp.491-498, 1995[112] J. R. Files and H. D. Huskey. An information retrieval system based on superimposed coding. In Proc. AFIPS FJCC, 35 :423—432, 1969[113] M.C.Harrison. Implementation of the substring test by hashing. In CACM, 14(12): 777—779, December 1971[114] D. Tsichritizes and S. Christodoulakis. Message files. In ACM Trans, on Office Information Systems, 1(1): 88—98, January 1983[115] F. Rabitti and J. Zizka. Evaluation of access methods to text documents in office systems. In Proc. 3rd Joint ACM-BCS Symposium on Research and Development in Information Retrieval, 1984[116] R. Sacks-Davis, A. Kent, and K. Ramamohanarao. Multikey access methods based on superimposed coding techniques. In ACM Trans, on Database Systems (TODS), 12(4): 655-696, December 1987[117] R. Sacks-Davis and K. Ramamohanarao. A two level superimposed coding scheme for partial match retrieval. In Information Systems, 8(4): 273—280, 1983[118] U. Deppisch. S-treee: a dynamic balanced signature index for office retrieval. In Proc. OfACM Research and Development in Information Retrieval, pages 77—87, September 1986 [119] D. L .Lee and C. –W. Leng. Partitioned signature file: Designs and performance evaluation. In ACM Trans. On Information Systems (TOIS), 7(2):l 158—180,April 1989[120] G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983[121] D. E. Knuth. The Art of Computer Programming, V ol.3: Sorting and Searching Addison-Wesley, Reading, Mass, 1973[122] IBM.IBM Systems/370(OS/VS),Storage and Information Retrieval/V ertical Storage (STAIRS/VS). IBM World Trade Corporation.[123 R. L. Haskin. Special-purpose processors for text retrieval. In Database Engineering,4(1): 16—29, September 1981[124] G. K. Zipf. Human Behavior and Principle of Least Effort: an Introduction to Human Ecology. Addison Wesley, Cambridge, Massachusetts, 1949[125] C. Faloutsos and H.V. Jagadish. Hybird index organizations for test database. InEDBT’92, pp 310:327, March 1992[126] Chistos Faloutsos and H. V. Jagadish. On b-tree indices for skewed distributions. In 18th VLDB Conference, pages 363—374, V ancouver, British Columbia August 1992[127] Anthony Tomasic, Hector Garcia-Molina, and Kurt Shoens. Incremental updates of inverted lists for text document retrieval. In ACM SIGMOD, pp 289: 300, My 1994[128] Doug Cutting and Jan Pedersen. Optimizations for dynamic inverted index maintenance. In SIGIR, pages 405—411, 1990[129] Sihem Amer-Y ahia, Nick Koudas, Amelie Marian, Divesh Srivastave, David Toman. Structure and Content Scoring for XML. In VLDB 2005[130] Cong Y u, Hong Qi, H. V.Jagadish. Integration of IR into an XML Database. In First Annual Workshop of the Initiative for the Evaluation of XML Retrieval (INEX), 2002[131] N. Fuhr and K. Grobjohann. XIRQL: A Query Language for Information Retrieval in XML Documents. In Proceedings of the 24th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, 2001[132] Y. Mass, M. Mandelbrod, E. Amitay, D. Carmel, Y. Maarek, and A. Soffer. Juru XML-an XML retrieval system. In INEX 2002[133] ET. Grabs, H.-J. Schek. Flexible Information Retrieval from XML with Power DBXML. In INEX 2002[134] T. Schlieder and H. Meuss. Result Ranking for Structure Queries against XML Documents. In DELOS Workshop on Information Seeking, Searching and Querying in Digital Libraries, 2000 [135] Shurug Al-Khalifa, Cong Y u, H.V. Jagadish. Querying Structured Text in an XML Database. In Sigmod 2003[136] Sihem AmerYhahia, Chavdar Botev, Jayavel. TeXQuery: A Full Text Search Extension to XQuery. In WWW 2004[137] L. L. Guo, F. Shao, C.Botev, J.Shanmugasundaram. XRANK: Ranked Keyword Search over XML Documents. In Sigmod 2003[138] Sihem AmerY ahia, Laks V. S. Lakshmanan, Shashank Pandit. FleXPath: Flexible Structure and Full Text Querying for XML. In Sigmod 2004[139] Daniela Florescu, Donald Kossman, loana Manolescu. Integrating Keyword Search into XML Query Processing. In WWW 2000[140] Taurai T. Chinenyanga, Nicholas Kushmerick. An expressive and efficient language for XML information retrieval. In J. American Society for Information Science &Technology 53(6): 438-453 2002[141] R. Sacks- Davis, T. Dao, J. A. Thom, J. Zobel. Indexing Documents for Queries on Structure, Content and Attributes. Proc. Of International Symposium on Digital Media Information Base (DMIB), Nara, 1997[142] Jaap Lamps, Maarten de Rijke, Borkur Sigurbjornsson. Length. Normalization in XML Retrieval. In ACM SIGIR 2004[143] Hugh E. Williams, Justin Zobel, Dirk Bahle. Fast Phrase Querying with Multiple Indexes. In ACM Transactions on Information Systems 22(4):573-594, 2004[144] Raghav Kaushik, Rajasekar Krishnamurthy, Jeffrey F. Naughton, Raghu Ramakrishna. On the Integration of Structure Indexes and Inverted Lists. In Sigmod 2004[145] Galax. /[146] http;///TR/xquery-full-text/[147] Qizx/open. /qizxopen/[148] Zhongming Han, Jiajin Le, Beijing Shen. Effectively Scoring for XML IR Queries. In DEXA 2006, Springer LNCS[149] Zhongming Han, Jiajin Le and Niya Fu. An Efficient Numbering Scheme and Query Algorithms for XML. In International Journal of Computational Science and Engineering. [150] Zhongming Han, Congting Xi, and Jiajin Le. Efficient coding and querying XML document. In 10th DASFAA, Springer LNCS 3433:54-69 2005[151] Zhongming Han, Niyu Fu. Efficient coding and indexing XML document. In 3rd NDIS, Springer LNCS 3453:138-150 2005。
《基于XML的ACCESS数据库文档阅卷系统的设计与实现》篇一一、引言随着信息技术的飞速发展,教育领域对阅卷系统的需求日益增长。
为了提高阅卷的效率和准确性,本文提出了一种基于XML 的ACCESS数据库文档阅卷系统的设计与实现方案。
该系统旨在通过XML技术实现对文档数据的标准化存储和传输,以及通过ACCESS数据库进行高效的数据管理和查询。
二、系统设计1. 系统架构设计本系统采用C/S(客户端/服务器)架构,分为前端和后端两部分。
前端主要负责文档的上传、编辑和阅卷操作,后端则负责数据的存储、管理和查询。
系统架构设计应遵循高内聚、低耦合的原则,以确保系统的稳定性和可扩展性。
2. 数据库设计本系统采用ACCESS数据库作为数据存储的核心。
数据库设计应遵循规范化原则,确保数据的完整性和一致性。
同时,为了提高查询效率,应合理设计数据库表结构和索引。
此外,为了支持XML数据的存储和传输,应在数据库中创建相应的XML字段。
3. XML技术应用XML(Extensible Markup Language)是一种可扩展的标记语言,具有数据自描述性、跨平台性和易读性等优点。
本系统采用XML技术实现对文档数据的标准化存储和传输。
通过XML,可以将文档数据转换为结构化的格式,方便后续的数据处理和分析。
三、系统实现1. 前端实现前端主要采用Windows Forms或Web技术进行开发。
用户可以通过前端界面上传文档、编辑文档和进行阅卷操作。
为了方便用户使用,前端界面应具有友好的交互设计和丰富的功能。
2. 后端实现后端主要实现数据的存储、管理和查询功能。
通过ACCESS 数据库技术,可以实现高效的数据存储和查询。
同时,后端还应提供相应的API接口,方便前端进行数据交互。
为了支持XML 数据的处理,后端应具备XML解析和生成功能。
3. 系统集成与测试在系统开发和实现过程中,应进行严格的测试和调试,确保系统的稳定性和可靠性。
基于XML的文档管理系统的开题报告一、研究背景和意义随着互联网和电子信息技术的迅速发展,文档管理日益重要。
在企业、政府机关、科研机构、教育机构等机构中,需要对大量文档进行分类、存储、检索、共享和传输。
然而,传统的文档管理方式已不能满足现代化要求,效率低下、安全性不足、可扩展性差等问题引起了越来越多的关注。
基于XML的文档管理系统是一种新型的文档管理系统,它采用标准的XML格式存储文档,并利用XML技术实现文档的分类、存储、检索、共享和传输等功能。
相对于传统的文档管理技术,基于XML的文档管理系统具有以下优势:1. 易于扩展和维护。
XML是一种标准的格式,具有良好的可扩展性和灵活性,系统可以根据需要进行快速扩展和修改,保证系统始终可以满足用户的需求。
2. 安全性高。
XML具有较好的安全性,采用基于XML的文档管理系统可以有效避免文档丢失、被盗用、被篡改等安全问题。
3. 检索精准度高。
XML具有良好的结构性,文档的检索可以通过XPath等方法进行,检索精准度高。
4. 共享性好。
XML可以实现文档的多平台共享,即使在不同的操作系统中,也可以通过XML格式进行文档共享,提高了团队协作效率。
基于XML的文档管理系统的开发和研究具有重要的理论意义和实践意义,可以满足现代化文档管理的需求,提高文档管理的效率和安全性,促进科学研究和社会发展。
二、研究内容和方法本研究主要包括以下内容:1. 基于XML的文档管理系统的需求分析。
对现有文档管理系统进行分析,确定开发基于XML的文档管理系统的需求。
2. 基于XML的文档管理系统的架构设计。
设计系统的整体结构,包括系统的数据存储方式、文档分类结构、文档检索方式等。
3. 基于XML的文档管理系统的开发实现。
采用JAVA技术进行开发,实现系统的基本功能。
4. 基于XML的文档管理系统的测试与评价。
对系统进行测试,并评价系统的效率、安全性、可扩展性和易用性等方面。
本研究采用文献研究、需求分析、系统设计、开发实现、实验测试等方法,对基于XML的文档管理系统进行研究和开发。
一种基于关系数据库的XML文档存储和查询的方法XML已成为Web上数据表示、集成和交换的标准,它的格式简单、自我描述能力强,实现了内容、结构和表现三者的分离,更适合于数据表示和交换。
近年来,XML在各个领域得到了广泛的使用,Web上已经涌现了大量的XML数据。
为了有效地加工、分析和处理XML数据,国内外学者已经提出了各种XML的查询语言和存储管理技术。
由于关系数据库是目前最成熟的一种数据管理技术,在存储和管理XML数据的各种方式中,基于关系数据库的XML数据存储和处理技术显然是一种可行而有效的方式,并在学术界受到了广泛的关注。
然而,由于数据模型的差异,利用关系数据库存储和查询XML数据给传统数据库技术带来了许多新的挑战。
本文对XML数据的关系存储、路径表达式的查询处理等方面进行了深入的研究和探讨,提出了一种新的利用关系数据库存储和查询XML数据的方法,这种方法将XML
文档树中有文本值的节点和无文本值的节点分别存储在两个关系表中,它不关心文档DTD的模式信息,也不需要建立任何索引结构。
具体来说,本文的主要工作如下: (1)提出了一种新的基于路径的XML
数据的关系存储方法。
这种方法完整无损地记录了XML文档中的各节点信息、边信息和值信息,即在关系表中存储XML文档中各元素的name、id、parentid、level以及所有有值元素/属性的路径,加快了查询处理的速度。
(2)针对
这种存储结构,提出了新的基于表连接的查询转换算法。
基于Lucene的XML文件相似度检索系统吴新强;周娅;王如意;张敬伟;林煜明【期刊名称】《计算机系统应用》【年(卷),期】2015(000)002【摘要】经分析研究开源的Lucene系统架构以及特殊xml数据源,针对Lucene搜索得分公式的不足,提出了结合词项位置和二次检索的公式,设计一种文本搜索系统;并以提高检索性能、相似性搜索的准确率、索引的空间效率和支持查询的时间效率为目标进行实验,最后通过部署 Tomcat 服务器实现。
经实验验证,改进的系统较之于原Lucene系统提高了建立索引效率、查询效率、准确率。
%On the basis of analysis and study on the open source Lucene system architecture, a semantic search system is designed based on the special XML data sources in this paper. What’s more, we use the word item l ocation and word semantic to improve the Lucene’s search results and conduct experiments to test and verify the retrieval performance, the accuracy of similarity search, the space efficiency of index and the time-efficiency of supporting inquiry:And finally by deploying the Tomcat server to implement our implement system. The experiment results prove that compared with the original Lucene indexing system, our system can improve the indexing efficiency, query efficiency and accuracy.【总页数】6页(P134-139)【作者】吴新强;周娅;王如意;张敬伟;林煜明【作者单位】桂林电子科技大学计算机科学与工程学院,桂林 541004;桂林电子科技大学计算机科学与工程学院,桂林 541004;桂林电子科技大学计算机科学与工程学院,桂林 541004;桂林电子科技大学计算机科学与工程学院,桂林541004;桂林电子科技大学计算机科学与工程学院,桂林 541004【正文语种】中文【相关文献】1.一种基于Lucene的近义词关键字检索系统设计 [J], 刘天宇2.一种基于Lucene的近义词关键字检索系统设计 [J], 刘天宇3.基于SSM和Lucene的水利文献检索系统设计 [J], 孙敏;鞠勇4.基于SSM和Lucene的水利文献检索系统设计 [J], 孙敏;鞠勇;5.基于Lucene技术的金属设备信息检索系统 [J], 张艳飞;郭洋;孙云飞因版权原因,仅展示原文概要,查看原文内容请购买。