عنوان مقاله
اجرای محاسبات ابری روش شاخص گذاری XML با استفاده از هادوپ
فهرست مطالب
مقدمه
کارهای وابسته
مقدمات و کلیاتی پیرامون هادپ
سیستم پیشنهاد شده
نتایج آزمایشی
نتیجه گیری
بخشی از مقاله
ارزیابی پرس و جو در سیستم بر اساس الگوریتمNCIM انجام می شود. اختلاف آن است کهNCIM شاخص ها را در حافظه اصلی نگه داشته و سیستم پیشنهاد شده شاخص ها را درHDFS ذخیره می کند. بارگذاری شاخص های نظیر قبل از اجرای ارزیابی پرس و جو در سیستم الزامی می باشد.
در سیستم پیشنهاد شده، دو مد یا شیوه پردازش پرس و جو طراحی می کنیم.
مد پیوسته (مد 1)
مد ناپیوسته (مد 2).
در مد پیوسته، با پرس و جوهای کاربر به صورت رشته رفتار کرده و سپس با هر رشته به صورت یک قسمت رفتار شده و به نگاشت تخصیص داده می شود.
کلمات کلیدی:
A Cloud Computing Implementation of XML Indexing Method Using Hadoop∗ Wen-Chiao Hsu1, I-En Liao2, **, and Hsiao-Chen Shih3 1,2,3 Department of Computer Science and Engineering National Chung-Hsing University, 250 Kuo Kuang Road, Taichung 402, Taiwan phd9510@cs.nchu.edu.tw, ieliao@nchu.edu.tw, nic3p1217@gmail.com Abstract. With the increasing of data at an incredible rate, the development of cloud computing technologies is of critical importance to the advances of researches. The Apache Hadoop has become a widely used open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we present a cloud computing implementation of an XML indexing method called NCIM (Node Clustering Indexing Method), which was developed by our research team, for indexing and querying a large number of big XML documents using MapReduce. The experimental results show that NCIM is suitable for cloud computing environment. The throughput of 1200 queries per second for huge amount of queries using a 15-node cluster signifies the potential applications of NCIM to the fast query processing of enormous Internet documents. Keywords: Hadoop, Cloud Computing, XML Indexing, XML query, Node Clustering Indexing Method. 1 Introduction XML (eXtensible Markup Language) is widely used as the markup language for the web documents. The flexible nature of XML enables it to represent many kinds of data. However, the representation of XML is not efficient in terms of query processing. A number of indexing approaches for XML documents are proposed to accelerate query processing. Most of these works provide mechanisms to construct indexes and methods for query evaluation that deal with one or small amount of documents in a centralized fashion. In the real world, an XML database may contain a large number of XML documents which require the existing XML indexing methods to be scalable for high performance. The concept of the “cloud computing” has been received considerable attention because it provides a solution to the increasing data demands and offers a shared, ∗ This research was partially supported by National Science Council, Taiwan, under contractJ.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part III, LNAI 7198, pp. 256–265, 2012. © Springer-Verlag Berlin Heidelberg 2012