JP6180710B2

JP6180710B2 - Data storage method and apparatus

Info

Publication number: JP6180710B2
Application number: JP2012149299A
Authority: JP
Inventors: 基銑宋; 鍾皓文; 水亨金; 鉉傑李
Original assignee: Naver Corp
Current assignee: Naver Corp
Priority date: 2011-07-28
Filing date: 2012-07-03
Publication date: 2017-08-16
Anticipated expiration: 2032-07-03
Also published as: JP2013030165A

Description

本発明は、データを格納するための装置及び方法に関する。本発明は、マルチマスタモデルに基づいたデータを複製する装置及び方法を開示し、特に、拡張可能な分散インデックスを用いて、１つ以上の格納領域内にデータを分散して格納する装置及び方法を開示する。 The present invention relates to an apparatus and method for storing data. The present invention discloses an apparatus and method for replicating data based on a multi-master model, and more particularly, an apparatus and method for distributing and storing data in one or more storage areas using an expandable distributed index. Is disclosed.

データ格納容量を増大させるために、「垂直拡張」及び「水平拡張」が用いられてもよい。垂直拡張は、データを格納するためにより仕様の優れた機器を用いる方法を意味する。水平拡張は、データを格納する機器を追加することによってデータ格納容量の拡大を試みる方法である。垂直拡張は機器が処理できる容量を超過したデータを処理することができない。したがって、一般に、インターネット企業のような大容量のデータを処理しなければならない複数の企業は、水平拡張を用いて大容量のデータを処理する。 In order to increase the data storage capacity, “vertical extension” and “horizontal extension” may be used. Vertical extension refers to the method of using equipment with better specifications to store data. Horizontal expansion is a method of attempting to expand the data storage capacity by adding a device for storing data. Vertical expansion cannot process data that exceeds the capacity that the device can handle. Therefore, in general, a plurality of companies that have to process a large amount of data, such as Internet companies, process a large amount of data using horizontal expansion.

一般的に、関係型データベース管理システム（ＲｅｌａｔｉｏｎａｌＤａｔａｂａｓｅＭａｎａｇｅｍｅｎｔＳｙｓｔｅｍ；ＲＤＢＭＳ）には原子性（Ａｔｏｍｉｃｉｔｙ）、一貫性（Ｃｏｎｓｉｓｔｅｎｃｙ）、独立性（Ｉｓｏｌａｔｉｏｎ）及び永続性（Ｄｕｒａｂｉｌｉｔｙ）、すなわちＡＣＩＤ特性が要求される。 In general, a relational database management system (RDBMS) requires atomicity, consistency, isolation, and durability, that is, ACID characteristics. .

ＲＤＢＭＳが同一のデータに対する複数の複製（ｒｅｐｌｉｃａ）を維持する場合、ＲＤＢＭＳは複製のために複数の複製のうちマスタ及び１つ以上のスレーブを指定してもよい。ＲＤＢＭＳは、一貫性のために同期化された書込み演算を行なってもよい。一般的な同期化された書込み演算過程は、下記のステップ（１）から（６）の通りである。
（１）クライアントがマスタに書込みを要請
（２）マスタが要請された書込みを行う
（３）マスタがスレーブに書込みを要請
（４）スレーブが要請された書込みを行う
（５）マスタがスレーブから書込み要請に対する応答を受信
（６）マスタがクライアントに書込み結果を通知 When the RDBMS maintains multiple replicas for the same data, the RDBMS may designate a master and one or more slaves among the multiple replicas for replication. The RDBMS may perform synchronized write operations for consistency. The general synchronized write operation process is as follows (1) to (6).
(1) Client requests write to master (2) Master performs write requested (3) Master requests write to slave (4) Slave performs requested write (5) Master writes from slave Receive response to request (6) Master notifies client of write result

ＲＤＢＭＳにおいて一貫性は極めて重要である。したがって、スレーブは、マスタと完全に同一の情報を有しなければならない。 Consistency is extremely important in RDBMS. Therefore, the slave must have exactly the same information as the master.

前述されたステップ（１）から（６）のうち、スレーブで障害が発生すると、ＲＤＢＭＳはＲＤＢＭＳのアーキテクチャーにより書込み演算そのものが失敗したものと見なす。 Of the steps (1) to (6) described above, when a failure occurs in the slave, the RDBMS considers that the write operation itself has failed due to the architecture of the RDBMS.

ネットワークの発展に伴って、ネットワークを介してアクセスできる多量のデータが発生している。このようなデータを処理するためにクラウドコンピュータのような分散処理システムが導入されているが、従来におけるＲＤＢＭＳは、このような分散処理システムのための拡張性を支援することができないという限界がある。 With the development of the network, a large amount of data that can be accessed via the network is generated. A distributed processing system such as a cloud computer has been introduced to process such data, but conventional RDBMS has a limitation that it cannot support extensibility for such a distributed processing system. .

したがって、従来のＲＤＢＭＳが有するＣ（一貫性（Ｃｏｎｓｉｓｔｅｎｃｙ））及び可用性（ａｖａｉｌａｂｉｌｉｔｙ）特性のいずれか１つを放棄し、Ｐ（分割耐性（ＰａｒｔｉｔｉｏｎＴｏｌｅｒａｎｃｅ））の特性を導入するための様々な試みが行われた。このような試みのうち、代表的なものがＮｏ−ＳＱＬ（ＮｏｔｏｎｌｙＳＱＬ）である。 Therefore, various attempts to abandon any one of the C (consistency) and availability characteristics of conventional RDBMS and introduce the characteristics of P (Partition Tolerance) have been made. It was conducted. Among such attempts, a representative one is No-SQL (Not only SQL).

Ｎｏ−ＳＱＬは、キー・バリュー（ｋｅｙ−ｖａｌｕｅｄ）データベース（Ｄａｔａｂａｓｅ；ＤＢ）、ドキュメント指向（ｄｏｃｕｍｅｎｔ−ｏｒｉｅｎｔｅｄ）ＤＢ、グラフＤＢ、及び列指向（ｃｏｌｕｍｎ−ｏｒｉｅｎｔｅｄ）ＤＢなどで区分してもよい。 The No-SQL may be divided into a key-value database (Database), a document-oriented DB, a graph DB, a column-oriented DB, and the like.

そのうち、ドキュメント指向ＤＢの複製方式は、ＲＤＢＭＳの複製方式に類似する。例えば、ドキュメント指向ＤＢの複製は、マスタ及びスレーブに分類されて行われる。ただし、スレーブに書込みを行うとき、ＲＤＢＭＳは一貫性のために同期化された書込み過程を行うものの、ドキュメント指向ＤＢ（例えば、ＭｏｎｇｏＤＢ）は同期化された書込み及び非同期化された書込みを同時に用いることができる。 Of these, the document-oriented DB replication scheme is similar to the RDBMS replication scheme. For example, document-oriented DB replication is performed by being classified into a master and a slave. However, when writing to a slave, an RDBMS performs a synchronized writing process for consistency, while a document-oriented DB (eg, MongoDB) uses synchronized writing and asynchronous writing simultaneously. Can do.

関係型データベース管理システムはその特性上、水平に拡張されることはできない。したがって、ＲＤＢＭＳを大容量のデータのために用いる場合、シャーディング（ｓｈａｒｄｉｎｇ）（または、データ分割）によってＲＤＢＭＳの全体容量が拡張されてもよい。すなわち、同一のスキーマを用いて異なるデータを格納する１つ以上のＲＤＢＭＳ機器を用いてもよい。 A relational database management system cannot be expanded horizontally due to its characteristics. Therefore, when the RDBMS is used for a large amount of data, the entire capacity of the RDBMS may be expanded by sharding (or data division). That is, one or more RDBMS devices that store different data using the same schema may be used.

１つ以上のＲＤＢＭＳ機器を用いる場合、いずれのＲＤＢＭＳ機器があるかを把握しているアプリケーションサーバを用いてもよく、別途のミドルウェア（ｍｉｄｄｌｅｗａｒｅ）によって実際のデータ位置が隠匿されてもよい。 When using one or more RDBMS devices, an application server that knows which RDBMS device is present may be used, and the actual data position may be concealed by separate middleware.

シャーディング（または、データ分割）を用いる場合、より大きな容量のデータを処理するためにＲＤＢＭＳ機器が追加されるとき、１つ以上のＲＤＢＭＳ機器の間にデータを再分配しなければならないという問題が発生する。このようなデータの再分配は、ＲＤＢＭＳ機器の運用中に行われることがある。しかし、データの再分配には時間がかかるため、即時性が落ちるという問題がある。データを分散して格納するために、分散キー・バリューＤＢを用いてもよい。分散キー・バリューＤＢは、一貫するハッシュ（ｃｏｎｓｉｓｔｅｎｔｈａｓｈｉｎｇ）方式を用いることによってデータを分散して格納してもよい。一貫するハッシュ方式は、データ拡張において構造的な利点を有する。すなわち、一貫するハッシュ方式を用いるデータ格納システムは、サーバを追加することによって処理可能な全体データ量を増加させることができる。しかし、このようなハッシュ方式は、大小の概念を支援しないため、範囲検索に脆弱であり、２次元以上のデータを処理できない。 When using sharding (or data partitioning), the problem is that when an RDBMS device is added to process larger volumes of data, the data must be redistributed between one or more RDBMS devices. Occur. Such data redistribution may be performed during operation of the RDBMS device. However, since data redistribution takes time, there is a problem that immediacy is reduced. A distributed key / value DB may be used to store data in a distributed manner. The distributed key / value DB may store data in a distributed manner by using a consistent hashing method. A consistent hashing scheme has structural advantages in data expansion. In other words, a data storage system using a consistent hash method can increase the total amount of data that can be processed by adding a server. However, since such a hash method does not support the concept of large and small, it is vulnerable to range search and cannot process data of two or more dimensions.

分散キー・バリューＤＢを用いるシステム内のデータは、ハッシュによって分散する。したがって、範囲検索のために順次に検索結果を取得することができず、検索範囲内のキーをそれぞれ照会しなければならない。 Data in the system using the distributed key / value DB is distributed by hashing. Therefore, the search results cannot be obtained sequentially for the range search, and each key in the search range must be queried.

例えば、Ａというフィールドの値が１から１０の間のデータを探す場合、ＲＤＢＭＳでは範囲検索のために「ｓｅｌｅｃｔ＊ｆｒｏｍｆｏｏｗｈｅｒｅＡ＞＝１ａｎｄＡ＜＝１０」のようなクエリを用いてもよい。一方、キー・バリューＤＢは、１から１０までのデータをそれぞれ照会しなければならない。 For example, when searching for data in which the value of the field A is between 1 and 10, the RDBMS may use a query such as “select * from foo where A> = 1 and A <= 10” for range search. On the other hand, the key / value DB has to inquire data from 1 to 10, respectively.

したがって、ハッシュを用いるキー・バリューＤＢは、２次元以上のデータを処理する空間的インデックスを有することはできない。すなわち、特定の空間内にあるいずれかのデータを処理しなければならない場合、空間内にあるデータはキー・バリューＤＢの様々なノード（すなわち、サーバ）に分散して格納されているため、様々なノードのいずれのノードでも完全なインデックスを取り揃えることができない。 Therefore, a key / value DB using a hash cannot have a spatial index for processing data of two or more dimensions. That is, when any data in a specific space must be processed, the data in the space is distributed and stored in various nodes (that is, servers) of the key / value DB. Neither node can have a complete index.

本発明の目的は、メッセージングチャネルを用いて１つ以上のノードがデータアクセス要請を処理する装置及び方法を提供する。 An object of the present invention is to provide an apparatus and method in which one or more nodes process a data access request using a messaging channel.

本発明の目的は、データアクセス要請を選択されたノードに送信し、選択されたノードの要請に応じてデータアクセス要請を１つ以上のノードにマルチキャストする装置及び方法を提供する。 An object of the present invention is to provide an apparatus and method for transmitting a data access request to a selected node and multicasting the data access request to one or more nodes according to the request of the selected node.

本発明の目的は、ツリー構造で構成された１つ以上の格納領域を用いてデータを格納する装置及び方法を提供する。 An object of the present invention is to provide an apparatus and method for storing data using one or more storage areas configured in a tree structure.

本発明の目的は、階層的なキーを用いてデータが格納される格納領域を決定する装置及び方法を提供する。 An object of the present invention is to provide an apparatus and method for determining a storage area in which data is stored using a hierarchical key.

本発明の一実施形態によると、１つ以上のノードのうち選択されたノードがデータアクセス要請を受信し、前記選択されたノードが前記データアクセス要請のマルチキャスト要請をメッセージングチャネルに送信し、前記メッセージングチャネルが前記マルチキャスト要請を受信して前記データアクセス要請を前記１つ以上のノードにマルチキャストし、前記１つ以上のノードそれぞれが前記マルチキャストを受信して前記データアクセス要請を処理することを含み、前記１つ以上のノードは、前記データに対する複製を含むことを特徴とするデータ管理方法が提供される。 According to an embodiment of the present invention, a selected node of one or more nodes receives a data access request, the selected node transmits a multicast request for the data access request to a messaging channel, and the messaging A channel receiving the multicast request and multicasting the data access request to the one or more nodes, each of the one or more nodes receiving the multicast and processing the data access request; One or more nodes are provided with a data management method characterized by including a copy of the data.

前記データ管理方法は、メッセージングチャネルがクライアントから前記データアクセス要請を受信すること、前記メッセージングチャネルが前記１つ以上のノードのうち前記選択されたノードを決定すること、前記メッセージングチャネルが前記データアクセス要請を前記選択されたノードに送信すること、をさらに含んでもよい。 The data management method includes: a messaging channel receiving the data access request from a client; the messaging channel determining the selected node of the one or more nodes; and the messaging channel receiving the data access request. Transmitting to the selected node.

データアクセス要請は、データの読み出し、書込み、挿入、削除、または更新のいずれか１つ以であってもよい。 The data access request may be any one or more of data reading, writing, insertion, deletion, and update.

本発明の他の実施形態によると、データアクセス要請に対するマルチキャスト要請を受信し、１つ以上のノードに前記データアクセス要請を送信すること、を含み、前記１つ以上のノードは、前記データアクセス要請が要請するデータに対する複製を含むノードであることを特徴とするデータ管理方法が提供される。 According to another embodiment of the present invention, the method includes receiving a multicast request for a data access request and transmitting the data access request to one or more nodes, the one or more nodes receiving the data access request. A data management method is provided, which is a node including a copy of data requested by.

前記データ管理方法は、クライアントから前記データアクセス要請を受信すること、前記１つ以上のノードのうち選択されるノードを決定すること、前記データアクセス要請を前記選択されたノードに送信すること、をさらに含んでもよい。 The data management method includes receiving the data access request from a client, determining a selected node among the one or more nodes, and transmitting the data access request to the selected node. Further, it may be included.

前記ノードを決定することは、ラウンドロビン方式またはロードバランシングに基づいて前記１つ以上のノードのうち前記選択されるノードを決定してもよい。 Determining the node may determine the selected node of the one or more nodes based on a round robin scheme or load balancing.

本発明の他の実施形態によると、端末がクライアントからデータアクセス要請を処理する方法において、クライアントによって送信されたデータアクセス要請が送信され、前記データアクセス要請のマルチキャスト要請をメッセージングチャネルに送信し、前記メッセージングチャネルからマルチキャストを介して前記データアクセス要請を受信し、データアクセス要請を処理すること、を含むデータ管理方法が提供される。 According to another embodiment of the present invention, in a method for a terminal to process a data access request from a client, a data access request transmitted by the client is transmitted, a multicast request of the data access request is transmitted to a messaging channel, and A data management method is provided that includes receiving the data access request from a messaging channel via multicast and processing the data access request.

データアクセス要請が送信されることは、メッセージングチャネルからデータアクセス要請を受信することを含んでもよい。 Transmitting the data access request may include receiving the data access request from the messaging channel.

本発明の一実施形態によると、同一のデータの複製を含む１つ以上のノードと、データアクセス要請を前記１つ以上のノードに送信するメッセージングチャネルと、を含み、前記１つ以上のノードのいずれか１つのノードは、前記データアクセス要請のマルチキャスト要請を前記メッセージングチャネルに送信し、前記メッセージングチャネルは、前記マルチキャスト要請を受信して前記データアクセス要請を前記１つ以上のノードにマルチキャストし、前記１つ以上のノードそれぞれは、前記マルチキャストによって前記データアクセス要請を処理することを特徴とするデータ管理システムが提供される。 According to one embodiment of the present invention, comprising: one or more nodes including a copy of the same data; and a messaging channel for transmitting a data access request to the one or more nodes; Any one node transmits a multicast request for the data access request to the messaging channel, the messaging channel receives the multicast request, multicasts the data access request to the one or more nodes, and Each of the one or more nodes is provided with a data management system in which the data access request is processed by the multicast.

前記メッセージングチャネルは、クライアントからデータアクセス要請を受信し、前記１つ以上のノードのいずれか１つのノードを選択して前記データアクセス要請を前記選択されたノードに送信してもよい。 The messaging channel may receive a data access request from a client, select any one of the one or more nodes, and transmit the data access request to the selected node.

前記メッセージングチャネルは、ラウンドロビン方式またはロードバランシングに基づいて前記１つ以上のノードのうち前記１つのノードを選択してもよい。 The messaging channel may select the one of the one or more nodes based on a round robin scheme or load balancing.

本発明の一実施形態によると、クライアントからデータアクセス要請を受信する受信部と、データ格納装置においてデータの格納領域を有する１つ以上のノードのうち選択されるノードを決定する制御部と、データアクセス要請を前記選択されたノードに送信する送信部と、を備え、前記受信部は、前記選択されたノードから前記データアクセス要請のマルチキャスト要請を受信し、前記送信部は、前記第１要請を前記１つ以上のノードにマルチキャストすることを特徴とするメッセージングチャネルが提供される。 According to an embodiment of the present invention, a receiving unit that receives a data access request from a client, a control unit that determines a node to be selected from one or more nodes having a data storage area in a data storage device, and data A transmission unit that transmits an access request to the selected node, wherein the reception unit receives a multicast request for the data access request from the selected node, and the transmission unit receives the first request. A messaging channel is provided that multicasts to the one or more nodes.

本発明の一実施形態によると、ツリー構造で構成された１つ以上の格納領域を含み（各格納領域は前記ツリーにおける１つのノードに対応）、前記１つ以上の格納領域それぞれには０個以上のサブキーを有する階層的なキーが割り当てられ、前記１つ以上の格納領域のうち任意の第１格納領域をルートにするサブツリー内の格納領域は、前記第１格納領域の第１キーに対応するデータを格納し、前記第１キーは、第２キーに１つ以上のサブキーが連鎖されたキーであり、前記第２キーは、第２格納領域のキーであり、前記第２格納領域は、前記第１格納領域の親格納領域であることを特徴とするデータ格納装置が提供される。 According to an embodiment of the present invention, one or more storage areas configured in a tree structure are included (each storage area corresponds to one node in the tree), and each of the one or more storage areas is zero. A hierarchical key having the above subkeys is assigned, and a storage area in a subtree rooted at an arbitrary first storage area among the one or more storage areas corresponds to the first key of the first storage area The first key is a key in which one or more subkeys are chained to a second key, the second key is a key of a second storage area, and the second storage area is A data storage device is provided that is a parent storage area of the first storage area.

前記１つ以上の格納領域のそれぞれは、関係型データベース機器であってもよい。前記１つ以上の格納領域のそれぞれは、関係型データベースのインデックス、キー、および命令を理解および処理するミドルウェアを含んでもよい。 Each of the one or more storage areas may be a relational database device. Each of the one or more storage areas may include middleware that understands and processes relational database indexes, keys, and instructions.

前記階層的なキーは、英数字及び区分子を組み合わせた文字列であってもよい。 The hierarchical key may be a character string combining alphanumeric characters and ward molecules.

第１キーに対応するデータは、データのキーの接頭語のいずれか１つが前記第１キーと同一のデータを意味してもよい。 The data corresponding to the first key may mean data in which any one of the data key prefixes is the same as the first key.

接頭語は、前記データのキーのｎ個のサブキーのうち前のｉ個のサブキーであってもよい。 The prefix may be the previous i subkeys of the n subkeys of the data key.

ｉは１以上ｎ以下であってもよい。 i may be 1 or more and n or less.

前記第１格納領域は、前記第１格納領域のキーに対応するデータのうち、前記第１格納領域の子格納領域のキーに対応しないデータを格納してもよい。 The first storage area may store data that does not correspond to the key of the child storage area of the first storage area among the data corresponding to the key of the first storage area.

前記データ格納装置に第３格納領域を追加する場合、前記第３格納領域の第３キーに対応するデータを前記第１格納領域から前記第３格納領域に移動させ、前記第３格納領域は、前記第１格納領域の子格納領域であってもよい。 When adding a third storage area to the data storage device, the data corresponding to the third key of the third storage area is moved from the first storage area to the third storage area, and the third storage area is It may be a child storage area of the first storage area.

前記第１格納領域の格納量が予め定義された基準に達したとき、前記第３格納領域の追加及び前記データ移動を行ってもよい。 When the storage amount of the first storage area reaches a predefined standard, the addition of the third storage area and the data movement may be performed.

前記第１格納領域は、前記第１格納領域の１つ以上の子格納領域に検索範囲に対応するキーを有するデータの第１目録を要請し、前記第１格納領域が格納したデータのうち、前記検索範囲に対応するデータの第２目録を前記要請に応じて返還された前記第１目録に併合して前記検索範囲に対する結果として返還してもよい。 The first storage area requests a first list of data having a key corresponding to a search range to one or more child storage areas of the first storage area, and among the data stored in the first storage area, A second list of data corresponding to the search range may be merged with the first list returned in response to the request and returned as a result for the search range.

本発明の一実施形態によると、１つ以上の格納領域をツリー構造で構成し（各格納領域は前記ツリーにおける１つのノードに対応）、前記１つ以上の格納領域それぞれに０個以上のサブキーを有する階層的なキーを割り当て、前記１つ以上の格納領域のうち、任意の第１格納領域をルートにするサブツリー内の格納領域内に前記第１格納領域の第１キーに対応するデータを格納することを含み、前記第１キーは、第２キーに１つ以上のサブキーが連鎖されたキーであり、前記第２キーは、第２格納領域のキーであり、前記第２格納領域は、前記第１格納領域の親格納領域であることを特徴とするデータ格納方法が提供される。 According to an embodiment of the present invention, one or more storage areas are configured in a tree structure (each storage area corresponds to one node in the tree), and each of the one or more storage areas includes zero or more subkeys. And assigning the data corresponding to the first key of the first storage area in the storage area in the subtree rooted at an arbitrary first storage area among the one or more storage areas. The first key is a key in which one or more subkeys are chained to a second key, the second key is a key of a second storage area, and the second storage area is A data storage method is provided which is a parent storage area of the first storage area.

前記格納することは、第１格納領域のキーに対応するデータのうち、第１格納領域の子格納領域のキーに対応しないデータを第１格納領域に格納することを含んでもよい。 The storing may include storing, in the first storage area, data that does not correspond to the key of the child storage area of the first storage area among the data corresponding to the key of the first storage area.

前記データ格納方法は、第１格納領域の子格納領域として、１つ以上の格納領域に検索範囲に第３格納領域を追加すること、第３格納領域の第３キーに対応するデータを第１格納領域から第３格納領域に移動させること、をさらに含んでもよい。 In the data storage method, as a child storage area of the first storage area, the third storage area is added to the search range in one or more storage areas, and the data corresponding to the third key of the third storage area is the first storage area. It may further include moving from the storage area to the third storage area.

前記第３格納領域を追加することおよび前記データを移動させることは、第１格納領域の格納量が予め定義された基準に達したとき行なわれてもよい。 The addition of the third storage area and the movement of the data may be performed when the storage amount of the first storage area reaches a predefined standard.

前記データ格納方法は、第１格納領域の１つ以上の子格納領域に検索範囲に対応するキーを有するデータの第１目録を要請しと、１つ以上の子格納領域が第１目録を返還し、第１格納領域が格納したデータのうち、検索範囲に対応するデータの第２目録を返還された第１目録に併合して検索範囲に対する結果として返還すること、をさらに含んでもよい。 The data storage method requests one or more child storage areas of the first storage area for a first list of data having a key corresponding to a search range, and the one or more child storage areas return the first list. In addition, it may further include merging the second list of data corresponding to the search range out of the data stored in the first storage area with the returned first list as a result for the search range.

本発明によると、マルチキャストを介して１つ以上のノードが同時にデータ要請を処理することによって、データの一貫性を維持する装置及び方法を提供することができる。 According to the present invention, it is possible to provide an apparatus and method for maintaining data consistency by simultaneously processing data requests by one or more nodes via multicast.

本発明によると、１つ以上のノードがメッセージングチャネルとの接続のみを維持することによって、ノードの挿入、削除、または故障を容易に処理できるデータ管理システムを提供することができる。 According to the present invention, it is possible to provide a data management system in which insertion, deletion, or failure of a node can be easily handled by maintaining one or more nodes only in connection with a messaging channel.

本発明によると、ロードバランシングを考慮して、１つ以上のノードのうちクライアントのデータアクセス要請を処理するノードを選択する装置及び方法を提供することができる。 According to the present invention, it is possible to provide an apparatus and method for selecting a node that processes a data access request of a client from one or more nodes in consideration of load balancing.

本発明によると、ツリー構造で構成された１つ以上の格納領域を用いてデータを格納する装置及び方法を提供することができる。 According to the present invention, an apparatus and method for storing data using one or more storage areas configured in a tree structure can be provided.

本発明によると、階層的なキーを用いてデータが格納される格納領域を決定する装置及び方法を提供することができる。 According to the present invention, it is possible to provide an apparatus and method for determining a storage area in which data is stored using a hierarchical key.

本発明によると、データを格納することによって、ツリー構造で格納領域を拡張し、拡張された格納領域にデータを移動する装置及び方法を提供することができる。 According to the present invention, it is possible to provide an apparatus and method for storing data to expand a storage area in a tree structure and move the data to the expanded storage area.

本発明によると、クエリを子ノードに対応する子格納領域に送信し、子格納領域から返還されたデータ目録をクエリの検索結果として併合して返還する装置及び方法を提供することができる。 According to the present invention, it is possible to provide an apparatus and a method for transmitting a query to a child storage area corresponding to a child node, and merging and returning a data list returned from the child storage area as a query search result.

マスタ−スレーブ構造における非同期的な書込みを説明するための図である。It is a figure for demonstrating asynchronous writing in a master-slave structure. マルチマスタの複製方式を説明するための図である。It is a figure for demonstrating the replication method of a multi master. 本発明の一例に係るデータ管理システムの構造を示す図である。It is a figure which shows the structure of the data management system which concerns on an example of this invention. 本発明の一実施形態に係るデータ管理方法の信号フローチャートである。5 is a signal flowchart of a data management method according to an embodiment of the present invention. 本発明の一例に係るデータ管理方法の信号フローチャートである。5 is a signal flowchart of a data management method according to an example of the present invention. 本発明の一実施形態に係るメッセージングチャネルのブロック図である。FIG. 3 is a block diagram of a messaging channel according to an embodiment of the present invention. 本発明の一実施形態に係るデータ格納装置を示す図である。It is a figure showing a data storage device concerning one embodiment of the present invention. 本発明の一例に係るデータ格納装置に格納領域を追加する過程を説明するための図である。It is a figure for demonstrating the process of adding a storage area to the data storage apparatus which concerns on an example of this invention. 本発明の一例に係るデータ格納装置に対する範囲検索を説明するための図である。It is a figure for demonstrating the range search with respect to the data storage apparatus which concerns on an example of this invention. 本発明の一実施形態に係るデータ格納方法のフローチャートである。3 is a flowchart of a data storage method according to an embodiment of the present invention. 本発明の一実施形態に係るデータ格納装置おける拡張方法のフローチャートである。It is a flowchart of the expansion method in the data storage device which concerns on one Embodiment of this invention. 本発明の一例に係るデータ格納装置の範囲検索のフローチャートである。It is a flowchart of a range search of the data storage device according to an example of the present invention.

以下、本発明の一実施形態を図面を参照しながら詳細に説明する。しかし、本発明は、以下の実施形態に制限されることはなく、限定されることもない。各図面に示された同一の参照符号は同一の部材を示す。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings. However, the present invention is not limited to the following embodiments, and is not limited. The same reference numerals shown in the drawings indicate the same members.

後述する本発明の実施形態は、キー・バリューＤＢを用いる分散処理システムをマルチマスタ方式により実現する方法を提供する。 Embodiments of the present invention described later provide a method for realizing a distributed processing system using a key / value DB by a multi-master method.

データ管理システムを実現するとき、障害対応（ｆａｕｌｔｔｏｌｅｒａｎｃｅ）及びロードバランシング（ｌｏａｄｂａｌａｎｃｉｎｇ）などのために、同一のデータを有する複数のノードが構成される必要がある。ノードは、いずれかの作業を処理する１つの単位を意味する。例えば、ノードは、１つの物理的又は論理的なサーバ（または、ＤＢ）であってもよい。 When realizing a data management system, a plurality of nodes having the same data need to be configured for fault tolerance, load balancing, and the like. A node means one unit for processing any work. For example, the node may be one physical or logical server (or DB).

本発明においてクラスタは、複数の複製から構成された１つの集合を意味する。クラスタは、１つ以上のノードを含んでもよい。クラスタ内の１つ以上のノードは、クライアントに同一のデータを提供する。 In the present invention, a cluster means one set composed of a plurality of replicas. A cluster may include one or more nodes. One or more nodes in the cluster provide the same data to the client.

図１は、マスタ−スレーブ構造における非同期的な書込みを説明するための図である。 FIG. 1 is a diagram for explaining asynchronous writing in a master-slave structure.

前述された一貫性の代わりに、可用性または性能に重点をおく場合、複製のために様々な方式が用いられる。例えば、マスタ−スレーブ構造でも非同期化された書込みが適用されてもよい。Ｎｏ−ＳＱＬでマスタ−スレーブモデルを用いて複製が２つ以上ある場合、複製のいずれか１つ（または、１つ以上）に対しては同期化された書込みが適用されてもよく、残りの複製には非同期化された書込みが適用されてもよい。 Instead of the consistency described above, various schemes are used for replication when focusing on availability or performance. For example, asynchronous writing may be applied even in a master-slave structure. If there are two or more replicas using the master-slave model in No-SQL, a synchronized write may be applied to any one (or more) of the replicas and the rest Desynchronized writing may be applied to the replica.

マスタ−スレーブ構造における非同期的な書込み過程は、下記のステップ（１）からステップ（４）の通りである。
（１）クライアント１１０が書込み要請１５０
（２）マスタ１２０が要請された書込みを行う１６０
（３）マスタ１２０がスレーブ１３０（または、１つ以上のスレーブ１３２、１３４及び１３６）に非同期的な書込みを要請１７０
（４）マスタ１２０がクライアント１１０に書込み結果を通知１８０ The asynchronous writing process in the master-slave structure is as follows from step (1) to step (4).
(1) The client 110 writes 150
(2) The master 120 performs the requested write 160
(3) Master 120 requests asynchronous write to slave 130 (or one or more slaves 132, 134 and 136) 170
(4) The master 120 notifies the client 110 of the write result 180

上記のようなステップで、スレーブ１３０で入出力（Ｉｎｐｕｔ／Ｏｕｔｐｕｔ；ＩＯ）が行われるときは、クライアント１１０が書込み要請の結果を把握するときと重なる。したがって、スレーブ１３０に障害が発生することで書込みが正しく行われない場合、後に一貫性において問題が生じる。 In the above-described steps, when input / output (Input / Output; IO) is performed in the slave 130, it overlaps with when the client 110 grasps the result of the write request. Therefore, if writing is not performed correctly due to a failure in the slave 130, a problem in consistency will occur later.

図２は、マルチマスタの複製方式を説明する。 FIG. 2 illustrates a multi-master replication scheme.

データの一貫性が重要な（すなわち、複数のデータのいずれもが正しいデータであることが重要な）ＲＤＢＭＳは、マスタ−スレーブモデルを用いるが、Ｎｏ−ＳＱＬを用いるＤＢはマルチ（Ｍｕｌｔｉ）マスタ（または、Ｎｏ−マスタ）の複製方式を用いてもよい。 An RDBMS in which data consistency is important (that is, it is important that all of a plurality of data are correct data) uses a master-slave model, while a DB using No-SQL is a multi-master ( Alternatively, a No-master) replication method may be used.

マルチマスタモデルが用いられる場合、複製の数を初期に指定する必要がある。 If a multi-master model is used, the number of replicas must be specified initially.

第１ノード２１０、第２ノード２２０、及び第３ノード２３０は、それぞれマルチマスタモデルの複製である。 Each of the first node 210, the second node 220, and the third node 230 is a duplicate of the multi-master model.

マルチマスタモデルを用いるシステムでは、同一の内容を格納する複製それぞれに異なるクライアントがアクセスして情報を更新することで、一貫性を有さない情報が様々なクライアントに送信されることがある。 In a system using a multi-master model, inconsistent information may be transmitted to various clients by accessing different replicas storing the same contents and updating the information.

２つのクライアントがそれぞれ第１ノード２１０及び第３ノード２３０に対して情報Ａの更新を要請すると、次に情報Ａを読込んだクライアントは、いつ、どのノード（例えば、第１ノード２１０または第３ノード２３０）を介して情報Ａにアクセスするかに応じて、異なる値を取得する。情報Ａに対する更新された内容が異なるノードで全て複製される前にクライアントが情報Ａを読込む場合、クライアントはどのノードを介して情報Ａをアクセスしたかに応じて異なる値を取得する。または、第１ノード２１０及び第３ノード２３０でＡに対する更新が互いに異なるように行われた場合、クライアントはどのノードを介して情報Ａにアクセスしたかに応じて異なる値を取得する。 When the two clients request the first node 210 and the third node 230 to update the information A, respectively, the client that has read the information A next becomes which node (for example, the first node 210 or the third node). Different values are obtained depending on whether the information A is accessed via the node 230). When the client reads the information A before all the updated contents for the information A are copied at different nodes, the client acquires different values depending on the node through which the information A is accessed. Alternatively, when updates to A are performed differently in the first node 210 and the third node 230, the client acquires different values depending on which node accessed the information A.

前述されたように、明示的なマスタがない状態で、複製関係にある異なる複数のノードそれぞれにほとんど同時に情報更新が要請された場合、どのノードが有する情報が正しいのかを把握する必要がある。 As described above, when information update is requested almost simultaneously to each of a plurality of different nodes in a replication relationship without an explicit master, it is necessary to know which node has correct information.

したがって、マルチマスタモデルを用いるシステムは、情報を読み出すときに情報に対する補正を行う。このような補正は、読み出し補正（ｒｅａｄｒｅｐａｉｒ）であり、読み出し補正はシステムの読み出し性能を低下させる恐れがある。 Therefore, a system using a multi-master model corrects information when reading the information. Such a correction is a read correction, and the read correction may reduce the read performance of the system.

図３は、本発明の一例に係るデータ管理システムの構造を示す図である。 FIG. 3 is a diagram showing a structure of a data management system according to an example of the present invention.

データ管理システム３００（以下、システム３００と称する）は、複製を用いてデータを格納するシステムである。システム３００は、メッセージングチャネル３２０及び１つ以上のノード３３０を含む。システム３００は、レベル（Ｌｅｖｅｌ；Ｌ）４スイッチ３２５をさらに含んでもよい。 The data management system 300 (hereinafter, referred to as the system 300) is a system that stores data using replication. The system 300 includes a messaging channel 320 and one or more nodes 330. The system 300 may further include a level (L) 4 switch 325.

説明の便宜のために、図示された１つ以上のノード３３０は、同一のデータを格納及び提供する複製（クラスタ）を示す。１つ以上のノード３３０は、マルチマスタ方式によって動作されてもよい。図示していないが、システム３００は、他のデータに対する複製を含む複数のノードをさらに含んで構成されてもよい。 For convenience of explanation, the illustrated one or more nodes 330 represent replicas (clusters) that store and provide the same data. One or more nodes 330 may be operated in a multi-master manner. Although not shown, the system 300 may further include a plurality of nodes including replicas for other data.

システム３００は、クライアント３１０からのデータアクセス要請を処理する。データアクセス要請は、システム３００内の特定データに対する読み出し（ｒｅａｄ）、書込み（ｗｒｉｔｅ）、挿入（ｉｎｓｅｒｔ）、削除（ｄｅｌｅｔｅ）、または、更新（ｕｐｄａｔｅ）要請であってもよい。 The system 300 processes a data access request from the client 310. The data access request may be a read, a write, an insert, a delete, or an update request for specific data in the system 300.

クライアント３１０は複数であってもよい。すなわち、システム３００は、各クライアント３１０からデータアクセス要請を受信してもよく、データアクセス要請を処理してもよい。クライアント３１０は、データの要請を示す要請メッセージを送信することによって、シスム３００にデータを要請する。 There may be a plurality of clients 310. That is, the system 300 may receive a data access request from each client 310 and may process the data access request. The client 310 requests data from the system 300 by transmitting a request message indicating a request for data.

メッセージングチャネル３２０は、システムでクライアント３１０とデータを格納している１つ以上のノード３３０との間のメッセージを処理するミドルティア（ｍｉｄｄｌｅ−ｔｉｅｒ）の役割を行う。メッセージングチャネル３２０は、ルータの役割を行うメッセージ基盤ミドルウェア（ＭｅｓｓａｇｅＯｒｉｅｎｔｅｄＭｉｄｄｌｅｗａｒｅ；ＭＯＭ）またはメッセージ基盤ミドルウェアが搭載されたサーバであってもよい。また、メッセージングチャネル３２０は、処理容量を拡張するために、複数のサーバまたはソフトウェアデーモン（ｄａｅｍｏｎ）で構成されてもよい。 The messaging channel 320 serves as a middle-tier that processes messages between the client 310 and one or more nodes 330 storing data in the system. The messaging channel 320 may be a message-based middleware (MOM) or a server equipped with message-based middleware that functions as a router. The messaging channel 320 may be composed of a plurality of servers or software daemons to expand the processing capacity.

メッセージングチャネル３２０と接続される１つ以上のノード３３０は、固有のアドレス体系を用いる。メッセージングチャネル３２０で各ノードにメッセージを送信するためにアドレスを指定する方法は、（１）ユニキャスト（ｕｎｉｃａｓｔ）、（２）エニーキャスト（ａｎｙｃａｓｔ）、及び（３）マルチキャスト（または、ブロードキャスト）のいずれか１つ以上であってもよい。アドレスを指定する方法に応じて、メッセージングチャネル３２０が接続されたノードにメッセージを送信する方式が異なる。 One or more nodes 330 connected to the messaging channel 320 use a unique address scheme. A method for specifying an address for sending a message to each node on the messaging channel 320 is any one of (1) unicast, (2) anycast, and (3) multicast (or broadcast). Or more than one. Depending on the method of specifying the address, the method of sending a message to the node to which the messaging channel 320 is connected differs.

ユニキャストは、固有アドレスによって指定された１つのノードにのみメッセージングチャネル３２０がメッセージを送信する方式である。エニーキャストは、メッセージングチャネル３２０がいずれかの群（ｄｏｍａｉｎ）の１つのノードにのみメッセージを送信する方式である。メッセージングチャネル３２０は、各ノード３３２、３３４、または３３６とユニキャストまたはエニーキャストでメッセージまたはデータを送受信する。例えば、メッセージングチャネル３２０は、データアクセス要請を処理するために、選択された特定ノードにユニキャストまたはエニーキャスト方式を用いて要請メッセージを送信する。 Unicast is a scheme in which the messaging channel 320 transmits a message to only one node specified by a unique address. Anycast is a scheme in which the messaging channel 320 sends a message only to one node in any domain. The messaging channel 320 sends and receives messages or data with each node 332, 334, or 336 in unicast or anycast. For example, the messaging channel 320 transmits a request message using a unicast or anycast method to a selected specific node in order to process a data access request.

マルチキャストは、いずれもの群の全てのノードにメッセージを送信する方式である。メッセージングチャネル３２０は、同一のデータを有する１つ以上のノード（すなわち、クラスタ）を１つの群として指定し、メッセージをマルチキャストしてもよい。したがって、メッセージングチャネル３２０は、特定のデータアクセス要請をマルチキャストして該当クラスタ内の全てのノードがデータアクセス要請を受信できるようにする。メッセージングチャネル３２０は、群に関する情報を管理する管理部またはこれを管理する別途のサーバをさらに含んで構成してもよく（図示せず）、ノードの追加または除去するときに新しいノードに関する情報をアップデートしてもよい。 Multicast is a method of sending a message to all nodes in any group. Messaging channel 320 may designate one or more nodes (ie, clusters) having the same data as a group and multicast messages. Accordingly, the messaging channel 320 multicasts a specific data access request so that all nodes in the corresponding cluster can receive the data access request. The messaging channel 320 may further include a management unit that manages information about the group or a separate server that manages the information (not shown), and updates information about a new node when a node is added or removed. May be.

ここで、各ノード３３２、３３４、または３３６は、メッセージングチャネル３２０にのみ接続されるだけであって、各ノード相互間は通信することができない。この場合、ノードの追加または削除が発生する場合、新しいノードまたは削除されたノードに関する情報は、メッセージングチャネル３２０にのみ送信されてもよい。 Here, each node 332, 334, or 336 is only connected to the messaging channel 320, and cannot communicate with each other. In this case, when a node addition or deletion occurs, information about the new or deleted node may be sent only to the messaging channel 320.

クライアント３１０がメッセージングチャネル３２０のプロトコルを把握している場合、クライアント３１０は、メッセージングチャネル３２０に直接データアクセス要請を送信してもよい。または、クライアント３１０は、各ノード３３２、３３４または３３６に直接データアクセス要請を送信してもよい。この過程において、アクセス要請は、ネットワーク上に位置するＬ４スイッチ３２５を経由してもよい。Ｌ４スイッチ３２５は、仮想ＩＰ（ｖｉｒｔｕａｌＩＰ；ＶＩＰ）によって１つ以上のノード３３０を管理する。クライアント３１０がＶＩＰを用いてデータアクセス要請を送信すると、ＶＩＰを有するＬ４スイッチ３２５は、データアクセス要請を受信した後、受信されたデータアクセス要請を１つ以上のノード３３０で適切に分配する。 If client 310 knows the protocol of messaging channel 320, client 310 may send a data access request directly to messaging channel 320. Alternatively, the client 310 may transmit a data access request directly to each node 332, 334 or 336. In this process, the access request may pass through the L4 switch 325 located on the network. The L4 switch 325 manages one or more nodes 330 using a virtual IP (VIP). When the client 310 transmits a data access request using the VIP, the L4 switch 325 having the VIP appropriately distributes the received data access request at the one or more nodes 330 after receiving the data access request.

下記では、メッセージングチャネル３２０を用いて１つ以上のノード３３０の全てが同一のデータを維持及び提供する方法について説明する。また、下記の方法を用いることによってマルチマスタ複製が実現されることができる。 In the following, a method is described in which all of one or more nodes 330 maintain and provide the same data using messaging channel 320. In addition, multi-master replication can be realized by using the following method.

図４は、本発明の一実施形態に係るデータ管理方法の信号フローチャートである。 FIG. 4 is a signal flowchart of a data management method according to an embodiment of the present invention.

本実施形態において、端末（すなわち、第１ノード３３２、第２ノード３３４及び第３ノード３３６）は、クライアント３１０からのデータアクセス要請を処理する。 In the present embodiment, the terminals (that is, the first node 332, the second node 334, and the third node 336) process the data access request from the client 310.

ステップＳ４１０において、クライアント３１０はメッセージングチャネル３２０にデータアクセス要請を行う。データアクセス要請は、特定のデータ（例えば、オブジェクト）の読み出し、書込み、挿入、削除、または更新のいずれか１つ以上であってもよい。メッセージングチャネル３２０は、クライアント３１０からデータアクセス要請を受信する。 In step S410, the client 310 makes a data access request to the messaging channel 320. The data access request may be any one or more of reading, writing, inserting, deleting, or updating specific data (for example, an object). Messaging channel 320 receives a data access request from client 310.

ステップＳ４２０において、メッセージングチャネル３２０は、データアクセス要請に対して特定データを含むノードで構成されたクラスタから１つのノードを選択する。説明の便宜のために図４に示された１つ以上のノード３３０（すなわち、第１ノード３３２、第２ノード３３４、および第３ノード３３６を含むクラスタ）が、要請されたデータを含むノードの集合という。したがって、メッセージングチャネルは３２０は、１つ以上のノード３３０から１つのノードを選択する。メッセージングチャネル３２０は、ラウンド・ロビン（ｒｏｕｎｄ−ｒｏｂｉｎ）方式などを利用したり、ロードバランシングに基づいて選択されたノードを決定してもよい。本実施形態では、第２ノード３３４が選択された場合について説明する。 In step S420, the messaging channel 320 selects one node from a cluster composed of nodes including specific data in response to the data access request. For convenience of explanation, one or more of the nodes 330 shown in FIG. 4 (ie, a cluster including the first node 332, the second node 334, and the third node 336) are connected to the node including the requested data. It is called a set. Accordingly, the messaging channel 320 selects a node from one or more nodes 330. The messaging channel 320 may determine a selected node based on load balancing using a round-robin method or the like. In the present embodiment, a case where the second node 334 is selected will be described.

ステップＳ４３０において、メッセージングチャネル３２０は、選択されたノード（すなわち、第２ノード３３４）にデータアクセス要請を送信する。 In step S430, the messaging channel 320 sends a data access request to the selected node (ie, the second node 334).

選択されたノード（すなわち、第２ノード３３４）は、メッセージングチャネル３２０からクライアント３１０によって送信されたデータアクセス要請を受信する。ここで、データアクセス要請を受信した選択されたノード（第２ノード３３４）は、直ちにデータアクセス要請に対する処理（例えば、データの挿入または削除）を行わない場合がある。 The selected node (ie, second node 334) receives the data access request sent by client 310 from messaging channel 320. Here, the selected node (second node 334) that has received the data access request may not immediately perform processing (for example, insertion or deletion of data) in response to the data access request.

もし、データアクセス要請がデータに対する読み出し要請である場合、下記のステップＳ４４０、Ｓ４５０、Ｓ４６２、Ｓ４６４、及びＳ４６６を行わなくてもよい。この場合、データは選択されたノード（第２ノード３３４）からメッセージングチャネル３２０を経由してクライアント３１０に送信されてもよく、または、クライアント３１０に直接送信されてもよい。 If the data access request is a read request for data, the following steps S440, S450, S462, S464, and S466 may not be performed. In this case, the data may be sent from the selected node (second node 334) to the client 310 via the messaging channel 320 or directly to the client 310.

ステップＳ４４０において、選択されたノード（第２ノード３３４）は、データアクセス要請に対するマルチキャスト要請をメッセージングチャネル３２０に送信する。メッセージングチャネル３２０は、選択されたノード（第２ノード３３４）からマルチキャスト要請を受信する。 In step S440, the selected node (second node 334) transmits a multicast request for the data access request to the messaging channel 320. The messaging channel 320 receives a multicast request from the selected node (second node 334).

ステップＳ４５０において、メッセージングチャネル３２０は、データアクセス要請を１つ以上のノード３３０にマルチキャストする。マルチキャストの対象は１つ以上のノード３３０（すなわち、選択されたノードの全ての複製）である。マルチキャストを要請する選択されたノード（第２ノード３３４）自体にも、マルチキャストを介してデータアクセス要請が送信される。１つ以上のノード３３０それぞれは、マルチキャストされたデータアクセス要請を受信する。 In step S450, the messaging channel 320 multicasts the data access request to one or more nodes 330. The target of multicast is one or more nodes 330 (ie, all replicas of the selected node). A data access request is also transmitted via multicast to the selected node (second node 334) itself requesting multicast. Each of the one or more nodes 330 receives a multicast data access request.

メッセージングチャネル３２０がマルチキャストしたデータアクセス要請は、１つ以上のノード３３０に（論理的に）同時に到着する。また、１つ以上のノード３３０それぞれは、マルチキャストを介してデータアクセス要請を受信したとき、実際にデータアクセス要請を処理してもよい。 Data access requests multicasted by messaging channel 320 arrive at one or more nodes 330 simultaneously (logically). Each of the one or more nodes 330 may actually process the data access request when receiving the data access request via multicast.

ステップＳ４６２、Ｓ４６４、及びＳ４６６において、１つ以上のノード３３０それぞれは、マルチキャストされたデータアクセス要請を処理する。例えば、１つ以上のノード３３０それぞれは、受信されたデータアクセス要請に応じてデータの書込み、挿入、削除、または更新作業などを行う。 In steps S462, S464, and S466, each of the one or more nodes 330 processes the multicast data access request. For example, each of the one or more nodes 330 performs data writing, insertion, deletion, or update operation in response to the received data access request.

したがって、メッセージングチャネル３２０が複数のデータアクセス要請を順次マルチキャストすることによって、１つ以上のノード３３０は、全て同一の順序で複数のデータアクセス要請を処理することができる。システム３００は、アクセス要請の順序を把握するために、システム３００内の全てのノードを時間同期化してもよく、各要請に対するタイムスタンプを用いてもよい。システム３００は、時間同期化のためにＮＴＰ（ＮｅｔｗｏｒｋＴｉｍｅＰｒｏｔｏｃｏｌ）を用いてもよい。したがって、１つ以上のノード３３０が、管理するデータの一貫性を維持することができる。 Accordingly, the messaging channel 320 sequentially multicasts a plurality of data access requests so that one or more nodes 330 can process the plurality of data access requests in the same order. The system 300 may time-synchronize all the nodes in the system 300 in order to grasp the order of access requests, and may use a time stamp for each request. The system 300 may use NTP (Network Time Protocol) for time synchronization. Therefore, one or more nodes 330 can maintain the consistency of the data managed.

図５は、本発明の一例に係るデータ管理方法の信号フローチャートである。 FIG. 5 is a signal flowchart of a data management method according to an example of the present invention.

ステップＳ５１０において、選択されたノード（すなわち、第２ノード３３４）は、データアクセス要請を受信する。 In step S510, the selected node (ie, the second node 334) receives the data access request.

データアクセス要請は、クライアント３１０から直接に送信されたものであってもよい。 The data access request may be sent directly from the client 310.

データアクセス要請は、Ｌ４スイッチ３２５を経由して送信されたものであってもよい。Ｌ４スイッチ３２５は、クライアント３１０からデータアクセス要請を受信する。Ｌ４スイッチ３２５は、１つ以上のノード３３０のうち１つのノードを選択し、データアクセス要請を選択されたノードに送信する。Ｌ４スイッチ３２５は、ラウンド・ロビン方式などを用いたり、ロードバランシングに基づいて選択されたノードを決定してもよい The data access request may be transmitted via the L4 switch 325. The L4 switch 325 receives a data access request from the client 310. The L4 switch 325 selects one of the one or more nodes 330 and transmits a data access request to the selected node. The L4 switch 325 may use a round robin method or the like, or may determine a selected node based on load balancing.

ここで、データアクセス要請を受信した選択されたノード（第２ノード３３４）は、直ちにデータアクセス要請に対する処理を行わない場合がある。 Here, the selected node (second node 334) that has received the data access request may not immediately process the data access request.

ステップＳ５２０において、選択されたノード（第２ノード３３４）は、データアクセス要請に対するマルチキャストをメッセージングチャネル３２０に要請する。すなわち、選択されたノード（第２ノード３３４）は、データアクセス要請のマルチキャスト要請をメッセージングチャネル３２０に送信する。メッセージングチャネル３２０は、選択されたノード（すなわち、第２ノード３３４）からデータアクセス要請に対するマルチキャスト要請を受信する。 In step S520, the selected node (second node 334) requests the messaging channel 320 for multicast for the data access request. That is, the selected node (second node 334) transmits a multicast request for a data access request to the messaging channel 320. The messaging channel 320 receives a multicast request for a data access request from a selected node (ie, the second node 334).

ステップＳ５４０において、メッセージングチャネル３２０は、データアクセス要請を１つ以上のノード３３０にマルチキャストする。マルチキャストの対象は、１つ以上のノード３３０（すなわち、選択されたノードの全ての複製）である。マルチキャストを要請する選択されたノード（第２ノード３３４）自体にも、マルチキャストを介してデータアクセス要請が送信される。１つ以上のノード３３０それぞれは、マルチキャストされたデータアクセス要請を受信する。 In step S540, the messaging channel 320 multicasts the data access request to one or more nodes 330. The target of the multicast is one or more nodes 330 (ie, all replicas of the selected node). A data access request is also transmitted via multicast to the selected node (second node 334) itself requesting multicast. Each of the one or more nodes 330 receives a multicast data access request.

ステップＳ５５２、ステップＳ５５４、ステップＳ５５６において、１つ以上のノード３３０それぞれは、マルチキャストを介して受信されたデータアクセス要請を処理する。 In step S552, step S554, and step S556, each of the one or more nodes 330 processes a data access request received via multicast.

例えば、１つ以上のノード３３０それぞれは、受信されたデータアクセス要請に応じてデータの挿入、削除、または更新作業を行う。各ノード（例えば、第１ノード３３２、第２ノード３３４、または第３ノード３３６）は、データアクセス要請をマルチキャストを介して受信した後、データアクセス要請を処理する。データアクセス要請をマルチキャストするメッセージングチャネル３２０は、各ノードの立場におけるデータアクセス要請の順序を把握することができる。システム３００はアクセス要請の順序を把握するため、システム３００内の全てのノードを時間同期化してもよく、各要請に対するタイムスタンプを用いてもよい。システム３００は、時間同期化のためにＮＴＰ（ＮｅｔｗｏｒｋＴｉｍｅＰｒｏｔｏｃｏｌ）を用いてもよい。したがって、メッセージングチャネル３２０は、複数のノードが互いに異なるデータを提供する場合、いずれのデータが正しいデータであるかを常に把握できる。したがって、システム３００は、マルチマスタモデルを使用するにもかかわらず、読み出し補正を行うことなくクライアント３１０にデータを提供することができる。 For example, each of the one or more nodes 330 performs a data insertion, deletion, or update operation in response to the received data access request. Each node (eg, the first node 332, the second node 334, or the third node 336) processes the data access request after receiving the data access request via multicast. The messaging channel 320 for multicasting data access requests can grasp the order of data access requests at the standpoint of each node. In order to grasp the order of access requests, the system 300 may synchronize all nodes in the system 300 with time, or may use a time stamp for each request. The system 300 may use NTP (Network Time Protocol) for time synchronization. Therefore, the messaging channel 320 can always know which data is correct data when a plurality of nodes provide different data. Thus, the system 300 can provide data to the client 310 without performing read correction despite using a multi-master model.

１つ以上のノード３３０のうち、特定ノードに障害が発生したり、１つ以上のノード３３０から特定ノードを削除するとき、または、１つ以上のノード３３０に特定ノードを追加するとき、メッセージングチャネル３２０は、障害、削除、または追加に関する情報を把握することができる。 A messaging channel when a particular node of one or more nodes 330 fails, when a particular node is deleted from one or more nodes 330, or when a particular node is added to one or more nodes 330 320 can keep track of information regarding failures, deletions, or additions.

したがって、図４及び図５を参照して前述した方法は、便宜または目的などに応じてクラスタ内に複製（すなわち、ノード）を自由に追加及び削除できる柔軟な拡張性を提供することができる。また、前述した方法は、クラスタ内のいずれのノードでデータ照会を処理しても同一の結果を提供することができる。したがって、ロードバランシングの効果が提供される。 Therefore, the method described above with reference to FIG. 4 and FIG. 5 can provide flexible extensibility in which replicas (ie, nodes) can be freely added and deleted in the cluster according to convenience or purpose. Also, the above-described method can provide the same result regardless of whether the data query is processed at any node in the cluster. Therefore, a load balancing effect is provided.

図４及び図５を参照して前述した実施形態では、１つ以上のノード３３０それぞれは、他のノードと直接的に通信せず、自身以外にはどのようなノードが存在するかも把握できないまま動作する。 In the embodiment described above with reference to FIGS. 4 and 5, each of the one or more nodes 330 does not communicate directly with the other nodes, and it is not possible to know what other nodes are present. Operate.

したがって、１つ以上のノード３３０に特定ノード（すなわち、複製）が追加または削除される場合、追加または削除の処理は、メッセージングチャネル３２０に対してのみ行われ、他のノードは追加または削除に影響を受けることなく動作することができる。 Thus, when a particular node (ie, a replica) is added or removed from one or more nodes 330, the addition or removal process is only performed on the messaging channel 320, and other nodes affect the addition or removal. Can operate without receiving.

例えば、システム拡張のために１つのノードが追加される場合、該当ノードはメッセージングチャネル３２０に該当ノードに関する情報を送信し、メッセージングチャネル３２０との接続を生成する。その後には、該当ノードが含んでいるデータアクセスの要請を受けてもよい。 For example, when one node is added for system expansion, the corresponding node transmits information on the corresponding node to the messaging channel 320 and creates a connection with the messaging channel 320. Thereafter, a request for data access included in the corresponding node may be received.

一方、メッセージングチャネル３２０は、各ノードの状態をチェックしてもよい。例えば、メッセージングチャネル３２０はハートビット（ｈｅａｒｔｂｅａｔ）メッセージを周期的に受信したり、メッセージ送信に対する応答有無に基づいてシステムの障害有無を判別する。または、システム上、必要に応じて特定ノードを除去する場合、メッセージングチャネル３２０に除去されたノードに関する情報が送信されてもよい。もし、１つのノードに障害が発生したと判断されたり、特定ノードが除去される場合、メッセージングチャネル３２０は、該当ノードにこれ以上のメッセージを送信しない。このようにノードがなくなる場合、メッセージングチャネル３２０は、必要に応じて複製数を維持するための移動作業を指示してもよい。 Meanwhile, the messaging channel 320 may check the status of each node. For example, the messaging channel 320 periodically receives a heartbeat message and determines the presence or absence of a system failure based on the presence or absence of a response to message transmission. Alternatively, when a specific node is removed as necessary on the system, information regarding the removed node may be transmitted to the messaging channel 320. If it is determined that one node has failed or a specific node is removed, the messaging channel 320 does not send any more messages to that node. If there are no nodes in this way, the messaging channel 320 may direct a move operation to maintain the number of replicas as needed.

このようにシステム３００、はノードの挿入、削除、または故障を容易に処理することができる。 In this way, the system 300 can easily handle node insertion, deletion, or failure.

図６は、本発明の一実施形態に係るメッセージングチャネル３２０のブロック図である。 FIG. 6 is a block diagram of a messaging channel 320 according to one embodiment of the invention.

メッセージングチャネル３２０は、受信部６１０、制御部６２０、及び送信部６３０を備える。 The messaging channel 320 includes a reception unit 610, a control unit 620, and a transmission unit 630.

受信部６１０は、ネットワークを介してデータを受信する。例えば、ステップＳ４１０及びステップＳ５１０で、受信部６１０はクライアント３１０からデータアクセス要請を受信する。また、ステップＳ４４０及びステップＳ５４０で、受信部６１０は、選択されたノードからデータアクセス要請の１つ以上のノード３３０へのマルチキャスト要請を受信する。 The receiving unit 610 receives data via a network. For example, the reception unit 610 receives a data access request from the client 310 in steps S410 and S510. In step S440 and step S540, the receiving unit 610 receives a multicast request from the selected node to one or more nodes 330 for a data access request.

制御部６２０は、例えば、ステップＳ４２０において、１つ以上のノード３３０のうち選択されるノードを決定する。制御部６２０はラウンドロビンまたはロードバランシングに基づいて１つ以上のノード３３０のうち選択されるノードを決定してもよい。 For example, in step S420, the control unit 620 determines a node to be selected from the one or more nodes 330. The controller 620 may determine a node to be selected from the one or more nodes 330 based on round robin or load balancing.

送信部６３０は、ネットワークを介してデータを送信する。例えば、ステップＳ４３０において、送信部６３０は、選択されたノードにデータアクセス要請を送信する。また、ステップＳ４５０及びステップＳ５４０において、送信部６３０は、データアクセス要請を１つ以上のノード３３０にマルチキャストする。 The transmission unit 630 transmits data via the network. For example, in step S430, the transmission unit 630 transmits a data access request to the selected node. In step S450 and step S540, the transmission unit 630 multicasts a data access request to one or more nodes 330.

一方、メッセージングチャネル３２０は、ノード情報を管理する管理部（図示せず）をさらに備えてもよい。管理部は、ノードの情報、同一のデータの複製であるクラスタ情報を管理してもよく、メッセージングチャネル３２０がクラスタを１つの群として取り扱ってブロードキャスティングできるようにする。管理部では、ノードの追加または除去時に新しいノードに関する情報をアップデートしてもよい。 Meanwhile, the messaging channel 320 may further include a management unit (not shown) that manages node information. The management unit may manage node information and cluster information that is a duplicate of the same data, and allows the messaging channel 320 to handle the cluster as a group and perform broadcasting. The management unit may update information on a new node when adding or removing a node.

先に図１から図５を参照して説明された本発明の一実施形態に係る技術的な内容が本実施形態にそのまま適用されてもよい。したがって、本詳細な説明は以下では省略する。 The technical contents according to the embodiment of the present invention described above with reference to FIGS. 1 to 5 may be directly applied to the present embodiment. Therefore, this detailed description is omitted below.

図７は、本発明の一実施形態に係るデータ格納装置を示す図である。 FIG. 7 is a diagram showing a data storage device according to an embodiment of the present invention.

データ格納装置７００は、１つ以上の格納領域７１０、７２０、７３０、７４０及び７５０を備えてもよい。ここで「格納領域」とは、データを格納する物理的または論理的な空間を意味する。例えば、「格納領域」とは、１つの関係型データベースまたはファイルシステム、あるいは同一のデータに対する複製の集合である分散クラスタであってもよい。 The data storage device 700 may include one or more storage areas 710, 720, 730, 740 and 750. Here, the “storage area” means a physical or logical space for storing data. For example, the “storage area” may be one relational database or file system, or a distributed cluster that is a collection of replicas for the same data.

データ格納装置７００は、図３から図６を参照して説明した第１ノード３３２、第２ノード３３４及び第３ノード３３６の１つ以上に対応してもよい。 The data storage device 700 may correspond to one or more of the first node 332, the second node 334, and the third node 336 described with reference to FIGS.

各格納領域は、論理的にツリー形態の階層的な構造を有してもよい。言い換えれば、各格納領域は、ツリーにおける１つのノードに対応し、任意の２つの格納領域の間には２つの格納領域に対応するノード間の関係によって親−子関係または兄弟関係などが成り立つ。 Each storage area may have a logical tree-like hierarchical structure. In other words, each storage area corresponds to one node in the tree, and a parent-child relationship or a sibling relationship is established between any two storage areas depending on the relationship between the nodes corresponding to the two storage areas.

図７を参照すると、第１格納領域７１０はツリー構造のルートノードに、第２格納領域７２０及び第３格納領域７３０はそれぞれ第１格納領域７１０の右側の子ノード及び左側の子ノードに対応する。第１格納領域７１０は、第２格納領域７２０及び第３格納領域７３０の親格納領域である。第２格納領域７２０は、第１格納領域７１０の（左側）子格納領域である。第２格納領域７２０は、第３格納領域７３０の兄弟格納領域である。第４格納領域７４０及び第５格納領域７５０はそれぞれ第３格納領域７３０の左側の子ノード及び右側の子ノードに対応する。 Referring to FIG. 7, the first storage area 710 corresponds to the root node of the tree structure, and the second storage area 720 and the third storage area 730 correspond to the right child node and the left child node of the first storage area 710, respectively. . The first storage area 710 is a parent storage area for the second storage area 720 and the third storage area 730. The second storage area 720 is a (left side) child storage area of the first storage area 710. The second storage area 720 is a sibling storage area of the third storage area 730. The fourth storage area 740 and the fifth storage area 750 correspond to the left child node and the right child node of the third storage area 730, respectively.

格納領域７１０、７２０、７３０、７４０及び７５０それぞれには、各格納領域を識別するためのキー（ｋｅｙ）が割り当てられてもよい。このキーは、格納領域との間の階層構造を表すことができる形態で構成される。例えば、１つのキーは、１つ以上のサブキー（ｓｕｂ−ｋｅｙ）及びこれを区分する区分子（ｓｅｐａｒａｔｏｒ）を含んで構成されてもよい。ここで、サブキーは、任意のサブツリー内の全てのノードを代表する値になる。図７を参照すると、「ｋｏｒｅａ」は第１格納領域７１０をルートにするサブツリーを代表するキーであり、「ｓｅｏｕｌ」は第３格納領域７３０をルートにするサブツリーを代表するキーある。図７において、「ｋｏｒｅａ」、「ｋｙｅｏｎｇｇｉ」「ｓｅｏｕｌ「ｋａｎｇｂｕｋ」「ｋａｎｇｎａｍ」などはそれぞれ１つのサブキーであり、「．」を区分子として用いて「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ」のような１つのｋｅｙを構成する。 Each of the storage areas 710, 720, 730, 740 and 750 may be assigned a key for identifying each storage area. This key is configured in a form that can represent a hierarchical structure with the storage area. For example, one key may be configured to include one or more sub-keys and a separator that separates the sub-keys. Here, the subkey is a value representing all nodes in an arbitrary subtree. Referring to FIG. 7, “corea” is a key representing a subtree whose root is the first storage area 710, and “seoul” is a key representing a subtree whose root is the third storage area 730. In FIG. 7, “korea”, “kyeongi”, “seoul“ kangbuk ”,“ kangnam ”, and the like are each one subkey, and one key such as“ korea.seoul. Configure.

下記の数式（１）はこのように階層的なキーを示す正規式の一例である。

（１）
ここで、ｋｅｙは階層的なキーを示し、ｋｅｙは英数字（ａｌｐｈａｎｕｍｅｒｉｃ）と区分子「．」で組合わせた文字列（ｓｔｒｉｎｇ）である。 The following formula (1) is an example of a regular expression indicating a hierarchical key in this way.

(1)
Here, “key” indicates a hierarchical key, and “key” is a character string (string) formed by combining alphanumeric and “.”

一般的な英文字、数字及び区分子「．」を用いる場合、キーは数式（１）のように表わすことができるが、本発明はこれに限定されることなく他の文字を含んだり、または、他の形態の区分子を用いてもよい。また、キーが必ず１つの文字列で構成される必要はなく、例えば、リンクリスト（ｌｉｎｋｅｄｌｉｓｔ）のような形態で構成してもよい。以下では区分子などに区分されたサブキーを順に第１サブキー、第２サブキー、．．および第ｎサブキーとする。ルートノードのレベルを１とするとき、レベルｎに位置するノードに対応する格納領域を識別するキーはｎ個のサブキーを含む。 In the case of using general English letters, numbers, and the numerator “.”, The key can be expressed as Equation (1), but the present invention is not limited thereto and includes other characters, or Other types of molecules may be used. Further, the key is not necessarily composed of one character string, and may be composed in a form such as a linked list. In the following, subkeys divided into compartments and the like are sequentially assigned to a first subkey, a second subkey,. . And the nth sub key. When the level of the root node is 1, the key for identifying the storage area corresponding to the node located at level n includes n subkeys.

一方、ルートノードに対応する格納領域のキーは空白（ｎｕｌｌ）であってもよい。このような場合、ルートノードは０個のサブキーを、レベルｎに位置するノードに対応する格納領域を識別するキーはｎ−１個のサブキーを含んでもよい。例えば、階層的なキーは、数式（１）の正規式によって生成された文字列または空白文字列であってもよい。 On the other hand, the key of the storage area corresponding to the root node may be null. In such a case, the root node may include 0 subkeys, and the key for identifying the storage area corresponding to the node located at level n may include n-1 subkeys. For example, the hierarchical key may be a character string or a blank character string generated by the regular expression of Equation (1).

図７において、第１格納領域７１０に割り当てられたキー７１５は、「ｋｏｒｅａ」である。キー７１５は、第１サブキー「ｋｏｒｅａ」のみを有する。 In FIG. 7, the key 715 assigned to the first storage area 710 is “korea”. The key 715 has only the first subkey “korea”.

第２格納領域７２０に割り当てられたキー７２５は、「ｋｏｒｅａ．ｇｙｅｏｎｇｇｉ」である。キー７２５は、第１サブキー「ｋｏｒｅａ」及び第２サブキー「ｇｙｅｏｎｇｇｉ」を有する。第３格納領域７３０に割り当てられたキー７３５は、「ｋｏｒｅａ．ｓｅｏｕｌ」である。キー７３５は、第１サブキー「ｋｏｒｅａ」及び第２サブキー「ｓｅｏｕｌ」を有する。第４格納領域７４０に割り当てられたキー７４５は、「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｂｕｋ」である。キー７３５は、第１サブキー「ｋｏｒｅａ」、第２サブキー「ｓｅｏｕｌ」及び第３サブキー「ｋａｎｇｂｕｋ」を有する。第５格納領域７５０に割り当てられたキー７５５は、「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ」である。キー７５５は、第１サブキー「ｋｏｒｅａ」、第２サブキー「ｓｅｏｕｌ」及び第３サブキー「ｋａｎｇｎａｍ」を有する。 The key 725 assigned to the second storage area 720 is “corea.gyeongi”. The key 725 has a first subkey “korea” and a second subkey “gyeongi”. The key 735 assigned to the third storage area 730 is “corea.seoul”. The key 735 has a first subkey “korea” and a second subkey “seoul”. The key 745 assigned to the fourth storage area 740 is “corea.seoul.kangbuk”. The key 735 has a first subkey “korea”, a second subkey “seoul”, and a third subkey “kangbuk”. The key 755 assigned to the fifth storage area 750 is “corea.seoul.kangnam”. The key 755 has a first subkey “korea”, a second subkey “seoul”, and a third subkey “kannam”.

格納領域に割り当てられたキーは、格納領域の親格納領域のキーに１つ以上のサブキーが連鎖されたキーであってもよい。例えば、第４格納領域７４０に割り当てられたキーは、第４格納領域の親格納領域である第３格納領域７３０のキー「ｋｏｒｅａ．ｓｅｏｕｌ」にサブキー「ｋａｎｇｂｕｋ」が連鎖されたキーである。 The key assigned to the storage area may be a key in which one or more subkeys are chained to the key of the parent storage area of the storage area. For example, the key assigned to the fourth storage area 740 is a key in which the sub-key “kangbuk” is chained to the key “corea.seul” of the third storage area 730 that is the parent storage area of the fourth storage area.

すなわち、格納領域ｐ及び格納領域ｃが互いに親格納領域及び子格納領域の関係にある場合、親格納領域ｐのキーｋｐが第１サブキーｓｋ１から第ｎサブキーｓｋｎを含むと（ここで、ｎは１以上の整数である）、子格納領域ｃのキーｋｃは第１サブキーｓｋ１から第ｎｓｋｎを含み、第ｎ＋１サブキーｓｋ（ｎ＋１）から第ｍサブキーｓｋｍを含んでもよい。ここで、ｍはｎ＋１以上の整数である。 That is, when the storage area p and the storage area c are in the relationship between the parent storage area and the child storage area, if the key kp of the parent storage area p includes the first subkey sk1 to the nth subkey skn (where n is The key kc of the child storage area c may include the first subkey sk1 to nskn, and the (n + 1) th subkey sk (n + 1) to the mth subkey skm. Here, m is an integer of n + 1 or more.

本発明の一実施形態によると、データ格納装置７００は、データを各格納領域のキーに応じて分類して格納する。データはキーを有する。データのキーは、データの分類体系に用いられる識別子である。データのキーは、データの分類体系の上位置を表してもよい。データ格納装置７００内のツリー構造が分類体系を表す場合、データはデータのキー値に応じて特定の格納領域（または、格納領域のキー）に対応する。 According to an embodiment of the present invention, the data storage device 700 stores data classified according to the key of each storage area. The data has a key. The data key is an identifier used in the data classification system. The key of the data may represent the upper position of the data classification system. When the tree structure in the data storage device 700 represents a classification system, the data corresponds to a specific storage area (or storage area key) according to the key value of the data.

特定の格納領域を示すノードをルートにしたサブツリーにおいて、サブツリー内の格納領域は特定の格納領域のキーに対応するデータを格納する。 In a subtree rooted at a node indicating a specific storage area, the storage area in the subtree stores data corresponding to the key of the specific storage area.

例えば、第３格納領域７３０をルートにしたサブツリー７６０において、サブツリー７６０内の格納領域７３０、７４０及び７５０は、第３格納領域７３０のキー７３５「ｋｏｒｅａ．ｓｅｏｕｌ」に対応するデータを格納する。 For example, in the subtree 760 rooted at the third storage area 730, the storage areas 730, 740, and 750 in the subtree 760 store data corresponding to the key 735 “corea.seul” of the third storage area 730.

また、第４格納領域７４０をルートにしたサブツリーは、第４格納領域７４０のみを含む。したがって、第４格納領域７４０は、第４格納領域７４０のキー７４５「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｂｕｋ」に対応するデータを格納してもよい。 Further, the subtree having the fourth storage area 740 as a root includes only the fourth storage area 740. Therefore, the fourth storage area 740 may store data corresponding to the key 745 “corea.seoul.kangbuk” of the fourth storage area 740.

また、第５格納領域７５０をルートにしたサブツリーは、第５格納領域７５０のみを含む。したがって、第５格納領域７５０は、第５格納領域７５０のキー７５５「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ」に対応するデータを格納してもよい。 Further, the subtree having the fifth storage area 750 as a root includes only the fifth storage area 750. Therefore, the fifth storage area 750 may store data corresponding to the key 755 “corea.seoul.kannam” of the fifth storage area 750.

ここで、格納領域のキーに対応するデータは、データのキーの接頭語（ｐｒｅｆｉｘ）のいずれか１つが格納領域のキーと同一のデータを意味する。 Here, the data corresponding to the storage area key means data in which one of the data key prefixes is the same as the storage area key.

キーｘの接頭語とは、階層的なキーｘがｎ個のサブキーｘ１、ｘ２、ｘ３、．．．、ｘｎを含むとき、ｘのサブキーのうちのｉ個（ｉは１以上ｎ以下）のサブキーを含むキーを意味する。例えば、階層的なキー「ａ．ｂ．ｃ」の接頭語は「ａ」、「ａ．ｂ」及び「ａ．ｂ．ｃ」であってもよい。 The key x prefix means that the hierarchical key x has n subkeys x1, x2, x3,. . . , Xn means a key including i subkeys of x (i is 1 or more and n or less). For example, the prefix of the hierarchical key “abc” may be “a”, “ab” and “abc”.

例えば、データのキーが「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ」であれば、「ｋｏｒｅａ」、「ｋｏｒｅａ．ｓｅｏｕｌ」は前記キーの接頭語の１つであってもよい。したがって、前記データは、第１格納領域７１０のキー７１５「ｋｏｒｅａ」及び第３格納領域７３０のキー７３５「ｋｏｒｅａ．ｓｅｏｕｌ」に対応して、第４格納領域７４０のキー７４５及び第５格納領域７５０のキー７５５には対応しない。 For example, if the data key is “corea.seoul.kannam”, “korea” and “corea.seul” may be one of the key prefixes. Accordingly, the data corresponds to the key 715 “korea” in the first storage area 710 and the key 735 “korea.seoul” in the third storage area 730, and the key 745 and the fifth storage area 750 in the fourth storage area 740. This key 755 does not correspond.

例えば、データのキーが「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ．ｓｈｉｎｓａ」であれば、キーの接頭語は「ｋｏｒｅａ」、「ｋｏｒｅａ．ｓｅｏｕｌ」、「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ」であってもよい。データは、第１格納領域７１０のキー７１５、第３格納領域７３０のキー７３５及び第５格納領域７５０のキー７５５に対応する。 For example, if the key of the data is “corea.seoul.kannam.shinsa”, the prefix of the key may be “corea”, “corea.seoul”, “corea.seoul.kannam”. The data corresponds to the key 715 in the first storage area 710, the key 735 in the third storage area 730, and the key 755 in the fifth storage area 750.

データは、自らのキーに応じてデータ格納装置７００の１つ以上の格納領域７１０、７２０、７３０、７４０及び７５０のいずれか１つの格納領域内に格納されてもよい。 The data may be stored in any one of the one or more storage areas 710, 720, 730, 740, and 750 of the data storage device 700 according to its own key.

例えば、格納領域は、（１）格納領域のキーに対応し、（２）格納領域の子格納領域のキーには対応しないデータを格納してもよい。 For example, the storage area may store data that does not correspond to (1) the key of the storage area and (2) the key of the child storage area of the storage area.

親格納領域ｐに対応するデータは、親格納領域ｐの子格納領域ｃにも対応する。データが格納領域ｐ及び格納領域ｃに対応する場合、データは、格納領域ｐではない格納領域ｃをルートにしたサブツリー内の格納領域のいずれか１つの格納領域内に格納されてもよい。 The data corresponding to the parent storage area p also corresponds to the child storage area c of the parent storage area p. When the data corresponds to the storage area p and the storage area c, the data may be stored in any one of the storage areas in the subtree rooted at the storage area c that is not the storage area p.

データのキーが「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ．ｓｈｉｎｓａ」または「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ．ｓｈｉｎｓａ．１」であれば、データは、第１格納領域７１０、第３格納領域７３０、及び第５格納領域７５０に対応する。データは第５格納領域７５０内に格納されてもよい。 If the key of the data is “corea.seul.kannam.shinsa” or “corea.seoul.kannam.shinsa.1”, the data is stored in the first storage area 710, the third storage area 730, and the fifth storage area 750. Corresponding to Data may be stored in the fifth storage area 750.

データのキーが「ｋｏｒｅａ．ｇａｎｇｗｏｎ」であれば、データは第１格納領域７１０に対応する。データは第１格納領域７１０内に格納されてもよい。 If the data key is “corea.gangwon”, the data corresponds to the first storage area 710. Data may be stored in the first storage area 710.

前述のような格納方式が用いられる場合、階層的なキーによって実際のデータが位置する格納領域が検索されてもよい。すなわち、実際のデータが位置する格納領域は、階層的なキー全体またはキーの接頭語によって検索されてもよい。 When the storage method as described above is used, a storage area in which actual data is located may be searched by a hierarchical key. That is, the storage area where the actual data is located may be searched by the entire hierarchical key or the key prefix.

以下では、データのキーが「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ．ｓｈｉｎｓａ」であるデータが検索される場合の例について説明する。ルートノードに対応する第１格納領域７１０のキー７１５がデータのキーと比較される。データのキーの接頭語の１つである「ｋｏｒｅａ」は、第１格納領域７１０のキー７１５と同一である。また、第３格納領域７３０及び第１格納領域７１０は互いに子−親の関係にある。データのキーの接頭語の１つである「ｋｏｒｅａ．ｓｅｏｕｌ」は、第３格納領域７３０のキー７３５と同一である。第５格納領域７５０及び第１格納領域７１０は互いに子−親の関係にある。データのキーの接頭語の１つである「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ」は、第５格納領域７５０のキー７５５と同一である。したがって、キーが「ｋｏｒｅａ．ｓｅｏｕｌ．ｋａｎｇｎａｍ．ｓｈｉｎｓａ」であるデータは、第１格納領域７１０及び第３格納領域７３０を経由して第５格納領域７５０内で検索されてもよい。 In the following, an example will be described in which data whose data key is “corea.seoul.kannam.shinsa” is searched. The key 715 in the first storage area 710 corresponding to the root node is compared with the data key. “Korea”, which is one of the data key prefixes, is the same as the key 715 in the first storage area 710. The third storage area 730 and the first storage area 710 are in a child-parent relationship. “Corea.seoul”, which is one of the data key prefixes, is the same as the key 735 in the third storage area 730. The fifth storage area 750 and the first storage area 710 are in a child-parent relationship. One of the data key prefixes, “corea.seoul.gangnam”, is the same as the key 755 in the fifth storage area 750. Accordingly, the data whose key is “corea.seoul.kannam.shinsa” may be searched in the fifth storage area 750 via the first storage area 710 and the third storage area 730.

以下は、データのキーが「ｋｏｒｅａ．ｊｅｊｕｄｏ」であるデータが検索される場合について説明する。データのキーの接頭語の１つである「ｋｏｒｅａ」は、第１格納領域７１０のキー７１５と同一である。データのキーの接頭語である「ｋｏｒｅａ」及び「ｋｏｒｅａ．ｊｅｊｕｄｏ」は、第２格納領域７２０のキー７２５及び第３格納領域７３０のキー７３５とは同一ではない。したがって、キーが「ｋｏｒｅａ．ｊｅｊｕｄｏ」であるデータは第１格納領域７１０内で検索されてもよい。 The following describes a case where data whose data key is “corea.jejudo” is searched. “Korea”, which is one of the data key prefixes, is the same as the key 715 in the first storage area 710. The data key prefixes “korea” and “korea.jejudo” are not the same as the key 725 in the second storage area 720 and the key 735 in the third storage area 730. Therefore, the data whose key is “corea.jejudo” may be searched in the first storage area 710.

図８は、本発明の一例に係るデータ格納装置に格納領域を追加する過程を説明するための図である。 FIG. 8 is a diagram for explaining a process of adding a storage area to the data storage device according to an example of the present invention.

図７を参照して前述したデータ格納装置７００は、格納領域のキーである「ｋｏｒｅａ」に対応するデータを格納するため、格納領域が追加されたと見なすことができる。 Since the data storage device 700 described above with reference to FIG. 7 stores data corresponding to “corea” that is a key of the storage area, it can be considered that the storage area has been added.

初期状態８１０において、データ格納装置７００は第１格納領域７１０のみを有する。 In the initial state 810, the data storage device 700 has only the first storage area 710.

第１格納領域７１０のキー７１５は「ｋｏｒｅａ」である。したがって、キーが「ｋｏｒｅａ」であるデータ、キーが「ｋｏｒｅａ．ｇｙｅｏｎｇｇｉ」から始まるデータ、及びキーが「ｋｏｒｅａ．ｓｅｏｕｌ」から始まるデータは、全て第１格納領域７１０内に格納される。 The key 715 of the first storage area 710 is “korea”. Therefore, data whose key is “corea”, data whose key starts from “corea.gyeongi”, and data whose key starts from “corea.seoul” are all stored in the first storage area 710.

データ格納装置７００が運用されることによって、特定の接頭語を有する（すなわち、特定の文字列から始まる）キーを有するデータ（例えば、「ｋｏｒｅａ．ｓｅｏｕｌ」）が多くなれば、データ格納装置７００は、特定の接頭語をキーとして有する格納領域を追加してもよい。この追加は、データ格納装置７００のツリー構造が拡張されることを意味する。すなわち、拡張は、データ格納装置７００のツリー構造に新しい格納領域（または、新しい格納領域を示すノード）が追加されることを意味する。 When the data storage device 700 is operated, if the data having a key having a specific prefix (that is, starting with a specific character string) (for example, “corea.seoul”) increases, the data storage device 700 A storage area having a specific prefix as a key may be added. This addition means that the tree structure of the data storage device 700 is expanded. That is, expansion means that a new storage area (or a node indicating a new storage area) is added to the tree structure of the data storage device 700.

データが格納された状態８２０において、第１格納領域７１０は、キー７１５に対応する１つ以上のデータ８６０を格納する。 In the state 820 in which data is stored, the first storage area 710 stores one or more data 860 corresponding to the key 715.

第１格納領域７１０が１つ以上のデータ８６０を全て処理できない場合、新しい格納領域の追加が要求される。例えば、データ格納装置７００は、下記の状態８３０、８４０及び８５０を経由して拡張されてもよい。 If the first storage area 710 cannot process one or more pieces of data 860, the addition of a new storage area is requested. For example, the data storage device 700 may be expanded via the following states 830, 840, and 850.

ノード生成状態８３０で示すように、データ格納装置７００は、第１格納領域７１０の子格納領域である第３格納領域７３０を生成してもよい。 As indicated by the node generation state 830, the data storage device 700 may generate a third storage area 730 that is a child storage area of the first storage area 710.

すなわち、第３キー７３５に対応するデータ（すなわち、データのキーが「ｋｏｒｅａ．ｓｅｏｕｌ」から始まるデータ）は、第３格納領域７３０に別個に分離されてもよく、分離後第３格納領域７３０によって処理されてもよい。 That is, data corresponding to the third key 735 (that is, data whose data key starts with “corea.seoul”) may be separately separated into the third storage area 730, and may be separated by the third storage area 730 after the separation. May be processed.

追加通知状態８４０において、新しく生成された第３格納領域７３０は、自らの親格納領域である第１格納領域７１０に自らが処理するキー（すなわち、第３キー７３５）を通知してもよい。この通知は、格納領域の追加を通知するものである。 In the addition notification state 840, the newly generated third storage area 730 may notify the first storage area 710, which is its own parent storage area, of the key that it processes (that is, the third key 735). This notification notifies the addition of the storage area.

データ移動状態８５０において、通知を受信した第１格納領域７１０は自らが格納したデータのうち通知された第３キー７３５に対応するデータ８７０を第３格納領域７３０に移動させる。例えば、通知を受信した第１格納領域７１０は、自らが格納したデータのうち通知された第３キー７３５に対応するデータ８７０を第３格納領域７３０にコピーしてもよい。コピーが完了すると、第１格納領域７１０は子格納領域（すなわち、第３格納領域７３０）が保有するデータを重複して保有する必要がない。したがって、１格納領域７１０は、自らが格納したデータのうち第３格納領域７３０にコピーされたデータを削除してもよい。削除の後、第１格納領域７１０は、第１キー７１５に対応するデータのうち第３キー７３５に対応しないデータを格納する。 In the data movement state 850, the first storage area 710 that has received the notification moves the data 870 corresponding to the notified third key 735 among the data stored therein to the third storage area 730. For example, the first storage area 710 that has received the notification may copy the data 870 corresponding to the notified third key 735 among the data stored therein to the third storage area 730. When the copying is completed, the first storage area 710 does not need to hold the data held in the child storage area (that is, the third storage area 730) redundantly. Therefore, one storage area 710 may delete data copied to the third storage area 730 among the data stored by itself. After the deletion, the first storage area 710 stores data not corresponding to the third key 735 among the data corresponding to the first key 715.

第３格納領域の第３キー７３５に対応するデータ８７０を第１格納領域７１０から第３格納領域７３０に移動することによって、第１格納領域７１０の格納量は減少する。 By moving the data 870 corresponding to the third key 735 of the third storage area from the first storage area 710 to the third storage area 730, the storage amount of the first storage area 710 decreases.

データ格納装置７００は、第１格納領域７１０に新しいデータが挿入されることによって第１格納領域７１０の格納量が予め定義された基準に達したとき、前述した第３格納領域７３０の生成及び第３格納領域７３０へのデータ移動を行なってもよい。 When new data is inserted into the first storage area 710 and the storage amount of the first storage area 710 reaches a predefined standard, the data storage device 700 generates the third storage area 730 and generates the third storage area 730 described above. 3 Data movement to the storage area 730 may be performed.

拡張中に、データ格納装置７００は（部分的または全体的に）中断されなくてもよい。また、拡張によってデータも自動で複製されてもよい。 During expansion, the data storage device 700 may not be interrupted (partially or entirely). Further, the data may be automatically copied by the extension.

拡張中に、第３キー７３５に対応する新規の流入データは、拡張によって生成された子ノードに対応する第３格納領域７３０に送信される。 During expansion, new inflow data corresponding to the third key 735 is transmitted to the third storage area 730 corresponding to the child node generated by the expansion.

データ格納装置７００の縮小（すなわち、ツリーノードの削除）は、前述したデータ格納装置７００の拡張の逆順に行なわれてもよい。 The reduction of the data storage device 700 (that is, the deletion of the tree node) may be performed in the reverse order of the expansion of the data storage device 700 described above.

図９は、本発明の一例に係るデータ格納装置に対する範囲検索を説明するための図である。 FIG. 9 is a diagram for explaining a range search for a data storage device according to an example of the present invention.

特定の条件が満たされるデータを照会するためにクエリが用いられてもよい。クエリは、特定の検索範囲に対応するキーを有するデータ目録を質疑する文章であってもよい。 Queries may be used to query data that satisfies certain conditions. The query may be a sentence that queries a data list having a key corresponding to a specific search range.

データ格納装置７００の格納領域７１０、７２０、７３０、７４０及び７５０の全てまたは一部は、クエリに対してデータを検索してもよい。格納領域７１０、７２０、７３０、７４０及び７５０の全てまたは一部によって検索された結果を併合することで、特定の格納領域７１０、７２０、７３０、７４０または７５０には格納されていないデータも検索されてもよい。 All or part of the storage areas 710, 720, 730, 740, and 750 of the data storage device 700 may retrieve data for a query. Data that is not stored in a specific storage area 710, 720, 730, 740, or 750 is also searched by merging the results searched by all or part of the storage areas 710, 720, 730, 740, and 750. May be.

クエリ提供状態９１０において、クエリは、任意の格納領域７１０、７２０、７３０、７４０または７５０に提供されてもよい。 In the query provision state 910, the query may be provided to any storage area 710, 720, 730, 740 or 750.

本発明の一実施形態では、ルートノードに対応する第１格納領域７１０にクエリが提供された場合を説明する。 In the embodiment of the present invention, a case where a query is provided to the first storage area 710 corresponding to the root node will be described.

第１格納領域７１０は、送信されたクエリの分析によって自らの第１キー７１５がクエリに対応するか否かを判断する。ここで、第１キー７１５がクエリに対応することは、第１キーに対応するデータのうちクエリの検索範囲に含まれるデータが存在することを意味する。 The first storage area 710 determines whether or not its first key 715 corresponds to the query by analyzing the transmitted query. Here, the fact that the first key 715 corresponds to the query means that there is data included in the query search range among the data corresponding to the first key.

例えば、クエリの検索範囲が「ｕｓａ．ａｒ」から「ｕｓａ．ｃａ」までであれは、第１キー７１５「ｋｏｒｅａ」に対応するデータは検索範囲内には含まれない。したがって、第１キー７１５はクエリに対応しない。 For example, if the search range of the query is “usa.ar” to “usa.ca”, the data corresponding to the first key 715 “korea” is not included in the search range. Accordingly, the first key 715 does not correspond to a query.

もし、クエリに対応するデータがなければ、第１格納領域７１０はクエリに対して空白（ｎｕｌｌ）を返還してもよく、または、これ以上の処理を行なわなくてもよい。 If there is no data corresponding to the query, the first storage area 710 may return a null for the query, or no further processing is required.

クエリ送信状態９２０において、第１格納領域７１０は、自らの子格納領域である第２格納領域７２０及び第３格納領域７３０にクエリを送信してもよい。すなわち、第１格納領域７１０は、１つ以上の子格納領域（すなわち、第２格納領域７２０及び第３格納領域７３０）に検索範囲に対応するキーを有するデータの目録を要請してもよい。 In the query transmission state 920, the first storage area 710 may transmit a query to the second storage area 720 and the third storage area 730, which are its child storage areas. That is, the first storage area 710 may request an inventory of data having a key corresponding to the search range in one or more child storage areas (that is, the second storage area 720 and the third storage area 730).

第２格納領域７２０及び第３格納領域７３０も自らの子格納領域にクエリを送信してもよい。すなわち、クエリの送信は階層的に行われてもよい。（図示せず） The second storage area 720 and the third storage area 730 may also send queries to their child storage areas. That is, the transmission of the query may be performed hierarchically. (Not shown)

目録返還状態９３０において、第１格納領域７１０の１つ以上の子格納領域（例えば、第２格納領域７２０及び第３格納領域７３０）から検索範囲に対応するキーを有するデータの目録が返還されてもよい。 In the inventory return state 930, a list of data having a key corresponding to the search range is returned from one or more child storage areas (eg, the second storage area 720 and the third storage area 730) of the first storage area 710. Also good.

また、クエリの送信が階層的に行われた場合、第２格納領域７２０及び第３格納領域７３０もそれぞれ自らの１つ以上の子格納領域から検索範囲に対応するキーを有するデータの目録が返還されてもよい。 When queries are transmitted hierarchically, the second storage area 720 and the third storage area 730 also return a list of data having keys corresponding to the search ranges from one or more child storage areas of the second storage area 720 and the third storage area 730, respectively. May be.

目録併合及び検索結果の返還状態９４０において、第１格納領域７１０は、検索語の検索範囲に対する結果として併合されたデータ目録を返還してもよい。 In the list merge and search result return state 940, the first storage area 710 may return the data list merged as a result of the search term search range.

第２格納領域７２０及び第３格納領域７３０が返還したデータの目録も併合されたデータ目録であってもよい。すなわち、併合されたデータ目録の返還は階層的に行われてもよい。 The list of data returned by the second storage area 720 and the third storage area 730 may also be a merged data list. That is, the merged data list may be returned hierarchically.

第１格納領域７１０は、第１格納領域７１０が格納したデータのうち検索範囲に対応するデータの第２目録を返還された第１目録として併合してもよく、併合によって生成して併合された目録を検索語の検索範囲に対する結果として返還してもよい。 The first storage area 710 may be merged as a returned first list of data corresponding to the search range among the data stored in the first storage area 710, or generated and merged by the merge. The inventory may be returned as a result of the search term search range.

前述したように、本発明の一例によって、キー・バリューＤＢまたはハッシュを用いることなく、範囲検索及び空間的なインデックスが支援され得る。 As described above, an example of the present invention can support range search and spatial index without using a key-value DB or hash.

図１０は、本発明の一実施形態に係るデータ格納方法のフローチャートである。 FIG. 10 is a flowchart of a data storage method according to an embodiment of the present invention.

ステップＳ１０１０において、１つ以上の格納領域がツリー構造で構成される。１つ以上の格納領域それぞれ、はＲＤＢＭＳであってもよい。 In step S1010, one or more storage areas are configured in a tree structure. Each of the one or more storage areas may be an RDBMS.

ステップＳ１０２０において、１つ以上の格納領域それぞれに階層的なキーが割り当てられる。階層的なキーは０個以上のサブキーを有してもよい。階層的なキーは、前記の数式（１）の正規式によって生成された文字列であるか、空白文字列であってもよい。 In step S1020, a hierarchical key is assigned to each of the one or more storage areas. A hierarchical key may have zero or more subkeys. The hierarchical key may be a character string generated by the regular expression of Equation (1) or a blank character string.

ステップＳ１０３０において、１つ以上の格納領域のうち、任意の第１格納領域を示す第１ノードをルートにしたサーバツリー内の格納領域内に第１格納領域の第１キーに対応するデータが格納される。 In step S1030, the data corresponding to the first key of the first storage area is stored in the storage area in the server tree rooted at the first node indicating an arbitrary first storage area among the one or more storage areas. Is done.

第１キーは、第１ノードの親ノードを示す第２格納領域の第２キーに１つ以上のサブキーが連鎖されたキーである。 The first key is a key in which one or more subkeys are chained to the second key in the second storage area indicating the parent node of the first node.

第１キーに対応するデータは、データのキーの接頭語のうち１つが第１キーと同一であるデータを意味する。 Data corresponding to the first key means data in which one of the data key prefixes is identical to the first key.

データのキーの接頭語のうち１つが格納領域のキーと同一であれば、データは格納領域に対応するものと見なすことができる。 If one of the data key prefixes is the same as the storage area key, the data can be considered to correspond to the storage area.

第１格納領域に対応するデータのうち、第１ノードの子ノードを示す第３格納領域に対応するデータは第３格納領域に格納され、第３格納領域に対応しないデータは第１格納領域に格納される。したがって、ステップＳ１０３０は、第１格納領域に対応するデータのうち、第１ノードの子ノードを示す格納領域に対応しないデータを第１格納領域に格納するステップを含んでもよい。 Of the data corresponding to the first storage area, data corresponding to the third storage area indicating the child node of the first node is stored in the third storage area, and data not corresponding to the third storage area is stored in the first storage area. Stored. Therefore, step S1030 may include a step of storing, in the first storage area, data that does not correspond to the storage area indicating the child node of the first node among the data corresponding to the first storage area.

先に図７から図９を参照して説明した本発明の一実施形態に係る技術的な内容は、本実施形態にもそのまま適用されてもよい。したがって、本詳細な説明は以下では省略する。 The technical contents according to the embodiment of the present invention described above with reference to FIGS. 7 to 9 may be applied to the present embodiment as they are. Therefore, this detailed description is omitted below.

図１１は、本発明の一実施形態に係るデータ格納装置における拡張方法のフローチャートである。 FIG. 11 is a flowchart of an expansion method in the data storage device according to the embodiment of the present invention.

後述するステップＳ１１１０、Ｓ１１２０、Ｓ１１３０、及びＳ１１４０は、前述したステップＳ１０３０に含まれてもよい。 Steps S1110, S1120, S1130, and S1140, which will be described later, may be included in step S1030 described above.

ステップＳ１１１０において、第１格納領域の格納量が予め定義された基準に達するか否かを判定する。 In step S1110, it is determined whether the storage amount of the first storage area reaches a predefined standard.

第１格納領域の格納量が予め定義された基準に達する場合は、ステップＳ１１２０が行われ、そうではない場合は、データの格納を行い終了する。 If the storage amount of the first storage area reaches a predefined standard, step S1120 is performed. If not, data is stored and the process ends.

ステップＳ１１２０において、第１格納領域の子格納領域の第３格納領域が生成される。 In step S1120, a third storage area of the child storage area of the first storage area is generated.

ステップＳ１１３０において、第３格納領域は第１格納領域に自らの第３キーを通知する。 In step S1130, the third storage area notifies its own third key to the first storage area.

ステップＳ１１４０において、第３格納領域の第３キーに対応するデータが第１格納領域から第３格納領域に移動する。 In step S1140, data corresponding to the third key in the third storage area moves from the first storage area to the third storage area.

ステップＳ１１４０は、（１）第３格納領域の第３キーに対応するデータが第１格納領域から第３格納領域にコピーされるステップ、及び（２）第１格納領域のデータのうち第３格納領域にコピーされたデータが削除されるステップを含んでもよい。 Step S1140 includes (1) a step of copying data corresponding to the third key of the third storage area from the first storage area to the third storage area, and (2) third storage of the data in the first storage area. A step of deleting data copied to the area may be included.

先に図７から図１０を参照して説明した本発明の一実施形態に係る技術的な内容は、本実施形態にそのまま適用されてもよい。したがって、本詳細な説明は以下では省略する。 The technical contents according to the embodiment of the present invention described above with reference to FIGS. 7 to 10 may be directly applied to the present embodiment. Therefore, this detailed description is omitted below.

図１２は、本発明の一例に係るデータ格納装置の範囲検索のフローチャートである。 FIG. 12 is a flowchart of range search of the data storage device according to an example of the present invention.

ステップＳ１２１０において、クエリが第１格納領域に提供される。 In step S1210, a query is provided to the first storage area.

クエリは、格納領域にクエリ内の検索範囲に対応するキーを有するデータを要請する文章である。 The query is a sentence requesting data having a key corresponding to the search range in the query in the storage area.

ステップＳ１２２０において、クエリが第１格納領域の子格納領域に送信される。すなわち、第１格納領域は、１つ以上の子格納領域に検索範囲に対応するキーを有するデータの第１目録を要請する。 In step S1220, the query is transmitted to the child storage area of the first storage area. That is, the first storage area requests a first list of data having a key corresponding to the search range in one or more child storage areas.

ステップＳ１２２０は再帰的に行われてもよい。クエリが送信された第１格納領域の子格納領域は、クエリを自らの１つ以上の子格納領域に再送信してもよい。 Step S1220 may be performed recursively. The child storage area of the first storage area to which the query has been sent may retransmit the query to its one or more child storage areas.

ステップＳ１２３０において、１つ以上の格納領域が第１目録を返還する。 In step S1230, one or more storage areas return the first inventory.

ステップＳ１２４０において、第１格納領域が格納したデータのうち検索範囲に対応するキーを有するデータの第２目録が返還された第１目録に併合されることで併合された目録を生成する。 In step S1240, a merged catalog is generated by merging the second catalog of data having a key corresponding to the search range among the data stored in the first storage area with the returned first catalog.

ステップＳ１２３０及びＳ１２４０は再帰的に行われてもよい。第１格納領域の子格納領域は、自らの１つ以上の子格納領域から検索範囲に対応するデータの第３目録が返還されてもよい。第１格納領域の子格納領域は、返還された第３目録を第１目録と併合してもよく、併合された第１目録を第１格納領域に返還してもよい。 Steps S1230 and S1240 may be performed recursively. In the child storage area of the first storage area, a third list of data corresponding to the search range may be returned from one or more child storage areas of the first storage area. In the child storage area of the first storage area, the returned third list may be merged with the first list, or the merged first list may be returned to the first storage area.

ステップＳ１２５０において、併合された目録が検索範囲に対する結果として返還される。 In step S1250, the merged inventory is returned as a result for the search range.

先に図７から図１１を参照して説明した本発明の一実施形態に係る技術的な内容は、本実施形態にそのまま適用されてもよい。したがって、本詳細な説明は以下では省略する。 The technical contents according to the embodiment of the present invention described above with reference to FIGS. 7 to 11 may be applied to the present embodiment as they are. Therefore, this detailed description is omitted below.

一実施形態に係る方法は、多様なコンピュータ手段によって行うことができるプログラム命令形態で実現され、コンピュータ読み出し可能媒体に記録されてもよい。記録媒体は、プログラム命令、データファイル、データ構造などを単独または組み合わせたものを含んでもよい。記録媒体及びプログラム命令は、本発明の目的のために特別に設計して構成されたものでもよく、コンピュータソフトウェア分野の技術を有する当業者にとって公知のものであり使用可能なものであってもよい。コンピュータ読取可能な記録媒体の例としては、ハードディスク、フロッピー（登録商標）ディスク及び磁気テープのような磁気媒体、ＣＤ−ＲＯＭ、ＤＶＤのような光記録媒体、フロプティカルディスクのような磁気−光媒体、及びＲＯＭ、ＲＡＭ、フラッシュメモリなどのようなプログラム命令を保存して実行するように特別に構成されたハードウェア装置を含んでもよい。プログラム命令の例としては、コンパイラによって生成されるような機械語コードだけでなく、インタプリタなどを用いてコンピュータによって実行され得る高級言語コードを含む。上述のハードウェア装置は、本発明の動作を行うために１つ以上のソフトウェアモジュールとして作動するように構成してもよく、その逆も同様である。 The method according to an embodiment may be implemented in the form of program instructions that can be performed by various computer means and recorded on a computer-readable medium. The recording medium may include a program instruction, a data file, a data structure, etc., alone or in combination. The recording medium and the program instructions may be specially designed and configured for the purpose of the present invention, and may be known and usable by those skilled in the art having computer software technology. . Examples of computer-readable recording media include magnetic media such as hard disks, floppy (registered trademark) disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magnetic-lights such as floppy disks. The medium and hardware devices specially configured to store and execute program instructions such as ROM, RAM, flash memory, etc. may be included. Examples of program instructions include not only machine language code generated by a compiler but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

上述したように本発明を限定された実施形態と図面によって説明したが、本発明は、上記の実施形態に限定されることなく、本発明が属する分野における通常の知識を有する者であれば、このような実施形態から様々に修正及び変形が可能である。 As described above, the present invention has been described with reference to the limited embodiments and drawings. However, the present invention is not limited to the above-described embodiments, and any person having ordinary knowledge in the field to which the present invention belongs can be used. Various modifications and variations are possible from such an embodiment.

したがって、本発明の範囲は、開示された実施形態に限定して定められるものではなく、特許請求の範囲及び特許請求の範囲と均等なものなどによって定められるものである。 Therefore, the scope of the present invention is not limited to the disclosed embodiments, but is defined by the claims and equivalents of the claims.

３１０クライアント
３２０メッセージングチャネル
３３２第１ノード
３３４第２ノード
３３６第３ノード
７００データ格納装置
７１０第１格納領域
７１５第１キー 310 Client 320 Messaging Channel 332 First Node 334 Second Node 336 Third Node 700 Data Storage Device 710 First Storage Area 715 First Key

Claims

Selecting one of the plurality of nodes operated by multimaster system in accordance with the data access request from the client, the selected node receives the data access request,
Before the selected node processes the data access request , it sends a multicast request for the data access request to a messaging channel;
The messaging channel receives the multicast request and multicasts the data access request to the plurality of nodes;
Processing the data access request when each of the plurality of nodes receives the multicast;
Including
The plurality of nodes includes a copy of the data access request data;
Each of the plurality of nodes does not directly communicate with other nodes and does not transmit the data access request to each other .

The messaging channel receives the data access request from a client;
The messaging channel determines the selected node of the plurality of nodes;
The messaging channel sends the data access request to the selected node;
The data management method according to claim 1, further comprising:

Receiving a data access request via a messaging channel ;
Transmitting the data access request from the messaging channel to a node selected from a plurality of nodes operating in a multi-master manner;
Before the selected node processes the data access request, it sends a multicast request for the data access request to a messaging channel;
Transmits the data access request by said messaging channel to the plurality of nodes,
Each of the plurality of nodes processing the data access request when receiving the data access request ;
The plurality of nodes are nodes including a copy of data requested by the data access request;
Each of the plurality of nodes does not directly communicate with other nodes and does not transmit the data access request to each other .

3. The data management method according to claim 2, wherein the step of determining the node determines the selected node among the plurality of nodes based on a round robin method or load balancing.

In response to a data access request from a client , the data access request is transmitted through a messaging channel to a selected node among a plurality of nodes operating in a multi-master scheme .
Before the selected node to process the data access request, and sends to the messaging channel multicast request for the data access request by said selected node,
Said data access request via multicast from the messaging channel received by the plurality of nodes,
Processing the data access request when each of the plurality of nodes receives the multicast ;
Including
Each of the plurality of nodes does not directly communicate with other nodes and does not transmit the data access request to each other .

Multiple nodes containing duplicates of the same data;
A messaging channel for transmitting a data access request to the plurality of nodes;
Including
After receiving the data access request from the messaging channel , any one of the plurality of nodes transmits a multicast request for the data access request to the messaging channel before processing the data access request. The messaging channel receives the multicast request and multicasts the data access request to the plurality of nodes, and each of the plurality of nodes processes the data access request when receiving the multicast,
Each of the plurality of nodes does not directly communicate with other nodes and does not transmit the data access request to each other .

Wherein the messaging channel receives a data access request from the client, to claim 6, characterized in that by selecting one of the nodes of said plurality of nodes transmitting the data access request to the selected node The data management system described.

The data management system according to claim 7, wherein the messaging channel selects the one node among the plurality of nodes based on a round robin method or load balancing.