JP6421146B2

JP6421146B2 - Information processing system, information processing apparatus, program

Info

Publication number: JP6421146B2
Application number: JP2016127069A
Authority: JP
Inventors: 竹本　剛; 剛竹本
Original assignee: NEC Personal Computers Ltd
Current assignee: NEC Personal Computers Ltd
Priority date: 2016-06-27
Filing date: 2016-06-27
Publication date: 2018-11-07
Anticipated expiration: 2036-06-27
Also published as: JP2018005305A

Description

本発明は、情報処理システム、情報処理装置、およびプログラムに関する。 The present invention relates to an information processing system, an information processing apparatus, and a program.

近年、インターネットや放送網から膨大な情報やデータ量が提供されるとともに、提供される情報も多様化してきている。また、インターネットや放送網から情報を取得しようとするユーザも増加している。このような状況の中、インターネットや放送網を使用してコンテンツを提供する事業者が、インターネット等へのユーザのアクセス履歴等を収集し、収集したアクセス履歴に基づいてユーザごとの嗜好を分析し、分析された嗜好に合致するコンテンツを推薦するシステムが既に知られている。 In recent years, enormous amounts of information and data have been provided from the Internet and broadcast networks, and the information provided has also been diversified. In addition, an increasing number of users are trying to acquire information from the Internet or broadcast networks. Under such circumstances, a provider that provides content using the Internet or a broadcast network collects user access history to the Internet, etc., and analyzes the preference for each user based on the collected access history. There are already known systems for recommending content that matches the analyzed preferences.

上記のようなコンテンツ推薦システムに関連する技術が例えば特許文献１に開示されている。特許文献１では、ユーザの嗜好変化に追随できるように、履歴情報とユーザ固有の情報を対応させたテーブルを用意し、該テーブルにユーザの履歴情報を反映させていくことにより、ユーザに有益な情報を提供する技術が開示されている。 A technique related to the content recommendation system as described above is disclosed in Patent Document 1, for example. In Patent Document 1, a table in which history information and user-specific information are associated with each other so as to be able to follow the user's preference change, and the user's history information is reflected in the table, which is beneficial to the user. A technique for providing information is disclosed.

特開２００９−０８７１５５号公報JP 2009-087155 A

しかし、例えば特許文献１に開示されたような従来の技術は、推薦コンテンツが所定のキーワードに基づいて検索され、検索された推薦コンテンツを取得するものであったが、所定のキーワードが社会一般の興味度が高いものだけであった場合、ユーザ固有の興味度が反映されない。また、検索処理を行う装置が複数あり、同時間帯に一斉に検索エンジンに対して検索を行ったとすると、アクセス集中によりパフォーマンスが低下してしまう。つまり、ユーザ固有の興味度を反映しつつ、装置のパフォーマンスを低下させないレコメンドシステムの構築が課題となっている。 However, for example, in the conventional technique disclosed in Patent Document 1, the recommended content is searched based on a predetermined keyword, and the searched recommended content is acquired. If the degree of interest is only high, the user-specific degree of interest is not reflected. Also, if there are a plurality of devices that perform search processing, and search is performed on the search engines all at the same time, the performance deteriorates due to concentration of access. In other words, the construction of a recommendation system that reflects the degree of interest unique to the user and does not degrade the performance of the apparatus is an issue.

本発明は、このような実情に鑑みてなされたものであって、装置間でコンテンツと共にコンテンツを検索するためのキーワードを併せて共有できる情報処理システムを提供することを目的とする。 The present invention has been made in view of such a situation, and an object of the present invention is to provide an information processing system that can share keywords for searching content together with content between devices.

本発明に係る情報処理システムは、ネットワークで接続された情報処理装置とサーバとを有する情報処理システムであって、サーバが、ドキュメントに含まれる単語であるタームに社会一般の興味度を対応付けて記憶する第１のデータベースと、ユーザが閲覧する閲覧ドキュメントに関連する第１のキーワードを、社会一般の興味度に基づいて選定する第１のキーワード選定手段と、選定された第１のキーワードに関連する第１のコンテンツを取得する第１のコンテンツ取得手段と、選定された第１のキーワード、および取得した第１のコンテンツを情報処理装置に送信するキーワードコンテンツ送信手段と、を備え、情報処理装置が、ドキュメントに含まれる単語であるタームにユーザの興味度を対応付けて記憶する第２のデータベースと、閲覧ドキュメントに関連する第２のキーワードを、ユーザの興味度に基づいて選定する第２のキーワード選定手段と、第２のキーワードを送信し、第２のキーワードに関連する第２のコンテンツを受信するコンテンツ受信手段と、受信した第１のコンテンツ、および第２のコンテンツを閲覧ドキュメントと共に表示する表示手段と、を備える、ことを特徴とする。 An information processing system according to the present invention is an information processing system having an information processing apparatus and a server connected via a network, and the server associates a general interest level with a term that is a word included in a document. A first database to be stored; a first keyword selecting means for selecting a first keyword related to a browse document to be browsed by a user based on a general public interest; and a related to the selected first keyword An information processing apparatus comprising: a first content acquisition unit configured to acquire the first content to be performed; and a keyword content transmission unit configured to transmit the selected first keyword and the acquired first content to the information processing apparatus. A second database that stores the user's interest level in association with a term that is a word included in the document; Second keyword selection means for selecting a second keyword related to the browsing document based on the degree of interest of the user, and the second keyword are transmitted, and the second content related to the second keyword is received. Content receiving means; and display means for displaying the received first content and second content together with the browsing document.

本発明に係る情報処理装置は、ユーザが閲覧する閲覧ドキュメントに関連する第１のキーワードと、第１のコンテンツと、を受信するキーワードコンテンツ受信手段と、ドキュメントに含まれる単語であるタームにユーザの興味度を対応付けて記憶するデータベースと、閲覧ドキュメントに関連する第２のキーワードを、ユーザの興味度に基づいて選定するキーワード選定手段と、第２のキーワードを送信し、第２のキーワードに関連する第２のコンテンツを受信するコンテンツ受信手段と、受信した第１のコンテンツ、および第２のコンテンツを閲覧ドキュメントと共に表示する表示手段と、を備える、ことを特徴とする。 An information processing apparatus according to the present invention includes a keyword content receiving unit that receives a first keyword and a first content related to a browse document that a user browses, and a term that is a word included in the document. A database that stores the degree of interest in association with each other, a keyword selection unit that selects a second keyword related to the viewing document based on the degree of interest of the user, and a second keyword that is transmitted and related to the second keyword Content receiving means for receiving the second content, and display means for displaying the received first content and the second content together with the browsing document.

本発明に係る工程をコンピュータに実行させるプログラムは、ユーザが閲覧する閲覧ドキュメントに関連する第１のキーワードと、第１のコンテンツと、を受信する工程と、ドキュメントに含まれる単語であるタームにユーザの興味度を対応付けて記憶するデータベースを生成する工程と、閲覧ドキュメントに関連する第２のキーワードを、ユーザの興味度に基づいて選定する工程と、第２のキーワードを送信し、第２のキーワードに関連する第２のコンテンツを受信する工程と、受信した第１のコンテンツ、および第２のコンテンツを閲覧ドキュメントと共に表示する工程と、を含む、ことを特徴とする。 A program for causing a computer to execute a process according to the present invention includes a process of receiving a first keyword related to a browse document browsed by a user and a first content, and a term that is a word included in the document. Generating a database that stores the degree of interest in association with each other, a step of selecting a second keyword related to the viewing document based on the degree of interest of the user, and transmitting the second keyword, Receiving the second content related to the keyword; and displaying the received first content and the second content together with the browsing document.

本発明によれば、装置間でコンテンツと共にコンテンツを特定するためのキーワードを併せて共有できる。 According to the present invention, it is possible to share a keyword for specifying content together with content between devices.

本発明の実施形態にかかる情報処理システムを構成するサーバ１と情報処理装置２のハードウェア構成図である。It is a hardware block diagram of the server 1 and the information processing apparatus 2 which comprise the information processing system concerning embodiment of this invention. 本発明の実施形態にかかる情報処理システムの機能ブロック図である。It is a functional block diagram of the information processing system concerning the embodiment of the present invention. 本発明の実施形態にかかるドキュメントクラスタリングの一例である。It is an example of the document clustering concerning embodiment of this invention. 本発明の実施形態にかかる閲覧ドキュメントの一例である。It is an example of the browsing document concerning embodiment of this invention. 本発明の実施形態にかかる一般興味度表示の一例である。It is an example of the general interest level display concerning embodiment of this invention. 本発明の実施形態にかかる個人興味度表示の一例である。It is an example of the personal interest degree display concerning embodiment of this invention. 本発明の実施形態にかかる閲覧ドキュメントに関連する一般キーワードと個人キーワードの抽出の一例である。It is an example of the extraction of the general keyword and personal keyword relevant to the browsing document concerning embodiment of this invention. 本発明の実施形態にかかるコンテンツ表示の一例である。It is an example of the content display concerning embodiment of this invention. 本発明の実施形態にかかるキーワードランキング化の一例である。It is an example of keyword ranking according to the embodiment of the present invention. 本発明の実施形態にかかるキーワードランキング化の一例である。It is an example of keyword ranking according to the embodiment of the present invention. 本発明の実施形態にかかるコンテンツ表示の一例である。It is an example of the content display concerning embodiment of this invention. 本発明の実施形態にかかるコンテンツ表示のフローチャートである。It is a flowchart of the content display concerning embodiment of this invention. 本発明の実施形態にかかるキーワードランキング化のフローチャートである。It is a flowchart of keyword ranking according to the embodiment of the present invention.

以下、本発明の実施の形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail.

まず、本実施形態の情報処理システムにかかるサーバ１と情報処理装置２のハードウェア構成について図１を用いて説明する。ここでのサーバ１とは、複数のコンピュータにネットワーク３０を通じて処理要求を行うホストコンピュータを指す。また、情報処理装置２とは、例えばパーソナルコンピュータ、タブレット端末、スマートフォンなどのネットワーク３０に接続が可能な情報端末を指す。尚、サーバ１、および情報処理装置２の構成は、図１に示したものと必ずしも同じ構成である必要はなく、本実施形態を実現できるハードウェアを備えていればそれで十分である。 First, the hardware configuration of the server 1 and the information processing apparatus 2 according to the information processing system of this embodiment will be described with reference to FIG. Here, the server 1 refers to a host computer that makes a processing request to a plurality of computers through the network 30. The information processing device 2 refers to an information terminal that can be connected to the network 30 such as a personal computer, a tablet terminal, or a smartphone. Note that the configurations of the server 1 and the information processing apparatus 2 are not necessarily the same as those shown in FIG. 1, and it is sufficient if hardware capable of realizing the present embodiment is provided.

サーバ１は、所定のプログラムを実行することにより、サーバ１の全体の制御を実現するためのＣＰＵ１０と、サーバ１の電源が投入されたときにＣＰＵ１０が読出すプログラムを記憶する読出専用の不揮発メモリであるマスクＲＯＭ、ＥＰＲＯＭ、またはＳＳＤなどと、ＣＰＵ１０がプログラムを読み出し、演算処理等により生成したデータを一時的に書き込む作業用の揮発メモリであるＳＲＡＭやＤＲＡＭなどから構成されるメモリ１１と、サーバ１の電源が切断されたときに種々のデータの記録を保持することが可能なＨＤＤ１２と、を備えている。 The server 1 executes a predetermined program so as to realize the overall control of the server 1, and a read-only nonvolatile memory that stores a program read by the CPU 10 when the server 1 is powered on A mask ROM, EPROM, SSD, or the like, a memory 11 composed of SRAM, DRAM, or the like, which is a volatile memory for work in which the CPU 10 reads a program and temporarily writes data generated by arithmetic processing, etc., and a server And an HDD 12 capable of holding various data records when the power of one of the devices is turned off.

また、サーバ１は、通信Ｉ／Ｆ１３を更に備えている。サーバ１は通信Ｉ／Ｆ１３を介してネットワーク３０に接続されている。通信Ｉ／Ｆ１３は、接続されている複数の情報処理装置２に対するユーザ操作に基づいてネットワーク３０経由でアクセス可能な各種情報にアクセスするものであり、通信Ｉ／Ｆ１３の具体例としてＵＳＢポートやＬＡＮポート、無線ＬＡＮポートなどがあり、外部の機器とデータの送受信が行えればどのようなものでも構わない。 The server 1 further includes a communication I / F 13. The server 1 is connected to the network 30 via the communication I / F 13. The communication I / F 13 accesses various types of information that can be accessed via the network 30 based on user operations on a plurality of information processing apparatuses 2 connected to the communication I / F 13. Specific examples of the communication I / F 13 include a USB port and a LAN. There are a port, a wireless LAN port, etc. Anything can be used as long as data can be transmitted / received to / from an external device.

情報処理装置２は、所定のプログラムを実行することにより、情報処理装置２の全体の制御を実現するためのＣＰＵ２０と、情報処理装置２の電源が投入されたときにＣＰＵ２０が読出すプログラムを記憶する読出専用の不揮発メモリであるマスクＲＯＭ、ＥＰＲＯＭ、またはＳＳＤなどと、ＣＰＵ２０がプログラムを読み出し、演算処理等により生成したデータを一時的に書き込む作業用の揮発メモリであるＳＲＡＭやＤＲＡＭなどから構成されるメモリ２１と、情報処理装置２の電源が切断されたときに種々のデータの記録を保持することが可能なＨＤＤ２２と、マウスや入力キーで構成される入力装置２３と、液晶、および有機ＥＬなどのパネルを用いたディスプレイを備えた表示装置２４と、を備えている。 The information processing apparatus 2 executes a predetermined program to store a CPU 20 for realizing overall control of the information processing apparatus 2 and a program read by the CPU 20 when the information processing apparatus 2 is turned on. It consists of mask ROM, EPROM, SSD, etc., which are read-only non-volatile memories, and SRAM, DRAM, etc., which are volatile memories for work in which the CPU 20 reads programs and temporarily writes data generated by arithmetic processing, etc. Memory 21, HDD 22 capable of holding various data records when the information processing device 2 is turned off, an input device 23 composed of a mouse and input keys, a liquid crystal display, and an organic EL display And a display device 24 including a display using a panel such as the above.

また、情報処理装置２は、通信Ｉ／Ｆ２５を更に備えている。情報処理装置２は通信Ｉ／Ｆ２５を介してネットワーク３０に接続されている。通信Ｉ／Ｆ２５は、情報処理装置２のユーザの操作に基づいてネットワーク３０経由でアクセス可能な各種情報にアクセスするものであり、通信Ｉ／Ｆの具体例としてＵＳＢポートやＬＡＮポート、無線ＬＡＮポートなどがあり、外部の機器とデータの送受信が行えればどのようなものでも構わない。 The information processing apparatus 2 further includes a communication I / F 25. The information processing apparatus 2 is connected to the network 30 via the communication I / F 25. The communication I / F 25 accesses various types of information that can be accessed via the network 30 based on the operation of the user of the information processing apparatus 2. Specific examples of the communication I / F include a USB port, a LAN port, and a wireless LAN port. Anything can be used as long as data can be transmitted / received to / from an external device.

図２は、本発明の実施形態にかかる情報処理システムの機能ブロック図である。図２に示すように、本発明にかかる情報処理システムを構成するサーバ１は、第１のデータベース１００と、第１のキーワード選定手段１０１と、第１のコンテンツ取得手段１０２と、キーワードコンテンツ送信手段１０３と、キーワードランキング手段１０４と、を備えている。また、本発明にかかる情報処理システムを構成する情報処理装置２は、第２のデータベース２００と、第２のキーワード選定手段２０１と、コンテンツ受信手段２０２と、表示手段２０３と、を備えている。 FIG. 2 is a functional block diagram of the information processing system according to the embodiment of the present invention. As shown in FIG. 2, the server 1 constituting the information processing system according to the present invention includes a first database 100, a first keyword selection unit 101, a first content acquisition unit 102, and a keyword content transmission unit. 103 and keyword ranking means 104. Further, the information processing apparatus 2 constituting the information processing system according to the present invention includes a second database 200, a second keyword selection unit 201, a content reception unit 202, and a display unit 203.

サーバ１の第１のデータベース１００は、各種データを不揮発に記憶する情報である。本実施形態では、ネットワーク３０を介して外部から定期的に取得した情報やデータを蓄積し、所定の方式でデータベース化して記憶する情報を意味する。本実施形態でデータベースとして扱うものについては、情報を記憶するための専用のＨＤＤなどの不揮発記憶装置を備えてもよいし、ＨＤＤ１２に記憶させてもよい。第１のデータベース１００の詳細については後述する。 The first database 100 of the server 1 is information that stores various data in a nonvolatile manner. In the present embodiment, it means information that is periodically acquired from the outside via the network 30 and stored in a database by a predetermined method. What is handled as a database in this embodiment may be provided with a nonvolatile storage device such as a dedicated HDD for storing information, or may be stored in the HDD 12. Details of the first database 100 will be described later.

サーバ１が備える第１のデータベース１００は、ネットワーク３０経由でアクセス可能なドキュメントを該ドキュメントが含まれるカテゴリごと、該ドキュメントに含まれるタームごとにグループ化した所謂クラスタリング方式で生成したデータベースに対して社会一般の「一般興味度」を対応付けて記憶したものである。尚、本実施形態において「ドキュメント」とは、例えば不特定多数の人間が閲覧可能な多岐に渡る情報を意味しており、例えば、政治経済などの社会記事を配信するサイトの情報や、スポーツ記事を配信するサイトの情報などをいう。本実施形態において「ターム」とは、ドキュメントに出現する単語のことを言う。以後、ユーザが閲覧する閲覧ドキュメントに出現する単語、および各種データベース等を構成する単語は、一律してタームと表記する。尚、一般興味度の詳細については後述する。 The first database 100 provided in the server 1 is a social database for a database generated by a so-called clustering method in which documents accessible via the network 30 are grouped for each category included in the document and for each term included in the document. A general “degree of general interest” is stored in association with each other. In this embodiment, “document” means, for example, a wide variety of information that can be browsed by an unspecified number of people. For example, information on sites that distribute social articles such as political economy, sports articles, etc. The information of the site that distributes. In this embodiment, “term” refers to a word that appears in a document. Hereinafter, the words appearing in the browsing document browsed by the user and the words constituting various databases and the like are collectively referred to as terms. Details of general interest will be described later.

ドキュメントのクラスタリングには様々な方式があるが、例えばドキュメントを形態素解析し、出現するタームの出現傾向が類似するタームごと、ドキュメント自体のカテゴリごとでグループ化する二次元データベースの方式がある。また、例えばドキュメントを形態素解析し、出現するタームの出現傾向が類似するタームごとのカテゴリのみでグループ化する一次元データベースの方式もある。このようにクラスタリングを行うことで、膨大な量のドキュメントであっても、ドキュメントに出現するタームの出現傾向に従ってグループ化できる。本実施形態で用いる第１のデータベース１００、および第２のデータベース２００は上述の一次元データベースの方式でクラスタリングされているものとして説明する。 There are various methods for document clustering. For example, there is a two-dimensional database method in which documents are morphologically analyzed, and groups of terms having similar appearance tendencies of terms appearing and categories of the documents themselves are grouped. For example, there is a one-dimensional database system in which documents are grouped only by category for each term in which the appearance tendency of appearing terms is similar by performing morphological analysis. By performing clustering in this way, even a huge amount of documents can be grouped according to the appearance tendency of terms appearing in the documents. The first database 100 and the second database 200 used in this embodiment will be described as being clustered by the above-described one-dimensional database method.

上記のようにクラスタリングを行った一次元データベースの一例を図３として挙げる。膨大なドキュメントに出現するタームによるグループ化を行うと、無数のカテゴリが生まれることになるが、図３では説明の簡素化のため、カテゴリを「野球」、「サッカー」、「政治」の３つに限定する。また、わかりやすく説明するためにカテゴリ名を「野球」、「サッカー」、「政治」と表記しているが、実際にコンピュータ内でデータベースを生成すると、カテゴリＩＤや、カテゴリ番号等で管理されることが一般的である。カテゴリ「野球」であれば、「野球」に関連するタームの出現傾向が類似しており、「ジャイアンツ」、「松井秀喜」などがカテゴリ「野球」に属するタームとしてグループ化される。「サッカー」、「政治」に関しても同様にグループ化される。 An example of a one-dimensional database that has been clustered as described above is shown in FIG. Grouping by terms appearing in a vast number of documents will produce an infinite number of categories, but in FIG. 3 there are three categories: “baseball”, “soccer”, and “politics” for simplicity of explanation. Limited to. For easy understanding, the category names are described as “baseball”, “soccer”, and “politics”, but when a database is actually generated in a computer, it is managed by category ID, category number, etc. It is common. If the category is “baseball”, the appearance tendency of terms related to “baseball” is similar, and “Giants”, “Hideki Matsui” and the like are grouped as terms belonging to the category “baseball”. “Soccer” and “Politics” are grouped in the same way.

図３の出現頻度は、一次元データベースを構成する全タームの出現回数に対する所定のタームの出現率を意味する。 The appearance frequency in FIG. 3 means an appearance rate of a predetermined term with respect to the number of appearances of all terms constituting the one-dimensional database.

サーバ１の第１のキーワード選定手段１０１は、ユーザが閲覧する閲覧ドキュメントに関連する第１のキーワードを、第１のデータベース１００での社会一般の興味度に基づいて選定する。閲覧ドキュメントとは、例えば図４に示すようなネットワーク３０経由でアクセス可能なドキュメントである。ここで、図４のドキュメントに関連するキーワードを選定する場合を考えてみる。 The first keyword selection unit 101 of the server 1 selects the first keyword related to the browsing document browsed by the user based on the general public interest level in the first database 100. A browsing document is a document accessible via the network 30 as shown in FIG. 4, for example. Here, consider the case of selecting a keyword related to the document of FIG.

まず、一次元データベースに存在する複数のカテゴリの中で、図４のドキュメントが属するカテゴリを特定する方法を説明する。属するカテゴリの判断は、例えば図４のドキュメントを構成する文章を形態素解析して抽出されるタームとタームの出現頻度に着目することで判断できる。詳細に言えば、図４のドキュメントを構成する文章の形態素解析により抽出されるタームの一次元データベースでの出現頻度に基づいて、図４のドキュメントと図３の各カテゴリとの類似性を評価する。類似性は、一次元データベースでの閲覧ドキュメントに出現するタームの出現頻度の合計値と、閲覧ドキュメントでの出現頻度の合計値との相関に基づいて評価することができる。尚、評価方法はこの方法だけに限定されない。例えば、一次元データベースでの閲覧ドキュメントに出現するタームの出現頻度の値が単純に大きいものを類似性が高いと評価することも可能である。更には、ユークリッド距離、またはコサイン類似度などを適用して類似性を評価することも可能である。 First, a method for specifying a category to which the document of FIG. 4 belongs among a plurality of categories existing in the one-dimensional database will be described. The category to which the user belongs can be determined, for example, by paying attention to the terms that are extracted by morphological analysis of the sentences constituting the document in FIG. Specifically, the similarity between the document of FIG. 4 and each category of FIG. 3 is evaluated based on the appearance frequency in the one-dimensional database of terms extracted by the morphological analysis of the sentences constituting the document of FIG. . Similarity can be evaluated based on the correlation between the total appearance frequency of terms appearing in a browsing document in a one-dimensional database and the total appearance frequency of browsing documents. The evaluation method is not limited to this method. For example, it is also possible to evaluate that the value of the appearance frequency of the term appearing in the browsing document in the one-dimensional database is simply large as having high similarity. Furthermore, the similarity can be evaluated by applying the Euclidean distance or the cosine similarity.

類似性の判断結果より、図４のドキュメントと類似する図３のカテゴリは「野球」であるとする。つまり、第１のキーワードはカテゴリ「野球」の中のタームから選定されることになる。図５はカテゴリ「野球」に出現するタームの出現頻度より、一般興味度を算出した一例である。一般興味度とは、図３の一次元データベースに出現するタームの特徴量として定義したものである。サーバ１が取得する情報として、上記で説明したネットワーク３０経由でアクセス可能な多様なカテゴリのドキュメントの他に、不特定多数のユーザが自由に発言したり、ＷＥＢリンクを張り付けたりできる所謂ツイッタ−（登録商標）やＳＮＳなどのソーシャルネットワークサービスがある。このようなソーシャルネットワークサービスは、例えばあるトピックに対しての不特定ユーザの書き込みやアクセス回数などから現在の流行や注目されていることなどが反映されやすいという特徴がある。つまり社会一般で流行っているもの、注目しているものが顕著に出やすい。サーバ１は、図示していないが、このようなソーシャルネットワークサービスで取得したドキュメントを構成する文章を形態素解析して抽出したタームと出現頻度を関連付けた「ソーシャルデータベース」を備えている。 From the similarity determination result, it is assumed that the category of FIG. 3 similar to the document of FIG. 4 is “baseball”. That is, the first keyword is selected from terms in the category “baseball”. FIG. 5 is an example in which the general interest level is calculated from the appearance frequency of terms appearing in the category “baseball”. The general interest level is defined as a feature amount of a term appearing in the one-dimensional database in FIG. As information acquired by the server 1, in addition to the various categories of documents that can be accessed via the network 30 described above, a so-called Twitter that allows an unspecified number of users to speak freely or attach a WEB link ( There are social network services such as registered trademark and SNS. Such a social network service has a feature that it is easy to reflect the current trend and attention from the writing of an unspecified user and the number of accesses to a topic, for example. In other words, what is popular and popular in society is likely to be noticeable. Although not shown, the server 1 includes a “social database” in which terms extracted from a morphological analysis of sentences constituting a document acquired by such a social network service are associated with the appearance frequency.

一般興味度はソーシャルネットワークサービスで取得したドキュメントから生成したデータベースに出現するタームの出現頻度と、図３のデータベースに出現する同タームの出現頻度と、の相関により算出することが可能である。一例として一般興味度は、図５の一次元データベースでの出現頻度をＡ、ソーシャルデータベースでの出現頻度をＢとして、ＬＯＧ（Ｂ／Ａ）で算出できる。尚、本実施形態ではＬＯＧの底を１０として算出しているが、底は１０に限定されない。算出された値は、社会一般における興味度を示している。一次元データベースでの出現頻度が低く、ソーシャルデータベースでの出現頻度が高いほど大きな値を示し、より嗜好性の強いタームであると言えるからである。図５では、カテゴリ「野球」で一般興味度が高いタームは「ジャイアンツ」、「松井秀喜」、「阪神」ということになる。このように、一次元データベースに対して、ソーシャルデータベース、つまり、社会一般で注目されているサイトの情報から一次元データベースを構成するタームの特徴量としての一般興味度を対応付けて記憶したものが第１のデータベース１００として定義される。 The general interest level can be calculated based on the correlation between the appearance frequency of terms appearing in a database generated from a document acquired by a social network service and the appearance frequency of the terms appearing in the database of FIG. As an example, the general interest degree can be calculated by LOG (B / A), where A is the appearance frequency in the one-dimensional database of FIG. 5 and B is the appearance frequency in the social database. In the present embodiment, the bottom of the LOG is calculated as 10, but the bottom is not limited to 10. The calculated value indicates the degree of interest in society in general. This is because the lower the appearance frequency in the one-dimensional database and the higher the appearance frequency in the social database, the larger the value, and it can be said that the term has a stronger preference. In FIG. 5, the terms with high general interest in the category “baseball” are “Giants”, “Hideki Matsui”, and “Hanshin”. In this way, a one-dimensional database is stored in association with a social database, that is, a general interest level as a feature quantity of a term constituting the one-dimensional database from information on a site that is attracting attention in general society. It is defined as the first database 100.

算出された一般興味度より、カテゴリ「野球」から、「ジャイアンツ」、「松井秀喜」、「阪神」が選定されるキーワードの候補となる。尚、キーワードは、一般興味度が最も高いものだけを選定してもよいし、所定の規定に基づいて複数選定してもよい。本実施例では興味度の高い上位３つのターム「ジャイアンツ」、「松井秀喜」、「阪神」をキーワードとして選定する。 Based on the calculated general interest level, “Giants”, “Hideki Matsui”, and “Hanshin” are selected from the category “baseball”. Note that only keywords with the highest general interest may be selected, or a plurality of keywords may be selected based on a predetermined rule. In this embodiment, the top three terms “Giants”, “Hideki Matsui”, and “Hanshin” with the highest degree of interest are selected as keywords.

サーバ１の第１のキーワード選定手段１０１は、ＣＰＵ１０がメモリ１１に記憶されている所定のキーワード選定プログラムに基づいてＨＤＤ１２に記憶されているデータベース等を読み出して実行し、演算処理等されたデータをメモリ１１に一時的に記憶、もしくはＨＤＤ１２などに記憶することで実現が可能である。 The first keyword selection means 101 of the server 1 reads and executes a database or the like stored in the HDD 12 based on a predetermined keyword selection program stored in the memory 11 by the CPU 10, and executes the arithmetic processing data. This can be realized by temporarily storing it in the memory 11 or storing it in the HDD 12 or the like.

サーバ１の第１のコンテンツ取得手段１０２は、選定された第１のキーワードに関連する第１のコンテンツを取得する。なお、本実施形態における「コンテンツ」は、コンテンツという語句そのものが有する通常の意味に加え、例えば、映像、音楽、文章、又はそれらの組合せ等の、メディアが記録又は伝送し、人間が鑑賞するひとまとまりの情報をいい、実例でいえば例えばインターネットで配信されるアプリケーションやダウンロード可能な映像コンテンツ若しくは音楽コンテンツ等をいう。 The first content acquisition unit 102 of the server 1 acquires the first content related to the selected first keyword. Note that “content” in the present embodiment is a person that is recorded or transmitted by a medium such as video, music, text, or a combination thereof and is viewed by a human being in addition to the ordinary meaning of the word “content” itself. This refers to a group of information. For example, it refers to applications distributed on the Internet, downloadable video content, music content, and the like.

取得されるコンテンツは、全て有効な情報として扱ってもよいが、コンテンツに含まれるドキュメントを構成する文章を再度形態素解析して、出現するタームの一般興味度からコンテンツを絞り込んでもよい。取得したコンテンツを再評価することで、社会一般の嗜好にマッチしたコンテンツに絞り込んで提供することが可能となる。 The acquired contents may all be handled as valid information, but the sentences constituting the document included in the contents may be subjected to morphological analysis again, and the contents may be narrowed down based on the general interest level of the appearing terms. By re-evaluating the acquired content, it becomes possible to narrow down and provide content that matches general social preferences.

サーバ１の第１のコンテンツ取得手段１０２は、ＣＰＵ１０がメモリ１１に記憶されている所定のコンテンツ取得プログラムに基づいてＨＤＤ１２に記憶されているデータベース等を読み出して実行し、通信Ｉ／Ｆ１３よりネットワーク３０を介して所定のウェブサイトにアクセスし、コンテンツを取得することで実現が可能である。 The first content acquisition unit 102 of the server 1 reads and executes a database or the like stored in the HDD 12 based on a predetermined content acquisition program stored in the memory 11 by the CPU 10, and the network 30 through the communication I / F 13. This can be realized by accessing a predetermined website via the URL and acquiring content.

サーバ１のキーワードコンテンツ送信手段１０３は、選定された第１のキーワード、および取得した第１のコンテンツを情報処理装置２に送信する。取得したコンテンツが複数の場合、全てのコンテンツを送信するか、もしくは、第１のコンテンツ取得手段１０２により、一般興味度が高いと評価されたコンテンツのみを送信してもよい。また、第１のキーワードに関しても同様に、第１のキーワードと第２のコンテンツを紐付け、送信対象となった第１のコンテンツに関連する第１のキーワードのみを送信するようにしてもよい。 The keyword content transmission unit 103 of the server 1 transmits the selected first keyword and the acquired first content to the information processing apparatus 2. When there are a plurality of acquired contents, all the contents may be transmitted, or only the contents evaluated as having a high general interest level by the first content acquiring unit 102 may be transmitted. Similarly, the first keyword may be associated with the second content, and only the first keyword related to the first content to be transmitted may be transmitted.

サーバ１のキーワードコンテンツ送信手段１０３は、ＣＰＵ１０がメモリ１１に記憶されている所定のキーワードコンテンツ送信プログラムに基づいてＨＤＤ１２に記憶されているデータベース等を読み出して実行し、通信Ｉ／Ｆ１３よりネットワーク３０を介して情報処理装置２に第１のキーワード、および第１のコンテンツを送信することで実現が可能である。 The keyword content transmitting means 103 of the server 1 reads and executes a database or the like stored in the HDD 12 based on a predetermined keyword content transmission program stored in the memory 11 by the CPU 10, and connects the network 30 through the communication I / F 13. This can be realized by transmitting the first keyword and the first content to the information processing apparatus 2 via the network.

情報処理装置２が備える第２のデータベース２００は、ネットワーク３０経由でアクセス可能なドキュメントを該ドキュメントが含まれるカテゴリごと、該ドキュメントに含まれるタームごとにグループ化した所謂クラスタリング方式で生成したデータベースに対して「ユーザ興味度」を対応付けて記憶したものである。データベースの生成方式としては、第１のデータベース１００の生成方式と同様に、例えばドキュメントを構成する文章を形態素解析し、出現するタームの出現傾向が類似するタームごと、ドキュメント自体のカテゴリごとでグループ化する二次元データベースの方式、もしくはドキュメントを構成する文章を形態素解析し、出現するタームの出現傾向が類似するタームごとのみでグループ化する一次元データベースの方式もある。前述のとおり、第２のデータベース２００の土台となるデータベースは、第１のデータベース１００と同様の一次元データベースとする。尚、ユーザ興味度の詳細については後述する。 The second database 200 included in the information processing apparatus 2 is a database generated by a so-called clustering method in which documents accessible via the network 30 are grouped for each category including the document and for each term included in the document. The “user interest” is stored in association with each other. As a database generation method, as in the first database 100 generation method, for example, sentences constituting a document are analyzed by morphological analysis, and groups of terms having similar appearance tendencies of appearing terms and categories of the document itself are grouped. There is also a two-dimensional database method, or a one-dimensional database method in which sentences constituting a document are morphologically analyzed and grouped only by terms having similar appearance tendencies. As described above, the database serving as the basis of the second database 200 is a one-dimensional database similar to the first database 100. Details of the user interest level will be described later.

一次元データベースは、第１のデータベース１００と同様に図３を用いる。第１のデータベース１００において一般興味度を特徴量として定義したように、情報処理装置２が備える第２のデータベース２００では、情報処理装置２を保有するユーザ自身のユーザ興味度を特徴量として定義する。情報処理装置２が取得する情報として、上記で説明したネットワーク３０経由でアクセス可能な多様なカテゴリのドキュメントの他に、ユーザ自身の嗜好に基づいてネットワーク３０経由で取得したドキュメントがある。例えば、「野球」が好きなユーザであれば、必然的に「野球」に関連するドキュメントを閲覧するケースが多くなり、ユーザ自身の操作に基づいてネットワーク３０経由で取得したドキュメントが記憶される。このようにユーザ自身の嗜好に基づいてネットワーク３０経由で取得したドキュメントの閲覧履歴に基づいて図３の一次元データベースと同様のクラスタリング方式で生成された「ユーザデータベース」も備えている。 The one-dimensional database uses FIG. 3 similarly to the first database 100. As in the first database 100, the general interest level is defined as the feature amount, the second database 200 included in the information processing device 2 defines the user interest level of the user who owns the information processing device 2 as the feature amount. . Information acquired by the information processing apparatus 2 includes documents acquired via the network 30 based on the user's own preferences, in addition to the various categories of documents accessible via the network 30 described above. For example, a user who likes “baseball” inevitably browses documents related to “baseball”, and a document acquired via the network 30 based on the user's own operation is stored. Thus, a “user database” generated by the same clustering method as the one-dimensional database of FIG. 3 based on the browsing history of the document acquired via the network 30 based on the user's own preference is also provided.

ユーザ興味度は、ユーザ自身の嗜好に基づいて取得したドキュメントから生成したユーザデータベースに出現するタームの出現頻度と、図３の一次元データベースに出現する同タームの出現頻度と、の相関により算出することが可能である。計算方法の一例として、図５の一般興味度と同様の計算方法で算出が可能である。算出結果は図６のようになっており、ここでは２つの情報処理装置２がサーバ１とネットワーク３０経由で接続されているケースを想定し、「Ａさん」と「Ｂさん」それぞれのユーザ興味度が記載されている。「Ａさん」、および「Ｂさん」それぞれのユーザ興味度は同じカテゴリ「野球」であっても、例えば「Ａさん」は「ソフトバンク」というタームの興味度が高く、「Ｂさん」は「イチロー」というタームの興味度が高いなど、カテゴリ内での興味度はユーザごとで様々である。このように、一次元データベースに対して、ユーザデータベース、つまり、情報処理装置２を保有するユーザが過去に閲覧したドキュメントの履歴情報から一次元データベースを構成するタームの特徴量としてのユーザ興味度を対応付けて記憶したものが第２のデータベース２００として定義される。 The user interest level is calculated based on the correlation between the appearance frequency of the terms appearing in the user database generated from the document acquired based on the user's own preference and the appearance frequency of the terms appearing in the one-dimensional database of FIG. It is possible. As an example of the calculation method, the calculation can be performed by the same calculation method as the general interest degree in FIG. The calculation result is as shown in FIG. 6. Here, assuming that the two information processing apparatuses 2 are connected to the server 1 via the network 30, the user interests of “Mr. A” and “Mr. B” respectively. Degrees are listed. Even if the user interests of “Mr. A” and “Mr. B” are in the same category “baseball”, for example, “Mr. A” has a higher interest degree of the term “Softbank”, and “Mr. B” has “Ichiro” The interest level within the category varies from user to user. Thus, the user interest degree as the feature quantity of the term constituting the one-dimensional database from the history information of the user database, that is, the document viewed in the past by the user holding the information processing apparatus 2 is compared with the one-dimensional database. What is stored in association with each other is defined as the second database 200.

以上のように、情報処理装置２を保有するユーザがそれぞれ固有の第２のデータベース２００を備えている。 As described above, each user who has the information processing apparatus 2 has the unique second database 200.

情報処理装置２の第２のキーワード選定手段２０１は、閲覧ドキュメントに関連する第２のキーワードを、第２のデータベース２００でのユーザの興味度に基づいて選定する。前述で定義したユーザ興味度に基づいて「Ａさん」が保有する情報処理装置２から第２のキーワードを選定する場合を考えてみる。図７に示すとおり、ユーザ興味度に着目すると、カテゴリ「野球」から、「ソフトバンク」、「カープ」、「ジャイアンツ」が選定されるキーワードの候補となる。閲覧ドキュメントと類似性が高いカテゴリは、第１のキーワードを選定した時と同様に「野球」であるものとする。選定する第２のキーワードは、ユーザ興味度が最も高いものだけを選定してもよいし、所定の規定に基づいて複数選定してもよい。また、第１のキーワード選定手段１０１により選定されたキーワードと重複するキーワードは選定対象としなくてもよい。本実施例ではユーザ興味度の高い上位３つのターム「ソフトバンク」、「カープ」、「ジャイアンツ」をキーワードとして選定する。 The second keyword selection unit 201 of the information processing device 2 selects the second keyword related to the browse document based on the user's interest level in the second database 200. Consider a case where the second keyword is selected from the information processing apparatus 2 owned by “Mr. A” based on the user interest level defined above. As shown in FIG. 7, when attention is focused on the user interest level, “soft bank”, “carp”, and “giants” are selected from the category “baseball” as candidate keywords. It is assumed that the category having a high similarity to the browsed document is “baseball” in the same manner as when the first keyword is selected. As the second keyword to be selected, only the keyword having the highest degree of user interest may be selected, or a plurality of second keywords may be selected based on a predetermined rule. Further, a keyword that overlaps with the keyword selected by the first keyword selection unit 101 may not be selected. In the present embodiment, the top three terms “Softbank”, “Carp”, and “Giants” with the highest user interest are selected as keywords.

選定された第２のキーワードのユーザ興味度を一般興味度と比較すると、「ジャイアンツ」というキーワードが第１のキーワードと重複しているが、「ソフトバンク」、「カープ」は、一般興味度が低く、ユーザ興味度が高いユーザ固有のキーワードである。このように、一般興味度とユーザ興味度双方の嗜好性の高いキーワードを選定することが可能となる。 When the user interest degree of the selected second keyword is compared with the general interest degree, the keyword “Giants” overlaps with the first keyword, but “soft bank” and “carp” have low general interest degree. This is a user-specific keyword with a high degree of user interest. As described above, it is possible to select a keyword having high preference for both the general interest level and the user interest level.

情報処理装置２の第２のキーワード選定手段２０１は、ＣＰＵ２０がメモリ２１に記憶されている所定のキーワード選定プログラムに基づいてＨＤＤ２２に記憶されているデータベース等を読み出して実行し、演算処理等されたデータをメモリ２１に一時的に記憶、もしくはＨＤＤ２２などに記憶することで実現が可能である。 In the second keyword selection unit 201 of the information processing apparatus 2, the CPU 20 reads and executes a database or the like stored in the HDD 22 based on a predetermined keyword selection program stored in the memory 21, and performs arithmetic processing or the like. This can be realized by temporarily storing the data in the memory 21 or by storing it in the HDD 22 or the like.

情報処理装置２のコンテンツ受信手段２０２は、第２のキーワードを送信し、第２のキーワードに関連する第２のコンテンツを受信する。前述した第２のキーワード選定手段２０１で選定された第２のキーワードを第２のコンテンツが取得可能なサーバ１に送信する。尚、第２のキーワードの送信先のサーバとしては、第１のキーワード、および第１のコンテンツの送信元であるサーバ１であってもよいし、または別のサーバやホストコンピュータ等であってもよい。情報処理装置２から受信した第２のキーワードに関連するコンテンツを取得できる機能を有するコンピュータであれば特に制限は設けない。第２のキーワードは前述した「Ａさん」が保有する情報処理装置２から選定した「ソフトバンク」、「カープ」、「ジャイアンツ」であるものとする。第２のコンテンツを受信する際に、例えば一般興味度から選定された「ジャイアンツ」という重複したキーワードに関連するコンテンツは取得しなくてもよい。 The content receiving unit 202 of the information processing device 2 transmits the second keyword and receives the second content related to the second keyword. The second keyword selected by the second keyword selection means 201 described above is transmitted to the server 1 capable of acquiring the second content. The second keyword transmission destination server may be the first keyword and the server 1 that is the transmission source of the first content, or may be another server or a host computer. Good. There is no particular limitation as long as the computer has a function capable of acquiring content related to the second keyword received from the information processing device 2. The second keyword is “Softbank”, “Carp”, and “Giants” selected from the information processing apparatus 2 owned by “Mr. A”. When receiving the second content, for example, the content related to the duplicate keyword “Giants” selected from the general interest level may not be acquired.

情報処理装置２のコンテンツ受信手段２０２は、ＣＰＵ２０がメモリ２１に記憶されている所定のコンテンツ受信プログラムに基づいて通信Ｉ／Ｆ２５よりネットワーク３０を介して第２のキーワードを送信し、第２のキーワードに関連する第２のコンテンツを外部より受信することで実現が可能である。 In the content receiving unit 202 of the information processing apparatus 2, the CPU 20 transmits the second keyword from the communication I / F 25 via the network 30 based on a predetermined content receiving program stored in the memory 21. This can be realized by receiving the second content related to.

情報処理装置２の表示手段２０３は、受信した第１のコンテンツ、および第２のコンテンツを前記閲覧ドキュメントと共に表示する。図８にコンテンツ表示の一例を示す。受信した第１のコンテンツと、ユーザ興味度に基づいて取得された第２のコンテンツを、例えば表示装置２４に出力する。ユーザ固有のキーワードである「ソフトバンク」、「カープ」に関連するコンテンツだけでなく、一般興味度の高い「ジャイアンツ」、「松井秀喜」に関連するコンテンツも閲覧することが可能であり、個人的に嗜好性の高いコンテンツだけでなく、社会一般で嗜好性の高いコンテンツも閲覧することが可能となる。 The display unit 203 of the information processing apparatus 2 displays the received first content and second content together with the browse document. FIG. 8 shows an example of content display. The received first content and the second content acquired based on the user interest level are output to the display device 24, for example. Not only content related to user-specific keywords “Softbank” and “Carp”, but also content related to “Giants” and “Hideki Matsui” with high general interest can be viewed personally. It is possible to browse not only highly-preference content, but also general and high-taste content.

情報処理装置２の表示手段２０３は、ＣＰＵ２０がメモリ２１に記憶されている所定の情報表示プログラムに基づいて、表示装置２４に記憶されている所定の表示形式に従って表示装置２４に情報を表示することで実現が可能である。 The display unit 203 of the information processing device 2 displays information on the display device 24 according to a predetermined display format stored in the display device 24 based on a predetermined information display program stored in the memory 21 by the CPU 20. Can be realized.

サーバ１が更に備えるキーワードランキング手段１０４は、情報処理装置２より受信した第２のキーワードの特徴に基づいてランク付けを行う。尚、ここでは、サーバ１がネットワーク３０を介して複数の情報処理装置２と接続されている場合を考えてみる。全ての情報処理装置２が前述の機能を備えており、更に情報処理装置２で選定された第２のキーワードはサーバ１に送信されるものとする。つまりサーバ１に接続されている「Ａさん」、「Ｂさん」それぞれが保有する情報処理装置２から第２のキーワードを受信する。 The keyword ranking unit 104 further provided in the server 1 performs ranking based on the characteristics of the second keyword received from the information processing apparatus 2. Here, consider a case where the server 1 is connected to a plurality of information processing apparatuses 2 via the network 30. It is assumed that all the information processing apparatuses 2 have the above-described function, and the second keyword selected by the information processing apparatus 2 is transmitted to the server 1. That is, the second keyword is received from the information processing apparatus 2 owned by “Mr. A” and “Mr. B” connected to the server 1.

複数のユーザから受信したキーワードの一覧を図９として示す。尚、各ユーザから受信する第２のキーワードはユーザ興味度が高い上位３つとする。図９を参照すると、「Ａさん」が保有する情報処理装置２から受信したキーワードは「ソフトバンク」、「カープ」、「ジャイアンツ」の３つであり、「Ｂさん」が保有する情報処理装置２から受信したキーワードは「イチロー」、「ジャイアンツ」、「松井秀喜」の３つである。尚、カテゴリ「野球」に属するタームで、キーワードとして選定されなかったタームは「選定外」とし、ユーザ興味度は0とする。 A list of keywords received from a plurality of users is shown in FIG. The second keywords received from each user are the top three having the highest user interest. Referring to FIG. 9, there are three keywords “Softbank”, “Carp”, and “Giants” received from the information processing apparatus 2 possessed by “Mr. A”, and the information processing apparatus 2 possessed by “Mr. B”. The keywords received from are “Ichiro”, “Giants”, and “Hideki Matsui”. A term belonging to the category “baseball” and not selected as a keyword is set to “not selected”, and the user interest level is set to 0.

＜本実施形態におけるキーワードランキング化の第１の実施形態＞
図９に示した情報より、ユーザが保有する情報処理装置２から受信した第２のキーワードをランキング付けする第１の実施形態について説明する。第１の実施形態では、情報処理装置２から受信したキーワードと、キーワードにおけるユーザ興味度の情報からランキング付けを行う。まず、キーワードごとに複数のユーザのユーザ興味度を積算して算出する。例えば、キーワード「ジャイアンツ」は「Ａさん」、「Ｂさん」それぞれの情報処理装置２から受信したキーワードであるため、「Ａさん」のユーザ興味度は0.3、「Ｂさん」のユーザ興味度は0.35とすると、積算結果は0.65になる。このようにキーワードごとにユーザ興味度を積算し、積算結果が大きい順にランキング付けを行う。 <First embodiment of keyword ranking in this embodiment>
A first embodiment in which the second keyword received from the information processing apparatus 2 owned by the user is ranked based on the information shown in FIG. 9 will be described. In the first embodiment, ranking is performed based on keywords received from the information processing apparatus 2 and information on the degree of user interest in the keywords. First, the user interest levels of a plurality of users are integrated and calculated for each keyword. For example, since the keyword “Giant” is a keyword received from each of the information processing apparatuses 2 of “Mr. A” and “Mr. B”, the user interest level of “Mr. A” is 0.3 and the user interest level of “Mr. B” is Assuming 0.35, the integration result is 0.65. Thus, the user interest level is integrated for each keyword, and ranking is performed in descending order of the integration result.

積算結果より、上位キーワードは「ソフトバンク」、「イチロー」、「ジャイアンツ」となり、これらのキーワードがユーザに送信されるコンテンツを取得するためのキーワードの候補となる。「Ａさん」のユーザ興味度からは「イチロー」というキーワードは「選定外」であったが、「Ｂさん」のユーザ興味度から「イチロー」というキーワードが選定されている。つまり、自身のユーザ興味度が低いキーワードであっても、他人のユーザ興味度が高ければキーワードとして選定される可能性がある。 From the integration result, the upper keywords are “SoftBank”, “Ichiro”, and “Giants”, and these keywords are candidate keywords for acquiring content to be transmitted to the user. The keyword “Ichiro” is “not selected” from the user interest level of “Mr. A”, but the keyword “Ichiro” is selected from the user interest level of “Mr. B”. That is, even if a keyword has a low degree of user interest, it may be selected as a keyword if the interest of another user is high.

＜本実施形態におけるキーワードランキング化の第２の実施形態＞
次に、図１０に示した情報より、ユーザが保有する情報処理装置２から受信した第２のキーワードをランキング付けする第２の実施形態について説明する。第２の実施形態では、受信したキーワードの集計数からランキング付けを行う。本実施形態ではサーバ１とネットワーク３０を介して接続されている情報処理装置２は２台の想定であるが、例えば１０００台の情報処理装置２と接続されている場合などでは、受信するキーワードの重複が多々起きることが想定される。集計方法としては、「Ａさん」が保有する情報処理装置２から受信したキーワードの中に「イチロー」が含まれていれば“１”として集計する。このようにサーバ１とネットワーク３０で接続されている全てのユーザの情報処理装置２より受信したキーワードの集計量を算出する。 <Second embodiment of keyword ranking in this embodiment>
Next, a second embodiment will be described in which the second keyword received from the information processing apparatus 2 owned by the user is ranked based on the information shown in FIG. In the second embodiment, ranking is performed based on the total number of keywords received. In the present embodiment, it is assumed that there are two information processing apparatuses 2 connected to the server 1 via the network 30. For example, when the information processing apparatuses 2 are connected to 1000 information processing apparatuses 2, the keyword It is assumed that many duplications will occur. As a counting method, if “Ichiro” is included in the keyword received from the information processing apparatus 2 owned by “Mr. A”, the keyword is counted as “1”. Thus, the total amount of keywords received from the information processing devices 2 of all users connected to the server 1 and the network 30 is calculated.

算出結果により、上位キーワードは「イチロー」、「ジャイアンツ」、「ソフトバンク」となり、これらのキーワードが、ユーザが保有する情報処理装置２に送信されるコンテンツを取得するためのキーワードの候補となる。 According to the calculation result, the upper keywords are “ICHIRO”, “GIANTS”, and “SOFTBANK”, and these keywords are candidate keywords for acquiring content transmitted to the information processing apparatus 2 owned by the user.

本実施形態におけるキーワードランキング化のその他の実施形態としては、例えば一般興味度に基づいて選定されたキーワードと重複するキーワードと、一般興味度に基づいて選定されなかったユーザ固有のキーワードとが共存する場合は、ユーザ固有のキーワードを優先的に上位にランキング化し、一般興味度に基づいて選定されたキーワードと重複するキーワードは除外する。このようにキーワードの数自体を極力減らすことでパフォーマンスの向上が期待できる。また、ユーザ固有のキーワードは、そのユーザごとに優先的にランキング化する方法がある。例えば、一般興味度に基づいて選定されなかった「ソフトバンク」、「イチロー」は、「ソフトバンク」はＡさん固有のキーワードであり、「イチロー」はＢさん固有のキーワードであるため、「Ａさん」に対しては「ソフトバンク」を優先的に上位にランク付けし、「Ｂさん」に対しては「イチロー」を優先的に上位にランク付ける。このように各ユーザの固有のキーワードを、ユーザごとに優先的にランク付けすることで各ユーザ固有の嗜好性の高いキーワードを漏らさず選定することができる。また、社会一般の興味度が高いキーワードにおいて、取得したコンテンツに対するユーザの反響、つまりアクセス回数などに予めしきい値を設けておき、所定の回数以上のアクセスがあったコンテンツに関連するキーワードを優先的に上位にランキング化するなどの方法もある。 As another embodiment of the keyword ranking in this embodiment, for example, a keyword that overlaps with a keyword selected based on the general interest level and a user-specific keyword that is not selected based on the general interest level coexist. In this case, the keywords specific to the user are ranked in a higher ranking, and keywords that overlap with the keywords selected based on the general interest are excluded. Thus, performance can be expected to be reduced by reducing the number of keywords as much as possible. In addition, there is a method in which keywords unique to users are preferentially ranked for each user. For example, “Softbank” and “Ichiro”, which were not selected based on general interest, are “Softbank” is a keyword unique to Mr. A, and “Ichiro” is a keyword unique to Mr. B. “Softbank” is preferentially ranked higher, and “Ichiro” is preferentially ranked higher for “Mr. B”. In this way, the keywords unique to each user are ranked preferentially for each user, so that it is possible to select the keywords with high preference unique to each user without leakage. In addition, for keywords with a high degree of interest in general society, a threshold is set in advance for the user's response to acquired content, that is, the number of accesses, etc., and keywords related to content that has been accessed more than a predetermined number of times are given priority. There are also methods such as ranking in the higher rank.

複数のユーザから数種の第２のキーワードを受信した際に、受信した全ての第２のキーワードに関連するコンテンツを取得するのではなく、所定のランキング手法に基づいてキーワードのランキング化を行うことでキーワードの優劣を規定することができ、興味度が高くないキーワードを予め省けるため全体としてのパフォーマンス向上が期待できる。 When several types of second keywords are received from a plurality of users, the keywords are ranked based on a predetermined ranking method, instead of acquiring content related to all the received second keywords. It is possible to define the superiority or inferiority of keywords, and it is possible to expect the improvement of the performance as a whole because keywords that are not of high interest can be omitted in advance.

サーバ１のキーワードランキング手段１０４は、ＣＰＵ１０がメモリ１１に記憶されている所定のキーワードランキングプログラムに基づいてＨＤＤ１２に記憶されているデータベース等を読み出して実行し、演算処理等されたデータをメモリ１１に一時的に記憶、もしくはＨＤＤ１２などに記憶することで実現が可能である。 The keyword ranking unit 104 of the server 1 reads and executes a database or the like stored in the HDD 12 based on a predetermined keyword ranking program stored in the memory 11 by the CPU 10, and stores the processed data in the memory 11. This can be realized by temporarily storing it in the HDD 12 or the like.

図１１として、キーワードランキング化に基づいて選定されたキーワードに関連するコンテンツの表示例を示す。図１１は「Ａさん」保有の情報処理装置２にコンテンツを表示する表示例である。キーワードランキング化の手法は、前述したユーザ興味度に基づいてもよいし、集計量に基づいてもよく、ユーザ固有のキーワードの漏らさず選定できれば特に限定はしない。図１１では、図８と同様に指定されたドキュメントと共に一般興味度に基づいて取得された第１のコンテンツと、ユーザ興味度に基づいて取得された第２のコンテンツが共に表示されている。 FIG. 11 shows a display example of content related to keywords selected based on keyword ranking. FIG. 11 is a display example in which content is displayed on the information processing apparatus 2 owned by “Mr. A”. The keyword ranking method may be based on the above-described degree of user interest or may be based on the total amount, and is not particularly limited as long as it can be selected without leaking keywords specific to the user. In FIG. 11, the first content acquired based on the general interest level and the second content acquired based on the user interest level are displayed together with the designated document as in FIG.

「Ａさん」が保有する情報処理装置２から受信したキーワードの特徴から「イチロー」というキーワードは「選定外」であったが、他のユーザが保有する情報処理装置２から受信したキーワードの特徴から「イチロー」が選定されたため、「Ａさん」が保有する情報処理装置２で「イチロー」に関連するコンテンツが表示されている。また、キーワードをコンテンツと共に表示することで、指定されたドキュメントに対して社会一般が関心を寄せているフレーズと、自身、もしくは不特定多数のユーザが関心を寄せているフレーズがどのようなものであるかを認識することが可能である。 Although the keyword “ICHIRO” was “not selected” from the characteristics of the keyword received from the information processing apparatus 2 owned by “Mr. A”, the keyword characteristics received from the information processing apparatus 2 owned by other users Since “ICHIRO” is selected, the content related to “ICHIRO” is displayed on the information processing apparatus 2 owned by “Mr. A”. In addition, by displaying the keywords with the content, what kind of phrases the general society is interested in for the specified document and the phrases that the user or an unspecified number of users are interested in? It is possible to recognize that there is.

以上、本発明の実施形態について説明を行った。本発明により、装置間でコンテンツと共にコンテンツを検索するためのキーワードを併せて共有でき、ユーザ自身の嗜好性が高いコンテンツだけでなく、社会一般、また第３者の嗜好性が高いコンテンツも認識することが可能となる。本発明の情報処理システムにより、従来と比較してユーザに提供するレコメンド情報範囲の拡大が期待できる。 The embodiment of the present invention has been described above. According to the present invention, it is possible to share a keyword for searching content together with content between devices, and recognize not only content with high user's own preference but also content with general society and high preference for third parties. It becomes possible. With the information processing system of the present invention, it is possible to expect an expansion of the recommended information range provided to the user as compared with the conventional system.

図１２は、本発明の実施形態にかかるコンテンツ表示のフローチャートである。 FIG. 12 is a flowchart of content display according to the embodiment of the present invention.

まず、サーバ１が、閲覧ドキュメントに関連する第１のキーワードを選定する（ステップ１）。選定された第１のキーワードに関連する第１のコンテンツを取得する（ステップ２）。第１のキーワード、および取得した第１のコンテンツを情報処理装置２に送信する（ステップ３）。次に、情報処理装置２が閲覧ドキュメントに関連する第２のキーワードを選定する（ステップ４）。尚、情報処理装置２が閲覧ドキュメントに関連する第２のキーワードを選定するタイミングは、サーバ１が第１のキーワード選定するタイミングでもよく、また、第１のコンテンツを受信したタイミングでもよく、特に制限は設けない。 First, the server 1 selects a first keyword related to the viewed document (step 1). First content related to the selected first keyword is acquired (step 2). The first keyword and the acquired first content are transmitted to the information processing apparatus 2 (step 3). Next, the information processing apparatus 2 selects a second keyword related to the browse document (step 4). Note that the timing at which the information processing device 2 selects the second keyword related to the viewed document may be the timing at which the server 1 selects the first keyword, or the timing at which the first content is received, and is particularly limited. Is not provided.

情報処理装置２が、選定された第２のキーワードをサーバ１に送信する（ステップ５）。次に、サーバ１が情報処理装置２より受信した第２のキーワードに関連する第２のコンテンツを取得する（ステップ６）。第２のキーワード、および第２のコンテンツを情報処理装置２に送信する（ステップ７）。次に、情報処理装置２がサーバ１より受信した第１コンテンツ、および第２コンテンツを閲覧ドキュメントと共に表示する（ステップ８）。 The information processing device 2 transmits the selected second keyword to the server 1 (step 5). Next, the server 1 acquires the second content related to the second keyword received from the information processing device 2 (step 6). The second keyword and the second content are transmitted to the information processing device 2 (step 7). Next, the information processing apparatus 2 displays the first content and the second content received from the server 1 together with the browsing document (step 8).

図１３は、本発明の実施形態にかかるキーワードランキング化のフローチャートである。 FIG. 13 is a flowchart of keyword ranking according to the embodiment of the present invention.

複数の情報処理装置２が、閲覧ドキュメントに関連する第２のキーワードを選定する（ステップ９、ステップ１１）。選定した第２のキーワードをサーバ１に送信する（ステップ１０、ステップ１２）。サーバ１が複数の情報処理装置２より受信した第２のキーワードを所定の基準に従ってランキング化する（ステップ１３）。ランキング上位となったキーワードに関連する第２のコンテンツを追加で取得する（ステップ１４）。 The plurality of information processing devices 2 select a second keyword related to the browsed document (Steps 9 and 11). The selected second keyword is transmitted to the server 1 (steps 10 and 12). The server 1 ranks the second keyword received from the plurality of information processing devices 2 according to a predetermined standard (step 13). The second content related to the keyword having the highest ranking is additionally acquired (step 14).

本願発明を実現できるような構成であれば、用いる装置の具備する内容、および装置の数量などは本実施例に限定されない。 As long as the present invention can be realized, the contents of the apparatus used, the number of apparatuses, and the like are not limited to the present embodiment.

１００第１のデータベース
１０１第１のキーワード選定手段
１０２第１のコンテンツ取得手段
１０３キーワードコンテンツ送信手段
１０４キーワードランキング手段
２００第２のデータベース
２０１第２のキーワード選定手段
２０２コンテンツ受信手段
２０３表示手段 DESCRIPTION OF SYMBOLS 100 1st database 101 1st keyword selection means 102 1st content acquisition means 103 Keyword content transmission means 104 Keyword ranking means 200 2nd database 201 2nd keyword selection means 202 Content reception means 203 Display means

Claims

An information processing system having an information processing apparatus and a server connected via a network,
The server
A first database for storing a general interest level in association with a term that is a word included in a document;
A first keyword selecting means for selecting a first keyword related to a viewing document that the user browses based on the degree of interest of the general society;
First content acquisition means for acquiring first content related to the selected first keyword;
Keyword content transmission means for transmitting the selected first keyword and the acquired first content to the information processing apparatus;
With
Information processing device
A second database that stores the user's interest level in association with terms that are words included in the document;
Second keyword selection means for selecting a second keyword related to the browsing document based on the degree of interest of the user;
Content receiving means for transmitting the second keyword and receiving second content related to the second keyword;
Display means for displaying the received first content and second content together with the browse document;
Comprising
An information processing system characterized by this.

The second keyword selecting means selects, as the second keyword, a keyword that does not overlap with the received first keyword based on the degree of interest of the user.
The information processing system according to claim 1.

The display means displays the second keyword and the received first keyword together with the first content and the second content on the browsing document;
The information processing system according to claim 1 or 2.

The content receiving means transmits the second keyword to the server, and receives second content related to the second keyword from the server.
The information processing system according to any one of claims 1 to 3.

The server
Keyword ranking means for ranking based on the characteristics of the second keyword received from the information processing device;
Further comprising
The information processing system according to claim 4.

The keyword ranking means calculates the aggregate amount of the second keyword, and ranks in order of increasing the aggregate amount.
The information processing system according to claim 5.

A keyword content receiving means for receiving a first keyword related to a browsing document browsed by a user and a first content related to the first keyword;
A database that stores the user's interest level in association with terms that are words included in the document;
Keyword selection means for selecting a second keyword related to the browse document based on the degree of interest of the user;
Content receiving means for transmitting the second keyword and receiving second content related to the second keyword;
Display means for displaying the received first content and second content together with the browse document;
Comprising
An information processing apparatus characterized by that.

Receiving a first keyword associated with a viewing document viewed by a user and first content associated with the first keyword ;
Generating a database for storing a user's interest level in association with a term that is a word included in the document;
Selecting a second keyword associated with the browsed document based on the user's degree of interest;
Transmitting the second keyword and receiving second content related to the second keyword;
Displaying the received first content and second content together with the browsing document;
To run on a computer,
program.