JP7740446B2

JP7740446B2 - Information processing device, information processing method, and program

Info

Publication number: JP7740446B2
Application number: JP2024097707A
Authority: JP
Inventors: 聡田端; 慧吾廣川; 寛樹吉原; 遥前田
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2018-09-28
Filing date: 2024-06-17
Publication date: 2025-09-17
Anticipated expiration: 2039-09-25
Also published as: JP2024107473A; JP2020057381A

Description

本発明は、情報処理装置、情報処理方法及びプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

雑誌、書籍、新聞等のレイアウト作成を支援する種々の手法が提案されている。例えば特許文献１では、レイアウト素材であるテキスト、画像等のコンテンツ間の関連度合いを算出し、関連度合いが高いコンテンツ同士が近くに配置されるようレイアウトを決定する情報処理装置等が開示されている。 Various methods have been proposed to assist in creating layouts for magazines, books, newspapers, etc. For example, Patent Document 1 discloses an information processing device that calculates the degree of relevance between content such as text and images, which are layout materials, and determines a layout so that content with a high degree of relevance is placed close to each other.

特開２００９－１６９５３６号公報JP 2009-169536 A

しかしながら、特許文献１に係る発明は、関連度合いが高いコンテンツ同士を近くに配置しているに過ぎず、全体として見た場合には適切なレイアウトになっていない虞がある。 However, the invention of Patent Document 1 merely places highly related content close to each other, and there is a risk that the layout may not be appropriate when viewed as a whole.

一つの側面では、適切なレイアウトを生成することができる情報処理装置等を提供することを目的とする。 One aspect is to provide an information processing device or the like that can generate an appropriate layout.

一つの側面では、情報処理装置は、複数のコンテンツを取得する取得部と、前記複数のコンテンツを所定の領域内に配置したレイアウト画像を生成する生成部と、複数のレイアウト画像を学習済みの識別器を用いて、生成した前記レイアウト画像の評価を取得する評価部と、評価結果を出力する出力部とを備えることを特徴とする。 In one aspect, the information processing device includes an acquisition unit that acquires multiple pieces of content, a generation unit that generates a layout image in which the multiple pieces of content are arranged within a predetermined area, an evaluation unit that acquires an evaluation of the generated layout image using a classifier that has trained on the multiple layout images, and an output unit that outputs the evaluation results.

一つの側面では、適切なレイアウトを生成することができる。 On the one hand, it can generate an appropriate layout.

レイアウト生成システムの構成例を示す模式図である。FIG. 1 is a schematic diagram illustrating an example of the configuration of a layout generation system. サーバの構成例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of the configuration of a server. 本実施の形態の概要を示す説明図である。FIG. 1 is an explanatory diagram showing an overview of the present embodiment. レイアウト学習処理に関する説明図である。FIG. 10 is an explanatory diagram relating to layout learning processing. レイアウト生成処理に関する説明図である。FIG. 10 is an explanatory diagram relating to a layout generation process. レイアウト学習処理の処理手順の一例を示すフローチャートである。10 is a flowchart illustrating an example of a processing procedure for layout learning processing. レイアウト生成処理の処理手順の一例を示すフローチャートである。10 is a flowchart illustrating an example of a processing procedure for a layout generation process. 実施の形態２の概要を示す説明図である。FIG. 10 is an explanatory diagram showing an overview of a second embodiment. 実施の形態２に係るレイアウト生成処理の処理手順の一例を示すフローチャートである。10 is a flowchart illustrating an example of a processing procedure for a layout generation process according to the second embodiment. 上述した形態のサーバの動作を示す機能ブロック図である。FIG. 2 is a functional block diagram showing the operation of the server of the above-described embodiment. スコアリングモデル１４１の要部を示す説明図である。FIG. 1 is an explanatory diagram showing the main parts of a scoring model 141. 総合スコアの算出の一例を示す説明図である。FIG. 10 is an explanatory diagram showing an example of calculation of a total score. 実施の形態４の概要を示す説明図である。FIG. 10 is an explanatory diagram showing an overview of a fourth embodiment. 実施の形態４のレイアウト生成処理の処理手順の一例を示すフローチャートである。13 is a flowchart showing an example of a processing procedure for a layout generation process according to the fourth embodiment.

以下、本発明をその実施の形態を示す図面に基づいて詳述する。
（実施の形態１）
図１は、レイアウト生成システムの構成例を示す模式図である。本実施の形態では、雑誌、書籍、新聞等の文書のページレイアウトを自動生成するレイアウト生成システムについて説明する。レイアウト生成システムは、情報処理装置１及び端末２を有する。各装置は、インターネット等のネットワークＮを介して通信接続されている。 The present invention will be described in detail below with reference to the drawings showing embodiments thereof.
(Embodiment 1)
1 is a schematic diagram showing an example of the configuration of a layout generation system. In this embodiment, a layout generation system that automatically generates page layouts for documents such as magazines, books, newspapers, etc. will be described. The layout generation system includes an information processing device 1 and a terminal 2. Each device is communicatively connected via a network N such as the Internet.

情報処理装置１は、種々の情報処理、情報の送受信が可能な情報処理装置であり、例えばサーバ装置、パーソナルコンピュータ等である。本実施の形態では情報処理装置１がサーバ装置であるものとし、以下の説明では簡潔のためサーバ１と読み替える。サーバ１は、文書ページのレイアウト素材である画像、テキスト等のコンテンツを端末２から取得し、各コンテンツを文書ページ内に配置したレイアウト画像を生成する。本実施の形態でサーバ１は、後述するように、既存の文書ページのレイアウトを機械学習によって学習済みのスコアリングモデル１４１（識別器）を用い、生成したレイアウト画像の評価値を表すスコアを算出する。サーバ１は、算出したスコアに応じて、生成したレイアウト画像の情報を端末２に返却（出力）する。 The information processing device 1 is an information processing device capable of various information processing and information transmission and reception, such as a server device or a personal computer. In this embodiment, the information processing device 1 is assumed to be a server device, and for simplicity's sake will be referred to as server 1 in the following description. The server 1 acquires content such as images and text, which are the layout materials for a document page, from the terminal 2, and generates a layout image in which each content is arranged within the document page. In this embodiment, the server 1, as described below, uses a scoring model 141 (classifier) that has been trained on the layout of an existing document page through machine learning to calculate a score representing the evaluation value of the generated layout image. The server 1 returns (outputs) information about the generated layout image to the terminal 2 according to the calculated score.

端末２は、本システムを利用する各ユーザが使用する端末装置であり、例えばパーソナルコンピュータ、スマートフォン、タブレット端末等である。本システムのユーザは、例えば雑誌、書籍等を作成する出版社であり、本システムを利用して文書ページのレイアウトを作成する。 Terminal 2 is a terminal device used by each user of this system, such as a personal computer, smartphone, or tablet device. Users of this system are, for example, publishers that create magazines, books, etc., and use this system to create document page layouts.

なお、サーバ１が生成するレイアウト画像は雑誌、書籍等の印刷物に関するページだけでなく、Ｗｅｂページのように、Ｗｅｂ上のページレイアウトに関するものであってもよい。 Note that the layout images generated by server 1 may not only be pages related to printed materials such as magazines and books, but may also be related to page layouts on the web, such as web pages.

図２は、サーバ１の構成例を示すブロック図である。サーバ１は、制御部１１、主記憶部１２、通信部１３、補助記憶部１４を備える。制御部１１は、一又は複数のＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro-Processing Unit）、ＧＰＵ（Graphics Processing Unit）等の演算処理装置を有し、補助記憶部１４に記憶されたプログラムＰを読み出して実行することにより、サーバ１に係る種々の情報処理、制御処理等を行う。主記憶部１２は、ＳＲＡＭ（Static Random Access Memory）、ＤＲＡＭ（Dynamic Random Access Memory）、フラッシュメモリ等の一時記憶領域であり、制御部１１が演算処理を実行するために必要なデータを一時的に記憶する。通信部１３は、通信に関する処理を行うための通信モジュールであり、外部と情報の送受信を行う。 Figure 2 is a block diagram showing an example configuration of server 1. Server 1 comprises a control unit 11, a main memory unit 12, a communication unit 13, and an auxiliary memory unit 14. The control unit 11 has one or more arithmetic processing devices such as a CPU (Central Processing Unit), MPU (Micro-Processing Unit), or GPU (Graphics Processing Unit), and performs various information processing, control processing, and other operations related to server 1 by reading and executing programs P stored in the auxiliary memory unit 14. The main memory unit 12 is a temporary storage area such as SRAM (Static Random Access Memory), DRAM (Dynamic Random Access Memory), or flash memory, and temporarily stores data necessary for the control unit 11 to execute arithmetic processing. The communication unit 13 is a communication module for performing communication-related processing, and sends and receives information to and from the outside.

補助記憶部１４は大容量メモリ、ハードディスク等であり、制御部１１が処理を実行するために必要なプログラムＰ、その他のデータを記憶している。また、補助記憶部１４は、レイアウト画像のスコアを算出するために用いられるスコアリングモデル１４１を記憶している。スコアリングモデル１４１は、レイアウト画像のスコア算出用の学習済みモデル（識別器）であり、後述するように、既存の文書ページのレイアウト画像を教師データとして生成された学習済みモデルである。 The auxiliary storage unit 14 is a large-capacity memory, hard disk, etc., and stores the program P and other data necessary for the control unit 11 to execute processing. The auxiliary storage unit 14 also stores a scoring model 141 used to calculate the score of a layout image. The scoring model 141 is a trained model (classifier) used to calculate the score of a layout image, and as described below, is a trained model generated using layout images of existing document pages as training data.

なお、補助記憶部１４はサーバ１に接続された外部記憶装置であってもよい。また、サーバ１は複数のコンピュータからなるマルチコンピュータであってもよく、ソフトウェアによって仮想的に構築された仮想マシンであってもよい。 The auxiliary storage unit 14 may be an external storage device connected to the server 1. The server 1 may also be a multi-computer consisting of multiple computers, or may be a virtual machine virtually constructed using software.

また、本実施の形態においてサーバ１は上記の構成に限られず、例えば可搬型記憶媒体に記憶された情報を読み取る読取部、操作入力を受け付ける入力部、画像を表示する表示部等を含んでもよい。 Furthermore, in this embodiment, the server 1 is not limited to the above configuration, and may include, for example, a reading unit that reads information stored on a portable storage medium, an input unit that accepts operation input, a display unit that displays images, etc.

図３は、本実施の形態の概要を示す説明図である。図３では、端末２からの要求を受けて、サーバ１が文書ページのレイアウト画像を生成する様子を模式的に示している。図３に基づき、本実施の形態の概要について説明する。まずサーバ１は、文書ページに配置する複数のコンテンツのデータを端末２から取得し、レイアウト画像の生成依頼を受け付ける。例えばサーバ１は、ＸＭＬ（Extensible Markup Language）ファイルのように、文書ページに配置するテキスト、画像等が構造化された文書データを取得する。なお、端末２から取得するコンテンツのデータは構造化されたデータに限定されず、非構造化データであってもよい。 Figure 3 is an explanatory diagram showing an overview of this embodiment. Figure 3 schematically shows how server 1 generates a layout image of a document page upon receiving a request from terminal 2. An overview of this embodiment will be explained based on Figure 3. First, server 1 acquires data of multiple contents to be placed on a document page from terminal 2 and accepts a request to generate a layout image. For example, server 1 acquires document data in which text, images, etc. to be placed on a document page are structured, such as an XML (Extensible Markup Language) file. Note that the content data acquired from terminal 2 is not limited to structured data and may be unstructured data.

なお、サーバ１は端末２からコンテンツのデータを取得する際に、作成する文書のページ数の指定入力を併せて受け付ける。後述するように、サーバ１は、上記で取得した複数のコンテンツを各ページに割り当てて配置し、ページ毎にレイアウト画像を生成する。複数ページのレイアウト画像を生成する処理について、詳しくは後述する。 When server 1 acquires content data from terminal 2, it also accepts input specifying the number of pages of the document to be created. As will be described later, server 1 allocates and arranges the acquired multiple pieces of content to each page, and generates a layout image for each page. The process of generating layout images for multiple pages will be described in detail later.

サーバ１はプログラムＰを実行することにより、レイアウト情報生成部１１１、レイアウト画像生成部１１２、及びレイアウト評価部１１３の機能を有する。レイアウト情報生成部１１１は、端末２から取得した複数のコンテンツを、文書ページに相当する所定の領域内に配置するためのレイアウト情報を生成する。具体的には、レイアウト情報生成部１１１は、文書レイアウトとしての最低限の規定（グリッドを揃える、各コンテンツが被らないようにする等）を設けた上で、当該規定に反しない限度で、各コンテンツを所定の領域内に配置する配置座標、あるいは各コンテンツのサイズ（ページ内に占める大きさ）をランダムに決定する。 By executing program P, server 1 has the functions of layout information generation unit 111, layout image generation unit 112, and layout evaluation unit 113. Layout information generation unit 111 generates layout information for arranging multiple pieces of content obtained from terminal 2 within a predetermined area corresponding to a document page. Specifically, layout information generation unit 111 establishes minimum document layout specifications (aligning grids, ensuring that content does not overlap, etc.), and then randomly determines the layout coordinates for arranging each piece of content within the predetermined area, or the size of each piece of content (the size it occupies within the page), to the extent that it does not violate these specifications.

レイアウト情報生成部１１１は、各コンテンツの配置座標等をランダムに決定し、複数パターンのレイアウト情報を生成する。図３では、Ｎ通りのレイアウト情報を生成する様子を図示している。レイアウト情報生成部１１１は、各々のレイアウトパターンで各コンテンツの座標、サイズ等が異なるようにして、互いにレイアウトが異なるＮ通りのレイアウト情報を生成する。 The layout information generation unit 111 randomly determines the placement coordinates, etc. of each piece of content and generates multiple patterns of layout information. Figure 3 illustrates the generation of N types of layout information. The layout information generation unit 111 generates N types of layout information with different layouts by varying the coordinates, size, etc. of each piece of content in each layout pattern.

レイアウト画像生成部１１２は、レイアウト情報生成部１１１が生成したレイアウト情報に従い、各コンテンツを所定の領域内に配置したレイアウト画像を生成する。具体的には、レイアウト画像生成部１１２は、レイアウト情報生成部１１１が生成したＮ通りのレイアウト情報それぞれに対応して、Ｎ通りのレイアウト画像を生成する。 The layout image generation unit 112 generates a layout image in which each piece of content is arranged within a predetermined area in accordance with the layout information generated by the layout information generation unit 111. Specifically, the layout image generation unit 112 generates N different layout images corresponding to the N different pieces of layout information generated by the layout information generation unit 111.

レイアウト評価部１１３は、生成されたＮ通りのレイアウト画像それぞれについて、レイアウトの確からしさを評価したスコアを算出する。具体的には、レイアウト評価部１１３は、既存の文書ページのレイアウトを学習済みのスコアリングモデル１４１（識別器）を用いて、生成されたＮ通りのレイアウト画像それぞれのスコアを算出する。スコアリングモデル１４１について、詳しくは後述する。 The layout evaluation unit 113 calculates a score that evaluates the likelihood of the layout for each of the N layout images generated. Specifically, the layout evaluation unit 113 calculates a score for each of the N layout images generated using a scoring model 141 (classifier) that has been trained on the layout of existing document pages. The scoring model 141 will be described in more detail below.

サーバ１は、上記で算出したスコアに基づき、Ｎ通りのレイアウト画像の順位を特定する。サーバ１は、特定した順位に応じて、生成したレイアウト画像の情報を端末２に出力する。例えばサーバ１は、予め定められた上位Ｍ位までのレイアウト画像を特定し、上位Ｍ位までのレイアウト画像に対応するレイアウト情報（配置情報）を端末２に出力する。端末２では、サーバ１から取得したレイアウト情報を所定の文書編集ソフトにインポートし、ユーザが最終的な文書ページを作成する。 Based on the scores calculated above, the server 1 determines the ranking of the N layout images. The server 1 outputs information about the generated layout images to the terminal 2 according to the determined ranking. For example, the server 1 determines the top M predetermined layout images and outputs layout information (arrangement information) corresponding to the top M layout images to the terminal 2. The terminal 2 imports the layout information obtained from the server 1 into specified document editing software, and the user creates the final document page.

なお、例えばサーバ１は、閾値以上のスコアを有するレイアウト画像の情報を出力するようにしてもよい。また、例えばサーバ１は、生成した全てのレイアウト画像の情報を出力し、併せてスコアや順位を提示（出力）するようにしてもよい。このように、サーバ１は算出したスコア（評価）に応じてレイアウト画像の情報を提示することができればよく、その態様は順位に基づくものに限定されない。 For example, the server 1 may output information about layout images that have a score above a threshold. Also, for example, the server 1 may output information about all generated layout images, and also present (output) the scores and rankings. In this way, the server 1 only needs to be able to present information about layout images according to the calculated scores (evaluations), and the manner in which it presents information is not limited to being based on rankings.

図４は、レイアウト学習処理に関する説明図である。本実施の形態では、サーバ１はスコアリングモデル１４１として、ディープラーニングにより構築されるニューラルネットワーク、具体的にはＣＮＮ（Convolution Neural Network）を用いてスコアを算出する。図４では、既存の文書ページのレイアウト画像を教師データとしてディープラーニングを行い、スコアリングモデル１４１を構築（生成）する様子を概念的に図示している。図４に基づき、スコアリングモデル１４１を構築するためのレイアウト学習処理について説明する。 Figure 4 is an explanatory diagram of the layout learning process. In this embodiment, the server 1 calculates scores using a neural network constructed by deep learning, specifically a CNN (Convolution Neural Network), as the scoring model 141. Figure 4 conceptually illustrates how deep learning is performed using layout images of existing document pages as training data to construct (generate) the scoring model 141. The layout learning process for constructing the scoring model 141 will be described with reference to Figure 4.

なお、本実施の形態ではスコアリングモデル１４１がＣＮＮであるものとして説明するが、スコアリングモデル１４１はその他のニューラルネットワーク、ＳＶＭ（Support Vector Machine）、ベイジアンネットワーク、決定木など、その他の学習済みモデルであってもよい。 In this embodiment, the scoring model 141 is described as being a CNN, but the scoring model 141 may also be other trained models such as other neural networks, SVMs (Support Vector Machines), Bayesian networks, decision trees, etc.

本実施の形態でサーバ１は、ランク学習の手法を用いてスコアリングモデル１４１を生成する。ランク学習は、データ集合の序列を学習する学習手法である。ディープラーニングによりランク学習を行うニューラルネットワークとしては、例えばＤｅｅｐＬａｎｋ、ＳｉａｍｅｓｅＮｅｔ等が知られている。ランク学習は公知の学習手法であるため、その詳細な説明は省略する。 In this embodiment, the server 1 generates the scoring model 141 using a rank learning technique. Rank learning is a learning technique that learns the ranking of a dataset. Known examples of neural networks that perform rank learning using deep learning include DeepLank and SiameseNet. Because rank learning is a well-known learning technique, a detailed description of it will be omitted.

例えばサーバ１は、人手で作成された既存の文書ページのレイアウト画像を端末２から取得し、学習用の教師データとして用いる。以下の説明では便宜上、当該レイアウト画像を「既存レイアウト画像」と呼ぶ。例えばサーバ１は、既存レイアウト画像をスコアが「１」の正解データとして用いる。 For example, server 1 acquires a manually created layout image of an existing document page from terminal 2 and uses it as training data for learning. For convenience in the following explanation, this layout image will be referred to as an "existing layout image." For example, server 1 uses the existing layout image as correct answer data with a score of "1."

さらにサーバ１は、既存レイアウト画像内に配置されているコンテンツの配置座標をランダムに入れ替え、コンテンツの配置を変更した複数のレイアウト画像を生成する。以下の説明では便宜上、当該レイアウト画像を「偽レイアウト画像」と呼ぶ。サーバ１は、偽レイアウト画像をスコアが「０」の不正解データとして用いる。 The server 1 then randomly rearranges the placement coordinates of the content placed within the existing layout image, generating multiple layout images with altered content placement. For convenience in the following explanation, these layout images will be referred to as "fake layout images." The server 1 uses the fake layout images as incorrect data with a score of "0."

サーバ１は、既存レイアウト画像と、既存レイアウト画像からコンテンツの配置を変更した偽レイアウト画像とをニューラルネットワークに入力し、ランク学習を行う。具体的には、サーバ１は、既存レイアウト画像のスコアが偽レイアウト画像のスコアよりも高くなるよう学習を行う。サーバ１は、既存レイアウト画像のスコアと、上記で生成した複数の偽レイアウト画像それぞれのスコアとを比較し、既存レイアウト画像のスコアが、複数の偽レイアウト画像のいずれのスコアよりも高くなるよう学習を行う。これによりサーバ１は、スコアリングモデル１４１を生成する。サーバ１は、生成したスコアリングモデル１４１を用いてレイアウト画像のスコアを算出する。 Server 1 inputs the existing layout image and a fake layout image in which the content arrangement has been changed from the existing layout image into a neural network and performs rank learning. Specifically, server 1 performs learning so that the score of the existing layout image will be higher than the score of the fake layout image. Server 1 compares the score of the existing layout image with the scores of each of the multiple fake layout images generated above, and performs learning so that the score of the existing layout image will be higher than the scores of any of the multiple fake layout images. In this way, server 1 generates scoring model 141. Server 1 calculates the score of the layout image using the generated scoring model 141.

なお、上記でサーバ１は教師データの一部（偽レイアウト画像）を自ら生成するものとしたが、教師データは全て人手で作成されたものであってもよい。また、教師データに対し、人手でスコアや順位といった正解値をラベル付けしておいてもよい。 In the above, it was stated that server 1 generates part of the training data (fake layout images) itself, but the training data may be created entirely by hand. Furthermore, the training data may be manually labeled with correct values such as scores and rankings.

また、レイアウトの学習処理と生成処理とを行う処理主体（サーバ１）は同一でなくともよい。 Furthermore, the processing entity (server 1) that performs the layout learning process and the layout generation process does not have to be the same.

図５は、レイアウト生成処理に関する説明図である。図５では、レイアウト画像を生成する処理について、概念的に図示している。上述の如く、サーバ１は、文書ページに配置するコンテンツの構造化データを端末２から取得する。さらにサーバ１は、コンテンツのデータを取得する際に、併せて端末２から、ユーザにより指定された文書のページ数Ｋを取得する。 Figure 5 is an explanatory diagram of the layout generation process. Figure 5 conceptually illustrates the process of generating a layout image. As described above, the server 1 acquires structured data of the content to be placed on the document page from the terminal 2. Furthermore, when acquiring the content data, the server 1 also acquires the number of pages K of the document specified by the user from the terminal 2.

サーバ１は、指定されたページ数Ｋとなるように、端末２から取得した複数のコンテンツを各ページに割り当てて配置し、Ｋ個のレイアウト画像を生成する。サーバ１は、各コンテンツを配置するページ及び配置座標をランダムに決定してＫページ分のレイアウト画像を生成していき、図５に示すように、Ｎ通りのレイアウト画像群を生成する。すなわち、サーバ１は最終的にＫ×Ｎ個のレイアウト画像を生成する。 The server 1 allocates and arranges the multiple contents acquired from the terminal 2 to each page so that the specified number of pages is K, generating K layout images. The server 1 randomly determines the page and arrangement coordinates on which to arrange each piece of content, and generates layout images for K pages, generating a group of N layout images, as shown in Figure 5. In other words, the server 1 ultimately generates K x N layout images.

サーバ１は、生成した各レイアウト画像をスコアリングモデル１４１に入力し、レイアウト画像のスコアを算出する。この場合にサーバ１は、例えば入力するレイアウト画像のページ毎に異なるスコアリングモデル１４１を用意しておき、各ページに対応するスコアリングモデル１４１にレイアウト画像を入力してスコアを算出する。例えばサーバ１は、ページ数に応じてＫ個のスコアリングモデル１４１を用意してもよく、文書内のページの区分（例えば雑誌である場合の、表紙、目次ページ、記事ページ、広告ページ等の別）に応じてスコアリングモデル１４１を用意してもよい。サーバ１は、学習時に各ページに対応するレイアウト画像を教師データとして用いて学習を行い、各ページに対応するスコアリングモデル１４１を生成しておく。 The server 1 inputs each generated layout image into a scoring model 141 and calculates a score for the layout image. In this case, the server 1 prepares a different scoring model 141 for each page of the input layout image, and inputs the layout image into the scoring model 141 corresponding to each page to calculate a score. For example, the server 1 may prepare K scoring models 141 according to the number of pages, or may prepare scoring models 141 according to the page classification within the document (for example, in the case of a magazine, the cover, table of contents page, article page, advertisement page, etc.). During learning, the server 1 performs learning using the layout image corresponding to each page as training data and generates a scoring model 141 corresponding to each page.

サーバ１は、Ｎ通りの各レイアウト画像群それぞれについて、各ページに対応するスコアリングモデル１４１を用いて各ページのレイアウト画像のスコアを算出する。そしてサーバ１は、全てのページのスコアを合算し、デザインスコアを算出する。サーバ１は、算出したデザインスコアに基づき、Ｎ通りのレイアウト画像群の順位付けを行う。このようにしてサーバ１は、生成したレイアウト画像の順位を取得する。 For each of the N layout image groups, the server 1 calculates the score of the layout image for each page using the scoring model 141 corresponding to each page. The server 1 then adds up the scores for all pages to calculate a design score. The server 1 ranks the N layout image groups based on the calculated design scores. In this way, the server 1 obtains the rankings of the generated layout images.

既に説明したように、サーバ１は、Ｎ通りのレイアウト画像群のうち、デザインスコアが上位Ｍ位までのレイアウト画像群の情報を端末２に出力する。 As already explained, the server 1 outputs information on the layout image group with the top M design scores out of the N layout image groups to the terminal 2.

上述の如く、サーバ１は文書レイアウトを学習済みのスコアリングモデル１４１を用いてレイアウト画像を評価し、評価が高いレイアウト画像をユーザに提示する。これにより、ユーザが文書レイアウトを決める手間が省かれ、文書作成の効率化を図ることができる。 As described above, the server 1 evaluates layout images using the scoring model 141 that has learned document layouts, and presents highly rated layout images to the user. This saves the user the trouble of deciding on the document layout, and improves the efficiency of document creation.

図６は、レイアウト学習処理の処理手順の一例を示すフローチャートである。図６に基づき、レイアウト学習処理の処理内容について説明する。サーバ１の制御部１１は、教師データとする既存レイアウト画像を取得する（ステップＳ１１）。既存レイアウト画像は、人手で作成された文書ページのレイアウト画像である。制御部１１は、ステップＳ１１で取得した既存レイアウト画像に含まれる各コンテンツを再配置した偽レイアウト画像を生成する（ステップＳ１２）。例えば制御部１１は、ステップＳ１１で取得した既存レイアウト画像に含まれる複数のコンテンツをランダムに再配置し、複数の偽レイアウト画像を生成する。 Figure 6 is a flowchart showing an example of the processing steps of the layout learning process. The processing details of the layout learning process will be described based on Figure 6. The control unit 11 of the server 1 acquires an existing layout image to be used as training data (step S11). The existing layout image is a layout image of a document page created manually. The control unit 11 generates a false layout image by rearranging each piece of content included in the existing layout image acquired in step S11 (step S12). For example, the control unit 11 randomly rearranges multiple pieces of content included in the existing layout image acquired in step S11 to generate multiple false layout images.

制御部１１は、ステップＳ１１で取得した既存レイアウト画像、及びステップＳ１２で生成した偽レイアウト画像に基づき、レイアウト画像を入力した場合にレイアウト画像のスコア（評価）を出力するよう学習したスコアリングモデル１４１を生成する（ステップＳ１３）。上述の如く、制御部１１はランク学習の手法を用いて学習を行い、ＣＮＮに係るスコアリングモデル１４１を生成する。具体的には、制御部１１は、ステップＳ１１で取得したレイアウト画像を正解データとし、ステップＳ１２で生成した偽レイアウト画像を不正解データとして学習を行う。制御部１１は、一連の処理を終了する。 The control unit 11 generates a scoring model 141 that has been trained to output a score (evaluation) for a layout image when it is input, based on the existing layout image acquired in step S11 and the false layout image generated in step S12 (step S13). As described above, the control unit 11 performs training using a rank learning method to generate a scoring model 141 related to CNN. Specifically, the control unit 11 performs training using the layout image acquired in step S11 as correct data and the false layout image generated in step S12 as incorrect data. The control unit 11 then ends the series of processes.

図７は、レイアウト生成処理の処理手順の一例を示すフローチャートである。図７に基づき、レイアウト生成処理の処理内容について説明する。サーバ１の制御部１１は、端末２から、文書ページに配置する複数のコンテンツと、指定ページ数とを取得する（ステップＳ３１）。制御部１１は、指定されたページ数に応じて、複数のコンテンツを各ページに割り当てて配置するレイアウト情報を生成する（ステップＳ３２）。具体的には、制御部１１は、所定の領域内に各コンテンツを配置する配置座標をランダムに決定し、複数パターンのレイアウト情報を生成する。制御部１１は、生成したレイアウト情報に従って、所定の領域内にコンテンツを配置した各ページのレイアウト画像を生成する（ステップＳ３３）。具体的には、制御部１１は、ステップＳ３２で生成した複数パターンのレイアウト情報に従い、複数パターンのレイアウト画像を生成する。 Figure 7 is a flowchart showing an example of the processing steps of the layout generation process. The processing details of the layout generation process will be described based on Figure 7. The control unit 11 of the server 1 acquires from the terminal 2 multiple pieces of content to be arranged on document pages and the specified number of pages (step S31). The control unit 11 generates layout information that allocates and arranges multiple pieces of content to each page according to the specified number of pages (step S32). Specifically, the control unit 11 randomly determines the arrangement coordinates for arranging each piece of content within a specified area, and generates multiple patterns of layout information. The control unit 11 generates layout images of each page in which content is arranged within a specified area according to the generated layout information (step S33). Specifically, the control unit 11 generates multiple patterns of layout images according to the multiple patterns of layout information generated in step S32.

制御部１１は、複数のレイアウト画像を学習済みのスコアリングモデル１４１を用いて、ステップＳ３３で生成した各ページのレイアウト画像のスコアを算出する（ステップＳ３４）。具体的には、制御部１１は、ページ毎に異なるスコアリングモデル１４１を用いて、各ページのスコアを算出する。制御部１１は、ステップＳ３３で生成した複数パターンのレイアウト画像それぞれについてスコアを算出する。 The control unit 11 calculates a score for each page of layout image generated in step S33 using a scoring model 141 that has been trained on multiple layout images (step S34). Specifically, the control unit 11 calculates a score for each page using a different scoring model 141 for each page. The control unit 11 calculates a score for each of the multiple patterns of layout images generated in step S33.

制御部１１は、各ページのレイアウト画像のスコアを合算し、複数パターンそれぞれについてデザインスコアを算出する（ステップＳ３５）。制御部１１は、算出したデザインスコアに応じて各パターンのレイアウト画像の順位付けを行い、上位のレイアウト画像に係るレイアウト情報を端末２に出力する（ステップＳ３６）。制御部１１は、一連の処理を終了する。 The control unit 11 adds up the scores of the layout images for each page and calculates a design score for each of the multiple patterns (step S35). The control unit 11 ranks the layout images for each pattern according to the calculated design score, and outputs layout information related to the top-ranked layout images to the terminal 2 (step S36). The control unit 11 ends this series of processes.

なお、上記ではスコアリングモデル１４１がレイアウト画像のスコア（評価値）を出力するものとしたが、レイアウト画像としての適否を示す識別結果のみを出力するようにしてもよい。すなわち、サーバ１は、学習済みモデルを用いてレイアウト画像の評価を取得可能であればよく、取得する評価は連続的な確率値に限定されない。 In the above, the scoring model 141 is assumed to output a score (evaluation value) of the layout image, but it may also be configured to output only an identification result indicating the suitability of the layout image. In other words, the server 1 only needs to be able to obtain an evaluation of the layout image using the trained model, and the obtained evaluation is not limited to a continuous probability value.

以上より、本実施の形態１によれば、スコアリングモデル１４１を用いて算出したスコアに応じてレイアウト画像の情報を出力することで、適切なレイアウトをユーザに提示することができる。 As described above, according to this first embodiment, by outputting layout image information according to the score calculated using the scoring model 141, it is possible to present an appropriate layout to the user.

また、本実施の形態１によれば、スコアリングモデル１４１を用いて複数のレイアウト画像の順位を決定し、決定した順位に応じてレイアウト画像を出力することで、より適切なレイアウトをユーザに提示することができる。 Furthermore, according to this embodiment 1, the ranking of multiple layout images is determined using the scoring model 141, and the layout images are output according to the determined ranking, thereby making it possible to present a more appropriate layout to the user.

また、本実施の形態１によれば、ページ毎に異なるスコアリングモデル１４１を用いてレイアウト画像のスコアを算出することで、各ページの特性を考慮してレイアウト画像を適切に評価することができる。 Furthermore, according to this embodiment 1, by calculating the score of a layout image using a different scoring model 141 for each page, it is possible to appropriately evaluate the layout image taking into account the characteristics of each page.

（実施の形態２）
本実施の形態では、ユーザが予めレイアウトの条件を指定しておき、指定された条件に従ったレイアウトを出力する形態について述べる。なお、実施の形態１と重複する内容については同一の符号を付して説明を省略する。図８は、実施の形態２の概要を示す説明図である。図８の内容は図３とほぼ同様であるため、共通する事項については説明を省略する。図８に基づき、本実施の形態の概要を説明する。 (Embodiment 2)
In this embodiment, a form will be described in which a user specifies layout conditions in advance, and a layout according to the specified conditions is output. Note that the same reference numerals will be used to designate the same parts as in the first embodiment, and a description thereof will be omitted. Fig. 8 is an explanatory diagram showing an outline of the second embodiment. The contents of Fig. 8 are almost the same as those of Fig. 3, and therefore a description of the common matters will be omitted. The outline of this embodiment will be described based on Fig. 8.

本実施の形態では、サーバ１は、端末２からレイアウト画像の生成依頼を受け付ける際に、各コンテンツのデータ、指定ページ数のほかに、各コンテンツを文書ページに配置する際の条件を規定する条件情報を取得する。条件情報は、文書を作成するユーザが任意に指定するレイアウト条件であり、レイアウトを決定する上で必要な、人間（ユーザ）にしかわからないコンテンツの事前知識情報である。 In this embodiment, when the server 1 receives a request to generate a layout image from the terminal 2, it acquires the data for each content, the specified number of pages, and condition information that specifies the conditions for arranging each content on a document page. The condition information is a layout condition arbitrarily specified by the user creating the document, and is prior knowledge information about the content that is only known to a human (user) and is necessary for determining the layout.

本実施の形態でサーバ１は、画像、テキスト等のコンテンツのうち、画像について条件情報の入力を受け付ける。例えばサーバ１は、各画像の重要度、及び各画像が文書ページ内に配置される順序の入力を受け付ける。重要度は、例えば複数段階のランクで入力される。順序は、例えば昇順の数字で入力される。 In this embodiment, the server 1 accepts input of condition information for images, among other content such as images and text. For example, the server 1 accepts input of the importance of each image and the order in which each image is to be arranged within a document page. The importance is input, for example, as a rank on multiple levels. The order is input, for example, in ascending numerical order.

例えば文書が雑誌である場合、画像の重要度に応じて、画像のサイズを見開きサイズとするか、一ページサイズとするか、二分の一ページサイズとするか等、重要度が高いものほどサイズが大きくなるようにレイアウトすることが多い。そこでサーバ１は、画像の重要度を条件情報として取得し、重要度に応じて画像のサイズを決定する。 For example, if the document is a magazine, the image size is often determined based on the importance of the image, with the image size being larger the more important it is, such as double-page spread size, full page size, or half page size. Therefore, server 1 acquires the importance of the image as condition information and determines the size of the image based on the importance.

また、複数の画像が互いに関連ある場合、ページ内にどの画像を先に配置すべきか、その画像内容に応じて順序を決定することが多い。例えば、ある被写体の全体像の写真と細部の写真とをページ内にレイアウトする場合、基本的には先に全体像の写真を配置した方が良い。そこでサーバ１は、画像の配置順序を条件情報として取得する。 Furthermore, when multiple images are related to each other, the order in which images should be placed first on a page is often determined based on the image content. For example, when laying out a photo of the entire subject and a photo of details on a page, it is generally better to place the photo of the entire subject first. Therefore, Server 1 acquires the order in which images should be placed as condition information.

実施の形態１と同様に、サーバ１は各コンテンツをランダムに配置したレイアウト画像を生成し、スコアリングモデル１４１に入力してデザインスコアを算出する。さらに本実施の形態では、サーバ１は、算出したデザインスコアを、上記の条件情報に基づいて補正する。すなわち、サーバ１は、条件情報に応じてレイアウト画像の評価を変更する。 Similar to the first embodiment, the server 1 generates a layout image in which each piece of content is randomly arranged, inputs the layout image into the scoring model 141, and calculates the design score. Furthermore, in this embodiment, the server 1 corrects the calculated design score based on the above-mentioned condition information. In other words, the server 1 changes the evaluation of the layout image in accordance with the condition information.

具体的には、サーバ１は、各ページのレイアウト画像におけるコンテンツ（画像）のサイズ、配置順序等を条件情報で規定された重要度、配置順序等と比較し、各コンテンツが条件情報に反したサイズ、順序等で配置されているか否かを判定する。そしてサーバ１は、条件情報に反して配置されたコンテンツ数を計数する。すなわち、サーバ１は、ユーザが指定したレイアウト条件に反する違反数を計数する。 Specifically, the server 1 compares the size, layout order, etc. of the content (images) in the layout image of each page with the importance, layout order, etc. specified in the condition information, and determines whether each piece of content is arranged in a size, order, etc. that violates the condition information. The server 1 then counts the number of pieces of content that are arranged in a manner that violates the condition information. In other words, the server 1 counts the number of violations that violate the layout conditions specified by the user.

サーバ１は、計数したコンテンツ数（違反数）に所定の係数を乗算し、デザインスコアから減算する。すなわち、サーバ１は、違反数に応じてペナルティを与える。サーバ１は、上記の処理によって最終的に算出されたデザインスコアに基づいて順位付けを行い、上位のレイアウト画像の情報を端末２に出力する。 The server 1 multiplies the counted number of contents (number of violations) by a predetermined coefficient and subtracts the result from the design score. In other words, the server 1 imposes a penalty according to the number of violations. The server 1 ranks the images based on the design score finally calculated by the above process, and outputs information about the top layout images to the terminal 2.

上述の如く、サーバ１はコンテンツの配置条件を規定する条件情報を事前に取得し、条件情報に基づいてレイアウト画像を評価する。これにより、ユーザが所望するレイアウト画像を提示することができる。 As described above, the server 1 obtains condition information specifying the content placement conditions in advance and evaluates the layout image based on the condition information. This allows the layout image desired by the user to be presented.

なお、上記では条件情報をレイアウト画像の評価にのみ用いたが、本実施の形態はこれに限定されるものではなく、サーバ１は、条件情報をレイアウト画像の生成に用いてもよい。例えばサーバ１は、条件情報で規定される画像の重要度、配置順序等を参照して画像サイズ等を決定し、各ページに配置する。この場合でも、上記と同様の効果を奏する。 Note that, although the condition information is used only to evaluate layout images in the above embodiment, this is not limited to this, and the server 1 may also use the condition information to generate layout images. For example, the server 1 determines the image size, etc., by referring to the importance and layout order of the images specified in the condition information, and arranges them on each page. Even in this case, the same effect as above can be achieved.

図９は、実施の形態２に係るレイアウト生成処理の処理手順の一例を示すフローチャートである。サーバ１の制御部１１は、文書ページに配置する複数のコンテンツ及び指定ページ数に加えて、コンテンツの配置条件を規定する条件情報を端末２から取得する（ステップＳ２０１）。条件情報は、各コンテンツをページ内に配置する際のレイアウト条件を規定する情報であり、例えば上述の如く、コンテンツの重要度、配置順序等の情報である。制御部１１は、処理をステップＳ３２に移行する。 Figure 9 is a flowchart showing an example of the processing steps for layout generation processing according to embodiment 2. The control unit 11 of the server 1 acquires from the terminal 2 condition information that specifies the content placement conditions, in addition to the multiple contents to be placed on a document page and the specified number of pages (step S201). The condition information specifies the layout conditions for placing each content on a page, and is, for example, information such as the importance of the content and the placement order, as described above. The control unit 11 then proceeds to step S32.

レイアウト画像のデザインスコアを算出した後（ステップＳ３４）、制御部１１は、ステップＳ２０１で取得した条件情報に基づき、デザインスコアを補正（変更）する（ステップＳ２０２）。具体的には、制御部１１は、生成したレイアウト画像における各コンテンツのサイズ、配置順序等を条件情報で規定されたコンテンツの重要度、配置順序等と比較し、各コンテンツが条件情報に反したサイズ、順序等で配置されているか否かを判定する。制御部１１は、条件情報に反して配置されたコンテンツ数を計数し、計数したコンテンツ数に応じてデザインスコアから所定値を減算する。制御部１１は、処理をステップＳ３６に移行する。 After calculating the design score of the layout image (step S34), the control unit 11 corrects (changes) the design score based on the condition information acquired in step S201 (step S202). Specifically, the control unit 11 compares the size, arrangement order, etc. of each piece of content in the generated layout image with the importance, arrangement order, etc. of the content specified in the condition information, and determines whether each piece of content is arranged in a size, order, etc. that violates the condition information. The control unit 11 counts the number of pieces of content that are arranged in a manner that violates the condition information, and subtracts a predetermined value from the design score according to the counted number of pieces of content. The control unit 11 then proceeds to step S36.

以上より、本実施の形態２によれば、ユーザが指定した条件情報に応じてレイアウト画像を評価することで、より適切なレイアウトを提示することができる。 As described above, according to this second embodiment, layout images can be evaluated according to the condition information specified by the user, making it possible to present a more appropriate layout.

また、本実施の形態２によれば、ユーザが指定した条件情報に基づきレイアウト画像を生成することもできる。 Furthermore, according to this second embodiment, layout images can also be generated based on condition information specified by the user.

（実施の形態３）
図１０は、上述した形態のサーバ１の動作を示す機能ブロック図である。制御部１１がプログラムＰを実行することにより、サーバ１は以下のように動作する。取得部１０１は、複数のコンテンツを取得する。生成部１０２は、前記複数のコンテンツを所定の領域内に配置したレイアウト画像を生成する。評価部１０３は、複数のレイアウト画像を学習済みの識別器を用いて、生成した前記レイアウト画像の評価を取得する。出力部１０４は、評価結果を出力する。 (Embodiment 3)
10 is a functional block diagram showing the operation of the server 1 in the above-described embodiment. The control unit 11 executes the program P, causing the server 1 to operate as follows: The acquisition unit 101 acquires multiple pieces of content. The generation unit 102 generates a layout image in which the multiple pieces of content are arranged in a predetermined area. The evaluation unit 103 uses a classifier that has trained on the multiple layout images to acquire an evaluation of the generated layout image. The output unit 104 outputs the evaluation result.

本実施の形態３は以上の如きであり、その他は実施の形態１及び２と同様であるので、対応する部分には同一の符号を付してその詳細な説明を省略する。 This third embodiment is as described above, and is otherwise similar to the first and second embodiments, so corresponding parts are given the same reference numerals and detailed descriptions thereof will be omitted.

（実施の形態４）
上述の実施の形態では、適切なレイアウトをユーザに提供することができる。レイアウトを決定する場合、特に初期の検討段階では、レイアウトのバリエーションが豊富であることが望ましい場合もある。以下、この点について説明する。 (Embodiment 4)
In the above-described embodiment, an appropriate layout can be provided to the user. When determining a layout, particularly in the initial stage of consideration, it may be desirable to have a wide variety of layouts. This point will be described below.

レイアウト評価部１１３は、生成されたＮ通りのレイアウト画像それぞれについて、レイアウトの確からしさを評価したスコアを算出するとともに（実施の形態１～３と同様）、生成されたＮ通りのレイアウト画像それぞれについて多様性を表す多様性スコアを算出し、両方のスコアに基づいて総合スコアを算出する。以下、総合スコアの算出方法について説明する。 The layout evaluation unit 113 calculates a score that evaluates the likelihood of the layout for each of the N generated layout images (as in embodiments 1 to 3), and also calculates a diversity score that represents diversity for each of the N generated layout images, and calculates an overall score based on both scores. The method for calculating the overall score is described below.

図１１は、スコアリングモデル１４１の要部を示す説明図である。スコアリングモデル１４１は、例えば、ＣＮＮとすることができ、全結合層１４１ａ、１４１ｂを有する。なお、全結合層の数は図１１の例に限定されない。全結合層１４１ａ、１４１ｂでは、入力も出力もベクトルとなる。全結合層１４１ｂは出力層の前段に位置しているので、入力されたレイアウト画像の特徴を組み合わせたものであるため、レイアウトを分類するための識別部に相当する。実施の形態４では、全結合層１４１ｂのベクトルをレイアウト画像の特徴を識別する識別指標として用いる。なお、他の全結合層を用いてもよい。 Figure 11 is an explanatory diagram showing the main parts of the scoring model 141. The scoring model 141 can be, for example, a CNN, and has fully connected layers 141a and 141b. Note that the number of fully connected layers is not limited to the example in Figure 11. In the fully connected layers 141a and 141b, both the input and output are vectors. Since the fully connected layer 141b is located before the output layer, it combines the features of the input layout image and therefore corresponds to an identification unit for classifying layouts. In embodiment 4, the vectors in the fully connected layer 141b are used as identification indices for identifying the features of the layout image. Note that other fully connected layers may also be used.

図１２は、総合スコアの算出の一例を示す説明図である。便宜上、Ｎ通りのレイアウト画像をＧ１、Ｇ２、Ｇ３、…、ＧＮと表す。実施の形態１～３において、算出したデザインスコアをＳＴｉとする。ここで、ｉ＝１～Ｍである。例えば、レイアウト画像ＧｉのデザインスコアはＳＴｉである。レイアウト画像Ｇｉとレイアウト画像Ｇｊとの間の多様性スコアをＳＤｉ、ｊ（またはＳＤ（ｉ、ｊ））で表す。ここで、ｊ＝１～Ｍであり、ｉ＜ｊとする。 Figure 12 is an explanatory diagram showing an example of calculating the overall score. For convenience, the N layout images are represented as G1, G2, G3, ..., GN. In embodiments 1 to 3, the calculated design score is represented as STi, where i = 1 to M. For example, the design score of layout image Gi is STi. The diversity score between layout image Gi and layout image Gj is represented as SDi,j (or SD(i,j)), where j = 1 to M and i < j.

例えば、図１２の例では、レイアウト画像Ｇ１のデザインスコアは、ＳＴ１であり、レイアウト画像Ｇ２、Ｇ３、…、ＧＮとの間の多様性スコアは、ＳＤ（１、２）、ＳＤ（１、３）、…、ＳＤ（１、Ｎ）である。また、レイアウト画像Ｇ２のデザインスコアは、ＳＴ２であり、レイアウト画像Ｇ１、Ｇ３、…、ＧＮとの間の多様性スコアは、ＳＤ（１、２）、ＳＤ（２、３）、…、ＳＤ（２、Ｎ）である。なお、ＳＤ（２、１）はＳＤ（１、２）と等しいので、ＳＤ（１、２）という表記で統一している。他のレイアウト画像も同様である。 For example, in the example of Figure 12, the design score of layout image G1 is ST1, and the diversity scores between it and layout images G2, G3, ..., GN are SD(1,2), SD(1,3), ..., SD(1,N). Furthermore, the design score of layout image G2 is ST2, and the diversity scores between it and layout images G1, G3, ..., GN are SD(1,2), SD(2,3), ..., SD(2,N). Note that SD(2,1) is equal to SD(1,2), so the notation SD(1,2) is used consistently. The same applies to the other layout images.

図１１に示すように、Ｎ個のレイアウト画像のうち、レイアウト画像Ｇｉをスコアリングモデル１４１に入力したときの全結合層１４１ｂのベクトルをベクトルｄｉとし、レイアウト画像Ｇｊをスコアリングモデル１４１に入力したときの全結合層１４１ｂのベクトルをベクトルｄｊとする。ここで、ｉ≠ｊである。レイアウト画像ＧｉとＧｊとの間の多様性スコアＳＤｉ、ｊは、１からベクトルｄｉとｄｊのコサイン類似度を引いた値とすることができる。レイアウト画像ＧｉとＧｊのレイアウトが似ている場合、ベクトルｄｉとｄｊのコサイン類似度が１に近づくので、多様性スコアＳＤｉ、ｊは０に近づく。一方、レイアウト画像ＧｉとＧｊのレイアウトが似ていない場合、ベクトルｄｉとｄｊのコサイン類似度が０に近づくので、多様性スコアＳＤｉ、ｊは１に近づく。 As shown in FIG. 11, when layout image Gi out of N layout images is input to scoring model 141, the vector of fully connected layer 141b is denoted by vector di, and when layout image Gj is input to scoring model 141, the vector of fully connected layer 141b is denoted by vector dj. Here, i ≠ j. The diversity score SDi,j between layout images Gi and Gj can be calculated by subtracting the cosine similarity of vectors di and dj from 1. If the layouts of layout images Gi and Gj are similar, the cosine similarity of vectors di and dj approaches 1, and so the diversity score SDi,j approaches 0. On the other hand, if the layouts of layout images Gi and Gj are dissimilar, the cosine similarity of vectors di and dj approaches 0, and so the diversity score SDi,j approaches 1.

総合スコアＳは、Ｓ＝Σ（ＳＴｉ＋λ・ＳＤｉ、ｊ）という式で算出することができる。ここで、Σは、ｉ、ｊについて１からＮまでの和である。λは重み付けパラメータ（重み付け係数）で所要の値に設定することができる。 The overall score S can be calculated using the formula S = Σ(STi + λ·SDi,j), where Σ is the sum from 1 to N for i and j. λ is a weighting parameter (weighting coefficient) that can be set to the desired value.

図１３は、実施の形態４の概要を示す説明図である。レイアウト評価部１１３は、算出部としての機能を有し、複数のレイアウト画像のうちの任意の２つのレイアウト画像の類似度を算出する。レイアウト評価部１１３は、デザインを評価するスコアと、多様性を評価するスコアを用いて総合スコアを求め、求めた総合スコアが最大となる選択セット（サブセット）を求める。すなわち、選択セットを構成するＭ個のレイアウト画像全体に対する総合スコアは最大となっている。より具体的には、レイアウト評価部１１３は、実施の形態１～３で説明したデザインスコアに、さらに多様性スコアを考慮して総合スコアを算出し、レイアウト画像を評価することができる。 Figure 13 is an explanatory diagram showing an overview of embodiment 4. The layout evaluation unit 113 functions as a calculation unit and calculates the similarity between any two layout images from among multiple layout images. The layout evaluation unit 113 calculates an overall score using a score for evaluating design and a score for evaluating diversity, and determines a selection set (subset) that maximizes the overall score. In other words, the overall score for all M layout images that make up the selection set is maximized. More specifically, the layout evaluation unit 113 can calculate an overall score by taking into account the diversity score in addition to the design score described in embodiments 1 to 3, and evaluate the layout images.

デザインスコアＳＴｉは、レイアウト画像の適切さを示し、例えば、作品らしさを表す指標である。一方、多様性スコアＳＤｉ、ｊは、類似していないことを示す、あるいはバリエーションの度合いが高いことを表す指標である。総合スコアＳを用いることにより、図１３に模式的に例示するように、作品らしさや適切さが高く、かつ多様性に富んだレイアウト画像（上位Ｍ）を出力することができる。 The design score STi indicates the appropriateness of the layout image, and is an index that indicates, for example, how artistic it is. On the other hand, the diversity score SDi,j is an index that indicates dissimilarity or a high degree of variation. By using the overall score S, it is possible to output layout images (top M) that are highly artistic and appropriate, and rich in diversity, as illustrated schematically in Figure 13.

図１４は、実施の形態４のレイアウト生成処理の処理手順の一例を示すフローチャートである。図１４に基づき、レイアウト生成処理の処理内容について説明する。サーバ１の制御部１１は、端末２から、文書ページに配置する複数のコンテンツと、指定ページ数とを取得する（ステップＳ３１）。制御部１１は、指定されたページ数に応じて、複数のコンテンツを各ページに割り当てて配置するレイアウト情報を生成する（ステップＳ３２）。具体的には、制御部１１は、所定の領域内に各コンテンツを配置する配置座標をランダムに決定し、複数パターンのレイアウト情報を生成する。制御部１１は、生成したレイアウト情報に従って、所定の領域内にコンテンツを配置した各ページのレイアウト画像を生成する（ステップＳ３３）。具体的には、制御部１１は、ステップＳ３２で生成した複数パターンのレイアウト情報に従い、複数パターンのレイアウト画像を生成する。 Figure 14 is a flowchart showing an example of the processing steps for the layout generation processing of embodiment 4. The processing details of the layout generation processing will be described based on Figure 14. The control unit 11 of the server 1 acquires from the terminal 2 multiple pieces of content to be arranged on document pages and the specified number of pages (step S31). The control unit 11 generates layout information that allocates and arranges multiple pieces of content to each page according to the specified number of pages (step S32). Specifically, the control unit 11 randomly determines the arrangement coordinates for arranging each piece of content within a specified area, and generates multiple patterns of layout information. The control unit 11 generates layout images of each page in which the content is arranged within the specified area according to the generated layout information (step S33). Specifically, the control unit 11 generates multiple patterns of layout images according to the multiple patterns of layout information generated in step S32.

制御部１１は、各ページのレイアウト画像のスコアを合算し、複数パターンそれぞれについてデザインスコアを算出する（ステップＳ３５）。制御部１１は、各ページのレイアウト画像の多様性スコアを算出する（Ｓ２１１）。具体的には、制御部１１は、ページ毎に異なるスコアリングモデル１４１を用いて、各ページの多様性スコアを算出することができる。制御部１１は、ステップＳ３３で生成した複数パターンのレイアウト画像それぞれについて多様性スコアを算出する。 The control unit 11 adds up the scores of the layout images of each page and calculates a design score for each of the multiple patterns (step S35). The control unit 11 calculates a diversity score for the layout image of each page (S211). Specifically, the control unit 11 can calculate a diversity score for each page using a different scoring model 141 for each page. The control unit 11 calculates a diversity score for each of the multiple patterns of layout images generated in step S33.

制御部１１は、各ページのレイアウト画像の多様性スコアを合算し、複数パターンそれぞれについて多様性スコアを算出する（ステップＳ２１２）。制御部１１は、デザインスコア及び多様性スコアに基づいて複数パターンそれぞれについて総合スコアを算出する（Ｓ２１３）。制御部１１は、算出した総合スコアに応じて各パターンのレイアウト画像の順位付けを行い、総合スコアが最大となるＭ個のセット（組み合わせ）を端末２に出力する（ステップＳ３６）。制御部１１は、一連の処理を終了する。 The control unit 11 adds up the diversity scores of the layout images of each page and calculates a diversity score for each of the multiple patterns (step S212). The control unit 11 calculates an overall score for each of the multiple patterns based on the design score and diversity score (S213). The control unit 11 ranks the layout images of each pattern according to the calculated overall score, and outputs M sets (combinations) with the highest overall score to the terminal 2 (step S36). The control unit 11 ends the series of processes.

以上より、実施の形態４によれば、スコアリングモデル１４１を用いて算出したスコアに応じてレイアウト画像の情報を出力することで、適切なレイアウト（作品らしさが高いレイアウト）であり、かつ、多様性に富んだレイアウトをユーザに提示することができる。 As described above, according to embodiment 4, by outputting layout image information according to the score calculated using the scoring model 141, it is possible to present to the user appropriate layouts (layouts that are highly artistic) and layouts that are rich in variety.

今回開示された実施の形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 The embodiments disclosed herein are to be considered in all respects as illustrative and not restrictive. The scope of the present invention is defined by the claims, not by the meaning described above, and is intended to include all modifications within the meaning and scope of the claims.

１サーバ（情報処理装置）
１１制御部
１２主記憶部
１３通信部
１４補助記憶部
Ｐプログラム
１４１スコアリングモデル
１４１ａ、１４１ｂ全結合層
２端末 1. Server (information processing device)
11 Control unit 12 Main memory unit 13 Communication unit 14 Auxiliary memory unit P Program 141 Scoring model 141a, 141b Fully connected layer 2 Terminal

Claims

an acquisition unit that acquires a plurality of contents;
a generation unit that generates a layout image in which the plurality of contents are arranged in a predetermined area;
an evaluation unit that acquires an evaluation of a generated layout image using a classifier that has been trained on a plurality of layout images;
an output unit that outputs an evaluation result, wherein the acquisition unit acquires the plurality of contents and the number of pages of the layout image,
the generating unit generates a plurality of layout images in which the plurality of contents are allocated to and arranged on each page according to the number of pages;
The information processing apparatus, wherein the evaluation unit uses a different classifier depending on the page to obtain an evaluation of each of the plurality of layout images.

the generation unit generates a plurality of layout images in which the plurality of contents are arranged differently;
the evaluation unit obtains a ranking of the plurality of layout images using the classifier;
The information processing apparatus according to claim 1 , wherein the output unit outputs the ranking.

the acquisition unit acquires condition information that defines a placement condition for the content;
The information processing apparatus according to claim 1 , wherein the evaluation unit changes the evaluation obtained from the classifier based on the condition information.

the acquisition unit acquires condition information that defines a placement condition for the content;
4. The information processing apparatus according to claim 1, wherein the generating unit generates the layout image based on the condition information.

a calculation unit that calculates a similarity between any two layout images among the plurality of layout images;
5. The information processing apparatus according to claim 1, wherein the evaluation unit evaluates the plurality of layout images using the evaluation value by the classifier and the similarity calculated by the calculation unit.

Acquire multiple contents and the number of pages of the layout image,
generating a plurality of layout images in which the plurality of contents are allocated to each page according to the number of pages and arranged in a predetermined area;
acquiring an evaluation of each of the plurality of layout images generated using different trained classifiers according to the pages;
An information processing method comprising causing a computer to execute a process of outputting an evaluation result.

Acquire multiple contents and the number of pages of the layout image,
generating a plurality of layout images in which the plurality of contents are allocated to each page according to the number of pages and arranged in a predetermined area;
acquiring an evaluation of each of the plurality of layout images generated using different trained classifiers according to the pages;
A program that causes a computer to execute a process of outputting an evaluation result.