JP7696255B2

JP7696255B2 - Learning support device, learning support method, and learning support program

Info

Publication number: JP7696255B2
Application number: JP2021138985A
Authority: JP
Inventors: 真澄石川; 竜之介矢野
Original assignee: Denso Ten Ltd
Current assignee: Denso Ten Ltd
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2025-06-20
Anticipated expiration: 2041-08-27
Also published as: JP2023032702A

Description

開示の実施形態は、学習支援装置、学習支援方法および学習支援プログラムに関する。 The disclosed embodiments relate to a learning support device, a learning support method, and a learning support program.

従来、深層学習等のアルゴリズムを用いてＡＩ（Artificial Intelligence）モデルの学習を行うに際し、正例となる教師データを作成する作業であるアノテーションが知られている（たとえば、特許文献１参照）。 Conventionally, annotation is known as a process for creating training data that serves as positive examples when training an AI (Artificial Intelligence) model using algorithms such as deep learning (see, for example, Patent Document 1).

かかるアノテーションのうち、画像認識用のＡＩモデルのためのアノテーションでは、画像中の各物体を囲むバウンディングボックス（以下、「ＢＢ」と記載する）と呼ばれる矩形が、人手を介したいわゆるＶＤＴ（Visual Display Terminals）作業により作成される。また、作成された各ＢＢに対し、各物体の名称や属性等を示すメタデータが付与される。 Among these annotations, in annotations for AI models for image recognition, rectangles called bounding boxes (hereafter referred to as "BBs") that surround each object in the image are created manually using so-called VDT (Visual Display Terminals) work. In addition, metadata indicating the name, attributes, etc. of each object is assigned to each created BB.

特開２０１０－００９３３７号公報JP 2010-009337 A

しかしながら、従来技術は、アノテーションの品質を確保するうえで、さらなる改善の余地がある。 However, conventional techniques leave room for further improvement in ensuring annotation quality.

たとえば、ＡＩモデルの精度を高めるには、大量の画像を収集し、これらの各画像に対しアノテーションを行う必要があるが、上述したＶＤＴ作業においては膨大な工数が掛かるうえに、作業者の心身の疲労も大きくなるという問題があった。 For example, to improve the accuracy of an AI model, it is necessary to collect a large number of images and annotate each of these images. However, the VDT work described above requires a huge amount of man-hours and causes great physical and mental fatigue for the workers.

また、特に上述のＢＢは、教師データとしての品質を高めるために、物体以外の余分な領域を極力含めないように作成されることが好ましいが、人手によっては容易ではないうえ、作業者の熟練度によって品質が左右されてしまうおそれがある。 In particular, it is preferable to create the BB described above so as to include as few extraneous areas other than the object as possible in order to improve the quality of the training data. However, this is not easy to do manually, and there is a risk that the quality will be affected by the skill level of the worker.

実施形態の一態様は、上記に鑑みてなされたものであって、画像認識用のＡＩモデルのためのアノテーションの品質を確保することができる学習支援装置、学習支援方法および学習支援プログラムを提供することを目的とする。 One aspect of the embodiment has been made in consideration of the above, and aims to provide a learning support device, a learning support method, and a learning support program that can ensure the quality of annotations for an AI model for image recognition.

実施形態の一態様に係る学習支援装置は、コントローラを備える。前記コントローラは、教師データ作成におけるアノテーションの対象画像における物体のエッジを抽出し、抽出されたエッジのうち、閉曲線として抽出されたエッジである閉エッジに接する閉エッジ矩形を作成し、ユーザにより前記対象画像に対し入力された入力矩形と前記閉エッジ矩形との類似度を算出し、前記類似度が最も大きい前記閉エッジ矩形を選択し、選択された前記閉エッジ矩形を前記アノテーションのためのバウンディングボックスとして採用する。 A learning support device according to one aspect of an embodiment includes a controller that extracts edges of an object in a target image for annotation in creating teacher data , creates a closed-edge rectangle tangent to a closed edge that is an edge extracted as a closed curve from among the extracted edges, calculates a similarity between an input rectangle input by a user for the target image and the closed-edge rectangle, selects the closed-edge rectangle with the largest similarity , and employs the selected closed-edge rectangle as a bounding box for the annotation .

実施形態の一態様によれば、画像認識用のＡＩモデルのためのアノテーションの品質を確保することができる。 According to one aspect of the embodiment, it is possible to ensure the quality of annotations for an AI model for image recognition.

図１は、実施形態に係る学習支援装置の概要構成を示す図である。FIG. 1 is a diagram showing a schematic configuration of a learning support device according to an embodiment. 図２は、実施形態に係る学習支援方法の概要説明図（その１）である。FIG. 2 is a schematic explanatory diagram (part 1) of the learning support method according to the embodiment. 図３は、実施形態に係る学習支援方法の概要説明図（その２）である。FIG. 3 is a schematic explanatory diagram (part 2) of the learning support method according to the embodiment. 図４は、実施形態に係る学習支援方法の概要説明図（その３）である。FIG. 4 is a schematic explanatory diagram (part 3) of the learning support method according to the embodiment. 図５は、実施形態に係る学習支援方法の概要説明図（その４）である。FIG. 5 is a schematic explanatory diagram (part 4) of the learning support method according to the embodiment. 図６は、実施形態に係る学習支援装置の構成例を示すブロック図である。FIG. 6 is a block diagram illustrating an example of the configuration of a learning support device according to an embodiment. 図７は、閉エッジ矩形作成処理の説明図である。FIG. 7 is an explanatory diagram of the closed edge rectangle creation process. 図８は、最類似矩形選択処理の具体例を示す図（その１）である。FIG. 8 is a diagram (part 1) showing a specific example of the most similar rectangle selection process. 図９は、最類似矩形選択処理の具体例を示す図（その２）である。FIG. 9 is a diagram (part 2) showing a specific example of the most similar rectangle selection process. 図１０は、最類似矩形選択処理の具体例を示す図（その３）である。FIG. 10 is a diagram (part 3) showing a specific example of the most similar rectangle selection process. 図１１は、最類似矩形選択処理の具体例を示す図（その４）である。FIG. 11 is a diagram (part 4) showing a specific example of the most similar rectangle selection process. 図１２は、最類似矩形選択処理の具体例を示す図（その５）である。FIG. 12 is a diagram (part 5) showing a specific example of the most similar rectangle selection process. 図１３は、画面ＵＩの具体例を示す図（その１）である。FIG. 13 is a diagram (part 1) showing a specific example of the screen UI. 図１４は、画面ＵＩの具体例を示す図（その２）である。FIG. 14 is a diagram (part 2) showing a specific example of the screen UI. 図１５は、画面ＵＩの具体例を示す図（その３）である。FIG. 15 is a diagram (part 3) showing a specific example of the screen UI. 図１６は、画面ＵＩの具体例を示す図（その４）である。FIG. 16 is a diagram (part 4) showing a specific example of the screen UI. 図１７は、画面ＵＩの具体例を示す図（その５）である。FIG. 17 is a diagram (part 5) showing a specific example of the screen UI. 図１８は、画面ＵＩの具体例を示す図（その６）である。FIG. 18 is a diagram (part 6) showing a specific example of the screen UI. 図１９は、画面ＵＩの具体例を示す図（その７）である。FIG. 19 is a diagram (part 7) showing a specific example of the screen UI. 図２０は、画面ＵＩの具体例を示す図（その８）である。FIG. 20 is a diagram (part 8) showing a specific example of the screen UI. 図２１は、画面ＵＩの具体例を示す図（その９）である。FIG. 21 is a diagram (part 9) showing a specific example of the screen UI. 図２２は、画面ＵＩの具体例を示す図（その１０）である。FIG. 22 is a diagram (part 10) showing a specific example of the screen UI. 図２３は、画面ＵＩの具体例を示す図（その１１）である。FIG. 23 is a diagram (part 11) showing a specific example of the screen UI. 図２４は、画面ＵＩの具体例を示す図（その１２）である。FIG. 24 is a diagram (part 12) showing a specific example of the screen UI. 図２５は、画面ＵＩの具体例を示す図（その１３）である。FIG. 25 is a diagram (part 13) showing a specific example of the screen UI. 図２６は、画面ＵＩの具体例を示す図（その１４）である。FIG. 26 is a diagram (part 14) showing a specific example of the screen UI. 図２７は、実施形態に係る学習支援装置が実行する処理シーケンスである。FIG. 27 shows a processing sequence executed by the learning assistance device according to the embodiment.

以下、添付図面を参照して、本願の開示する学習支援装置、学習支援方法および学習支援プログラムの実施形態を詳細に説明する。なお、以下に示す実施形態によりこの発明が限定されるものではない。 Below, embodiments of the learning support device, learning support method, and learning support program disclosed in the present application will be described in detail with reference to the attached drawings. Note that the present invention is not limited to the embodiments described below.

まず、実施形態に係る学習支援方法の概要について、図１～図５を用いて説明する。図１は、実施形態に係る学習支援装置１０の概略構成を示す図である。また、図２～図５は、実施形態に係る学習支援方法の概要説明図（その１）～（その４）である。 First, an overview of the learning support method according to the embodiment will be described with reference to Figs. 1 to 5. Fig. 1 is a diagram showing a schematic configuration of a learning support device 10 according to the embodiment. Figs. 2 to 5 are diagrams (part 1) to (part 4) showing an overview of the learning support method according to the embodiment.

学習支援装置１０は、画像認識用のＡＩモデルのためのアノテーションを行うに際して、アノテーションの作業者であるユーザＵによって利用されるコンピュータである。学習支援装置１０は、たとえば、デスクトップ型やノート型のＰＣ（Personal Computer）や、タブレット端末や、スマートフォンや、サーバや、ワークステーション等である。 The learning support device 10 is a computer used by a user U, who is an annotator, when performing annotation for an AI model for image recognition. The learning support device 10 is, for example, a desktop or notebook PC (Personal Computer), a tablet terminal, a smartphone, a server, a workstation, etc.

図１に示すように、実施形態に係る学習支援装置１０は、ＨＭＩ（Human Machine Interface）部３を有する。また、学習支援装置１０は、対象画像ＤＢ１１ａと、矩形情報ＤＢ１１ｅとを有する。 As shown in FIG. 1, the learning support device 10 according to the embodiment has an HMI (Human Machine Interface) unit 3. The learning support device 10 also has a target image DB 11a and a rectangle information DB 11e.

ＨＭＩ部３は、ユーザＵに対するインターフェイス部品を提供する構成要素である。ＨＭＩ部３は、入力部３ａと、出力部３ｂとを含む。 The HMI unit 3 is a component that provides interface components for the user U. The HMI unit 3 includes an input unit 3a and an output unit 3b.

入力部３ａは、ユーザＵからの入力操作を受け付ける入力デバイスであって、たとえばキーボードや、マウスや、ペンタブレットや、タッチパネル等によって実現される。なお、入力部３ａは、ソフトウェア部品によって実現されてもよい。 The input unit 3a is an input device that accepts input operations from the user U, and is realized, for example, by a keyboard, a mouse, a pen tablet, a touch panel, etc. Note that the input unit 3a may also be realized by a software component.

出力部３ｂは、アノテーションの対象画像や、かかる対象画像上に入力されるＢＢ等を表示出力する出力デバイスであって、ディスプレイ等によって実現される。なお、タッチパネルディスプレイにより、入力部３ａと一体に構成されてもよい。 The output unit 3b is an output device that displays and outputs the target image of the annotation and the BB input on the target image, and is realized by a display or the like. Note that the output unit 3b may be configured integrally with the input unit 3a by a touch panel display.

対象画像ＤＢ１１ａは、アノテーションの作業対象となる各画像が格納されたデータベースである。矩形情報ＤＢ１１ｅは、アノテーションにおいて各画像上に作成されたＢＢの位置やサイズ等に関する情報である矩形情報が格納されるデータベースである。 The target image DB11a is a database that stores each image that is the subject of annotation work. The rectangle information DB11e is a database that stores rectangle information, which is information about the position, size, etc. of the BB created on each image during annotation.

ユーザＵは、ＨＭＩ部３を介したＶＤＴ作業により、対象画像ＤＢ１１ａに格納された各画像に対するアノテーションを行い、その結果として作成されたＢＢに関する矩形情報が矩形情報ＤＢ１１ｅへ格納される。 The user U performs VDT work via the HMI unit 3 to annotate each image stored in the target image DB 11a, and the rectangle information related to the BB created as a result is stored in the rectangle information DB 11e.

ところで、このようなアノテーションに関する既存技術は、アノテーションの品質を確保するうえで、さらなる改善の余地がある。 However, existing annotation technologies like this leave room for further improvement in terms of ensuring annotation quality.

たとえば、ＡＩモデルの精度を高めるには、大量の画像を収集し、これらの各画像に対しアノテーションを行う必要があるが、ＶＤＴ作業においては膨大な工数が掛かるうえに、ユーザＵの心身の疲労も大きくなるという問題があった。 For example, to improve the accuracy of an AI model, it is necessary to collect a large number of images and annotate each of these images, but this requires a huge amount of man-hours in VDT work and causes great physical and mental fatigue for the user U.

また、特にＢＢは、教師データとしての品質を高めるために、物体以外の余分な領域を極力含めないように作成されることが好ましいが、人手によっては容易ではないうえ、ユーザＵの熟練度によって品質が左右されてしまうおそれがある。 In particular, it is preferable to create the BB so that it includes as little extraneous area as possible other than the object in order to improve the quality of the training data, but this is not easy to do manually, and there is a risk that the quality will be affected by the level of skill of the user U.

そこで、実施形態に係る学習支援方法では、アノテーションの対象画像における物体のエッジを抽出し、抽出されたエッジのうち、閉曲線として抽出されたエッジである閉エッジに接する閉エッジ矩形を作成し、ユーザＵにより対象画像に対し入力された入力矩形と閉エッジ矩形との類似度を算出し、類似度が最も大きい閉エッジ矩形を選択することとした。 Therefore, in the learning support method according to the embodiment, the edges of the object in the target image for annotation are extracted, and from among the extracted edges, a closed edge rectangle is created that is tangent to a closed edge, which is an edge extracted as a closed curve. The similarity between the input rectangle input to the target image by user U and the closed edge rectangle is calculated, and the closed edge rectangle with the greatest similarity is selected.

具体的に、図２に示すように、実施形態に係る学習支援方法ではまず、学習支援装置１０が、アノテーションの対象画像に対し、ユーザＵの入力操作による入力矩形ＩＲを作成する（ステップＳ１）。同図に示すように、ユーザＵは、たとえば対象画像上における始点Ｐ１から終点Ｐ２へ向けたマウス等による領域選択操作により、入力矩形ＩＲを指定する。 2, in the learning support method according to the embodiment, first, the learning support device 10 creates an input rectangle IR for the target image for annotation by an input operation of the user U (step S1). As shown in the figure, the user U specifies the input rectangle IR, for example, by performing an area selection operation with a mouse or the like from a start point P1 to an end point P2 on the target image.

一方で、学習支援装置１０は、図３に示すように、同じ対象画像について、予めエッジを抽出する（ステップＳ２）。かかるエッジの抽出については、キャニー法（Canny edge detector）等の公知のアルゴリズムを用いてもよいし、オリジナルのアルゴリズムを用いてもよい。 On the other hand, the learning support device 10 extracts edges in advance for the same target image, as shown in FIG. 3 (step S2). For such edge extraction, a known algorithm such as the Canny edge detector may be used, or an original algorithm may be used.

そして、学習支援装置１０は、図４に示すように、抽出されたエッジのうち、閉曲線として抽出された、すなわち両端の一致する閉じたエッジ（以下、「閉エッジ」という）に対し、かかる閉エッジに接する矩形を自動的に作成する（ステップＳ３）。同図において破線で示すのが、かかる閉ヘッジに接する矩形（以下、「閉エッジ矩形」という）である。 Then, as shown in Figure 4, for the extracted edges that are extracted as closed curves, i.e., closed edges with matching ends (hereafter referred to as "closed edges"), the learning support device 10 automatically creates rectangles tangent to the closed edges (step S3). The rectangles tangent to the closed hedges (hereafter referred to as "closed edge rectangles") are shown by dashed lines in the figure.

そして、学習支援装置１０は、同図に示すように、閉エッジ矩形のうち、入力矩形ＩＲと最も類似する閉エッジ矩形を自動調整後の矩形ＡＲとして選択する（ステップＳ４）。最も類似する閉エッジ矩形の選択方法については、図８以降を用いた説明で後述する。 Then, as shown in the figure, the learning support device 10 selects, from among the closed edge rectangles, the closed edge rectangle that is most similar to the input rectangle IR as the automatically adjusted rectangle AR (step S4). The method for selecting the most similar closed edge rectangle will be described later using Figure 8 and subsequent figures.

そして、学習支援装置１０は、図５に示すように、出力部３ｂに対し、自動調整後の矩形ＡＲとともに、かかる矩形ＡＲを採用するか否かをユーザＵに問い合わせるダイアログを表示する（ステップＳ５）。 Then, as shown in FIG. 5, the learning support device 10 displays on the output unit 3b the automatically adjusted rectangular AR along with a dialog asking the user U whether or not to adopt the rectangular AR (step S5).

同図に示すように、ここでユーザＵが「Ｙｅｓ」を選択した場合、学習支援装置１０は、ＢＢとして調整後の矩形ＡＲを採用する。一方、ユーザＵが「Ｎｏ」を選択した場合、学習支援装置１０は、ＢＢとして調整前の矩形、すなわち入力矩形ＩＲを採用する。 As shown in the figure, if the user U selects "Yes" here, the learning support device 10 uses the adjusted rectangle AR as BB. On the other hand, if the user U selects "No", the learning support device 10 uses the rectangle before adjustment, i.e., the input rectangle IR, as BB.

このように、実施形態に係る学習支援方法では、アノテーションの対象画像における物体のエッジを抽出し、抽出されたエッジのうち、閉曲線として抽出されたエッジである閉エッジに接する閉エッジ矩形を作成し、ユーザＵにより対象画像に対し入力された入力矩形ＩＲと閉エッジ矩形との類似度を算出し、類似度が最も大きい閉エッジ矩形を選択することとした。 In this way, the learning support method according to the embodiment extracts the edges of an object in an image to be annotated, creates a closed edge rectangle that is tangent to a closed edge, which is an edge extracted as a closed curve from among the extracted edges, calculates the similarity between the input rectangle IR input to the target image by user U and the closed edge rectangle, and selects the closed edge rectangle with the greatest similarity.

したがって、実施形態に係る学習支援方法によれば、既存技術では、大量に収集された各画像に対し、ユーザＵが個人の感覚・判断で逐一行う必要のあったＢＢの作成に掛かる工数を大幅に削減することが可能となる。また、これにより、ＶＤＴ作業におけるユーザＵの心身の疲労も軽減することができる。 Therefore, according to the learning support method of the embodiment, it is possible to significantly reduce the amount of work required to create a BB for each image collected in large quantities, which in the existing technology required the user U to do one by one based on his/her personal sense and judgment. This also makes it possible to reduce the mental and physical fatigue of the user U during VDT work.

また、実施形態に係る学習支援方法によれば、エッジ抽出を含む画像解析技術を用いて自動的に作成する、画像中の物体の境界に接する閉エッジ矩形をＢＢとして採用可能であるので、ユーザＵの熟練度によって教師データの品質が左右されることを低減することができる。 In addition, according to the learning support method of the embodiment, a closed edge rectangle that is automatically created using image analysis technology including edge extraction and that is in contact with the boundary of an object in an image can be used as the BB, thereby reducing the influence of the user U's level of proficiency on the quality of the training data.

すなわち、実施形態に係る学習支援方法によれば、画像認識用のＡＩモデルのためのアノテーションの品質を確保することができる。以下、実施形態に係る学習支援方法を適用した学習支援装置１０の構成例について、より具体的に説明する。 In other words, according to the learning support method of the embodiment, it is possible to ensure the quality of annotations for an AI model for image recognition. Below, a more specific description is given of an example configuration of a learning support device 10 to which the learning support method of the embodiment is applied.

図６は、実施形態に係る学習支援装置１０の構成例を示すブロック図である。なお、図６では、実施形態の特徴を説明するために必要な構成要素のみを表しており、一般的な構成要素についての記載を省略している。 Figure 6 is a block diagram showing an example of the configuration of a learning support device 10 according to an embodiment. Note that Figure 6 shows only the components necessary to explain the features of the embodiment, and does not include general components.

換言すれば、図６に図示される各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。例えば、各ブロックの分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することが可能である。 In other words, each component shown in FIG. 6 is a functional concept, and does not necessarily have to be physically configured as shown. For example, the specific form of distribution and integration of each block is not limited to that shown, and all or part of it can be functionally or physically distributed and integrated in any unit depending on various loads, usage conditions, etc.

また、図６を用いた説明では、既に説明済みの構成要素については、説明を簡略するか、省略する場合がある。 In addition, in the explanation using Figure 6, the explanation of components that have already been explained may be simplified or omitted.

図６に示すように、実施形態に係る学習支援装置１０は、記憶部１１と、制御部１２とを備える。また、学習支援装置１０は、有線または無線を介し、あるいは直接に、ＨＭＩ部３が接続される。 As shown in FIG. 6, the learning support device 10 according to the embodiment includes a memory unit 11 and a control unit 12. The learning support device 10 is also connected to an HMI unit 3 via a wired or wireless connection, or directly.

ＨＭＩ部３については説明済みのため、ここでの説明は省略する。記憶部１１は、たとえば、ＲＡＭ（Random Access Memory）、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部１１は、図６の例では、対象画像ＤＢ１１ａと、抽出アルゴリズム情報１１ｂと、エッジ情報１１ｃと、類似度算出情報１１ｄと、矩形情報ＤＢ１１ｅとを記憶する。 Since the HMI unit 3 has already been described, a description thereof will be omitted here. The storage unit 11 is realized, for example, by a semiconductor memory element such as a random access memory (RAM) or a flash memory, or a storage device such as a hard disk or an optical disk. In the example of FIG. 6, the storage unit 11 stores a target image DB 11a, extraction algorithm information 11b, edge information 11c, similarity calculation information 11d, and a rectangle information DB 11e.

対象画像ＤＢ１１ａおよび矩形情報ＤＢ１１ｅについては説明済みのため、ここでの説明は省略する。抽出アルゴリズム情報１１ｂは、後述するエッジ抽出部１２ｂａが実行するエッジ抽出処理において用いられるアルゴリズムのライブラリ情報である。 The target image DB 11a and rectangle information DB 11e have already been explained, so their explanation will be omitted here. The extraction algorithm information 11b is library information of algorithms used in the edge extraction process executed by the edge extraction unit 12ba, which will be described later.

エッジ情報１１ｃは、エッジ抽出部１２ｂａによって抽出されたエッジに関する情報が格納される。類似度算出情報１１ｄは、入力矩形ＩＲと閉エッジ矩形との類似度の算出基準となるアルゴリズムや各種のパラメータ等が格納される。類似度算出情報１１ｄは、たとえばユーザＵが事前に選定し、静的な情報として予め設定可能である。 The edge information 11c stores information about the edges extracted by the edge extraction unit 12ba. The similarity calculation information 11d stores algorithms and various parameters that serve as the basis for calculating the similarity between the input rectangle IR and the closed edge rectangle. The similarity calculation information 11d can be selected in advance by the user U, for example, and set in advance as static information.

制御部１２は、コントローラ（controller）であり、たとえば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、記憶部１１に記憶されている図示略の各種プログラムがＲＡＭを作業領域として実行されることにより実現される。また、制御部１２は、たとえば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現することができる。 The control unit 12 is a controller, and is realized, for example, by a CPU (Central Processing Unit) or MPU (Micro Processing Unit) executing various programs (not shown) stored in the memory unit 11 using a RAM as a working area. The control unit 12 can also be realized, for example, by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

制御部１２は、画像描写部１２ａと、画像解析部１２ｂと、矩形作成部１２ｃとを有し、以下に説明する情報処理の機能や作用を実現または実行する。 The control unit 12 has an image depiction unit 12a, an image analysis unit 12b, and a rectangle creation unit 12c, and realizes or executes the information processing functions and actions described below.

画像描写部１２ａは、対象画像ＤＢ１１ａに格納されたアノテーションの対象画像を出力部３ｂに対し出力する。画像描写部１２ａは、矩形描画部１２ａａを含む。矩形描画部１２ａａは、入力部３ａを介して入力された入力矩形ＩＲを対象画像上へ描画する。 The image depiction unit 12a outputs the target image of the annotation stored in the target image DB 11a to the output unit 3b. The image depiction unit 12a includes a rectangle drawing unit 12aa. The rectangle drawing unit 12aa draws the input rectangle IR input via the input unit 3a on the target image.

画像解析部１２ｂは、対象画像に対する画像解析処理を実行する。画像解析部１２ｂは、エッジ抽出部１２ｂａを含む。エッジ抽出部１２ｂａは、抽出アルゴリズム情報１１ｂに基づいて、対象画像に対するエッジ抽出処理を実行する。 The image analysis unit 12b performs image analysis processing on the target image. The image analysis unit 12b includes an edge extraction unit 12ba. The edge extraction unit 12ba performs edge extraction processing on the target image based on the extraction algorithm information 11b.

なお、エッジ抽出部１２ｂａは、上述したキャニー法の他、エッジ抽出の公知のアルゴリズムとして、ＳｏｂｅｌやＬａｐｌａｃｉａｎ等を用いることができる。 In addition to the above-mentioned Canny algorithm, the edge extraction unit 12ba can use known algorithms for edge extraction, such as Sobel and Laplacian.

エッジ抽出部１２ｂａは、これら公知のアルゴリズムおよびオリジナルのアルゴリズムの中から、ユーザＵによって任意に選択されたアルゴリズムを用いてエッジ抽出処理を実行するようにしてもよい。 The edge extraction unit 12ba may perform edge extraction processing using an algorithm arbitrarily selected by the user U from among these publicly known algorithms and original algorithms.

このような画像解析処理により画像中の物体の境界を抽出することで、ユーザＵの感覚・判断に依存せず、一定の品質を保つことができる。また、アルゴリズムを任意に選択できるようにすることで、教師データの特性に依らない機能実現が可能となる。たとえば、物体の色や形状、あるいは画像の明度や彩度によってアルゴリズムを使い分け、物体のエッジを精度よく抽出することが可能となる。 By extracting the boundaries of objects in an image using this type of image analysis processing, it is possible to maintain a certain level of quality without relying on the senses and judgments of the user U. Furthermore, by allowing the algorithm to be selected arbitrarily, it is possible to realize functions that are not dependent on the characteristics of the training data. For example, by using different algorithms depending on the color or shape of the object, or the brightness or saturation of the image, it is possible to extract the edges of the object with high precision.

矩形作成部１２ｃは、閉エッジ矩形作成部１２ｃａと、最類似矩形選択部１２ｃｂとを含む。閉エッジ矩形作成部１２ｃａは、エッジ抽出部１２ｂａによって抽出されたエッジ情報１１ｃに基づいて、上述した閉エッジに接する閉エッジ矩形を作成する。 The rectangle creation unit 12c includes a closed edge rectangle creation unit 12ca and a most similar rectangle selection unit 12cb. The closed edge rectangle creation unit 12ca creates a closed edge rectangle that is in contact with the above-mentioned closed edge based on the edge information 11c extracted by the edge extraction unit 12ba.

ここで、図７は、閉エッジ矩形作成処理の説明図である。図７に示すように、閉エッジ矩形作成部１２ｃａは、エッジ情報１１ｃから、閉エッジＣｅの横位置最小値Ｘ１、横位置最大値Ｘ２、縦位置最小値Ｙ１、縦位置最大値Ｙ２を取得する。 Here, FIG. 7 is an explanatory diagram of the closed edge rectangle creation process. As shown in FIG. 7, the closed edge rectangle creation unit 12ca obtains the minimum horizontal position value X1, maximum horizontal position value X2, minimum vertical position value Y1, and maximum vertical position value Y2 of the closed edge Ce from the edge information 11c.

そして、閉エッジ矩形作成部１２ｃａは、同図に示すように、これらによって規定される４座標（Ｘ１，Ｙ１）、（Ｘ１，Ｙ２）、（Ｘ２，Ｙ１）、（Ｘ２，Ｙ２）を用いて閉エッジ矩形Ｒを作成する。 Then, the closed edge rectangle creation unit 12ca creates a closed edge rectangle R using the four coordinates (X1, Y1), (X1, Y2), (X2, Y1), and (X2, Y2) defined by these, as shown in the same figure.

これにより、ユーザＵが、物体の境界に合わせて逐一ＢＢを調整する必要がなくなり、ＶＤＴ作業の工数を大幅に減少することができるとともに、ユーザＵの負担を軽減することが可能となる。 This eliminates the need for user U to adjust BB to match the boundaries of the object one by one, significantly reducing the amount of work required for VDT work and easing the burden on user U.

図６の説明に戻る。最類似矩形選択部１２ｃｂは、類似度算出情報１１ｄに基づいて、閉エッジ矩形作成部１２ｃａによって作成された各閉エッジ矩形Ｒと入力矩形ＩＲとの類似度を算出し、類似度が最大となる閉エッジ矩形Ｒを選択する。 Returning to the explanation of FIG. 6, the most similar rectangle selection unit 12cb calculates the similarity between each closed edge rectangle R created by the closed edge rectangle creation unit 12ca and the input rectangle IR based on the similarity calculation information 11d, and selects the closed edge rectangle R with the maximum similarity.

類似度算出のアルゴリズムは、上述したエッジ抽出の場合と同様に、ユーザＵが任意に選択可能である。最類似矩形選択部１２ｃｂは、たとえば、類似度合いを計る要素として、アスペクト比や、中心位置や、横幅／縦幅等を要素として用いる。そして、最類似矩形選択部１２ｃｂは、入力矩形ＩＲと各閉エッジ矩形Ｒの間の各要素差の二乗和から類似度を求め、類似度が最大となる閉エッジ矩形Ｒを選択する。 The algorithm for calculating the similarity can be selected by the user U as desired, as in the case of edge extraction described above. The most similar rectangle selection unit 12cb uses, for example, aspect ratio, center position, width/length, etc. as elements for measuring the degree of similarity. The most similar rectangle selection unit 12cb then calculates the similarity from the sum of squares of the element differences between the input rectangle IR and each closed edge rectangle R, and selects the closed edge rectangle R with the maximum similarity.

ここで、最類似矩形選択処理の具体例について、図８～図１２を用いて説明する。図８～図１２は、最類似矩形選択処理の具体例を示す図（その１）～（その５）である。 Here, specific examples of the most similar rectangle selection process will be described with reference to Figs. 8 to 12. Figs. 8 to 12 are diagrams (part 1) to (part 5) showing specific examples of the most similar rectangle selection process.

まず、図８に示すように、中心位置（４，３）、横幅８、縦幅６の入力矩形ＩＲ内に、星形、丸形、五角形の閉エッジが含まれているものとする。 First, as shown in Figure 8, the input rectangle IR has a center position of (4, 3), a width of 8, and a height of 6, and contains closed edges of a star, a circle, and a pentagon.

かかる場合に、図９に示すように、アスペクト比のみを比較する場合、最類似矩形選択部１２ｃｂは、星形の閉エッジに接する閉エッジ矩形Ｒ１、丸形の閉エッジに接する閉エッジ矩形Ｒ２、五角形の閉エッジに接する閉エッジ矩形Ｒ３のうち、アスペクト比が入力矩形ＩＲと略同一の閉エッジ矩形Ｒ１を選択する。 In such a case, as shown in FIG. 9, when comparing only the aspect ratios, the most similar rectangle selection unit 12cb selects the closed edge rectangle R1 that has approximately the same aspect ratio as the input rectangle IR from among the closed edge rectangle R1 that is adjacent to the closed edge of a star, the closed edge rectangle R2 that is adjacent to the closed edge of a circle, and the closed edge rectangle R3 that is adjacent to the closed edge of a pentagon.

また、図１０に示すように、中心位置のみを比較する場合、最類似矩形選択部１２ｃｂは、閉エッジ矩形Ｒ１，Ｒ２，Ｒ３のうち、中心位置が入力矩形ＩＲと略同一の閉エッジ矩形Ｒ２を選択する。 Also, as shown in FIG. 10, when comparing only the center positions, the most similar rectangle selection unit 12cb selects the closed edge rectangle R2, which has a center position that is approximately the same as that of the input rectangle IR, from among the closed edge rectangles R1, R2, and R3.

また、図１１に示すように、横幅／縦幅のみを比較する場合、最類似矩形選択部１２ｃｂは、閉エッジ矩形Ｒ１，Ｒ２，Ｒ３のうち、横幅／縦幅が入力矩形ＩＲに最も類似する閉エッジ矩形Ｒ３を選択する。 Also, as shown in FIG. 11, when comparing only the width/height, the most similar rectangle selection unit 12cb selects the closed edge rectangle R3, which is the one that is most similar in width/height to the input rectangle IR, from among the closed edge rectangles R1, R2, and R3.

また、図１２に示すように、入力矩形ＩＲ１に対し、たとえば閉エッジが一部しか含まれておらず、入力矩形ＩＲ１内に閉エッジが検出されていないものとする。かかる場合、同図に示すように、最類似矩形選択部１２ｃｂは、入力矩形ＩＲ１を一定量拡張し、拡張した入力矩形ＩＲ２内において閉エッジを検出し、閉エッジ矩形Ｒ１を作成する。 As shown in FIG. 12, for example, the input rectangle IR1 contains only a portion of closed edges, and no closed edges are detected within the input rectangle IR1. In such a case, as shown in the figure, the most similar rectangle selection unit 12cb expands the input rectangle IR1 by a certain amount, detects closed edges within the expanded input rectangle IR2, and creates a closed edge rectangle R1.

このように、入力矩形ＩＲ内外から物体の境界判定を補助することで、精密な入力矩形ＩＲの作成が不要となり、ユーザＵの負担を軽減させることができる。 In this way, by assisting in determining the boundaries of an object from inside and outside the input rectangle IR, it becomes unnecessary to create a precise input rectangle IR, thereby reducing the burden on the user U.

ところで、図５を用いた説明では、ユーザＵに対し、入力矩形ＩＲに対する自動調整後の矩形ＡＲを採用するか否かを「Ｙｅｓ」または「Ｎｏ」で問い合わせる画面ＵＩ（User Interface）の例を挙げたが、画面ＵＩはこれに限られるものではない。 In the explanation using Figure 5, an example of a screen UI (User Interface) that asks the user U whether or not to adopt the rectangle AR after automatic adjustment to the input rectangle IR by selecting "Yes" or "No" was given, but the screen UI is not limited to this.

以下、画面ＵＩの各種の具体例について、図１３～図２６を用いて説明する。図１３～図２６は、画面ＵＩの具体例を示す図（その１）～（その１４）である。 Various specific examples of screen UIs will be explained below with reference to Figs. 13 to 26. Figs. 13 to 26 are diagrams (1) to (14) showing specific examples of screen UIs.

図１３に示すように、たとえば学習支援装置１０は、前述の採用の有無を問い合わせることなく、ユーザＵによって入力された入力矩形ＩＲに対し、自動調整後の矩形ＡＲをＢＢとして自動的に採用するようにしてもよい。 As shown in FIG. 13, for example, the learning support device 10 may automatically adopt the automatically adjusted rectangle AR as BB for the input rectangle IR input by the user U, without inquiring about the above-mentioned adoption.

また、図５でも示したが、図１４に示すように、たとえば学習支援装置１０は、ユーザＵによって入力された入力矩形ＩＲに対し、自動調整後の矩形ＡＲの採用の有無を問い合わせるようにしてもよい。 As also shown in FIG. 5, as shown in FIG. 14, for example, the learning support device 10 may inquire whether or not to adopt the automatically adjusted rectangle AR for the input rectangle IR input by the user U.

かかる場合、図１５に示すように、ユーザＵにより「Ｙｅｓ」が選択されたならば、学習支援装置１０は、ＢＢとして自動調整後の矩形ＡＲを採用することとなる。一方、図１６に示すように、ユーザＵにより「Ｎｏ」が選択されたならば、学習支援装置１０は、ＢＢとして入力矩形ＩＲを採用することとなる。 In this case, as shown in FIG. 15, if the user U selects "Yes," the learning support device 10 will adopt the automatically adjusted rectangle AR as BB. On the other hand, as shown in FIG. 16, if the user U selects "No," the learning support device 10 will adopt the input rectangle IR as BB.

また、図１７に示すように、ユーザＵにより入力された入力矩形ＩＲ内に複数の閉エッジが含まれるものとする。なお、かかるケースは、入力矩形ＩＲを拡張した場合を含むものとする。また、ここでは、五角形と丸形の２つの閉エッジが含まれるものとする。 As shown in FIG. 17, the input rectangle IR input by the user U includes multiple closed edges. This case also includes the case where the input rectangle IR is expanded. In addition, it includes two closed edges, a pentagon and a circle.

かかる場合、同図に示すように、たとえば学習支援装置１０は、前述の採用の有無を問い合わせることなく、ユーザＵによって入力された入力矩形ＩＲに対し、類似度が最大となる矩形ＡＲをＢＢとして自動的に採用するようにしてもよい。 In such a case, as shown in the figure, for example, the learning support device 10 may automatically adopt as BB the rectangle AR that has the greatest similarity to the input rectangle IR input by the user U, without inquiring about whether or not to adopt the rectangle AR.

また、同様のケースで、図１８に示すように、たとえば学習支援装置１０は、ユーザＵによって入力された入力矩形ＩＲに対し、まず類似度が最大となる第１候補の矩形ＡＲ１の採用の有無を問い合わせるようにしてもよい。 In a similar case, as shown in FIG. 18, for example, the learning support device 10 may first inquire whether or not to adopt the first candidate rectangle AR1 that has the highest similarity for the input rectangle IR input by the user U.

かかる場合、図１９に示すように、ユーザＵにより「Ｙｅｓ」が選択されたならば、学習支援装置１０は、ＢＢとして第１候補の矩形ＡＲ１を採用することとなる。一方、図２０に示すように、ユーザＵにより「Ｎｏ」が選択されたならば、学習支援装置１０は、その次に類似度が大きい第２候補の矩形ＡＲ２の採用の有無を問い合わせる。 In this case, as shown in FIG. 19, if the user U selects "Yes," the learning support device 10 will adopt the first candidate rectangle AR1 as BB. On the other hand, as shown in FIG. 20, if the user U selects "No," the learning support device 10 will inquire as to whether or not to adopt the second candidate rectangle AR2, which has the next highest similarity.

そして、かかる場合に、図２１に示すように、ユーザＵにより「Ｙｅｓ」が選択されたならば、学習支援装置１０は、ＢＢとして第２候補の矩形ＡＲ２を採用することとなる。一方、図２２に示すように、ユーザＵにより「Ｎｏ」が選択されたならば、学習支援装置１０は、ＢＢとして入力矩形ＩＲを採用することとなる。 In this case, if the user U selects "Yes" as shown in FIG. 21, the learning support device 10 will adopt the second candidate rectangle AR2 as BB. On the other hand, if the user U selects "No" as shown in FIG. 22, the learning support device 10 will adopt the input rectangle IR as BB.

また、たとえば学習支援装置１０は、図１８～図２２に示した流れを、図２３に示すように、キーボードの方向キーで選択を、Ｅｎｔｅｒキーで採用を行えるようにしてもよい。 For example, the learning support device 10 may be configured to allow selection with the directional keys on the keyboard and adoption with the Enter key, as shown in FIG. 23, in the flows shown in FIG. 18 to FIG. 22.

また、同様に、たとえば学習支援装置１０は、図２４に示すように、マウスによるクリック操作や、指によるタップ操作によって選択および採用を行えるようにしてもよい。 Similarly, for example, the learning support device 10 may be configured to allow selection and adoption by clicking with a mouse or tapping with a finger, as shown in FIG. 24.

次に、入力矩形ＩＲに対し、既にアノテーション済みの矩形ＡＲが含まれる場合を考える。図２５に示すように、入力矩形ＩＲに対し、アノテーション済みの矩形ＡＲ１，ＡＲ２が含まれているものとする。 Next, consider the case where the input rectangle IR includes an already annotated rectangle AR. As shown in FIG. 25, assume that the input rectangle IR includes annotated rectangles AR1 and AR2.

かかる場合、同図に示すように、たとえば学習支援装置１０は、図１３～図１６に示したいずれかの方法で、まだアノテーション済みでない矩形ＡＲ３を選択および採用する。 In such a case, as shown in the figure, for example, the learning support device 10 selects and adopts a rectangle AR3 that has not yet been annotated, using one of the methods shown in Figures 13 to 16.

また、図２６に示すように、入力矩形ＩＲに対し、アノテーション済みの矩形ＡＲ１が含まれているとともに、アノテーション済みでない複数の閉エッジが含まれているものとする。 As shown in FIG. 26, the input rectangle IR includes an annotated rectangle AR1, as well as multiple closed edges that have not been annotated.

かかる場合、同図に示すように、学習支援装置１０は、まだアノテーション済みでない矩形ＡＲ２，ＡＲ３のうち、図１７～図２４に示したいずれかの方法で、たとえば矩形ＡＲ２を選択および採用する。 In such a case, as shown in the figure, the learning support device 10 selects and adopts, for example, rectangle AR2 from among rectangles AR2 and AR3 that have not yet been annotated, using one of the methods shown in Figures 17 to 24.

次に、学習支援装置１０が実行する処理シーケンスについて、図２７を用いて説明する。図２７は、実施形態に係る学習支援装置１０が実行する処理シーケンスである。なお、図２７には、１つの対象画像において１つのＢＢが作成されるまでの処理シーケンスを示している。 Next, the processing sequence executed by the learning support device 10 will be described with reference to FIG. 27. FIG. 27 shows the processing sequence executed by the learning support device 10 according to the embodiment. Note that FIG. 27 shows the processing sequence up to the creation of one BB in one target image.

まず、ユーザＵからＨＭＩ部３を介してアノテーションの対象画像が選択されると（ステップＳ１０１）、画像描写部１２ａが、ＨＭＩ部３に対し対象画像を表示する（ステップＳ１０２）。また、画像描写部１２ａは、対象画像を画像解析部１２ｂへ送信する。 First, when a target image for annotation is selected by the user U via the HMI unit 3 (step S101), the image depiction unit 12a displays the target image on the HMI unit 3 (step S102). The image depiction unit 12a also transmits the target image to the image analysis unit 12b.

画像解析部１２ｂは、エッジ抽出部１２ｂａが、対象画像および記憶部１１の抽出アルゴリズム情報１１ｂに基づいてエッジ抽出処理を実行し（ステップＳ１０３）、処理結果であるエッジ情報１１ｃを記憶部１１へ書き込む。 The image analysis unit 12b causes the edge extraction unit 12ba to execute edge extraction processing based on the target image and the extraction algorithm information 11b in the storage unit 11 (step S103), and writes the edge information 11c, which is the processing result, to the storage unit 11.

そして、ユーザＵからＨＭＩ部３を介して矩形入力を受け付けると（ステップＳ１０４）、画像描写部１２ａは、ＨＭＩ部３に対し入力矩形ＩＲを描画する（ステップＳ１０５）。また、画像描写部１２ａは、入力矩形ＩＲに関する入力矩形情報を矩形作成部１２ｃへ送信する。 Then, when a rectangle input is received from the user U via the HMI unit 3 (step S104), the image depiction unit 12a draws the input rectangle IR on the HMI unit 3 (step S105). The image depiction unit 12a also transmits input rectangle information related to the input rectangle IR to the rectangle creation unit 12c.

矩形作成部１２ｃは、閉エッジ矩形作成部１２ｃａが、入力矩形ＩＲおよび記憶部１１のエッジ情報１１ｃに基づいて閉エッジ矩形作成処理を実行する（ステップＳ１０６）。 The closed edge rectangle creation unit 12ca of the rectangle creation unit 12c executes a closed edge rectangle creation process based on the input rectangle IR and the edge information 11c of the memory unit 11 (step S106).

そして、最類似矩形選択部１２ｃｂが、閉エッジ矩形作成処理の処理結果に基づいて最類似矩形選択処理を実行し（ステップＳ１０７）、処理結果である最類似矩形に関する最類似矩形情報を画像描写部１２ａへ送信する。画像描写部１２ａは、受信した最類似矩形情報に基づいて、ＨＭＩ部３に対し最類似矩形を描画する（ステップＳ１０８）。 Then, the most similar rectangle selection unit 12cb executes the most similar rectangle selection process based on the result of the closed edge rectangle creation process (step S107), and transmits most similar rectangle information about the most similar rectangle, which is the result of the process, to the image depiction unit 12a. The image depiction unit 12a draws the most similar rectangle on the HMI unit 3 based on the received most similar rectangle information (step S108).

そして、画像描写部１２ａは、ユーザＵからＨＭＩ部３を介して、採用される矩形ＡＲの選択を受け付けると（ステップＳ１０９）、かかる矩形ＡＲをＨＭＩ部３に対し描画するとともに（ステップＳ１１０）、矩形作成部１２ｃに対し、採用された矩形ＡＲに関する採用矩形情報を送信する（ステップＳ１１１）。 Then, when the image depiction unit 12a receives a selection of the rectangular AR to be adopted from the user U via the HMI unit 3 (step S109), it draws the rectangular AR on the HMI unit 3 (step S110) and transmits adopted rectangle information regarding the adopted rectangular AR to the rectangle creation unit 12c (step S111).

そして、矩形作成部１２ｃは、受信した採用矩形情報を記憶部１１の矩形情報ＤＢ１１ｅへ書き込み（ステップＳ１１２）、処理を終了する。 Then, the rectangle creation unit 12c writes the received adopted rectangle information to the rectangle information DB 11e of the storage unit 11 (step S112) and ends the process.

上述してきたように、実施形態に係る学習支援装置１０は、エッジ抽出部１２ｂａ（「抽出部」の一例に相当）と、閉エッジ矩形作成部１２ｃａ（「作成部」の一例に相当）と、最類似矩形選択部１２ｃｂ（「選択部」の一例に相当）とを備える。エッジ抽出部１２ｂａは、アノテーションの対象画像における物体のエッジを抽出する。閉エッジ矩形作成部１２ｃａは、エッジ抽出部１２ｂａによって抽出されたエッジのうち、閉曲線として抽出されたエッジである閉エッジに接する閉エッジ矩形を作成する。最類似矩形選択部１２ｃｂは、ユーザＵにより対象画像に対し入力された入力矩形ＩＲと閉エッジ矩形との類似度を算出し、類似度が最も大きい閉エッジ矩形を選択する。 As described above, the learning support device 10 according to the embodiment includes an edge extraction unit 12ba (corresponding to an example of an "extraction unit"), a closed edge rectangle creation unit 12ca (corresponding to an example of a "creation unit"), and a most similar rectangle selection unit 12cb (corresponding to an example of a "selection unit"). The edge extraction unit 12ba extracts edges of objects in a target image for annotation. The closed edge rectangle creation unit 12ca creates closed edge rectangles that are tangent to closed edges, which are edges extracted as closed curves from among the edges extracted by the edge extraction unit 12ba. The most similar rectangle selection unit 12cb calculates the similarity between an input rectangle IR input to the target image by a user U and a closed edge rectangle, and selects the closed edge rectangle with the greatest similarity.

したがって、実施形態に係る学習支援装置１０によれば、画像認識用のＡＩモデルのためのアノテーションの品質を確保することができる。 Therefore, according to the learning support device 10 of the embodiment, it is possible to ensure the quality of annotations for an AI model for image recognition.

また、エッジ抽出部１２ｂａは、ユーザＵにより任意に選択されるエッジ抽出アルゴリズムを用いて上記物体のエッジを抽出する。 The edge extraction unit 12ba also extracts the edges of the object using an edge extraction algorithm arbitrarily selected by the user U.

したがって、実施形態に係る学習支援装置１０によれば、教師データの特性に依らない機能実現が可能となる。たとえば、物体の色や形状、あるいは画像の明度や彩度によってアルゴリズムを使い分け、物体のエッジを精度よく抽出することが可能となる。 Therefore, the learning support device 10 according to the embodiment can realize functions that are not dependent on the characteristics of the teacher data. For example, it is possible to extract the edges of an object with high accuracy by using different algorithms depending on the color or shape of the object, or the brightness or saturation of the image.

また、閉エッジ矩形作成部１２ｃａは、上記閉エッジの横位置最小値、横位置最大値、縦位置最小値および縦位置最大値によって規定される四隅の座標位置を各頂点位置とする上記閉エッジ矩形を作成する。 The closed edge rectangle creation unit 12ca also creates the closed edge rectangle, with the coordinate positions of the four corners defined by the minimum horizontal position, maximum horizontal position, minimum vertical position, and maximum vertical position of the closed edge being the vertex positions.

したがって、実施形態に係る学習支援装置１０によれば、閉エッジに接する閉エッジ矩形を自動的にかつ正確に作成することが可能となり、入力矩形ＩＲの自動調整を精度よく行うことができる。 Therefore, according to the learning support device 10 of the embodiment, it is possible to automatically and accurately create a closed edge rectangle that is in contact with a closed edge, and the input rectangle IR can be automatically adjusted with high precision.

また、最類似矩形選択部１２ｃｂは、少なくともアスペクト比、中心位置、または、横幅および縦幅のいずれかを比較の要素として入力矩形ＩＲと閉エッジ矩形との類似度を算出する。 The most similar rectangle selection unit 12cb also calculates the similarity between the input rectangle IR and the closed edge rectangle using at least the aspect ratio, the center position, or the width and height as comparison elements.

したがって、実施形態に係る学習支援装置１０によれば、少なくともアスペクト比、中心位置、または、横幅および縦幅のいずれかを比較の要素として、精度よく類似度を算出し、かかる類似度に基づいて適切に最類似矩形ＡＲを選択することが可能となる。 Therefore, according to the learning support device 10 of the embodiment, it is possible to accurately calculate the similarity using at least the aspect ratio, the center position, or the width and height as comparison elements, and to appropriately select the most similar rectangle AR based on the similarity.

また、最類似矩形選択部１２ｃｂは、入力矩形ＩＲに対し、上記閉エッジが一部しか含まれていない場合に、入力矩形ＩＲ内において上記閉エッジが検出可能となるように、上記入力矩形ＩＲを一定量拡張する。 In addition, when the input rectangle IR only partially contains the closed edge, the most similar rectangle selection unit 12cb expands the input rectangle IR by a certain amount so that the closed edge can be detected within the input rectangle IR.

したがって、実施形態に係る学習支援装置１０によれば、たとえば入力矩形ＩＲが正確に作成されていない場合であっても、これを自動的に修正することが可能となる。 Therefore, according to the learning support device 10 of the embodiment, even if the input rectangle IR is not created accurately, it is possible to automatically correct it.

なお、上述した実施形態では、類似度の算出基準となる類似度算出情報１１ｄを、たとえばユーザＵが事前に選定し、静的な情報として予め設定可能であることとしたが、動的に設定を変更可能であるようにしてもよい。たとえば、算出された類似度と、かかる類似度に基づいて提示した閉エッジ矩形に対するユーザＵの採用結果等の履歴に基づいて、ユーザＵの採用パターンの傾向等を機械学習によって学習し、その学習結果を動的に類似度算出情報１１ｄへ反映するようにしてもよい。また、このような動的な反映を、抽出アルゴリズム情報１１ｂに対して適用するようにしてもよい。 In the above embodiment, the similarity calculation information 11d, which is the basis for calculating the similarity, is selected in advance by the user U, for example, and can be set in advance as static information, but the setting may be dynamically changed. For example, based on the calculated similarity and the history of the user U's adoption results for the closed edge rectangles presented based on the similarity, the tendency of the adoption pattern of the user U may be learned by machine learning, and the learning results may be dynamically reflected in the similarity calculation information 11d. In addition, such dynamic reflection may be applied to the extraction algorithm information 11b.

さらなる効果や変形例は、当業者によって容易に導き出すことができる。このため、本発明のより広範な態様は、以上のように表しかつ記述した特定の詳細および代表的な実施形態に限定されるものではない。したがって、添付の特許請求の範囲およびその均等物によって定義される総括的な発明の概念の精神または範囲から逸脱することなく、様々な変更が可能である。 Further advantages and modifications may readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details and representative embodiments shown and described above. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and equivalents thereof.

３ＨＭＩ部
３ａ入力部
３ｂ出力部
１０学習支援装置
１１記憶部
１１ａ対象画像ＤＢ
１１ｂ抽出アルゴリズム情報
１１ｃエッジ情報
１１ｄ類似度算出情報
１１ｅ矩形情報ＤＢ
１２制御部
１２ａ画像描写部
１２ａａ矩形描画部
１２ｂ画像解析部
１２ｂａエッジ抽出部
１２ｃ矩形作成部
１２ｃａ閉エッジ矩形作成部
１２ｃｂ最類似矩形選択部
ＩＲ入力矩形
Ｕユーザ 3 HMI section 3a Input section 3b Output section 10 Learning support device 11 Storage section 11a Target image DB
11b Extraction algorithm information 11c Edge information 11d Similarity calculation information 11e Rectangle information DB
12 Control unit 12a Image depiction unit 12aa Rectangle drawing unit 12b Image analysis unit 12ba Edge extraction unit 12c Rectangle creation unit 12ca Closed edge rectangle creation unit 12cb Most similar rectangle selection unit IR Input rectangle U User

Claims

Extracting edges of objects in the target images for annotation in creating training data ,
A closed edge rectangle is created that is tangent to a closed edge that is an edge extracted as a closed curve from among the extracted edges;
Calculating a similarity between an input rectangle input by a user for the target image and the closed edge rectangle ;
Selecting the closed edge rectangle with the greatest similarity;
adopting the selected closed edge rectangle as a bounding box for the annotation;
A learning support device equipped with a controller .

The controller :
Extracting edges of the object using an edge extraction algorithm selected by a user ;
The learning support device according to claim 1 .

The controller :
a closed edge rectangle is created , the vertices of which are the coordinate positions of the four corners defined by the minimum horizontal position value, the maximum horizontal position value, the minimum vertical position value, and the maximum vertical position value of the closed edge;
The learning support device according to claim 1 or 2.

The controller :
calculating the similarity between the input rectangle and the closed edge rectangle using at least one of an aspect ratio, a center position, or a width and a height as a comparison element ;
4. The learning support device according to claim 1, 2 or 3.

The controller :
When the closed edge is only partially included in the input rectangle, the input rectangle is expanded by a certain amount so that the closed edge can be detected within the input rectangle .
The learning support device according to any one of claims 1 to 4.

Extracting edges of objects in the target images for annotation in creating training data ,
A closed edge rectangle is created that is tangent to a closed edge that is an edge extracted as a closed curve from among the extracted edges;
Calculating a similarity between an input rectangle input by a user for the target image and the closed edge rectangle ;
Selecting the closed edge rectangle with the greatest similarity;
adopting the selected closed edge rectangle as a bounding box for the annotation;
A learning support method implemented by a controller .

Extracting edges of objects in the target images for annotation in creating training data ,
A closed edge rectangle is created that is tangent to a closed edge that is an edge extracted as a closed curve from among the extracted edges;
Calculating a similarity between an input rectangle input by a user for the target image and the closed edge rectangle ;
Selecting the closed edge rectangle with the greatest similarity;
adopting the selected closed edge rectangle as a bounding box for the annotation;
A learning support program executed by the controller .