JP7375161B2

JP7375161B2 - Learning data creation device, method, program, and recording medium

Info

Publication number: JP7375161B2
Application number: JP2022507151A
Authority: JP
Inventors: 一央岩見; 真司羽田
Original assignee: Fujifilm Toyama Chemical Co Ltd
Current assignee: Fujifilm Toyama Chemical Co Ltd
Priority date: 2020-03-13
Filing date: 2021-03-05
Publication date: 2023-11-07
Anticipated expiration: 2041-03-05
Also published as: JPWO2021182345A1; WO2021182345A1

Description

本発明は学習データ作成装置、方法、プログラム、学習データ及び機械学習装置に係り、特に多数の学習データを効率よく作成する技術に関する。 The present invention relates to a learning data creation device, method, program, learning data, and machine learning device, and particularly relates to a technique for efficiently creating a large amount of learning data.

従来、教示ファイルに格納されている多数の教示データに基づいて学習し、パターン認識して欠陥判定をする外観検査装置が提案されている（特許文献１）。 2. Description of the Related Art Conventionally, an appearance inspection apparatus has been proposed that learns based on a large amount of teaching data stored in a teaching file, performs pattern recognition, and determines defects (Patent Document 1).

この外観検査装置は、教示ファイル中の多数の教示データのうち、データ数の少ない特定の教示データについては、その特定の教示データを変形して新たな教示データを生成する教示データ生成装置を備え、教示データ生成装置により生成された教示データを教示ファイルに補充して学習することで、データ数の少ない欠陥の検査を可能にしている。 This visual inspection device includes a teaching data generation device that transforms specific teaching data with a small number of data out of a large number of teaching data in the teaching file to generate new teaching data. By supplementing the teaching file with teaching data generated by the teaching data generating device for learning, it is possible to inspect defects with a small amount of data.

また、教示データ生成装置は、生成すべき教示データが画像データであるときは、画像の拡大、縮小、回転を含むアフィン変換と、明るさ、コントラスト、エッジ強度を含む属性変換を行うことにより、新たな教示データを生成している。 Furthermore, when the teaching data to be generated is image data, the teaching data generation device performs affine transformation including image enlargement, reduction, and rotation, and attribute conversion including brightness, contrast, and edge strength. New teaching data is being generated.

特開２００６－４８３７０号公報Japanese Patent Application Publication No. 2006-48370

ところで、対象物としての薬剤が撮影された撮影画像からその撮影画像内の薬剤の領域を、学習済みの学習モデルにより精度よく認識するためには、薬剤を撮影した撮影画像と、その撮影画像内の薬剤の領域を示す領域情報（正解データ）とのペアを多数作成し、多数のペアからなる学習データセットにより学習モデルを機械学習させる必要がある。 By the way, in order to accurately recognize the drug area in a photographed image of a drug as a target object using a trained learning model, it is necessary to It is necessary to create many pairs with region information (correct data) indicating the region of the drug, and perform machine learning on the learning model using a learning data set consisting of many pairs.

従来のこの種の正解データは、撮影画像をディスプレイに表示し、ディスプレイに表示された撮影画像を見ながら薬剤の領域をユーザが画素単位で塗り潰して作成しており、正解データの作成に手間と時間がかかるという問題がある。 Conventionally, this type of correct answer data is created by displaying a photographed image on a display and having the user fill in the drug area pixel by pixel while looking at the photographed image displayed on the display, which takes time and effort to create correct answer data. The problem is that it takes time.

一方、特許文献１に記載の外観検査装置は、カメラを用いて印刷物や無地面（紙、フィルム、金属など）の対象物を撮像し、撮像した画像から印刷欠陥を認識し、欠陥の種類（「穴」、「しみ」、「凸」、「すじ」など）を分別するものである。 On the other hand, the visual inspection device described in Patent Document 1 uses a camera to image a printed matter or a plain surface (paper, film, metal, etc.), recognizes a printing defect from the captured image, and recognizes the type of defect ( It is used to separate ``holes'', ``stains'', ``protrusions'', ``streaks'', etc.).

したがって、教示データ生成装置により、データ数の少ない一のデータ（画像データ）を変形して新たに複数の教示データを生成する場合、同じ画像データを変形して生成した複数の教示データに対応する正解データは、同一の欠陥の種類を示すデータになる。即ち、特許文献１には、教示データ（教示画像）に対する正解データの作成に手間と時間がかかるという課題の記載がなく、それを解決する技術も開示されていない。 Therefore, when the teaching data generation device transforms one piece of data (image data) with a small number of data to generate a plurality of new teaching data, the data corresponding to the plurality of teaching data generated by transforming the same image data. The correct data is data indicating the same defect type. That is, Patent Document 1 does not describe the problem that it takes time and effort to create correct answer data for teaching data (teaching images), and also does not disclose a technique to solve the problem.

本発明はこのような事情に鑑みてなされたもので、薬剤を撮影した撮影画像から薬剤の領域に対応する領域情報を認識する学習モデルを機械学習させるための学習データを効率よく作成することができる学習データ作成装置、方法、プログラム、学習データ及び機械学習装置を提供することを目的とする。 The present invention has been made in view of the above circumstances, and it is possible to efficiently create learning data for machine learning a learning model that recognizes area information corresponding to the area of a drug from photographed images of drugs. The purpose is to provide a learning data creation device, method, program, learning data, and machine learning device that can perform the following tasks.

上記目的を達成するために第１態様に係る発明は、プロセッサと、メモリとを備え、プロセッサが機械学習用の学習データを作成する学習データ作成装置であって、プロセッサは、薬剤が撮影された撮影画像を取得する取得処理と、取得した撮影画像から薬剤を任意に配置した学習用画像を生成する学習用画像生成処理と、生成した学習用画像における薬剤の領域に対応する第２領域情報を生成し、生成した第２領域情報を学習用画像に対する正解データとする正解データ生成処理と、生成した学習用画像と正解データとのペアを、学習データとしてメモリに記憶させる記憶制御と、を行う。 In order to achieve the above object, the invention according to a first aspect is a learning data creation device comprising a processor and a memory, the processor creates learning data for machine learning, the processor creates learning data for machine learning, and the processor creates learning data for machine learning. An acquisition process that acquires a photographed image, a learning image generation process that generates a learning image in which drugs are arbitrarily arranged from the acquired photographic image, and second area information corresponding to the area of the drug in the generated learning image. Correct data generation processing is performed in which the generated second area information is used as correct data for the learning image, and storage control is performed in which a pair of the generated learning image and correct data is stored in a memory as learning data. .

本発明の第１態様によれば、薬剤が撮影された撮影画像から薬剤を任意に配置した学習用画像を生成する。また、生成した学習用画像における薬剤の領域に対応する第２領域情報を生成し、生成した第２領域情報を学習用画像に対する正解データとする。この正解データの生成は、プロセッサによる正解データ生成処理により行うことができるため、正解データの作成に手間と時間を要しない。 According to the first aspect of the present invention, a learning image in which medicines are arbitrarily arranged is generated from a photographed image in which medicines are photographed. Further, second area information corresponding to the drug area in the generated learning image is generated, and the generated second area information is used as correct data for the learning image. This generation of correct answer data can be performed by correct answer data generation processing by a processor, so that no effort or time is required to create the correct answer data.

このようにして生成した学習用画像と正解データとのペアを学習データとすることで、多くの学習データを生成すること（水増しすること）ができる。 By using the pair of the learning image and correct answer data generated in this way as learning data, it is possible to generate (inflate) a large amount of learning data.

本発明の第２態様に係る学習データ作成装置において、プロセッサの取得処理は、取得した撮影画像内の薬剤の領域に対応する第１領域情報を取得し、正解データ生成処理は、取得した第１領域情報に基づいて第２領域情報を生成することが好ましい。 In the learning data generation device according to the second aspect of the present invention, the acquisition process of the processor acquires first area information corresponding to the drug area in the acquired photographic image, and the correct data generation process acquires first area information corresponding to the acquired first area. Preferably, the second region information is generated based on the region information.

本発明の第３態様に係る学習データ作成装置において、プロセッサの取得処理は、複数の薬剤が撮影された撮影画像、又は薬剤が撮影された複数の撮影画像を取得し、複数の薬剤の領域に対応する複数の第１領域情報を取得することが好ましい。 In the learning data creation device according to the third aspect of the present invention, the acquisition process of the processor includes acquiring captured images in which a plurality of drugs are captured, or a plurality of captured images in which drugs are captured, and It is preferable to acquire a plurality of pieces of corresponding first area information.

本発明の第４態様に係る学習データ作成装置において、第１領域情報は、撮影画像内の薬剤の領域を手動で設定された領域情報、撮影画像内の薬剤の領域を画像処理により自動で抽出した領域情報、又は撮影画像内の薬剤の領域を画像処理により自動で抽出し、かつ手動で調整された領域情報であることが好ましい。 In the learning data creation device according to the fourth aspect of the present invention, the first region information includes region information that manually sets the drug region in the captured image, and region information that automatically extracts the drug region in the captured image by image processing. Preferably, the area information is area information obtained by automatically extracting the drug area in the photographed image through image processing and manually adjusting the area information.

本発明の第５態様に係る学習データ作成装置において、正解データは、薬剤の領域に対応する正解画像、薬剤の領域を矩形で囲むバウンディングボックス情報、及び薬剤の領域のエッジを示すエッジ情報のうちの少なくとも１つを含むことが好ましい。尚、正解画像は、マスク画像を含む。 In the learning data creation device according to the fifth aspect of the present invention, the correct data includes a correct image corresponding to the drug region, bounding box information surrounding the drug region with a rectangle, and edge information indicating edges of the drug region. It is preferable to include at least one of the following. Note that the correct image includes a mask image.

本発明の第６態様に係る学習データ作成装置において、学習用画像生成処理は、撮影画像を平行移動、反転、回転、又は拡縮させて学習用画像を生成し、正解データ生成処理は、第１領域情報を撮影画像に対応して平行移動、反転、回転、又は拡縮させて正解データを生成することが好ましい。学習用画像の生成と正解データの生成とは、並列処理にて同時に生成してもよいし、学習用画像及び正解データのうちのいずれか一方を生成してから他方を生成してもよい。 In the learning data generation device according to the sixth aspect of the present invention, the learning image generation process generates the learning image by translating, inverting, rotating, or scaling the captured image, and the correct data generation process includes the first It is preferable to generate correct data by translating, inverting, rotating, or scaling the region information in accordance with the photographed image. The learning image and the correct data may be generated simultaneously through parallel processing, or one of the learning image and the correct data may be generated before the other is generated.

本発明の第７態様に係る学習データ作成装置において、学習用画像生成処理は、撮影画像を平行移動、反転、回転、又は拡縮させた２以上の画像を合成して学習用画像を生成し、正解データ生成処理は、２以上の画像の各々に対応する第１領域情報を撮影画像に対応して平行移動、反転、回転、又は拡縮させて正解データを生成することが好ましい。これにより、複数の薬剤の画像からなる学習用画像とその正解データを生成することができる。 In the learning data creation device according to the seventh aspect of the present invention, the learning image generation process generates a learning image by synthesizing two or more images obtained by translating, inverting, rotating, or scaling a captured image; Preferably, the correct data generation process generates correct data by translating, inverting, rotating, or scaling the first area information corresponding to each of the two or more images in accordance with the captured image. Thereby, it is possible to generate a learning image consisting of images of a plurality of drugs and its correct answer data.

本発明の第８態様に係る学習データ作成装置において、プロセッサは、取得した第１領域情報に基づいて撮影画像から薬剤の領域を切り出した薬剤画像を取得する薬剤画像取得処理を含み、学習用画像生成処理は、取得した薬剤画像を平行移動、反転、回転、又は拡縮させて学習用画像を生成し、正解データ生成処理は、第１領域情報を薬剤画像に対応して平行移動、反転、回転、又は拡縮させて正解データを生成することが好ましい。 In the learning data creation device according to the eighth aspect of the present invention, the processor includes a drug image acquisition process for acquiring a drug image in which a drug region is cut out from a captured image based on the acquired first region information, and The generation process involves translating, inverting, rotating, or scaling the acquired drug image to generate a learning image, and the correct data generation process involves translating, inverting, or rotating the first region information in accordance with the drug image. It is preferable to generate the correct data by enlarging or reducing the data.

本発明の第９態様に係る学習データ作成装置において、プロセッサは、取得した第１領域情報に基づいて撮影画像から薬剤の領域を切り出した薬剤画像を取得する薬剤画像取得処理を含み、学習用画像生成処理は、取得した薬剤画像を平行移動、反転、回転、又は拡縮させた２以上の画像を合成して学習用画像を生成し、正解データ生成処理は、第１領域情報を薬剤画像に対応して平行移動、反転、回転、又は拡縮させた２以上の第２領域情報を合成して正解データを生成することが好ましい。これにより、複数の薬剤の画像からなる学習用画像とその正解データを生成することができる。 In the learning data creation device according to the ninth aspect of the present invention, the processor includes a drug image acquisition process for acquiring a drug image in which a drug region is cut out from a captured image based on the acquired first region information; In the generation process, a learning image is generated by combining two or more images obtained by translating, inverting, rotating, or scaling the acquired drug image, and in the correct data generation process, the first area information is made to correspond to the drug image. It is preferable to generate correct data by synthesizing two or more pieces of second area information that have been translated, reversed, rotated, or scaled. Thereby, it is possible to generate a learning image consisting of images of a plurality of drugs and its correct answer data.

本発明の第１０態様に係る学習データ作成装置において、学習用画像生成処理は、複数の薬剤を含む学習用画像を生成する際に、複数の薬剤の全部又は一部が点又は線で接触する学習用画像を生成することが好ましい。 In the learning data creation device according to the tenth aspect of the present invention, in the learning image generation process, when generating a learning image including a plurality of drugs, all or part of the plurality of drugs are in contact with each other at points or lines. It is preferable to generate training images.

本発明の第１１態様に係る学習データ作成装置において、正解データは、複数の薬剤画像の点又は線で接触する箇所のみを示すエッジ画像を含むことが好ましい。複数の薬剤の全部又は一部が点又は線で接触する学習用画像に対する正解データとして、複数の薬剤画像の点又は線で接触する箇所のみを示すエッジ画像を含めることができる。この学習データは、複数の薬剤画像の点又は線で接触する箇所の分離に有用なものとなる。 In the learning data creation device according to the eleventh aspect of the present invention, it is preferable that the correct data includes an edge image that shows only the points or lines of the plurality of drug images that are in contact with each other. As correct answer data for a learning image in which all or part of a plurality of medicines are in contact with a point or a line, an edge image showing only a portion of a plurality of medicine images that are in contact with a point or a line can be included. This learning data is useful for separating points or lines that touch each other in a plurality of drug images.

本発明の第１２態様に係る学習データ作成装置において、薬剤は、少なくとも一部が透明な薬剤であることが好ましい。少なくとも一部が透明な薬剤は、全体が不透明な薬剤と比較して抽出が難しく、かつ学習データが少ないため、少なくとも一部が透明な薬剤に対する学習データの生成は特に有効である。 In the learning data creation device according to the twelfth aspect of the present invention, it is preferable that the drug is at least partially transparent. Since drugs that are at least partially transparent are more difficult to extract than drugs that are entirely opaque and require less learning data, it is particularly effective to generate learning data for drugs that are at least partially transparent.

本発明の第１３態様に係る学習データ作成装置において、プロセッサによる学習用画像生成処理は、複数の薬剤を含む学習用画像を生成する際に、透明な薬剤以外の薬剤を任意に配置することが好ましい。少なくとも一部が透明な薬剤の場合、その薬剤画像を任意に配置したものと、透明な薬剤を同じ位置に配置して撮影したものとは、照明光との位置関係で異なる画像になるからである。 In the learning data creation device according to the thirteenth aspect of the present invention, the learning image generation process by the processor may include arbitrarily arranging drugs other than transparent drugs when generating the learning image including a plurality of drugs. preferable. In the case of a drug that is at least partially transparent, an image of the drug arranged arbitrarily and an image taken with the transparent drug placed in the same position will result in different images depending on the positional relationship with the illumination light. be.

本発明の第１４態様に係る学習データ作成装置において、撮影画像は、自薬局が取り扱っている薬剤を撮影した画像であることが好ましい。自薬局が取り扱っている薬剤に対する学習データを作成できればよいからである。 In the learning data creation device according to the fourteenth aspect of the present invention, it is preferable that the photographed image is an image of a drug handled by the own pharmacy. This is because it is only necessary to create learning data for drugs handled by the own pharmacy.

第１５態様に係る発明は、撮影された薬剤を任意に配置して生成された学習用画像と、学習用画像における薬剤の領域を示す第２領域情報を有する正解データと、のペアからなる学習データである。 The invention according to the fifteenth aspect provides learning that includes a pair of a learning image generated by arbitrarily arranging photographed drugs, and correct answer data having second area information indicating the area of the drug in the learning image. It is data.

本発明の第１６態様に係る機械学習装置は、学習モデルと、上記の学習データを使用し、学習モデルを機械学習させる学習制御部と、を備える。 A machine learning device according to a sixteenth aspect of the present invention includes a learning model and a learning control unit that performs machine learning on the learning model using the above learning data.

本発明の第１７態様に係る機械学習装置において、学習モデルは、畳み込みニューラルネットワークで構成されることが好ましい。 In the machine learning device according to the seventeenth aspect of the present invention, the learning model is preferably configured with a convolutional neural network.

第１８態様に係る発明は、プロセッサが、以下の各ステップの処理を行うことにより機械学習用の学習データを作成する学習データ作成方法であって、薬剤が撮影された撮影画像を取得するステップと、取得した撮影画像から薬剤を任意に配置した学習用画像を生成するステップと、生成した学習用画像における薬剤の領域に対応する第２領域情報を生成し、生成した第２領域情報を学習用画像に対する正解データとするステップと、生成した学習用画像と正解データとのペアを、学習データとしてメモリに記憶させるステップと、を含む。 The invention according to the 18th aspect is a learning data creation method in which a processor creates learning data for machine learning by performing processing in the following steps, the step of acquiring a photographed image in which a drug is photographed; , a step of generating a learning image in which drugs are arbitrarily arranged from the acquired photographed image, generating second area information corresponding to the area of the drug in the generated learning image, and using the generated second area information as a learning image. The method includes a step of setting the image as correct data, and a step of storing the generated pair of the learning image and the correct data in a memory as the learning data.

本発明の第１９態様に係る学習データ作成方法において、正解データは、薬剤の領域に対応する正解画像、薬剤の領域を矩形で囲むバウンディングボックス情報、及び薬剤の領域のエッジを示すエッジ情報のうちの少なくとも１つを含むことが好ましい。 In the learning data creation method according to the nineteenth aspect of the present invention, the correct data includes a correct image corresponding to a drug region, bounding box information surrounding the drug region with a rectangle, and edge information indicating an edge of the drug region. It is preferable to include at least one of the following.

本発明の第２０態様に係る学習データ作成方法において、学習用画像を生成するステップは、複数の薬剤を配置する際に、複数の薬剤の全部又は一部を点又は線で接触させることが好ましい。 In the learning data creation method according to the 20th aspect of the present invention, it is preferable that in the step of generating a learning image, when arranging a plurality of drugs, all or part of the plurality of drugs are brought into contact with a point or a line. .

本発明の第２１態様に係る学習データ作成方法において、正解データは、複数の薬剤の点又は線で接触する箇所のみを示すエッジ画像を含むことが好ましい。 In the learning data creation method according to the 21st aspect of the present invention, it is preferable that the correct data includes an edge image showing only the points or lines of contact between the plurality of drugs.

本発明の第２２態様に係る学習データ作成方法において、薬剤は、少なくとも一部が透明な薬剤であることが好ましい。 In the learning data creation method according to the twenty-second aspect of the present invention, it is preferable that the drug is at least partially transparent.

本発明の第２３態様に係る学習データ作成方法において、学習用画像を生成するステップは、複数の薬剤を含む学習用画像を生成する際に、透明な薬剤以外の薬剤を任意に配置することが好ましい。 In the learning data creation method according to the twenty-third aspect of the present invention, the step of generating a learning image may include arbitrarily arranging drugs other than transparent drugs when generating a learning image including a plurality of drugs. preferable.

第２４態様に係る発明は、薬剤が撮影された撮影画像を取得する機能と、取得した撮影画像から薬剤を任意に配置した学習用画像を生成する機能と、生成した学習用画像における薬剤の領域に対応する第２領域情報を生成し、生成した第２領域情報を学習用画像に対する正解データとする機能と、生成した学習用画像と正解データとのペアを、学習データとしてメモリに記憶させる機能と、をコンピュータにより実現させる学習データ作成プログラムである。 The invention according to the twenty-fourth aspect provides a function of acquiring a photographed image in which a drug is photographed, a function of generating a learning image in which a drug is arbitrarily arranged from the acquired photographic image, and a region of the drug in the generated learning image. A function that generates second area information corresponding to and uses the generated second area information as correct answer data for the learning image, and a function that stores the pair of the generated learning image and correct answer data in memory as learning data. This is a learning data creation program that uses a computer to realize the following.

本発明によれば、薬剤を撮影した撮影画像から薬剤の領域に対応する領域情報を認識する学習モデルを機械学習させるための学習データを効率よく作成することができる。 According to the present invention, it is possible to efficiently create learning data for machine learning a learning model that recognizes area information corresponding to a drug area from a captured image of a drug.

図１は、学習済みの学習モデルに入力される撮影画像と学習モデルから取得したい出力結果とを示す図である。FIG. 1 is a diagram showing captured images input to a trained learning model and output results to be obtained from the learning model. 図２は、学習データの一例を示す図である。FIG. 2 is a diagram showing an example of learning data. 図３は、正解データを自動で作成する場合の画像処理を示す概念図である。FIG. 3 is a conceptual diagram showing image processing when correct data is automatically created. 図４は、シミュレーションにより学習データを作成する第１実施形態を示す図である。FIG. 4 is a diagram showing a first embodiment in which learning data is created by simulation. 図５は、シミュレーションにより学習データを作成する第２実施形態を示す図である。FIG. 5 is a diagram showing a second embodiment in which learning data is created by simulation. 図６は、本発明に係る学習データ作成装置のハードウェア構成の一例を示すブロック図である。FIG. 6 is a block diagram showing an example of the hardware configuration of the learning data creation device according to the present invention. 図７は、複数の薬剤が一包化された薬包を示す平面図である。FIG. 7 is a plan view showing a medicine package containing a plurality of medicines. 図８は、図６に示した撮影装置の概略構成を示すブロック図である。FIG. 8 is a block diagram showing a schematic configuration of the imaging device shown in FIG. 6. 図９は、撮影装置の概略構成を示す平面図である。FIG. 9 is a plan view showing a schematic configuration of the photographing device. 図１０は、撮影装置の概略構成を示す側面図である。FIG. 10 is a side view showing a schematic configuration of the photographing device. 図１１は、本発明に係る学習データ作成装置の実施形態を示すブロック図である。FIG. 11 is a block diagram showing an embodiment of a learning data creation device according to the present invention. 図１２は、画像取得部が取得する撮影画像及び第１領域情報取得部が取得する撮影画像内の薬剤の領域を示す第１領域情報の一例を示す図である。FIG. 12 is a diagram illustrating an example of a photographed image acquired by the image acquisition unit and first area information indicating a drug area in the photographed image acquired by the first area information acquisition unit. 図１３は、図１２に示した撮影画像及びマスク画像から生成した学習データの一例を示す図である。FIG. 13 is a diagram showing an example of learning data generated from the captured image and mask image shown in FIG. 12. 図１４は、複数の薬剤の点又は線で接触する箇所のみを示すエッジ画像の一例を示す図である。FIG. 14 is a diagram showing an example of an edge image showing only points or lines of contact between a plurality of drugs. 図１５は、本発明に係る機械学習装置の実施形態を示すブロック図である。FIG. 15 is a block diagram showing an embodiment of a machine learning device according to the present invention. 図１６は、本発明に係る学習データ作成方法の実施形態を示すフローチャートである。FIG. 16 is a flowchart showing an embodiment of the learning data creation method according to the present invention.

以下、添付図面に従って本発明に係る学習データ作成装置、方法、プログラム、学習データ及び機械学習装置の好ましい実施形態について説明する。 Hereinafter, preferred embodiments of a learning data creation device, method, program, learning data, and machine learning device according to the present invention will be described with reference to the accompanying drawings.

［本発明の概要］
図１は、学習済みの学習モデルに入力される撮影画像と学習モデルから取得したい出力結果とを示す図である。[Summary of the present invention]
FIG. 1 is a diagram showing captured images input to a trained learning model and output results to be obtained from the learning model.

図１（Ｂ）に示す学習モデルの出力結果は、図１（Ａ）に示した撮影画像内の薬剤の領域（薬剤領域）を推論した推論結果であり、本例では、薬剤領域と背景領域とを領域分類したマスク画像である。尚、推論結果は、マスク画像に限らず、例えば、薬剤領域を矩形の枠で囲むバウンディングボックス、又はバウンディングボックスの対角の２点の座標、又はこれらの組み合わせが考えられる。 The output result of the learning model shown in FIG. 1(B) is the inference result of inferring the drug region (drug region) in the photographed image shown in FIG. 1(A), and in this example, the drug region and the background region are This is a mask image that is classified into regions. Note that the inference result is not limited to the mask image, but may be, for example, a bounding box surrounding the drug region with a rectangular frame, coordinates of two diagonal points of the bounding box, or a combination thereof.

学習済みの学習モデルにより、任意の入力画像から所望の出力結果（推論結果）を得るためには、未学習の学習モデルを機械学習させるための学習データを大量に準備する必要がある。 In order to obtain a desired output result (inference result) from an arbitrary input image using a trained learning model, it is necessary to prepare a large amount of learning data for performing machine learning on an untrained learning model.

図２は、学習データの一例を示す図である。 FIG. 2 is a diagram showing an example of learning data.

図２（Ａ）～（Ｃ）は、それぞれ左側が薬剤の撮影画像（学習用画像）であり、右側が撮影画像に対する正解データであり、左右の撮影画像と正解データとのペアが、学習データである。本例の正解データは、各薬剤の領域を背景から区別するマスク画像である。 In Figures 2 (A) to (C), the left side is the photographed image of the drug (learning image), the right side is the correct data for the photographed image, and the pair of the left and right photographed images and the correct data is the learning data. It is. The correct data in this example is a mask image that distinguishes each drug region from the background.

基本的に、学習データには、図２（Ａ）～（Ｃ）に示した左側の撮影画像が必要であるが、例えば、新薬等、数が少ない薬剤も存在するため、多くの画像が集まらないという問題がある。 Basically, the training data requires the captured images on the left side shown in Figures 2 (A) to (C), but for example, there are new drugs and other drugs that are rare in number, so many images cannot be collected. The problem is that there is no.

撮影画像（学習用画像）に対する正解データの作成は、撮影画像をディスプレイに表示させ、ディスプレイに表示された撮影画像を見ながら薬剤の領域をユーザが画素単位で塗り潰して作成するのが一般的である。 To create correct answer data for a photographed image (learning image), it is common for the user to display the photographed image on a display and fill in the drug area pixel by pixel while looking at the photographed image displayed on the display. be.

また、正解データを自動で作成する場合、例えば、撮影画像からテンプレートマッチングにより薬剤の位置、回転角を計算することで求めることができる。 Furthermore, when correct data is automatically created, it can be obtained, for example, by calculating the position and rotation angle of the drug using template matching from the photographed image.

図３は、正解データを自動で作成する場合の画像処理を示す概念図である。 FIG. 3 is a conceptual diagram showing image processing when correct data is automatically created.

薬剤を撮影した撮影画像ＩＴＰに対して、その薬剤を示す画像であるテンプレート画像Ｉ_tplを用意する。薬剤の形状が円形でない場合には、探索する回転角毎の複数のテンプレート画像Ｉ_tplを用意することが好ましい。A template image I tpl, which is an image showing the drug, is prepared for the photographed image _ITP of the drug. If the shape of the drug is not circular, it is preferable to prepare a plurality of template images I _tpl for each rotation angle to be searched.

そして、撮影画像ＩＴＰの中からテンプレート画像Ｉ_tplと相関が最も高くなる位置、及び回転角のテンプレート画像を探索すること（テンプレートマッチング)により、相関が最も高い時のテンプレート画像Ｉ_tplの位置、及びテンプレート画像Ｉ_tplの回転角に基づいて、撮影画像ＩＴＰにおける薬剤の領域を示す正解データを作成することができる。Then, by searching for a template image at a position and rotation angle that has the highest correlation with the template image I _tpl from among the photographed images ITP (template matching), the position of the template image I _tpl when the correlation is highest, and Based on the rotation angle of the template image I _tpl , correct data indicating the area of the drug in the photographed image ITP can be created.

また、撮影画像ＩＴＰと正解データ（例えば、マスク画像）とを重ね合わせてディスプレイに表示し、マスク画像に誤差がある場合には、ユーザがマスク画像を画素単位で修正するようにしてもよい。 Alternatively, the photographed image ITP and the correct data (for example, a mask image) may be superimposed and displayed on the display, and if there is an error in the mask image, the user may correct the mask image pixel by pixel.

＜シミュレーションにより学習データを作成する第１実施形態＞
図４は、シミュレーションにより学習データを作成する第１実施形態を示す図である。<First embodiment of creating learning data through simulation>
FIG. 4 is a diagram showing a first embodiment in which learning data is created by simulation.

図４（Ａ）は、薬剤を撮影した撮影画像と、その撮影画像に基づいて手動又は自動で生成したマスク画像とのペアを示す図である。 FIG. 4A is a diagram showing a pair of a photographed image of a drug and a mask image manually or automatically generated based on the photographed image.

本発明は、撮影画像とマスク画像とのペアから、シミュレーションにより学習データを作成する（学習データを水増しする）。 In the present invention, learning data is created by simulation from a pair of a captured image and a mask image (the learning data is padded).

図４（Ｂ）は、それぞれ図４（Ａ）に示した撮影画像とマスク画像とをそれぞれ反転した撮影画像及びマスク画像のペアを示す図である。 FIG. 4(B) is a diagram showing a pair of a photographed image and a mask image, which are obtained by inverting the photographed image and mask image shown in FIG. 4(A), respectively.

図４（Ｂ）に示す右側の反転（左右反転）されたマスク画像は、図４（Ｂ）に示す左側の反転された撮影画像における薬剤の領域を示すマスク画像となる。したがって、反転された撮影画像は、新たな学習用画像とすることができ、反転されたマスク画像は、新たに生成された学習用画像に対する正解データとすることができる。尚、画像の反転は、左右反転に限らず、上下反転も含む。また、先に左の画像を作っておき、そこから薬剤画像の領域を検出することによって、右の画像を作成するようにしてもよい。 The reversed (horizontally reversed) mask image on the right side shown in FIG. 4(B) becomes a mask image showing the area of the drug in the reversed captured image on the left side shown in FIG. 4(B). Therefore, the inverted captured image can be used as a new learning image, and the inverted mask image can be used as correct data for the newly generated learning image. Note that image reversal is not limited to horizontal reversal, but also includes vertical reversal. Alternatively, the right image may be created by first creating the left image and detecting the area of the drug image therefrom.

図４（Ｃ）は、図４（Ａ）及び図４（Ｂ）に示した画像を加算した画像を示す図である。 FIG. 4(C) is a diagram showing an image obtained by adding the images shown in FIG. 4(A) and FIG. 4(B).

図４（Ｃ）に示す左側の撮影画像は、図４（Ａ）に示した撮影画像と図４（Ｂ）に示した反転された撮影画像とを合成することで作成することができる。即ち、図４（Ｃ）に示す撮影画像は、図４（Ａ）に示す撮影画像に、図４（Ｂ）に示す反転された撮影画像中の薬剤の領域を切り出した画像（薬剤画像）を貼り付けることで作成することができる。尚、反転された撮影画像からの薬剤画像の切り出しは、図４（Ｂ）の反転されたマスク画像に基づいて、図４（Ａ）に示した撮影画像から薬剤の領域を切り出す処理により行うことができる。また、２以上の薬剤画像を合成する方法は、上記のようにマスク画像を使用する方法に限らない。例えば、薬剤が撮影されていない背景のみの背景画像を使用して、図４（Ａ）及び図４（Ｂ）からそれぞれ薬剤画像のみを抽出し、抽出した各薬剤画像を背景画像に合成することで、図４（Ｃ）に示した撮影画像を生成することができる。 The photographed image on the left side shown in FIG. 4(C) can be created by combining the photographed image shown in FIG. 4(A) and the inverted photographed image shown in FIG. 4(B). That is, the photographed image shown in FIG. 4(C) is obtained by adding an image (drug image) obtained by cutting out the drug area in the inverted photographed image shown in FIG. 4(B) to the photographed image shown in FIG. 4(A). It can be created by pasting. Note that cutting out the drug image from the inverted captured image is performed by cutting out the drug area from the captured image shown in FIG. 4(A) based on the inverted mask image in FIG. 4(B). I can do it. Furthermore, the method of combining two or more drug images is not limited to the method of using mask images as described above. For example, by using a background image of only the background in which no drug is photographed, only the drug images are extracted from each of FIGS. 4(A) and 4(B), and each extracted drug image is combined with the background image. Thus, the photographed image shown in FIG. 4(C) can be generated.

また、図４（Ｃ）に示す右側のマスク画像は、図４（Ａ）に示したマスク画像と図４（Ｂ）に示す反転されたマスク画像とを加算することで作成することができる。尚、２つのマスク画像を加算する際に、インスタンス分離のために、本例では図４（Ｃ）の反転されたマスク画像の薬剤の領域の画素値を例えば「０．５」、背景の画素値を「０」として加算することで、生成されたマスク画像における２つの薬剤の領域の画素値を異ならせる。 Further, the right mask image shown in FIG. 4(C) can be created by adding the mask image shown in FIG. 4(A) and the inverted mask image shown in FIG. 4(B). Note that when adding two mask images, in order to separate instances, in this example, the pixel value of the drug area in the inverted mask image of FIG. By adding the value as "0", the pixel values of the two drug regions in the generated mask image are made different.

このようにして、図４（Ａ）に示した撮影画像とマスク画像のペアからなる１つの学習データから、図４（Ｂ）及び（Ｃ）に示した２つの学習データを作成することができる。 In this way, the two learning data shown in FIGS. 4(B) and (C) can be created from one learning data consisting of the pair of photographed image and mask image shown in FIG. 4(A). .

また、上記の第１実施形態では、図４（Ａ）に示した撮影画像及びマスク画像をそれぞれ反転し、図４（Ｂ）に示した新たな撮影画像及びマスク画像のペアからなる学習データを作成するようにしたが、これに限らず、図４（Ａ）に示した撮影画像及びマスク画像をそれぞれ同期して平行移動、回転、又は拡縮させて、新たな撮影画像及びマスク画像のペアからなる学習データを作成してもよい。尚、撮影画像及びマスク画像をそれぞれ同期して平行移動、回転、又は縮小させることで、背景に余白が生じる場合には、背景と同様な画素値で余白を埋めることが好ましい。 In addition, in the first embodiment described above, the captured image and the mask image shown in FIG. 4(A) are each inverted, and the learning data consisting of the new captured image and mask image pair shown in FIG. 4(B) is created. However, the present invention is not limited to this, and the captured image and mask image shown in FIG. You may also create learning data. Note that if a blank space is created in the background by synchronously moving, rotating, or reducing the captured image and the mask image, it is preferable to fill the blank space with pixel values similar to those of the background.

更に、上記の第１実施形態では、１つの薬剤が撮影された撮影画像及びマスク画像から新たな撮影画像及びマスク画像を作成するが、複数の異なる薬剤が別々に撮影された複数の撮影画像及びマスク画像、又は複数の異なる薬剤が同時に撮影された撮影画像及びマスク画像から、新たな撮影画像及びマスク画像を作成するようにしてもよい。 Furthermore, in the first embodiment described above, a new photographic image and a mask image are created from a photographic image and a mask image in which one drug is photographed, but a new photographic image and a mask image are created from a photographic image and a mask image in which a plurality of different drugs are photographed separately. A new photographed image and a mask image may be created from a mask image or a photographed image and a mask image in which a plurality of different drugs are photographed simultaneously.

＜シミュレーションにより学習データを作成する第２実施形態＞
図５は、シミュレーションにより学習データを作成する第２実施形態を示す図である。<Second embodiment of creating learning data through simulation>
FIG. 5 is a diagram showing a second embodiment in which learning data is created by simulation.

図５（Ａ）は、薬剤を撮影した撮影画像と、その撮影画像に基づいて手動又は自動で生成したマスク画像とのペアを示す図であり、図４（Ａ）に示したペアと同一である。 FIG. 5(A) is a diagram showing a pair of a photographed image of a drug and a mask image generated manually or automatically based on the photographed image, and is the same as the pair shown in FIG. 4(A). be.

図５（Ｂ）は、それぞれ図５（Ａ）に示した撮影画像及びマスク画像からそれぞれ切り出す薬剤領域を示す図である。 FIG. 5(B) is a diagram showing drug regions cut out from the captured image and mask image shown in FIG. 5(A), respectively.

本例では、薬剤領域を囲む矩形の枠内の領域を、画像を切り出す領域（切出領域）としている。尚、マスク画像により薬剤領域は既知であるため、マスク画像に基づいて薬剤領域を囲む矩形の枠内の画像を切り出すことができる。 In this example, the area within the rectangular frame surrounding the drug area is set as the area from which the image is cut out (cutout area). Note that since the drug area is known from the mask image, an image within a rectangular frame surrounding the drug area can be cut out based on the mask image.

図５（Ｃ）は、それぞれ図５（Ａ）に示した撮影画像及びマスク画像から切り出された切出領域の画像を示す。撮影画像からの薬剤画像の切り出しは、図５（Ａ）に示したマスク画像に基づいて、図５（Ａ）に示した撮影画像から薬剤の領域を切り出す処理（薬剤画像取得処理）により行うことができる。尚、図５（Ａ）に示したマスク画像は、薬剤の領域を示す情報（第１領域情報）を有するため、マスク画像から薬剤の領域（以下、「薬剤マスク画像」という）を切り出すことができる。また、薬剤画像取得処理には、切り出された後の状態の画像を、メモリ等から読み出す処理が含まれていてもよい。 FIG. 5(C) shows an image of a cutout area cut out from the photographed image and mask image shown in FIG. 5(A), respectively. Cutting out the drug image from the captured image is performed by the process of cutting out the drug area from the captured image shown in FIG. 5(A) (drug image acquisition process) based on the mask image shown in FIG. 5(A). I can do it. Note that since the mask image shown in FIG. 5(A) has information indicating the drug area (first area information), it is possible to cut out the drug area (hereinafter referred to as "drug mask image") from the mask image. can. Further, the drug image acquisition process may include a process of reading out the image after it has been cut out from a memory or the like.

図５（Ｄ）は、切り出された薬剤画像及び薬剤マスク画像を任意の位置及び任意の回転角で貼り付けて作成した、新たな撮影画像及びマスク画像を示す図である。 FIG. 5(D) is a diagram showing a new captured image and a mask image created by pasting the cut out drug image and drug mask image at an arbitrary position and an arbitrary rotation angle.

図５（Ｄ）に示す撮影画像及びマスク画像は、図５（Ａ）に示した撮影画像及びマスク画像から上記の画像処理により作成した、新たな学習用画像及び正解データのペアからなる学習データとなる。 The photographed image and mask image shown in FIG. 5(D) are learning data consisting of a pair of a new learning image and correct answer data created by the above image processing from the photographed image and mask image shown in FIG. 5(A). becomes.

図５（Ｅ）は、切り出された薬剤画像及び薬剤マスク画像を任意の位置及び任意に回転角で貼り付けて作成した、新たな撮影画像及びマスク画像を示す図であり、特に複数の薬剤画像が点又は線で接触するように作成されている。 FIG. 5E is a diagram showing a new captured image and mask image created by pasting the cut out drug image and drug mask image at an arbitrary position and arbitrary rotation angle, and in particular, a diagram showing a new captured image and a mask image created by pasting the cut out drug image and drug mask image at an arbitrary position and at an arbitrary rotation angle. are created so that they touch at a point or line.

学習モデルにおける推論結果を向上させるためには、薬剤同士が点又は線で接触している状態の学習データを大量に作成する必要がある。薬剤同士が点又は線で接触している撮影画像から、各薬剤の領域を精度よく推論するのは、各薬剤が接触せずに孤立している場合に比べて難しいからである。 In order to improve the inference results of the learning model, it is necessary to create a large amount of learning data in which drugs are in contact with each other at points or lines. This is because it is more difficult to accurately infer the area of each drug from a photographed image in which the drugs are in contact with each other at points or lines than when the drugs are isolated without contact.

図５（Ｅ）に示す左側のマスク画像は、各薬剤領域が接しないように画像処理することが好ましい。各薬剤領域が接触する箇所は既知であるため、その接触する箇所を背景色に置換することで、各薬剤領域が接触しないようにできる。 The mask image on the left side shown in FIG. 5(E) is preferably image-processed so that the respective drug regions do not touch each other. Since the locations where each drug area contacts are known, by replacing the contact locations with the background color, it is possible to prevent each drug area from contacting each other.

また、薬剤同士が点又は線で接触する各薬剤が同一薬剤の場合、インスタンス分離のために、薬剤領域の画素値を異ならせることが好ましい。この場合、マスク画像における各薬剤領域は、その画素値の違いで認識できるため、薬剤領域が接触する箇所を背景色に置換しなくてもよい。 Furthermore, if the drugs that are in contact with each other at points or lines are the same drug, it is preferable to make the pixel values of the drug regions different for instance separation. In this case, since each drug region in the mask image can be recognized by the difference in pixel values, there is no need to replace the portions where the drug regions come into contact with the background color.

以上のようにして、薬剤が撮影された撮影画像とその撮影画像内の薬剤の領域を示す第１領域情報（マスク画像）とを元に、多くの学習データを作成することができる。 As described above, a large amount of learning data can be created based on a photographed image in which a drug is photographed and the first area information (mask image) indicating the region of the drug in the photographed image.

［学習データ作成装置の構成］
図６は、本発明に係る学習データ作成装置のハードウェア構成の一例を示すブロック図である。[Configuration of learning data creation device]
FIG. 6 is a block diagram showing an example of the hardware configuration of the learning data creation device according to the present invention.

図６に示す学習データ作成装置１は、例えば、コンピュータにより構成することができ、主として画像取得部２２、ＣＰＵ（Central Processing Unit）２４、操作部２５、ＲＡＭ(Random Access Memory)２６、ＲＯＭ(Read Only Memory)２７、メモリ２８及び表示部２９から構成されている。 The learning data creation device 1 shown in FIG. 6 can be configured by, for example, a computer, and mainly includes an image acquisition section 22, a CPU (Central Processing Unit) 24, an operation section 25, a RAM (Random Access Memory) 26, and a ROM (Read 27, a memory 28, and a display section 29.

画像取得部２２は、撮影装置１０により薬剤が撮影された撮影画像を、撮影装置１０から取得する。 The image acquisition unit 22 acquires, from the photographing device 10, a photographed image of the drug taken by the photographing device 10.

撮影装置１０により撮影される薬剤は、例えば、服用１回分の薬剤、又は任意の薬剤であり、薬包に入っているものでもよいし、薬包に入っていないものでもよい。 The medicine photographed by the photographing device 10 is, for example, a single dose of medicine or any medicine, and may be contained in a medicine package or not contained in a medicine package.

図７は、複数の薬剤が一包化された薬包を示す平面図である。 FIG. 7 is a plan view showing a medicine package containing a plurality of medicines.

図７に示す薬包ＴＰは、１回に服用される複数の薬剤が透明な包に収納され、一包ずつパッキングされたものである。薬包ＴＰは、図９及び図１０に示すように帯状に連結されており、各薬包ＴＰを切り離し可能にする切取線が入っている。尚、図７に示す薬包ＴＰには、６個の薬剤Ｔが一包化されている。 In the medicine package TP shown in FIG. 7, a plurality of medicines to be taken at one time are stored in a transparent package and packed one by one. As shown in FIGS. 9 and 10, the medicine packages TP are connected in a band shape, and each medicine package TP has a cutoff line that allows the medicine packages TP to be separated. Note that six medicines T are packaged in the medicine package TP shown in FIG. 7 .

図８は、図６に示した撮影装置の概略構成を示すブロック図である。 FIG. 8 is a block diagram showing a schematic configuration of the imaging device shown in FIG. 6.

図８に示す撮影装置１０は、薬剤を撮影する２台のカメラ１２Ａ、１２Ｂと、薬剤を照明する２台の照明装置１６Ａ，１６Ｂと、撮影制御部１３とから構成されている。 The photographing device 10 shown in FIG. 8 includes two cameras 12A and 12B that photograph the medicine, two illumination devices 16A and 16B that illuminate the medicine, and a photographing control section 13.

図９及び図１０は、それぞれ撮影装置の概略構成を示す平面図及び側面図である。 9 and 10 are a plan view and a side view, respectively, showing the schematic configuration of the photographing device.

薬包ＴＰは、水平（ｘ－ｙ平面）に設置された透明なステージ１４の上に載置される。 The medicine package TP is placed on a transparent stage 14 installed horizontally (xy plane).

カメラ１２Ａ、１２Ｂは、ステージ１４と直交する方向（ｚ方向）に、ステージ１４を挟んで互いに対向して配置される。カメラ１２Ａは、薬包ＴＰの表面に正対し、薬包ＴＰを上方から撮影する。カメラ１２Ｂは、薬包ＴＰの裏面に正対し、薬包ＴＰを下方から撮影する。 The cameras 12A and 12B are arranged facing each other with the stage 14 in between, in a direction (z direction) orthogonal to the stage 14. The camera 12A directly faces the surface of the medicine package TP and photographs the medicine package TP from above. The camera 12B directly faces the back side of the medicine package TP and photographs the medicine package TP from below.

ステージ１４を挟んで、カメラ１２Ａの側には、照明装置１６Ａが備えられ、カメラ１２Ｂの側には、照明装置１６Ｂが備えられる。 A lighting device 16A is provided on the side of the camera 12A with the stage 14 in between, and a lighting device 16B is provided on the side of the camera 12B.

照明装置１６Ａは、ステージ１４の上方に配置され、ステージ１４に載置された薬包ＴＰを上方から照明する。照明装置１６Ａは、放射状に配置された４つの発光部１６Ａ１～１６Ａ４を有し、直交する４方向から照明光を照射する。各発光部１６Ａ１～１６Ａ４の発光は、個別に制御される。 The illumination device 16A is arranged above the stage 14 and illuminates the medicine package TP placed on the stage 14 from above. The illumination device 16A has four light emitting sections 16A1 to 16A4 arranged radially, and emits illumination light from four orthogonal directions. Light emission from each of the light emitting sections 16A1 to 16A4 is individually controlled.

照明装置１６Ｂは、ステージ１４の下方に配置され、ステージ１４に載置された薬包ＴＰを下方から照明する。照明装置１６Ｂは、照明装置１６Ａと同様に放射状に配置された４つの発光部１６Ｂ１～１６Ｂ４を有し、直交する４方向から照明光を照射する。各発光部１６Ｂ１～１６Ｂ４の発光は、個別に制御される。 The illumination device 16B is arranged below the stage 14 and illuminates the medicine package TP placed on the stage 14 from below. The lighting device 16B has four light emitting parts 16B1 to 16B4 arranged radially like the lighting device 16A, and emits illumination light from four orthogonal directions. The light emission of each light emitting section 16B1 to 16B4 is individually controlled.

撮影は、次のように行われる。まず、カメラ１２Ａを用いて、薬包ＴＰを上方から撮影する。撮影の際には、照明装置１６Ａの各発光部１６Ａ１～１６Ａ４を順次発光させ、４枚の画像の撮影を行い、続いて、各発光部１６Ａ１～１６Ａ４を同時に発光させ、１枚の画像の撮影を行う。次に、下方の照明装置１６Ｂの各発光部１６Ｂ１～１６Ｂ４を同時に発光させるとともに、図示しないリフレクタを挿入し、リフレクタを介して薬包ＴＰを下から照明し、カメラ１２Ａを用いて上方から薬包ＴＰの撮影を行う。 Photographing is performed as follows. First, the medicine package TP is photographed from above using the camera 12A. When photographing, the light emitting parts 16A1 to 16A4 of the illumination device 16A are made to emit light in sequence to take four images, and then each of the light emitting parts 16A1 to 16A4 are made to emit light at the same time to take one image. I do. Next, each of the light emitting parts 16B1 to 16B4 of the lower illumination device 16B is made to emit light at the same time, a reflector (not shown) is inserted, the medicine package TP is illuminated from below through the reflector, and the medicine package TP is illuminated from above using the camera 12A. Take pictures of TP.

各発光部１６Ａ１～１６Ａ４を順次発光させて撮影される４枚の画像は、それぞれ照明方向が異なっており、薬剤の表面に刻印（凹凸）がある場合に刻印による影の出方が異なるものとなる。これらの４枚の撮影画像は、薬剤Ｔの表面側の刻印を強調した刻印画像を生成するために使用される。 The four images taken by sequentially emitting light from each of the light emitting parts 16A1 to 16A4 have different illumination directions, and if there is an imprint (unevenness) on the surface of the drug, the appearance of shadows due to the imprint will be different. Become. These four captured images are used to generate a stamp image that emphasizes the stamp on the front side of the medicine T.

各発光部１６Ａ１～１６Ａ４を同時に発光させて撮影される１枚の画像は、輝度ムラのない画像であり、例えば、薬剤Ｔの表面側の画像（薬剤画像）を切り出す場合に使用され、また、刻印画像が重畳される撮影画像である。 One image taken by simultaneously emitting light from each of the light emitting units 16A1 to 16A4 is an image without uneven brightness, and is used, for example, when cutting out an image of the surface side of the drug T (drug image), and This is a photographed image on which a stamped image is superimposed.

また、リフレクタを介して薬包ＴＰを下方から照明し、カメラ１２Ａを用いて上方から薬包ＴＰが撮影される画像は、複数の薬剤Ｔの領域を認識する場合に使用される撮影画像である。 In addition, an image in which the medicine package TP is illuminated from below through a reflector and the medicine package TP is photographed from above using the camera 12A is a photographed image used when recognizing areas of a plurality of medicines T. .

次に、カメラ１２Ｂを用いて、薬包ＴＰを下方から撮影する。撮影の際には、照明装置１６Ｂの各発光部１６Ｂ１～１６Ｂ４を順次発光させ、４枚の画像の撮影を行い、続いて、各発光部１６Ｂ１～１６Ｂ４を同時に発光させ、１枚の画像の撮影を行う。 Next, the medicine package TP is photographed from below using the camera 12B. When photographing, each of the light emitting sections 16B1 to 16B4 of the illumination device 16B is made to emit light in sequence to take four images, and then each of the light emitting sections 16B1 to 16B4 is made to emit light at the same time to take one image. I do.

４枚の撮影画像は、薬剤Ｔの裏面側の刻印を強調した刻印画像を生成するために使用され、各発光部１６Ｂ１～１６Ｂ４を同時に発光させて撮影される１枚の画像は、輝度ムラのない画像であり、例えば、薬剤Ｔの裏面側の薬剤画像を切り出す場合に使用され、また、刻印画像が重畳される撮影画像である。 The four captured images are used to generate a stamp image that emphasizes the stamp on the back side of the drug T, and one image captured by simultaneously emitting light from each of the light emitting sections 16B1 to 16B4 is used to eliminate uneven brightness. For example, this is an image that is used when cutting out a drug image on the back side of the drug T, and is also a photographed image on which a stamp image is superimposed.

図８に示した撮影制御部１３は、カメラ１２Ａ、１２Ｂ、及び照明装置１６Ａ、１６Ｂを制御し、１つの薬包ＴＰに対して１１回の撮影（カメラ１２Ａで６回、カメラ１２Ｂで５回の撮影）を行わせる。 The photographing control unit 13 shown in FIG. 8 controls the cameras 12A, 12B and the illumination devices 16A, 16B, and photographs one medicine package TP 11 times (6 times with the camera 12A, 5 times with the camera 12B). (photographing)).

また、撮影は暗室の状態で行われ、撮影の際に薬包ＴＰに照射される光は、照明装置１６Ａ、又は照明装置１６Ｂからの照明光のみである。したがって、上記のようにして撮影される１１枚の撮影画像のうち、リフレクタを介して薬包ＴＰを下方から照明し、カメラ１２Ａを用いて上方から薬包ＴＰを撮影した画像は、背景が光源の色（白色）になり、各薬剤Ｔの領域が遮光されて黒くなる。一方、他の１０枚の撮影画像は、背景が黒く、各薬剤の領域が薬剤の色になる。 Furthermore, the photographing is performed in a dark room, and the only light that is irradiated onto the medicine package TP during photographing is the illumination light from the illumination device 16A or the illumination device 16B. Therefore, among the 11 photographed images taken as described above, the image in which the medicine packet TP is illuminated from below through the reflector and the medicine packet TP is photographed from above using the camera 12A has the background as the light source. (white), and the area of each drug T is shielded from light and becomes black. On the other hand, in the other 10 photographed images, the background is black and the area of each drug is the color of the drug.

尚、リフレクタを介して薬包ＴＰを下方から照明し、カメラ１２Ａを用いて上方から薬包ＴＰを撮影した画像であっても、薬剤全体が透明（半透明）な透明薬剤、あるいは一部又は全部が透明なカプセルに粉末又は顆粒状の医薬が充填されたカプセル剤（一部が透明な薬剤）の場合、薬剤の領域から光が透過するため、不透明な薬剤のように真っ黒にならない。 Note that even if the image is obtained by illuminating the drug package TP from below through a reflector and photographing the drug package TP from above using the camera 12A, the drug may be entirely transparent (semi-transparent), or only a portion of the drug may be transparent. In the case of capsules that are completely transparent and filled with powder or granular medicine (partially transparent medicines), light passes through the medicine area, so the capsules do not turn pitch black like opaque medicines.

図６に戻って、学習データ作成装置１は、薬剤が撮影された撮影画像から薬剤を推論（特に撮影画像内に存在する各薬剤Ｔの領域を推論）する学習モデルを、機械学習させるための学習データを作成するものである。 Returning to FIG. 6, the learning data creation device 1 is configured to perform machine learning on a learning model that infers a drug from a captured image in which the drug is captured (in particular, infers the region of each drug T present in the captured image). This is to create learning data.

したがって、学習データ作成装置１の画像取得部２２は、撮影装置１０により撮影される１１枚の撮影画像のうちの、複数の薬剤Ｔの領域を認識する場合に使用される撮影画像（即ち、リフレクタを介して薬包ＴＰを下方から照明し、カメラ１２Ａを用いて上方から薬包ＴＰを撮影した撮影画像）を取得することが好ましい。 Therefore, the image acquisition unit 22 of the learning data creation device 1 uses a photographic image (i.e., a reflector It is preferable to illuminate the drug package TP from below through the camera 12A and obtain a photographed image of the drug package TP from above using the camera 12A.

メモリ２８は、学習データを記憶する記憶部分であり、例えば、ハードディスク装置、フラッシュメモリ等の不揮発性メモリである。 The memory 28 is a storage part that stores learning data, and is, for example, a nonvolatile memory such as a hard disk device or a flash memory.

ＣＰＵ２４は、ＲＡＭ２６を作業領域とし、ＲＯＭ２７又はメモリ２８に記憶された学習データ作成プログラムを含む各種のプログラムを使用し、プログラムを実行することで本装置の各種の処理を実行する。 The CPU 24 uses the RAM 26 as a work area, uses various programs including a learning data creation program stored in the ROM 27 or the memory 28, and executes various processes of the apparatus by executing the programs.

操作部２５は、キーボード、ポインティングデバイス（マウス等）を含み、ユーザの操作により各種の情報や指示を入力する部分である。 The operation unit 25 includes a keyboard and a pointing device (mouse, etc.), and is a part through which various information and instructions are input by user operations.

表示部２９は、操作部２５での操作に必要な画面を表示し、ＧＵＩ（Graphical User Interface）を実現する部分として機能し、また、撮影画像等を表示することができる。 The display unit 29 displays a screen necessary for operations on the operation unit 25, functions as a part that implements a GUI (Graphical User Interface), and can also display photographed images and the like.

［学習データ作成装置の実施形態］
図１１は、本発明に係る学習データ作成装置の実施形態を示すブロック図である。[Embodiment of learning data creation device]
FIG. 11 is a block diagram showing an embodiment of a learning data creation device according to the present invention.

図１１に示す学習データ作成装置１は、図６に示した学習データ作成装置１のハードウェア構成により実行される機能を示す機能ブロック図であり、プロセッサ２とメモリ２８とを備えている。 The learning data creation device 1 shown in FIG. 11 is a functional block diagram showing functions executed by the hardware configuration of the learning data creation device 1 shown in FIG. 6, and includes a processor 2 and a memory 28.

プロセッサ２は、図６に示した画像取得部２２、ＣＰＵ２４、ＲＡＭ２６、ＲＯＭ２７、及びメモリ２８等から構成され、以下に示す各種の処理を行う。 The processor 2 includes the image acquisition unit 22, CPU 24, RAM 26, ROM 27, memory 28, etc. shown in FIG. 6, and performs various processes shown below.

プロセッサ２は、取得部２０、学習用画像生成部３０、正解データ生成部３２、及び記憶制御部３４として機能する。 The processor 2 functions as an acquisition unit 20 , a learning image generation unit 30 , a correct data generation unit 32 , and a storage control unit 34 .

取得部２０は、画像取得部２２及び第１領域情報取得部２３を備えている。 The acquisition unit 20 includes an image acquisition unit 22 and a first area information acquisition unit 23.

画像取得部２２は、前述したように撮影装置１０から薬剤Ｔを撮影した撮影画像ＩＴＰを取得する（撮影画像の取得処理を行う）。 The image acquisition unit 22 acquires the photographed image ITP of the drug T from the photographing device 10 as described above (performs the photographed image acquisition process).

第１領域情報取得部２３は、画像取得部２２が取得した撮影画像ＩＴＰ内の薬剤の領域を示す情報（第１領域情報）を取得する。この第１領域情報は、撮影画像を学習モデルの機械学習用の入力画像とした場合、学習モデルが推論する推論結果に対する正解データである。尚、正解データである第１領域情報としては、撮影画像内の薬剤の領域を示す正解画像（例えば、マスク画像）、薬剤の領域を矩形で囲むバウンディングボックス情報、及び薬剤の領域のエッジを示すエッジ情報の少なくとも１つを含むことが好ましい。 The first area information acquisition unit 23 acquires information (first area information) indicating the area of the drug in the captured image ITP acquired by the image acquisition unit 22. This first region information is correct data for the inference result inferred by the learning model when the captured image is used as an input image for machine learning of the learning model. Note that the first area information, which is the correct data, includes a correct image (for example, a mask image) that indicates the area of the drug in the captured image, bounding box information that surrounds the area of the drug with a rectangle, and information that indicates the edges of the area of the drug. It is preferable that at least one piece of edge information is included.

図１２は、画像取得部が取得する撮影画像及び第１領域情報取得部が取得する撮影画像内の薬剤の領域を示す第１領域情報の一例を示す図である。 FIG. 12 is a diagram illustrating an example of a photographed image acquired by the image acquisition unit and first area information indicating a drug area in the photographed image acquired by the first area information acquisition unit.

図１２（Ａ）に示す撮影画像ＩＴＰは、リフレクタを介して薬包ＴＰを下方から照明し、カメラ１２Ａを用いて上方から薬包ＴＰ（図７参照）を撮影した画像である。この薬包ＴＰには、６個の薬剤Ｔ１～Ｔ６が一包化されている。 The captured image ITP shown in FIG. 12(A) is an image obtained by illuminating the medicine package TP from below through a reflector and photographing the medicine package TP (see FIG. 7) from above using the camera 12A. This medicine package TP contains six medicines T1 to T6.

図１２（Ａ）に示す薬剤Ｔ１～Ｔ３は、下方からの照明光を遮光する不透明な薬剤であるため、黒く撮影されている。薬剤Ｔ４は、透明薬剤であるため、下方からの照明光が透過して白く撮影されている。薬剤Ｔ５、Ｔ６は、同一種類のカプセル剤であり、下方からの照明光の一部が漏れるため、部分的に僅かに白く撮影されている。 The drugs T1 to T3 shown in FIG. 12(A) are photographed in black because they are opaque drugs that block illumination light from below. Since the drug T4 is a transparent drug, the illumination light from below is transmitted through it and the drug is photographed as white. Drugs T5 and T6 are the same type of capsules, and some of the illumination light from below leaks, so some parts are photographed slightly white.

図１２（Ｂ）は、撮影画像ＩＴＰ内の各薬剤Ｔ１～Ｔ６の領域を示す第１領域情報であり、本例ではマスク画像ＩＭである。 FIG. 12(B) is first area information indicating the area of each drug T1 to T6 in the photographed image ITP, which is a mask image IM in this example.

マスク画像ＩＭは、例えば、撮影画像ＩＴＰを表示部２９に表示させ、表示部２９に表示された撮影画像ＩＴＰを見ながら、ユーザがマウス等のポインティングデバイスを使用して各薬剤Ｔ１～Ｔ６の領域を画素単位で塗り潰すことで作成することができる。例えば、塗り潰した各薬剤Ｔ１～Ｔ６の領域の画素値を「１」、背景の領域の画素値を「０」とすることで、２値化したマスク画像ＩＭを作成することができる。 The mask image IM is created by, for example, displaying the captured image ITP on the display unit 29, and while viewing the captured image ITP displayed on the display unit 29, the user uses a pointing device such as a mouse to select the area of each drug T1 to T6. It can be created by filling in pixel by pixel. For example, a binarized mask image IM can be created by setting the pixel value of each of the filled-in areas of the drugs T1 to T6 to "1" and the pixel value of the background area to "0".

尚、カプセル状の薬剤Ｔ５、Ｔ６は、同一種類であるが、インスタンス分離のために、両者の薬剤Ｔ５、Ｔ６の領域の画素値を異ならせてことが好ましい。例えば、薬剤Ｔ５の領域の画素値を「１」、薬剤Ｔ６の領域の画素値を「０．５」とすることができる。 Note that although the capsule-shaped medicines T5 and T6 are of the same type, it is preferable that the pixel values of the regions of the two medicines T5 and T6 be made different for instance separation. For example, the pixel value of the drug T5 region can be set to "1", and the pixel value of the drug T6 region can be set to "0.5".

上記の例では、第１領域情報であるマスク画像ＩＭは、撮影画像ＩＴＰ内の各薬剤Ｔ１～Ｔ６の領域をユーザがポインティングデバイスを使用して手動で設定することで生成される領域情報であるが、これに限らず、撮影画像内の薬剤の領域を画像処理により自動で抽出して生成したものでもよいし、撮影画像内の薬剤の領域を画像処理により自動で抽出し、かつ手動で調整することで生成されたものでもよい。 In the above example, the mask image IM, which is the first area information, is area information generated by the user manually setting the area of each drug T1 to T6 in the captured image ITP using a pointing device. However, the invention is not limited to this, and it may be generated by automatically extracting the drug area in the captured image by image processing, or it may be generated by automatically extracting the drug area in the captured image by image processing and manually adjusted. It may be generated by

図１１に戻って、学習用画像生成部３０は、画像取得部２２から薬剤を撮影した撮影画像ＩＴＰを入力し、入力した撮影画像ＩＴＰから薬剤を任意に配置した学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）を生成する。即ち、学習用画像生成部３０は、撮影画像ＩＴＰに基づいて複数の学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）を生成する学習用画像生成処理を行う。Returning to FIG. 11, the learning image generation unit 30 inputs the photographed image ITP in which the drug is photographed from the image acquisition unit 22, and generates a learning image (I _A , I _B , I _C , ...) are generated. That is, the learning image generation unit 30 performs learning image generation processing to generate a plurality of learning images (I _A , I _B , I _C , . . . ) based on the photographed image ITP.

撮影画像ＩＴＰに撮影されている薬剤の任意の配置は、ユーザがポインディンデバイスにより薬剤画像の位置や回転を指示して行うようにしてもよいし、図４を用いて説明したように撮影画像の反転や加算等により行うようにしてもよい。また、乱数を使用してランダムに薬剤画像の位置や回転を決定して、薬剤を任意に配置してもよい。この場合、薬剤画像が重ならないようにする必要がある。 Arbitrary placement of the drug photographed in the photographed image ITP may be performed by the user instructing the position and rotation of the drug image using a pointing device, or as explained using FIG. This may also be done by inverting or adding. Alternatively, the position and rotation of the drug image may be randomly determined using random numbers, and the drug may be arbitrarily placed. In this case, it is necessary to prevent drug images from overlapping.

正解データ生成部３２は、第１領域情報取得部２３から第１領域情報であるマスク画像ＩＭを入力し、入力したマスク画像ＩＭから複数の学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）に対応する複数の正解データ（Ｉ_ａ，Ｉ_ｂ，Ｉ_ｃ，…）を生成する。即ち、正解データ生成部３２は、マスク画像ＩＭに基づいて複数の学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）における薬剤の領域を示す第２領域情報を生成し、生成した第２領域情報を複数の学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）にそれぞれ対する複数の正解データ（Ｉ_ａ，Ｉ_ｂ，Ｉ_ｃ，…）とする正解データ生成処理を行う。The correct data generation unit 32 inputs the mask image IM, which is the first area information, from the first area information acquisition unit 23, and generates a plurality of learning images (I _A , I _B , I _C , . . . ) from the input mask image IM. ) is generated. A plurality of correct answer data (I _a , I _b , I _c , ...) corresponding to That is, the correct data generation unit 32 generates second area information indicating the drug area in the plurality of learning images (I _A , I _B , I _C , ...) based on the mask image IM, and Correct data generation processing is performed in which region information is set as a plurality of correct data (I _a , I _b , I _c , . . . ) for each of a plurality of learning images (I _A , I _B , I _C , . . . ).

尚、複数の学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）、及び複数の正解データ（Ｉ_ａ，Ｉ_ｂ，Ｉ_ｃ，…）の生成は、シミュレーションにより学習データを作成する第１実施形態、及び第２実施形態で説明したように、薬剤を撮影した撮影画像と、その撮影画像内の薬剤の領域を示す第１領域情報（例えば、マスク画像）とを使用し、撮影画像及びマスク画像をそれぞれ同期して反転、平行移動、回転、又は拡縮等を行い、あるいは撮影画像及びマスク画像から切り出した薬剤画像及び薬剤マスク画像を平行移動、回転、又は拡縮して貼り付けることで生成することができる。Note that the generation of multiple learning images (I _A , I _B , I _C ,...) and multiple correct answer data (I _a , I _b , I _c ,...) is the first step in creating learning data through simulation. As described in the embodiment and the second embodiment, the captured image and the Generated by synchronously inverting, translating, rotating, scaling, etc. each mask image, or by translating, rotating, scaling, or pasting the drug image and drug mask image cut out from the captured image and mask image. can do.

記憶制御部３４は、学習用画像生成部３０により生成される学習用画像（Ｉ_Ａ，Ｉ_Ｂ，Ｉ_Ｃ，…）と、正解データ生成部３２により生成される正解データ（Ｉ_ａ，Ｉ_ｂ，Ｉ_ｃ，…）とを入力し、それぞれ対応するペア（学習用画像Ｉ_Ａと正解データＩ_ａ，学習用画像Ｉ_Ｂと正解データＩ_ｂ，学習用画像Ｉ_Ｃと正解データＩ_ｃ，…、）からなる学習データをメモリ２８に記憶させる。The storage control unit 34 stores learning images (I _A , I _B , I _C , ...) generated by the learning image generation unit 30 and correct data (I _a , I _b ) generated by the correct data generation unit 32 . , I _c , ...), respectively corresponding pairs (learning image I _A and correct data I _a , learning image I _B and correct data I _b , learning image I _C and correct data I _c , ... , ) is stored in the memory 28.

これにより、メモリ２８には、多くの学習データが記憶、蓄積される。尚、図６には図示されていないが、学習用画像生成部３０及び正解データ生成部３２にそれぞれ入力される撮影画像ＩＴＰとマスク画像ＩＭのペアも学習データとしてメモリ２８に記憶させることが好ましい。 As a result, a large amount of learning data is stored and accumulated in the memory 28. Although not shown in FIG. 6, it is preferable that the pair of photographed image ITP and mask image IM, which are input to the learning image generation unit 30 and the correct data generation unit 32, respectively, is also stored in the memory 28 as learning data. .

図１３は、図１２に示した撮影画像及びマスク画像から生成した学習データの一例を示す図である。 FIG. 13 is a diagram showing an example of learning data generated from the captured image and mask image shown in FIG. 12.

図１３（Ａ）は、学習用画像Ｉ_Ａと正解データ（マスク画像）Ｉ_ａとのペアからなる学習データを示し、図１３（Ｂ）は、学習用画像Ｉ_Ｂとマスク画像Ｉ_ｂとのペアからなる学習データを示す。FIG. 13(A) shows learning data consisting of a pair of learning image _IA and correct data (mask image) _Ia , and FIG. 13(B) shows a pair of learning image _IB and mask image _Ib . The training data consisting of pairs is shown.

図１３（Ａ）に示す学習用画像Ｉ_Ａでは、カプセル状の薬剤Ｔ５，Ｔ６が線で接触し、薬剤Ｔ２，Ｔ３，Ｔ４が互いに点で接触している。この学習用画像Ｉ_Ａに対応するマスク画像Ｉ_ａは、同一の薬剤である薬剤Ｔ５，Ｔ６の領域の画素値を異ならせることで、薬剤Ｔ５，Ｔ６の領域のインスタンス分離を可能にし、かつ線で接している薬剤Ｔ５，Ｔ６の境界も区別可能にしている。In the learning image _IA shown in FIG. 13(A), capsule-shaped medicines T5 and T6 are in contact with each other in a line, and medicines T2, T3, and T4 are in contact with each other in points. The mask image _{Ia corresponding to this learning image IA} _makes it possible to separate instances of the regions of drugs T5 and T6 by differentiating the pixel values of the regions of drugs T5 and T6, which are the same drug, and The boundary between drugs T5 and T6, which are in contact with each other, can also be distinguished.

また、マスク画像Ｉ_ａは、薬剤Ｔ２，Ｔ３，Ｔ４の互いに点で接触している箇所を、背景の画素値と同一とすることで、各薬剤Ｔ２，Ｔ３，Ｔ４が互いに接触しないようにし、各薬剤Ｔ２，Ｔ３，Ｔ４の領域が明確になるようにしている。In addition, in the mask image _Ia , the points where the drugs T2, T3, and T4 are in contact with each other are made the same as the pixel values of the background, so that the drugs T2, T3, and T4 do not come into contact with each other. The regions of each drug T2, T3, and T4 are made clear.

また、図１３（Ｂ）に示す学習用画像Ｉ_Ｂでは、カプセル状の薬剤Ｔ５，Ｔ６が線で接触し、薬剤Ｔ６と薬剤Ｔ３が点で接触している。この学習用画像Ｉ_Ｂに対応するマスク画像Ｉ_ｂは、同一の薬剤である薬剤Ｔ５，Ｔ６の領域の画素値を異ならせる（例えば、薬剤Ｔ６の領域の画素値を「０．５」とする）ことで、薬剤Ｔ５，Ｔ６の領域のインスタンス分離を可能にし、かつ線で接している薬剤Ｔ５，Ｔ６の境界、及び点で接している薬剤Ｔ６と薬剤Ｔ３の境界を区別可能にしている。In addition, in the learning image _IB shown in FIG. 13(B), capsule-shaped medicines T5 and T6 are in contact with each other at a line, and medicine T6 and medicine T3 are in contact at a point. The mask image _Ib corresponding to this learning image _IB has different pixel values in the regions of drugs T5 and T6, which are the same drug (for example, the pixel value in the region of drug T6 is set to "0.5"). ), it is possible to separate the instances of the regions of drugs T5 and T6, and it is also possible to distinguish the boundary between drugs T5 and T6, which are in contact with each other with a line, and the border between drugs T6 and drug T3, which are in contact with each other at a point.

図１３に示した学習データは一例であり、各薬剤Ｔ１～Ｔ６を示す薬剤画像をそれぞれ任意に平行移動及び回転等を組み合わせて配置し、各薬剤Ｔ１～Ｔ６の領域を示す薬剤マスク画像を同様に配置することで、多くの学習データを作成することができる。 The learning data shown in FIG. 13 is an example, and the drug images showing each drug T1 to T6 are arranged by arbitrarily combining translation and rotation, etc., and the drug mask image showing the area of each drug T1 to T6 is similarly arranged. You can create a lot of learning data by placing it in

この場合、複数の薬剤画像の一部又は全部が点又は線で接触するように配置して学習データを生成することが好ましい。このような学習データにより機械学習された学習済み学習モデルが、点又は線で接触する薬剤を撮影した撮影画像を入力画像とする場合に、各薬剤の領域を正しく推論するためである。 In this case, it is preferable to generate learning data by arranging some or all of the plurality of drug images so that they are in contact with each other at points or lines. This is because the trained learning model, which has been machine learned using such learning data, can correctly infer the area of each drug when the input image is a photographed image of a drug in contact with a point or a line.

また、図１２（Ａ）に示した撮影画像ＩＴＰのように透明な薬剤Ｔ４が撮影されている場合、下方からの照明光が透過して白く撮影されるが、薬剤Ｔ４の位置や角度により照明光の透過状況が変化する。即ち、透明な薬剤Ｔ４の薬剤画像は、撮影領域における透明な薬剤Ｔ４の位置や角度により輝度分布等が異なる画像になる。 Furthermore, when a transparent drug T4 is photographed as in the photographed image ITP shown in FIG. The light transmission situation changes. That is, the drug image of the transparent drug T4 is an image in which the brightness distribution and the like vary depending on the position and angle of the transparent drug T4 in the imaging region.

したがって、透明な薬剤を含む複数の薬剤を撮影した撮影画像から、薬剤を任意に配置して学習用画像を生成する場合、透明な薬剤は移動させずに透明な薬剤以外の薬剤を任意に配置して学習用画像を生成することが好ましい。 Therefore, when generating a learning image by arbitrarily arranging drugs from captured images of multiple drugs including transparent drugs, drugs other than the transparent drugs are arbitrarily placed without moving the transparent drugs. It is preferable to generate a learning image by

また、本例では、正解データとしてマスク画像を生成するようにしたが、薬剤画像の領域のエッジを示す薬剤画像毎のエッジ情報（エッジ画像）とすることができる。また、薬剤同士が点又は線で接触している場合には、点又は線で接触する箇所を背景色で置換し、薬剤毎のエッジ画像を離間させることが好ましい。 Further, in this example, a mask image is generated as the correct data, but edge information (edge image) for each drug image indicating the edge of the region of the drug image may be used. Furthermore, when the drugs are in contact with each other at points or lines, it is preferable to replace the points or lines where they contact with the background color and separate the edge images for each drug.

更に、薬剤同士が点又は線で接触する場合には、点又は線で接触する箇所のみを示すエッジ画像を、正解データとして生成してもよい。 Furthermore, if the drugs contact each other at a point or line, an edge image showing only the point or line contact may be generated as correct data.

図１４は、複数の薬剤の点又は線で接触する箇所のみを示すエッジ画像の一例を示す図である。 FIG. 14 is a diagram showing an example of an edge image showing only points or lines of contact between a plurality of drugs.

図１４に示すエッジ画像ＩＥは、複数の薬剤Ｔ１～Ｔ６のうちの２以上の薬剤が点又は線で接触する箇所Ｅ１、Ｅ２のみを示す画像であり、図１４上で、実線で示した画像である。尚、図１４上で、点線で示した領域は、複数の薬剤Ｔ１～Ｔ６が存在する領域を示す。 The edge image IE shown in FIG. 14 is an image showing only the locations E1 and E2 where two or more of the plurality of drugs T1 to T6 are in contact with each other in points or lines, and is an image shown by a solid line in FIG. It is. Note that in FIG. 14, the area indicated by a dotted line indicates an area where a plurality of drugs T1 to T6 are present.

線で接触する箇所Ｅ１のエッジ画像は、カプセル状の薬剤Ｔ５とＴ６とが線で接触している箇所の画像であり、点で接触する箇所Ｅ２のエッジ画像は、３つの薬剤Ｔ２～Ｔ４が互いに点で接触している箇所の画像である。 The edge image of the line contact point E1 is an image of the line contact point between capsule-shaped drugs T5 and T6, and the edge image of the point contact point E2 is an image of the point point contact point E2 where the three drugs T2 to T4 are in contact with each other. This is an image of locations that are in contact with each other at points.

学習用画像における各薬剤画像の配置は既知であるため、複数の薬剤のうちの２以上の薬剤が点又は線で接触する箇所も既知である。したがって、図１１に示した正解データ生成部３２は、学習用画像生成部３０により生成される学習用画像に対して、点又は線で接触する箇所のみを示すエッジ画像（正解データ）を自動的に作成することができる。 Since the arrangement of each drug image in the learning image is known, the locations where two or more of the plurality of drugs are in contact with each other by points or lines are also known. Therefore, the correct data generation unit 32 shown in FIG. 11 automatically generates an edge image (correct data) showing only points or lines that touch the learning image generated by the learning image generation unit 30. can be created.

図１４に示したエッジ画像ＩＥは、図１３（Ａ）に示した学習用画像Ｉ_Ａに対応する正解データとすることができる。即ち、図１３（Ａ）に示した学習用画像Ｉ_Ａと図１４に示したエッジ画像ＩＥとのペアからなる学習データとするができる。The edge image IE shown in FIG. 14 can be correct data corresponding to the learning image _IA shown in FIG. 13(A). That is, the learning data can be made up of a pair of the learning image _IA shown in FIG. 13(A) and the edge image IE shown in FIG. 14.

このような学習データは、点又は線で接触する薬剤を撮影した薬剤画像を入力画像とし、その点又は線で接触する箇所のみのエッジ画像を推論結果として出力する学習モデルを機械学習させる場合に使用することができる。 Such learning data is used when machine learning a learning model that uses a drug image captured by a point or line contact as an input image and outputs an edge image of only the point or line contact point as an inference result. can be used.

また、点又は線で接触する箇所のみのエッジ画像（推論結果）は、例えば、点又は線で接触する複数の薬剤を撮影した薬剤画像と、その点又は線で接触する箇所のみのエッジ画像とを入力画像（マルチチャンネルの入力画像）とし、複数の薬剤の領域を推論する学習モデルに使用することができる。この学習モデルによれば、入力画像に加えて、点又は線で接触する箇所の情報を入力するため、各薬剤の領域をより正確に推論することができる。 In addition, an edge image (inference result) of only a point or line contact can be, for example, a drug image taken of multiple drugs that are in contact with a point or line, and an edge image of only a point or line contact. can be used as an input image (multichannel input image) for a learning model that infers regions of multiple drugs. According to this learning model, in addition to the input image, information on points of contact with points or lines is input, so that the area of each drug can be more accurately inferred.

［機械学習装置］
図１５は、本発明に係る機械学習装置の実施形態を示すブロック図である。[Machine learning device]
FIG. 15 is a block diagram showing an embodiment of a machine learning device according to the present invention.

図１５に示す機械学習装置５０は、学習モデル（学習モデルの一つである畳み込みニューラルネットワーク（ＣＮＮ：Convolution Neural Network））５２と、損失値算出部５４と、パラメータ制御部５６とから構成される。 The machine learning device 50 shown in FIG. 15 includes a learning model (convolution neural network (CNN), which is one of the learning models) 52, a loss value calculation unit 54, and a parameter control unit 56. .

この機械学習装置５０は、図１１に示した学習データ作成装置１により作成され、メモリ２８に記憶された学習データを使用し、ＣＮＮ５２を機械学習させる。 This machine learning device 50 uses learning data created by the learning data creation device 1 shown in FIG. 11 and stored in the memory 28 to cause the CNN 52 to perform machine learning.

ＣＮＮ５２は、薬剤を撮影した撮影画像を入力画像とするとき、その入力画像に写っている薬剤の領域を推論する部分であり、複数のレイヤ構造を有し、複数の重みパラメータを保持している。重みパラメータは、畳み込み層での畳み込み演算に使用されるカーネルと呼ばれるフィルタのフィルタ係数などである。 The CNN 52 is a part that infers the region of the drug in the input image when an image of a drug is taken as an input image, and has a multiple layer structure and holds multiple weight parameters. . The weight parameters include filter coefficients of a filter called a kernel used for convolution operations in the convolution layer.

ＣＮＮ５２は、重みパラメータが初期値から最適値に更新されることで、未学習の学習モデルから学習済みの学習モデルに変化しうる。 The CNN 52 can change from an untrained learning model to a trained learning model by updating the weight parameters from initial values to optimal values.

このＣＮＮ５２は、入力層５２Ａと、畳み込み層とプーリング層から構成された複数セットを有する中間層５２Ｂと、出力層５２Ｃとを備え、各層は複数の「ノード」が「エッジ」で結ばれる構造となっている。 This CNN 52 includes an input layer 52A, an intermediate layer 52B having multiple sets composed of convolutional layers and pooling layers, and an output layer 52C, and each layer has a structure in which a plurality of "nodes" are connected by "edges". It has become.

入力層５２Ａには、学習対象である学習用画像が入力画像として入力される。学習用画像は、メモリ２８に記憶されている学習データ（学習用画像と正解データとのペアからなる学習データ）における学習用画像である。 A learning image, which is a learning target, is input as an input image to the input layer 52A. The learning image is a learning image in the learning data (learning data consisting of a pair of a learning image and correct data) stored in the memory 28.

中間層５２Ｂは、畳み込み層とプーリング層とを１セットとする複数セットを有し、入力層５２Ａから入力した画像から特徴を抽出する部分である。畳み込み層は、前の層で近くにあるノードにフィルタ処理し（フィルタを使用した畳み込み演算を行い）、「特徴マップ」を取得する。プーリング層は、畳み込み層から出力された特徴マップを縮小して新たな特徴マップとする。「畳み込み層」は、画像からのエッジ抽出等の特徴抽出の役割を担い、「プーリング層」は抽出された特徴が、平行移動などによる影響を受けないようにロバスト性を与える役割を担う。 The intermediate layer 52B has a plurality of sets including a convolution layer and a pooling layer, and is a part that extracts features from the image input from the input layer 52A. The convolution layer filters (performs a convolution operation using a filter) nearby nodes in the previous layer to obtain a "feature map." The pooling layer reduces the feature map output from the convolution layer to create a new feature map. The "convolution layer" is responsible for extracting features such as edge extraction from an image, and the "pooling layer" is responsible for providing robustness to the extracted features so that they are not affected by parallel movement.

尚、中間層５２Ｂには、畳み込み層とプーリング層とを１セットとする場合に限らず、畳み込み層が連続する場合や活性化関数による活性化プロセス、正規化層も含まれ得る。 Note that the intermediate layer 52B is not limited to the case where a convolutional layer and a pooling layer are set as one set, but may also include a case where convolutional layers are continuous, an activation process using an activation function, and a normalization layer.

出力層５２Ｃは、中間層５２Ｂにより抽出された特徴を示す特徴マップを出力する部分である。また、出力層５２Ｃは、学習済みＣＮＮ５２では、例えば、入力画像に写っている薬剤領域等をピクセル単位、もしくはいくつかのピクセルを一塊にした単位で領域分類（セグメンテーション）した推論結果を出力する。 The output layer 52C is a part that outputs a feature map showing the features extracted by the intermediate layer 52B. Further, the output layer 52C outputs an inference result in which the trained CNN 52 performs area classification (segmentation) of the drug area etc. in the input image in pixel units or in units of several pixels grouped together.

学習前のＣＮＮ５２の各畳み込み層に適用されるフィルタの係数やオフセット値は、任意の初期値がセットされる。 Any initial values are set for the filter coefficients and offset values applied to each convolutional layer of the CNN 52 before learning.

学習制御部として機能する損失値算出部５４及びパラメータ制御部５６のうちの損失値算出部５４は、ＣＮＮ５２の出力層５２Ｃから出力される特徴マップと、入力画像（学習用画像）に対する正解データであるマスク画像（メモリ２８から学習湯尾画像に対応して読み出されるマスク画像）とを比較し、両者間の誤差（損失関数の値である損失値）を計算する。損失値の計算方法は、例えばソフトマックスクロスエントロピー、シグモイドなどが考えられる。 Of the loss value calculation unit 54 and parameter control unit 56 that function as a learning control unit, the loss value calculation unit 54 uses the feature map output from the output layer 52C of the CNN 52 and the correct answer data for the input image (learning image). A certain mask image (a mask image read out corresponding to the learning Yuo image from the memory 28) is compared, and an error between the two (a loss value that is a value of a loss function) is calculated. Possible methods for calculating the loss value include, for example, softmax cross entropy and sigmoid.

パラメータ制御部５６は、損失値算出部５４により算出された損失値を元に、誤差逆伝播法によりＣＮＮ５２の重みパラメータを調整する。誤差逆伝播法では、誤差を最終レイヤから順に逆伝播させ、各レイヤにおいて確率的勾配降下法を行い、誤差が収束するまでパラメータの更新を繰り返す。 The parameter control unit 56 adjusts the weight parameters of the CNN 52 using the error backpropagation method based on the loss value calculated by the loss value calculation unit 54. In the error backpropagation method, errors are backpropagated in order from the final layer, stochastic gradient descent is performed in each layer, and parameter updates are repeated until the error converges.

この重みパラメータの調整処理を繰り返し行い、ＣＮＮ５２の出力と正解データであるマスク画像との差が小さくなるまで繰り返し学習を行う。 This weight parameter adjustment process is repeated, and learning is performed repeatedly until the difference between the output of the CNN 52 and the mask image that is the correct data becomes small.

機械学習装置５０は、メモリ２８に記憶された学習データを使用した機械学習を繰り返すことで、ＣＮＮ５２が学習済みモデルとなる。学習済みのＣＮＮ５２は、未知の入力画像（薬剤を撮影した撮影画像）を入力すると、撮影画像内の薬剤の領域を示すマスク画像等の推論結果を出力する。 The machine learning device 50 repeats machine learning using the learning data stored in the memory 28, so that the CNN 52 becomes a trained model. When the trained CNN 52 receives an unknown input image (a captured image of a drug), it outputs an inference result such as a mask image indicating the region of the drug in the captured image.

尚、ＣＮＮ５２としては、Ｒ－ＣＮＮ（Regions with Convolutional Neural Networks)を適用することができる。Ｒ－ＣＮＮでは、撮影画像ＩＴＰ内において、大きさを変えたバウンディングボックスをスライドさせ、薬剤が入るバウンディングボックスの領域を検出する。そして、バウンディングボックスの中の画像部分だけを評価（ＣＮＮ特徴量を抽出）することで、薬剤のエッジを検出する。また、Ｒ－ＣＮＮに代えて、ＦａｓｔＲ-ＣＮＮ、ＦａｓｔｅｒＲ－ＣＮＮ、ＭａｓｋＲ－ＣＮＮ等を使用することができる。 Note that as the CNN 52, R-CNN (Regions with Convolutional Neural Networks) can be applied. In R-CNN, bounding boxes of different sizes are slid within the photographed image ITP to detect the region of the bounding box into which the drug can enter. Then, by evaluating only the image portion within the bounding box (extracting CNN features), edges of the drug are detected. Furthermore, instead of R-CNN, Fast R-CNN, Faster R-CNN, Mask R-CNN, etc. can be used.

このようにして構成される学習済モデルの推論結果は、例えば、複数の薬剤が一包化された薬包を撮影した撮影画像から、各薬剤の画像を切り出す場合に使用することができる。尚、切り出された各薬剤の画像は、薬包に入っている各薬剤の監査・鑑別を行う場合に使用される。 The inference results of the trained model configured in this manner can be used, for example, when cutting out images of each drug from a photographed image of a medicine package containing a plurality of drugs. Note that the cut out images of each drug are used when inspecting and differentiating each drug contained in a drug package.

ところで、メモリ２８には、前述したように薬剤を撮影した撮影画像とその撮影画像内の薬剤の領域を示す正解データとに基づいて、シミュレーションにより作成した多くの学習データが記憶されるが、撮影画像は、自薬局が取り扱っている薬剤を撮影した画像であることが好ましい。自薬局が取り扱っている薬剤を撮影した撮影画像を使用して学習データを作成し、その学習データを使用して学習モデルを構成することで、自薬局で取り扱っている薬剤の監査・鑑別を行う場合に、その学習モデルを有効に使用できるからである。 By the way, as described above, the memory 28 stores a lot of learning data created by simulation based on the photographed image of the drug and the correct data indicating the area of the drug in the photographed image. The image is preferably an image of a drug that is handled by the pharmacy. By creating learning data using captured images of drugs handled by the own pharmacy and configuring a learning model using the learning data, auditing and differentiation of drugs handled by the own pharmacy is performed. This is because the learning model can be used effectively in certain cases.

［学習データ作成方法］
図１６は、本発明に係る学習データ作成方法の実施形態を示すフローチャートである。[Learning data creation method]
FIG. 16 is a flowchart showing an embodiment of the learning data creation method according to the present invention.

図１６に示す各ステップの処理は、例えば、図１１に示した学習データ作成装置１のプロセッサ２により行われる。 The processing of each step shown in FIG. 16 is performed, for example, by the processor 2 of the learning data creation device 1 shown in FIG.

図１６において、画像取得部２２は、撮影装置１０から薬剤を撮影した撮影画像ＩＴＰ（例えば、図１２（Ａ）に示す撮影画像ＩＴＰ）を取得する（ステップＳ１０）。尚、図１２（Ａ）に示した撮影画像ＩＴＰは、リフレクタを介して薬包を下から照明し、薬包の上方から薬包を撮影した画像であるが、薬剤を撮影した撮影画像は上記のようにして撮影したものに限らない。また、撮影される薬剤は、薬包に入っていないものでもよいし、薬剤の個数は１個でもよい。 In FIG. 16, the image acquisition unit 22 acquires a captured image ITP of a drug (for example, the captured image ITP shown in FIG. 12(A)) from the imaging device 10 (step S10). Note that the captured image ITP shown in FIG. 12(A) is an image obtained by illuminating the drug package from below through a reflector and photographing the drug package from above. It is not limited to photographs taken in this manner. Further, the medicine to be photographed may not be in a medicine package, and the number of medicines may be one.

また、第１領域情報取得部２３は、画像取得部２２が取得する撮影画像内の薬剤の領域を示す第１領域情報としてマスク画像ＩＭ（例えば、図１２（Ｂ）に示すマスク画像ＩＭ）を取得する（ステップＳ１２）。尚、マスク画像ＩＭは、撮影画像ＩＴＰに基づいて手動又は自動で生成され、メモリ２８等に記憶されたものである。 The first area information acquisition unit 23 also receives a mask image IM (for example, the mask image IM shown in FIG. 12(B)) as first area information indicating the area of the drug in the captured image acquired by the image acquisition unit 22. Acquire (step S12). Note that the mask image IM is generated manually or automatically based on the photographed image ITP, and is stored in the memory 28 or the like.

続いて、学習用画像生成部３０は、ステップＳ１０で取得する撮影画像ＩＴＰから薬剤Ｔ１～Ｔ６を任意に配置した学習用画像を生成する（ステップＳ１４）。学習用画像の生成は、各薬剤を示す薬剤画像を平行移動、反転、回転、又は拡縮させる画像処理により行うことができる。 Subsequently, the learning image generation unit 30 generates a learning image in which drugs T1 to T6 are arbitrarily arranged from the photographed image ITP acquired in step S10 (step S14). The learning images can be generated by image processing that translates, inverts, rotates, or scales the drug images representing each drug.

また、正解データ生成部３２は、ステップＳ１２で取得したマスク画像ＩＭに基づいて、ステップＳ１４で生成された学習用画像に対応する正解データ（マスク画像）を生成する（ステップＳ１６）。即ち、ステップＳ１６では、マスク画像ＩＭにおける各薬剤の領域を学習用画像における各薬剤と同様に配置し、その配置した各薬剤の領域を示す第２領域情報を生成し、生成した第２領域情報を学習用画像に対する正解データ（マスク画像）とする画像処理を行う。 Further, the correct data generation unit 32 generates correct data (mask image) corresponding to the learning image generated in step S14, based on the mask image IM acquired in step S12 (step S16). That is, in step S16, the region of each drug in the mask image IM is arranged in the same way as each drug in the learning image, second region information indicating the region of each drug arranged is generated, and the generated second region information Image processing is performed using this as correct data (mask image) for the learning image.

記憶制御部３４は、ステップＳ１４で生成した学習用画像とステップＳ１６で生成したマスク画像とのペアを学習データとしてメモリ２８に記憶させる（ステップＳ１８）。図１３（Ａ）及び図１３（Ｂ）は、上記のようにして生成され、メモリ２８に記憶される学習用画像とマスク画像のペアからなる学習データの一例を示す。 The storage control unit 34 stores the pair of the learning image generated in step S14 and the mask image generated in step S16 in the memory 28 as learning data (step S18). 13(A) and 13(B) show an example of learning data that is generated as described above and is stored in the memory 28 and is composed of a pair of a learning image and a mask image.

続いて、プロセッサ２は、学習データの生成を終了するか否かを判別する（ステップＳ２０）。例えば、ユーザからの学習データの生成終了の指示入力があった場合や、１つの撮影画像ＩＴＰとマスク画像のペアから、予め設定された規定数の学習データの作成が終了した場合を学習データの生成終了と判別することができる。 Subsequently, the processor 2 determines whether to end the generation of learning data (step S20). For example, when the user inputs an instruction to end the generation of learning data, or when a preset number of learning data has been created from a pair of one captured image ITP and a mask image, the learning data is It can be determined that generation has ended.

学習データの生成を終了していないと判別されると（「No」の場合）、ステップＳ１４、ステップＳ１６に戻り、ステップＳ１４～ステップＳ２０により次の学習データを作成する。 If it is determined that the generation of learning data has not been completed (in the case of "No"), the process returns to step S14 and step S16, and the next learning data is created in steps S14 to S20.

学習データの生成を終了すると判別されると（「Yes」の場合）、ステップＳ１０、ステップＳ１２で取得した撮影画像ＩＴＰ，マスク画像ＩＭに基づく学習データの作成を終了させる。 If it is determined that the generation of the learning data is to be completed (in the case of "Yes"), the creation of the learning data based on the photographed image ITP and the mask image IM acquired in steps S10 and S12 is terminated.

尚、ステップＳ１０、ステップＳ１２において、別の撮影画像ＩＴＰ，マスク画像ＩＭが取得される場合には、その撮影画像ＩＴＰ，マスク画像ＩＭに基づく複数の学習データの作成が行われることは言うまでもない。 It goes without saying that when another captured image ITP and mask image IM are acquired in steps S10 and S12, a plurality of learning data are created based on the captured image ITP and mask image IM.

［その他］
本発明に係る学習データ作成装置の、例えば、ＣＰＵ２４等の各種の処理を実行する処理部（processing unit）のハードウェア的な構造は、次に示すような各種のプロセッサ（processor）である。各種のプロセッサには、ソフトウェア（プログラム）を実行して各種の処理部として機能する汎用的なプロセッサであるＣＰＵ（Central Processing Unit）、ＦＰＧＡ（Field Programmable Gate Array）などの製造後に回路構成を変更可能なプロセッサであるプログラマブルロジックデバイス（Programmable Logic Device：ＰＬＤ）、ＡＳＩＣ（Application Specific Integrated Circuit）などの特定の処理を実行させるために専用に設計された回路構成を有するプロセッサである専用電気回路などが含まれる。[others]
The hardware structure of a processing unit such as the CPU 24 that executes various processes in the learning data creation device according to the present invention is the following various processors. Various types of processors include CPUs (Central Processing Units) and FPGAs (Field Programmable Gate Arrays), which are general-purpose processors that execute software (programs) and function as various processing units.The circuit configuration can be changed after manufacturing. This includes programmable logic devices (PLDs), which are processors, and dedicated electrical circuits, which are processors with circuit configurations specifically designed to execute specific processes, such as ASICs (Application Specific Integrated Circuits). It will be done.

１つの処理部は、これら各種のプロセッサのうちの１つで構成されていてもよいし、同種または異種の２つ以上のプロセッサ（例えば、複数のＦＰＧＡ、あるいはＣＰＵとＦＰＧＡの組み合わせ）で構成されてもよい。また、複数の処理部を１つのプロセッサで構成してもよい。複数の処理部を１つのプロセッサで構成する例としては、第１に、クライアントやサーバなどのコンピュータに代表されるように、１つ以上のＣＰＵとソフトウェアの組合せで１つのプロセッサを構成し、このプロセッサが複数の処理部として機能する形態がある。第２に、システムオンチップ（System On Chip：ＳｏＣ）などに代表されるように、複数の処理部を含むシステム全体の機能を１つのＩＣ（Integrated Circuit）チップで実現するプロセッサを使用する形態がある。このように、各種の処理部は、ハードウェア的な構造として、上記各種のプロセッサを１つ以上用いて構成される。 One processing unit may be composed of one of these various types of processors, or may be composed of two or more processors of the same type or different types (for example, multiple FPGAs, or a combination of a CPU and FPGA). You can. Further, the plurality of processing units may be configured with one processor. As an example of configuring multiple processing units with one processor, first, one processor is configured with a combination of one or more CPUs and software, as typified by computers such as clients and servers. There is a form in which a processor functions as multiple processing units. Second, there are processors that use a single IC (Integrated Circuit) chip to implement the functions of the entire system, including multiple processing units, as typified by System On Chip (SoC). be. In this way, various processing units are configured using one or more of the various processors described above as a hardware structure.

これらの各種のプロセッサのハードウェア的な構造は、より具体的には、半導体素子などの回路素子を組み合わせた電気回路（circuitry）である。 More specifically, the hardware structure of these various processors is an electric circuit (circuitry) that is a combination of circuit elements such as semiconductor elements.

また、本発明は、コンピュータにインストールされることにより、本発明に係る学習データ作成装置として各種の機能を実現させる学習データ作成プログラム、及びこの学習データ作成プログラムが記録された記録媒体を含む。 Further, the present invention includes a learning data creation program that is installed on a computer to realize various functions as the learning data creation device according to the present invention, and a recording medium on which this learning data creation program is recorded.

更に、本発明は上述した実施形態に限定されず、本発明の精神を逸脱しない範囲で種々の変形が可能であることは言うまでもない。 Furthermore, it goes without saying that the present invention is not limited to the embodiments described above, and that various modifications can be made without departing from the spirit of the present invention.

１学習データ作成装置
２プロセッサ
１０撮影装置
１２Ａ、１２Ｂカメラ
１３撮影制御部
１４ステージ
１６Ａ、１６Ｂ照明装置
１６Ａ１～１６Ａ４，１６Ｂ１～１６Ｂ４発光部
２０取得部
２２画像取得部
２３第１領域情報取得部
２４ＣＰＵ
２５操作部
２６ＲＡＭ
２７ＲＯＭ
２８メモリ
２９表示部
３０学習用画像生成部
３２正解データ生成部
３４記憶制御部
５０機械学習装置
５２学習モデル（ＣＮＮ）
５２Ａ入力層
５２Ｂ中間層
５２Ｃ出力層
５４損失値算出部
５６パラメータ制御部
Ｉ_Ａ、Ｉ_Ｂ、Ｉ_Ｃ学習用画像
ＩＥエッジ画像
ＩＭ、Ｉ_ａ、Ｉ_ｂ、Ｉ_ｃマスク画像（正解データ）
ＩＴＰ撮影画像
Ｉ_tpl テンプレート画像
Ｓ１０～Ｓ２０ステップ
Ｔ、Ｔ１～Ｔ６薬剤
ＴＰ薬包1 Learning data creation device 2 Processor 10 Photographing device 12A, 12B Camera 13 Photographing control section 14 Stage 16A, 16B Illumination device 16A1 to 16A4, 16B1 to 16B4 Light emitting section 20 Acquisition section 22 Image acquisition section 23 First area information acquisition section 24 CPU
25 Operation unit 26 RAM
27 ROM
28 Memory 29 Display unit 30 Learning image generation unit 32 Correct data generation unit 34 Storage control unit 50 Machine learning device 52 Learning model (CNN)
52A Input layer 52B Intermediate layer 52C Output layer 54 Loss value calculation section 56 Parameter control section _IA , _IB , _IC learning image IE Edge image IM, _Ia , _Ib , _Ic Mask image (correct data)
ITP Photographed image I _tpl template image S10-S20 Step T, T1-T6 Drug TP Medicine package

Claims

A learning data creation device comprising a processor and a memory, the processor creating learning data for machine learning,
The processor includes:
an acquisition process for acquiring a photographed image of the drug;
a learning image generation process that generates a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
Correct data generation processing that generates second area information corresponding to the area of the drug in the generated learning image, and sets the generated second area information as correct data for the learning image;
performing storage control to store a pair of the generated learning image and the correct answer data in the memory as learning data;
When generating the learning image including a plurality of drugs, the learning image generation process generates the learning image in which all or part of the plurality of drugs are in contact with a point or a line,
The correct answer data includes an edge image showing only the points or lines of the plurality of drugs that are in contact with each other.
Learning data creation device.

A learning data creation device comprising a processor and a memory, the processor creating learning data for machine learning,
The processor includes:
an acquisition process for acquiring a photographed image of the drug;
a learning image generation process that generates a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
Correct data generation processing that generates second area information corresponding to the area of the drug in the generated learning image, and sets the generated second area information as correct data for the learning image;
performing storage control to store a pair of the generated learning image and the correct answer data in the memory as learning data;
When generating the learning image including a plurality of drugs, the learning image generation process generates the learning image in which all or part of the plurality of drugs are in contact with a point or a line,
The correct answer data is a mask image corresponding to the drug area or an edge image showing an edge of the drug area, and the points or lines that touch are replaced with a background color.
Learning data creation device.

A learning data creation device comprising a processor and a memory, the processor creating learning data for machine learning,
The processor includes:
an acquisition process for acquiring a photographed image of the drug;
a learning image generation process that generates a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
Correct data generation processing that generates second area information corresponding to the area of the drug in the generated learning image, and sets the generated second area information as correct data for the learning image;
performing storage control to store a pair of the generated learning image and the correct answer data in the memory as learning data;
When generating the learning image including a plurality of drugs, the learning image generation process generates the learning image in which all or part of the plurality of drugs are in contact with a point or a line,
The correct answer data is a mask image corresponding to the region of the drug, and the mask images corresponding to the plurality of drugs that are in contact with the points or lines have different pixel values.
Learning data creation device.

A learning data creation device comprising a processor and a memory, the processor creating learning data for machine learning,
The processor includes:
an acquisition process for acquiring a photographed image of the drug;
a learning image generation process that generates a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
Correct data generation processing that generates second area information corresponding to the area of the drug in the generated learning image, and sets the generated second area information as correct data for the learning image;
performing storage control to store a pair of the generated learning image and the correct answer data in the memory as learning data;
The drug is at least partially transparent,
The learning image generation process by the processor includes arbitrarily arranging drugs other than the transparent drug when generating the learning image including a plurality of drugs.
Learning data creation device.

The acquisition process of the processor acquires first area information corresponding to a drug area in the acquired captured image;
The correct data generation process generates the second area information based on the acquired first area information.
The learning data creation device according to any one of claims 1 to 4 .

The acquisition processing of the processor acquires the photographed images in which a plurality of drugs are photographed, or the plurality of photographed images in which drugs are photographed, and acquires the plurality of first region information corresponding to the regions of the plurality of drugs. get,
The learning data creation device according to claim 5 .

The first region information may be region information that manually sets the drug region in the captured image, region information that automatically extracts the drug region in the captured image by image processing, or region information that indicates the drug region in the captured image. The area is automatically extracted by image processing and the area information is manually adjusted.
The learning data creation device according to claim 5 or 6 .

The learning image generation process generates the learning image by translating, inverting, rotating, or scaling the photographed image,
The correct data generation process generates the correct data by translating, inverting, rotating, or scaling the first region information in accordance with the photographed image.
The learning data creation device according to claim 5 or 6 .

The learning image generation process generates the learning image by combining two or more images obtained by translating, inverting, rotating, or scaling the photographed image,
The correct data generation process generates the correct data by translating, inverting, rotating, or scaling the first region information corresponding to each of the two or more images in accordance with the photographed image.
The learning data creation device according to claim 5 or 6 .

The processor includes a drug image acquisition process of acquiring a drug image in which the drug region is cut out from the photographed image based on the acquired first region information,
The learning image generation process involves translating, inverting, rotating, or scaling the acquired drug image to generate the learning image;
The correct data generation process generates the correct data by translating, inverting, rotating, or scaling the first region information in accordance with the drug image.
The learning data creation device according to claim 5 or 6 .

The processor includes a drug image acquisition process of acquiring a drug image in which the drug region is cut out from the photographed image based on the acquired first region information,
The learning image generation process generates the learning image by combining two or more images obtained by translating, inverting, rotating, or scaling the obtained drug image,
The correct data generation process generates the correct data by combining two or more pieces of second area information obtained by translating, inverting, rotating, or scaling the first area information corresponding to the drug image.
The learning data creation device according to any one of claims 5, 6, and 10 .

The photographed image is an image of a drug handled by the own pharmacy;
The learning data creation device according to any one of claims 1 to 11 .

A learning data creation method in which a processor creates learning data for machine learning by processing each of the following steps,
a step of acquiring a photographed image in which the drug is photographed;
generating a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
generating second area information corresponding to the area of the drug in the generated learning image, and using the generated second area information as correct data for the learning image;
a step of storing the pair of the generated learning image and the correct data in a memory as learning data;
In the step of generating the learning image, when arranging the plurality of drugs, all or part of the plurality of drugs are brought into contact with a point or a line,
The correct answer data includes an edge image showing only the points or lines of the plurality of drugs that are in contact with each other.
How to create learning data.

A learning data creation method in which a processor creates learning data for machine learning by processing each of the following steps,
a step of acquiring a photographed image in which the drug is photographed;
generating a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
generating second area information corresponding to the area of the drug in the generated learning image, and using the generated second area information as correct data for the learning image;
a step of storing the pair of the generated learning image and the correct data in a memory as learning data;
In the step of generating the learning image, when arranging the plurality of drugs, all or part of the plurality of drugs are brought into contact with a point or a line,
The correct answer data is a mask image corresponding to the drug area or an edge image showing an edge of the drug area, and the points or lines that touch are replaced with a background color.
How to create learning data.

A learning data creation method in which a processor creates learning data for machine learning by processing each of the following steps,
a step of acquiring a photographed image in which the drug is photographed;
generating a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
generating second area information corresponding to the area of the drug in the generated learning image, and using the generated second area information as correct data for the learning image;
a step of storing the pair of the generated learning image and the correct data in a memory as learning data;
In the step of generating the learning image, when arranging the plurality of drugs, all or part of the plurality of drugs are brought into contact with a point or a line,
The correct answer data is a mask image corresponding to the region of the drug, and the mask images corresponding to the plurality of drugs that are in contact with the points or lines have different pixel values.
How to create learning data.

A learning data creation method in which a processor creates learning data for machine learning by processing each of the following steps,
a step of acquiring a photographed image in which the drug is photographed;
generating a learning image in which the drug is arbitrarily arranged from the acquired photographed image;
generating second area information corresponding to the area of the drug in the generated learning image, and using the generated second area information as correct data for the learning image;
a step of storing the pair of the generated learning image and the correct data in a memory as learning data;
In the step of generating the learning image, when arranging the plurality of drugs, all or part of the plurality of drugs are brought into contact with a point or a line,
The drug is at least partially transparent,
The step of generating the learning image includes arbitrarily arranging drugs other than the transparent drug when generating the learning image including a plurality of drugs.
How to create learning data.

A function to obtain images of drugs,
The function is to generate a learning image in which the drugs are arbitrarily arranged from the acquired photographic image, and when generating the learning image including a plurality of drugs, all or a part of the plurality of drugs are A function of generating the learning image that contacts with a point or a line ;
a function of generating second area information corresponding to the area of the drug in the generated learning image, and using the generated second area information as correct data for the learning image;
A computer realizes a function of storing a pair of the generated learning image and the correct answer data in a memory as learning data ,
The correct answer data includes an edge image showing only the points or lines of the plurality of drugs that are in contact with each other.
Learning data creation program.

A non-transitory computer-readable recording medium on which the program according to claim 17 is recorded.