JP7737865B2

JP7737865B2 - Motion command generating device and motion command generating method

Info

Publication number: JP7737865B2
Application number: JP2021172279A
Authority: JP
Inventors: 秀行一藁; 洋伊藤; 健次郎山本
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-10-21
Filing date: 2021-10-21
Publication date: 2025-09-11
Anticipated expiration: 2041-10-21
Also published as: JP2023062361A; WO2023067972A1

Description

本発明は、ロボットの動作指令を生成する動作指令生成装置および動作指令生成方法に関する。 The present invention relates to a motion command generation device and a motion command generation method for generating motion commands for a robot.

生産効率向上や人件費削減のため、工業製品の組み立て、溶接、搬送などの人が行っていた作業をロボットに代替させる取り組みが増えている。しかし、これまでのロボットシステムは、膨大なプログラミングや高い専門知識が必要であり、ロボット導入の阻害要因になっていた。 In order to improve production efficiency and reduce labor costs, there are increasing efforts to use robots to perform tasks previously performed by humans, such as assembling, welding, and transporting industrial products. However, previous robotic systems required extensive programming and a high level of specialized knowledge, which was an obstacle to the introduction of robots.

このような状況に対応するために、ロボットに取り付けられた各種センサ情報に基づいてロボット自身で動作を決定する自律学習型ロボット制御システムが提案されている。これまでのロボットシステムに比べると膨大なプログラミングや高い専門知識が不要であり、ロボットを容易に導入できることが期待される。さらに、この自律学習型ロボット制御システムは、ロボット自らが動作経験を記憶・学習することで多様な環境変化に対し柔軟な動作生成（動作指令の生成）が可能であると期待されている。 To address these situations, an autonomous learning robot control system has been proposed, in which the robot determines its own behavior based on information from various sensors attached to the robot. Compared to previous robot systems, this system does not require extensive programming or advanced specialized knowledge, and is expected to enable robots to be easily introduced. Furthermore, this autonomous learning robot control system is expected to enable the robot to memorize and learn from its own behavioral experiences, enabling it to flexibly generate behavior (generate behavior commands) in response to diverse environmental changes.

ロボットの動作経験とは、例えば、ロボットの操作者／管理者がロボットに動作を直接教えて記憶させる手法や、操作者／管理者や他のロボットの動作を見て真似る手法などがある。また、一般的に自律学習型ロボット制御システムには、学習器と呼ばれる学習装置が備えられており、動作経験時のセンサ情報の記憶と、動作を生成するためのパラメータ調整とが行われている。この記憶された動作は学習データ、パラメータ調整は学習と呼ばれ、学習データを用いて学習器の学習を行う。学習器は、予め入出力の関係を定義し、学習器への入力値に対し期待した出力値が出力されるように学習（パラメータ調整）を繰り返し行う。 A robot's operational experience can be achieved, for example, by the robot's operator/manager directly teaching the robot how to operate and having it memorize it, or by having the operator/manager or another robot observe and imitate the operation. Furthermore, autonomous learning robot control systems are generally equipped with a learning device called a learner, which stores sensor information from operational experience and adjusts parameters to generate operations. These stored operations are called learning data, and parameter adjustment is called learning, and the learner learns using the learning data. The learner defines the input/output relationship in advance and repeatedly learns (adjusts parameters) so that the expected output value is output for the input value to the learner.

学習データの例として、ある動作経験時のロボットの関節角情報と作業の撮像画像との時系列データがある。物体を把持する作業を含む場合、撮影画像には把持の対象物やロボットアーム、ロボットハンドが映っている。この学習データを用いて、学習器に時刻（ｔ）の関節角情報と画像を入力し、時刻（ｔ＋１）の関節角情報と画像を予測するように時系列学習させたとする。すると、学習が完了した学習器にロボット関節角情報と画像を逐次入力することで、自律学習型ロボット制御システムは、作業の状態に応じて自動的に動作を生成することが可能になる。 An example of learning data is time-series data of a robot's joint angle information and captured images of the task when it experiences a certain action. If the task involves grasping an object, the captured images will show the object being grasped, the robot arm, and the robot hand. Using this learning data, joint angle information and images at time (t) are input into a learning device, which is then trained to predict joint angle information and images at time (t+1). Then, by sequentially inputting robot joint angle information and images into the learning device once learning has completed, the autonomous learning robot control system will be able to automatically generate actions according to the task status.

このような手法は、ある時刻のセンサ情報から物体認識などを陽に介さず、直接ロボットの動作を生成するため、自律学習型のエンドツーエンドの動作生成手法と呼ばれる。これらの手法では、学習時に作業対象とした物体や類似の物体が撮像画像中に得られた場合に、その物体に対して自律的に作業を開始する。しかし、作業対象の物体が同時に複数存在する場合については想定されておらず、作業ができない、もしくはどの物体に作業を行うかが分からない可能性があった。 Such methods generate robot behavior directly from sensor information at a given time, without explicitly going through object recognition, and are therefore known as autonomous learning-based end-to-end behavior generation methods. With these methods, if an object that was the target of work during learning or a similar object is detected in the captured image, the robot will autonomously begin working on that object. However, these methods do not take into account the situation where multiple target objects are present at the same time, which could result in the robot being unable to perform the work or not knowing which object to work on.

一方で、自律化が求められるロボットの作業として、作業対象物体が同時に複数存在するケースがある。このような作業として例えば、廃炉作業における瓦礫撤去作業や、プラントにおけるバルブ開閉作業、段ボールの搬送作業などがある。瓦礫撤去作業では、類似形状・テクスチャの瓦礫が散らばっていることが想定され、順番によらず撤去すればよい場合や、瓦礫が重なるなどしているために適切な順番で撤去しなければならない場合がある。また、バルブ開閉作業では、適切な順番でバルブを開閉する必要がある。このため、自律学習型のエンドツーエンドの動作生成手法において、対象物を指定し、その物体に作業を行うことが求められている。
作業の対象物を指定する技術として、特許文献１に記載のロボット装置があり、ロボット装置が撮像した画像内の対象物の位置や姿勢（向き）を操作者が指定する。 On the other hand, there are cases where robots that require autonomy have to work on multiple objects at the same time. Examples of such tasks include removing rubble during decommissioning work, opening and closing valves in a plant, and transporting cardboard boxes. In rubble removal work, rubble of similar shapes and textures is expected to be scattered, and in some cases it is sufficient to remove it in any order, but in other cases rubble overlaps, so it is necessary to remove it in the appropriate order. In addition, in valve opening and closing tasks, the valves must be opened and closed in the appropriate order. For this reason, an autonomous learning-based end-to-end behavior generation method is required that allows a target object to be specified and the robot to perform the task on that object.
A robot device described in Patent Document 1 is a technology for specifying an object to be worked on, in which an operator specifies the position and posture (direction) of an object in an image captured by the robot device.

特開２０１３－１７３２０９号公報JP 2013-173209 A

特許文献１に記載のロボット装置は、位置や姿勢が指定された対象物を把持するための腕部の起動を計画し、この軌道に従って動作を制御する。自律学習型のエンドツーエンドの動作生成では、物体認識（物体の位置や姿勢の認識）を陽に介さず、直接ロボットの動作（動作指令）を生成している。このため、特許文献１に記載のロボット装置のように対象物の位置や姿勢を指定することができない。
本発明は、このような背景を鑑みてなされたものであり、自律学習型のエンドツーエンドの動作指令生成において対象物の選択を可能とする動作指令生成装置および動作指令生成方法を提供することを課題とする。 The robot device described in Patent Document 1 plans the activation of an arm to grasp an object whose position and orientation are specified, and controls its movement according to this trajectory. In autonomous learning-based end-to-end movement generation, the robot's movement (movement command) is generated directly without explicit object recognition (recognition of the object's position and orientation). Therefore, it is not possible to specify the position and orientation of an object, as with the robot device described in Patent Document 1.
The present invention has been made in consideration of the above background, and aims to provide an action command generation device and an action command generation method that enable selection of an object in autonomous learning type end-to-end action command generation.

上記した課題を解決するため、本発明に係る動作指令生成装置は、ロボットが作業する対象物の候補を含む画像およびセンサ情報を取得する取得部と、機械学習モデルを用いて、前記画像および前記センサ情報を入力として前記ロボットの動作指令を出力する指令生成部とを備え、前記機械学習モデルは、前記対象物の候補の位置または領域を算出する際に参照されるパラメータを含む位置抽出ブロックと、前記対象物に対する前記ロボットの作業の動作指令を算出する際に参照されるパラメータを含む動作指令生成ブロックとを含み、説明変数が前記画像および前記センサ情報を含み、目的変数が前記動作指令を含む学習データを用いて生成され、前記指令生成部は、前記対象物の候補のなかで前記ロボットの作業対象となる対象物を指定する作業対象指定情報を取得し、前記位置抽出ブロックを用いて算出された前記対象物の候補の位置または領域の情報を、前記作業対象指定情報が示す対象物の位置または領域の情報に置き換えて前記動作指令を出力する。 In order to solve the above-mentioned problems, the motion command generation device of the present invention includes an acquisition unit that acquires images and sensor information including candidate objects for a robot to work on, and a command generation unit that uses a machine learning model to input the images and the sensor information and outputs motion commands for the robot, the machine learning model including a position extraction block including parameters that are referenced when calculating the position or area of the candidate object, and a motion command generation block including parameters that are referenced when calculating a motion command for the robot to work on the object, wherein explanatory variables include the images and the sensor information, and a target variable is generated using learning data including the motion commands, and the command generation unit acquires work object designation information that specifies an object to be worked on by the robot from among the candidate objects, and replaces the information on the position or area of the candidate object calculated using the position extraction block with information on the position or area of the object indicated by the work object designation information, and outputs the motion command .

本発明によれば、自律学習型のエンドツーエンドの動作指令生成において対象物の選択を可能とする動作指令生成装置および動作指令生成方法を提供することができる。上記した以外の課題、構成および効果は、以下の実施形態の説明により明らかにされる。 The present invention provides a motion command generation device and a motion command generation method that enable target object selection in autonomous learning-type end-to-end motion command generation. Issues, configurations, and advantages other than those described above will become clear from the description of the following embodiments.

第１実施形態に係るロボット制御システムの全体構成図である。1 is an overall configuration diagram of a robot control system according to a first embodiment. 第１実施形態に係る動作指令生成装置の機能ブロック図である。1 is a functional block diagram of a motion command generating device according to a first embodiment. 第１実施形態に係る指令生成モデルの構成図である。FIG. 2 is a configuration diagram of a command generation model according to the first embodiment. 第１実施形態に係る動作指令生成処理のフローチャートである。10 is a flowchart of an operation command generation process according to the first embodiment. 第１実施形態の変形例に係る撮影画像である。10 is a photographed image according to a modified example of the first embodiment. 第１実施形態の変形例に係るマスク用の画像である。10 is an image for a mask according to a modified example of the first embodiment. 第１実施形態の変形例に係る撮影画像が画像でマスクされた撮影画像である。A photographed image according to a modification of the first embodiment is a photographed image masked with an image. 第２実施形態に係る動作指令生成処理のフローチャートである。10 is a flowchart of an operation command generation process according to the second embodiment. 第３実施形態に係る動作指令生成処理のフローチャートである。13 is a flowchart of an operation command generation process according to the third embodiment.

≪動作指令生成装置の概要≫
以下に本発明を実施するための形態（実施形態）における動作指令生成装置を説明する。動作指令生成装置は、作業対象を含む撮影画像を入力として、学習済みの機械学習モデルである指令生成モデルを用いてロボットに対する動作指令を生成する。 <<Outline of the motion command generation device>>
The following describes a motion command generation device according to an embodiment of the present invention. The motion command generation device receives a captured image including a work object as input and generates motion commands for a robot using a command generation model, which is a trained machine learning model.

指令生成モデルは、位置抽出ブロックと動作指令生成ブロックとを含む。位置抽出ブロックは、撮影画像にある１つ以上の作業対象（物体）を認識する処理に用いられるパラメータを含む。換言すれば位置抽出ブロックは、作業対象（作業の対象物の候補）の位置または領域を算出する際に参照されるパラメータを含む。動作指令生成ブロックは、１つの作業対象を把持するためのロボットの動作を指示する動作指令を生成する処理に用いられるパラメータを含む。換言すれば動作指令生成ブロックは、対象物に対するロボットの作業の動作指令を算出する際に参照されるパラメータを含む。 The command generation model includes a position extraction block and a motion command generation block. The position extraction block includes parameters used in the process of recognizing one or more work targets (objects) in the captured image. In other words, the position extraction block includes parameters referenced when calculating the position or area of a work target (a candidate work object). The motion command generation block includes parameters used in the process of generating a motion command that instructs the robot to operate to grasp a single work target. In other words, the motion command generation block includes parameters referenced when calculating a motion command for the robot to work on an object.

なお、指令生成モデルや位置抽出ブロック、動作指令生成ブロックは、機械学習技術を用いた認識や推論などの処理に用いるパラメータであるが、以下ではパラメータ自体がパラメータを用いて処理する主体であるかのように記載する場合もある。例えば、「指令生成モデルが動作指令を算出する」、「位置抽出ブロックの入力である撮影画像」などと記す場合がある。 Note that the command generation model, position extraction block, and action command generation block are parameters used in processes such as recognition and inference using machine learning technology, but below they may be described as if the parameters themselves are the entities that use them for processing. For example, they may be described as "the command generation model calculates the action command" or "the captured image that is input to the position extraction block."

動作指令生成装置は、位置抽出ブロックと動作指令生成ブロックとを含む構成の機械学習モデルを、学習データを用いて訓練して学習済み機械学習モデルである指令生成モデルを生成する。学習データの説明変数（入力）は撮影画像を含み、目的変数（出力、正解）は撮影画像にある１つの物体（作業対象、対象物）を把持するロボットの動作指令を含む。 The motion command generation device uses training data to train a machine learning model including a position extraction block and a motion command generation block, generating a command generation model that is a trained machine learning model. The explanatory variables (input) of the training data include captured images, and the objective variables (output, correct answer) include motion commands for a robot to grasp a single object (work target, target object) in the captured images.

学習データは、人がロボットを操作して物体を把持する様子の撮影画像と、把持するときのロボットの操作を示す動作指令とを含む教示型の学習データであってもよい。人が操作せず、物体に対する最適な把持動作を、例えば機械学習技術を用いて算出して、算出された動作を動作指令として記録した学習データであってもよい。なお教示型学習データの撮影画像に映る作業対象の物体は、１つであってもよい。 The learning data may be instruction-type learning data that includes captured images of a person operating a robot to grasp an object and motion commands that indicate how the robot should be operated when grasping. It may also be learning data in which the optimal grasping motion for an object is calculated without human operation, for example using machine learning technology, and the calculated motion is recorded as a motion command. Note that the number of work target objects shown in the captured images of the instruction-type learning data may be one.

動作指令生成装置は、指令生成モデル（の位置抽出ブロック）を用いて撮影画像から１つ以上の作業対象（の位置情報）を取得する。次に動作指令生成装置は、操作者に問い合わせて１つ以上の作業対象から１つの作業対象を選択する。続いて動作指令生成装置は、指令生成モデル（の動作指令生成ブロック）を用いて選択された作業対象を把持するロボットの動作命令を生成する。換言すれば、動作指令生成装置は、指令生成モデル（の位置抽出ブロック）を用いて取得された１つ以上の作業対象を１つの作業対象に置き換えて、当該１つの作業対象を把持する動作指令を生成する。 The motion command generation device acquires one or more work objects (position information) from the captured image using the command generation model (position extraction block). Next, the motion command generation device queries the operator to select one work object from the one or more work objects. The motion command generation device then generates a motion command for the robot to grasp the selected work object using the command generation model (motion command generation block). In other words, the motion command generation device replaces the one or more work objects acquired using the command generation model (position extraction block) with a single work object and generates a motion command to grasp the single work object.

このような動作指令生成装置によれば、取得しやすい教示型／自律学習型の学習データを用いて生成された指令生成モデルを用いて、複数の作業対象のなかで作業者が指定した作業対象を把持する動作指令が生成される。従来までの教示型の学習データを用いて生成された機械学習モデルを用いる場合には、複数の作業対象があった場合の動作が保証できなかった（どのように動作するか不明だった）。動作指令生成装置によれば、作業する作業対象の順番を保証することができる。例えば、作業順序が決まった作業に対してロボットを使用することができるようになる。 This type of motion command generation device uses a command generation model generated using easily obtainable teaching/autonomous learning data to generate a motion command to grasp a work object designated by a worker from among multiple work objects. Previously, when using machine learning models generated using teaching-type learning data, it was not possible to guarantee operation when there were multiple work objects (it was unclear how the object would operate). With this motion command generation device, it is possible to guarantee the order in which the work objects are worked on. For example, it becomes possible to use a robot for work where the work order is fixed.

≪ロボット制御システムの全体構成≫
図１は、第１実施形態に係るロボット制御システム１０の全体構成図である。ロボット制御システム１０は、ロボット３００、制御装置３１０、カメラ３７１，３７２、および動作指令生成装置１００を含んで構成される。
ロボット３００は、物体である作業対象３８０のハンドリングが可能であり、部品の組み立てや搬送など所定の作業を行う。ロボット３００の構成は問わず、ロボットアーム単体でもよく、クローラや車輪などの移動装置を備えてもよい。 <Overall configuration of the robot control system>
1 is a diagram showing the overall configuration of a robot control system 10 according to the first embodiment. The robot control system 10 includes a robot 300, a control device 310, cameras 371 and 372, and an operation command generating device 100.
The robot 300 is capable of handling a work target 380, which is an object, and performs predetermined tasks such as assembling and transporting parts. The configuration of the robot 300 is not important, and it may be a robot arm alone, or may be equipped with a moving device such as crawlers or wheels.

制御装置３１０は、動作指令生成装置１００から入力されたロボット３００の関節角やエンドエフェクタ３０１（ロボットハンド）の姿勢（位置）、力（トルク）などの動作指令を基に、ロボット３００に制御指令を出力してロボット３００の動作を制御する装置である。制御指令は、例えばロボット３００に備わるロボットアームの関節やエンドエフェクタ３０１などに設けられたアクチュエータ（モータなど）に対する電流値や電圧値などを示す信号である。ロボット３００は、制御装置３１０から制御指令を受信すると内蔵の駆動回路が該当するアクチュエータに駆動信号を供給する。 The control device 310 is a device that outputs control commands to the robot 300 to control the movement of the robot 300 based on movement commands such as the joint angles of the robot 300 and the posture (position) and force (torque) of the end effector 301 (robot hand) input from the movement command generating device 100. The control commands are signals that indicate, for example, current values and voltage values for actuators (motors, etc.) provided on the joints of the robot arm and the end effector 301 of the robot 300. When the robot 300 receives a control command from the control device 310, its built-in drive circuit supplies a drive signal to the corresponding actuator.

カメラ３７１，３７２は、ロボット３００の作業環境や周辺環境を撮像するための撮像装置である。図１では、２台のカメラ３７１，３７２が設置されているが、１台であっても３台以上であってもよい。カメラ３７１はロボットアーム３０２に取り付けられており、カメラ３７２はロボット３００の周辺（例えば作業室や建物の壁）に設置されている。ここで作業環境はロボット３００の可動領域（作業エリア）に相当し、周辺環境はロボットの可動領域外の周辺領域に相当する。 Cameras 371 and 372 are imaging devices for capturing images of the working environment and surrounding environment of robot 300. In FIG. 1, two cameras 371 and 372 are installed, but there may be one or three or more. Camera 371 is attached to robot arm 302, and camera 372 is installed around robot 300 (for example, a workroom or a building wall). Here, the working environment corresponds to the movable range (work area) of robot 300, and the surrounding environment corresponds to the surrounding area outside the movable range of the robot.

動作指令生成装置１００は、カメラ３７１，３７２により撮影された画像（撮影画像）、ロボット３００や作業環境、周辺環境に配置されたセンサから得られた情報（以下、センサ情報と記す）に基づいて、ロボット３００の動作を計画し、制御装置３１０にロボット３００の関節角や力などの動作指令を送信する装置である。ここで、ロボット３００やセンサ情報の種類は問わない。例えば、センサ情報は、ロボット３００の関節に備わっているアクチュエータの電流値、ロボット３００に外付けされている触覚センサや慣性センサの出力信号などでもよい。さらにセンサ情報は、作業環境や周辺環境を計測している温度センサなどでもよい。このように、各センサは、ロボット３００の状態や環境の状態を検出し、検出内容に応じた検出信号を出力する。 The motion command generation device 100 plans the motion of the robot 300 based on images (photographed images) taken by the cameras 371, 372 and information obtained from sensors placed on the robot 300, the work environment, and the surrounding environment (hereinafter referred to as sensor information), and transmits motion commands such as the joint angles and forces of the robot 300 to the control device 310. The type of robot 300 and sensor information is not important here. For example, the sensor information may be the current value of an actuator provided in the joints of the robot 300, or the output signal of a tactile sensor or inertial sensor externally attached to the robot 300. Furthermore, the sensor information may be a temperature sensor measuring the work environment or surrounding environment. In this way, each sensor detects the state of the robot 300 and the state of the environment, and outputs a detection signal according to the detected content.

≪動作指令生成装置の構成≫
図２は、第１実施形態に係る動作指令生成装置１００の機能ブロック図である。動作指令生成装置１００はコンピュータであり、制御部１１０、記憶部１２０、および入出力部１８０を備える。
入出力部１８０には、ディスプレイやキーボード、マウスなどのユーザインターフェイス機器が接続される。例えば入出力部１８０にはタッチパネルディスプレイが接続され、カメラ３７１，３７２の撮影画像が表示される。ロボットの操作者は、ロボット３００が把持する作業対象３８０の１つをタッチパネルディスプレイ上で指定する。また、入出力部１８０は通信デバイスを備え、制御装置３１０やカメラ３７１，３７２などの装置とのデータ（信号）の送受信が可能である。 <Configuration of the motion command generating device>
2 is a functional block diagram of the motion command generating device 100 according to the first embodiment. The motion command generating device 100 is a computer, and includes a control unit 110, a storage unit 120, and an input/output unit 180.
User interface devices such as a display, keyboard, and mouse are connected to the input/output unit 180. For example, a touch panel display is connected to the input/output unit 180, and images captured by cameras 371 and 372 are displayed on the touch panel display. The robot operator designates one of the work objects 380 to be grasped by the robot 300 on the touch panel display. The input/output unit 180 also includes a communication device, and is capable of transmitting and receiving data (signals) to and from devices such as the control device 310 and the cameras 371 and 372.

記憶部１２０は、ＲＯＭ（Read Only Memory）やＲＡＭ（Random Access Memory）、ＳＳＤ（Solid State Drive）などの記憶機器を含んで構成される。記憶部１２０には、学習データ１３０、指令生成モデル１４０、およびプログラム１２８が記憶される。プログラム１２８は、動作指令生成処理（後記する図４参照）の記述を含む。 The memory unit 120 is configured to include storage devices such as ROM (Read Only Memory), RAM (Random Access Memory), and SSD (Solid State Drive). The memory unit 120 stores training data 130, a command generation model 140, and a program 128. The program 128 includes a description of the operation command generation process (see Figure 4, described below).

≪動作指令生成装置：学習データ≫
学習データ１３０は、機械学習モデルである指令生成モデル１４０の学習に用いる学習用データである。学習データ１３０の説明変数（入力）は、ロボット３００が作業対象３８０を把持する作業時に時系列に取得した、ロボット３００の作業環境や周辺環境の撮影画像、および、ロボット３００や作業環境、周辺環境のセンサ情報である。学習データ１３０の目的変数（出力）は、ロボット３００が作業対象３８０を把持する作業時に時系列に取得したロボット３００への動作指令である。 <Motion command generation device: learning data>
The training data 130 is training data used for training the command generation model 140, which is a machine learning model. The explanatory variables (input) of the training data 130 are photographed images of the work environment and surrounding environment of the robot 300, acquired in time series while the robot 300 is working to grasp the work object 380, and sensor information of the robot 300, the work environment, and the surrounding environment. The objective variables (output) of the training data 130 are operation commands for the robot 300, acquired in time series while the robot 300 is working to grasp the work object 380.

この作業は、ロボット３００の操作者が、例えばジョイスティックなどの操作部を入出力部１８０に接続して、ロボット３００を操作（制御）する作業である。操作部は操作者の入力を受け付けて、その内容に応じた動作指令を制御装置３１０に送信する装置である。操作者が操作する替わりに、予め計画されたロボット３００の動作をロボット３００に再生させてもよい。なお、ロボット３００の運用時に作業対象３８０が複数あることが予想されるとしても、学習データ１３０の取得時には、作業対象３８０は１つであってもよい。 In this task, the operator of the robot 300 connects an operating unit, such as a joystick, to the input/output unit 180 to operate (control) the robot 300. The operating unit is a device that accepts input from the operator and transmits operation commands corresponding to the input to the control device 310. Instead of the operator operating the robot 300, the robot 300 may reproduce pre-planned operations. Note that even if multiple work targets 380 are expected when the robot 300 is in operation, there may only be one work target 380 when the learning data 130 is acquired.

≪動作指令生成装置：指令生成モデル≫
図３は、第１実施形態に係る指令生成モデル１４０の構成図である。指令生成モデル１４０は機械学習モデルであって、位置抽出ブロック１４１と動作指令生成ブロック１４２とを含む。
位置抽出ブロック１４１は、撮影画像にある１つ以上の作業対象３８０やエンドエフェクタ３０１を認識する処理に用いられるパラメータを含む。位置抽出ブロック１４１は、例えば画像の特徴量を抽出するＣＮＮ（Convolutional Neural Network）、および得られた特徴マップから一番強度の大きい位置の座標情報を抽出するSpatial Softmaxに係るパラメータを含み、撮影画像から作業対象３８０の位置情報を抽出する処理をするときに参照される。Spatial Softmaxは、機械学習で用いられるsoftmax関数やtanh関数、sigmoid関数などの関数の一種であり、soft argmaxとも称される。Spatial Softmaxで抽出される位置座標の数は、直前のＣＮＮのチャンネル数に基づいて決定される。 <Motion command generation device: command generation model>
3 is a configuration diagram of the command generation model 140 according to the first embodiment. The command generation model 140 is a machine learning model, and includes a position extraction block 141 and an action command generation block 142.
The position extraction block 141 includes parameters used in the process of recognizing one or more work objects 380 and end effectors 301 in the captured image. The position extraction block 141 includes, for example, parameters related to a convolutional neural network (CNN) that extracts image features and spatial softmax that extracts coordinate information of the position with the strongest intensity from the obtained feature map, and is referenced when performing the process of extracting position information of the work object 380 from the captured image. Spatial softmax is a type of function used in machine learning, such as the softmax function, tanh function, or sigmoid function, and is also called soft argmax. The number of position coordinates extracted by spatial softmax is determined based on the number of channels of the previous CNN.

動作指令生成ブロック１４２は、作業対象を把持するためのロボット３００の動作を指示する動作指令を生成する処理に用いられるパラメータを含む。動作指令生成ブロック１４２は、例えば全結合層やRecurrent Neural Network（ＲＮＮ）などを用いて、位置抽出ブロックから得られた作業対象の位置情報とセンサ情報とから、動作指令を生成する。 The motion command generation block 142 contains parameters used in the process of generating motion commands that instruct the robot 300 to grasp the work object. The motion command generation block 142 generates motion commands from the position information and sensor information of the work object obtained from the position extraction block, using, for example, a fully connected layer or a recurrent neural network (RNN).

≪動作指令生成装置：制御部≫
図２に戻って制御部１１０の説明を続ける。制御部１１０は、ＣＰＵ（Central Processing Unit）を含んで構成され、取得部１１１、学習部１１２、および指令生成部１１３が備わる。取得部１１１は、カメラ３７１，３７２の撮影画像、およびロボット３００や作業環境、周辺環境に備わるセンサのセンサ情報を取得する。 <<Motion command generating device: control unit>>
2, the description of the control unit 110 will continue. The control unit 110 is configured to include a CPU (Central Processing Unit) and is equipped with an acquisition unit 111, a learning unit 112, and a command generation unit 113. The acquisition unit 111 acquires images captured by cameras 371 and 372, and sensor information from sensors equipped in the robot 300, the work environment, and the surrounding environment.

学習部１１２は、学習データ１３０を用いて指令生成モデル１４０を訓練する（指令生成モデル１４０に学習データ１３０を学習させる）。訓練／学習の結果として指令生成モデル１４０に含まれるパラメータが調整され、撮影画像とセンサ情報とに基づいて誤差が最小となるような作業対象３８０を把持する動作指令が出力されるようになる。 The learning unit 112 trains the command generation model 140 using the learning data 130 (allows the command generation model 140 to learn the learning data 130). As a result of the training/learning, the parameters included in the command generation model 140 are adjusted, and an action command for grasping the work object 380 that minimizes error based on the captured image and sensor information is output.

動作指令の誤差を最小化することは、学習データ１３０に示されている作業を達成することに等しいため、位置抽出ブロック１４１の出力として作業を達成するために重要な位置情報が得られることが期待できる。例えば、作業対象３８０やエンドエフェクタ３０１などの位置情報が得られると考えられる。なお学習データ１３０は、撮影画像とセンサ情報と動作指令とを含むが、位置情報は含んでおらず、陽に訓練／学習されたものではない。 Since minimizing the error in the motion commands is equivalent to completing the task indicated in the training data 130, it is expected that important position information for completing the task will be obtained as the output of the position extraction block 141. For example, it is expected that position information of the work object 380 and the end effector 301 will be obtained. Note that the training data 130 includes captured images, sensor information, and motion commands, but does not include position information, and is not explicitly trained/learned.

また、学習の結果として得られたＣＮＮの各チャンネルは、特定の形状に反応する。例えば、第１のチャンネルは作業対象３８０に、第２のチャンネルはエンドエフェクタ３０１に反応するなどである。動作指令生成ブロック１４２は、各チャンネルから得られる座標に基づいて動作指令を予測するため、第１のチャンネルの座標が変わることは作業対象の位置が変わることと等しい。なお、どのチャンネルが作業対象に反応するかは、訓練／学習後に学習データを入力して、作業対象の位置座標を出力しているチャンネルを調べることで決定できる。このように、位置抽出ブロック１４１の出力となる位置情報は、撮影画像に含まれる個々の作業対象３８０やエンドエフェクタ３０１の位置座標を含む。 Furthermore, each channel of the CNN obtained as a result of learning responds to a specific shape. For example, the first channel responds to the work object 380, and the second channel responds to the end effector 301. The motion command generation block 142 predicts motion commands based on the coordinates obtained from each channel, so a change in the coordinates of the first channel is equivalent to a change in the position of the work object. Note that which channel responds to the work object can be determined by inputting the learning data after training/learning and examining the channel that is outputting the position coordinates of the work object. In this way, the position information output by the position extraction block 141 includes the position coordinates of each work object 380 and end effector 301 included in the captured image.

指令生成部１１３は、指令生成モデル１４０を用いて、撮影画像と、センサ情報と、操作者が指定した作業対象物の位置情報とに基づいて推論（予測）を行い、ロボット３００の動作指令を出力する。この動作指令は、制御装置３１０に送信され、ロボット３００が動作する。 The command generation unit 113 uses the command generation model 140 to make inferences (predictions) based on the captured image, sensor information, and position information of the work object specified by the operator, and outputs operation commands for the robot 300. These operation commands are sent to the control device 310, causing the robot 300 to operate.

操作者が指定した作業対象物の位置情報は、機械学習モデルとしての指令生成モデル１４０の入力（説明変数）ではない。指令生成部１１３は、位置抽出ブロック１４１を用いて算出された作業対象の位置情報を操作者が指定した作業対象物の位置情報に置き換えて、動作指令生成ブロック１４２を用いて動作指令を算出する。 The position information of the work object specified by the operator is not an input (explanatory variable) to the command generation model 140, which is a machine learning model. The command generation unit 113 replaces the position information of the work object calculated using the position extraction block 141 with the position information of the work object specified by the operator, and calculates an action command using the action command generation block 142.

≪動作指令生成処理≫
図４は、第１実施形態に係る動作指令生成処理のフローチャートである。以下の動作指令生成処理の説明において主作業とは、ロボット３００が１つ以上の作業対象３８０（例えば瓦礫）を１つずつエンドエフェクタ３０１で把持して、所定の容器（不図示）に移す（撤去する）作業である。１つの作業対象３８０を把持して容器に移すのが、１つの副作業である。第１実施形態では、撤去する作業対象３８０をロボット３００の操作者が指示して、生成された動作命令に従ってロボット３００が指示された作業対象３８０を把持して容器に移す。 <<Motion command generation process>>
4 is a flowchart of the motion command generation process according to the first embodiment. In the following description of the motion command generation process, the main task is the task of the robot 300 grasping one or more work objects 380 (e.g., rubble) one by one with the end effector 301 and transferring (removing) them to a predetermined container (not shown). Grasping one work object 380 and transferring it to the container is one sub-task. In the first embodiment, the operator of the robot 300 indicates the work object 380 to be removed, and the robot 300 grasps the instructed work object 380 and transfers it to the container in accordance with the generated motion command.

ステップＳ１１において指令生成部１１３は、副作業ごとにステップＳ１２～Ｓ１７を繰り返す処理を開始する。詳しくは、指令生成部１１３は、位置抽出ブロック１４１を用いて、撮影画像から位置情報を算出し、位置情報に作業対象３８０の位置情報がなければ動作指令生成処理を終え、作業対象３８０があればステップＳ１２に進む。 In step S11, the command generation unit 113 begins the process of repeating steps S12 to S17 for each sub-task. In more detail, the command generation unit 113 uses the position extraction block 141 to calculate position information from the captured image, and if the position information does not include position information for the work object 380, the command generation process ends; if the work object 380 is present, the command generation unit 113 proceeds to step S12.

ステップＳ１２において指令生成部１１３は、作業対象３８０が映っているカメラ３７１，３７２の撮影画像をタッチパネルディスプレイに表示し、どの作業対象３８０が把持するかを操作者に問い合わせる。操作者が把持する作業対象３８０の１つをタッチして指示すると、指令生成部１１３はその位置を取得する。換言すれば、指令生成部１１３は作業対象となる対象物の候補のなかでロボット３００の作業対象となる対象物を指定する作業対象指定情報を取得する。以下では指示された１つの作業対象３８０を作業対象物体と記す。 In step S12, the command generation unit 113 displays images captured by cameras 371 and 372 showing the work objects 380 on the touch panel display and asks the operator which work object 380 to grasp. When the operator touches and indicates one of the work objects 380 to be grasped, the command generation unit 113 acquires its position. In other words, the command generation unit 113 acquires work object designation information that specifies the object to be worked on by the robot 300 from among the candidate objects to be worked on. Hereinafter, the one designated work object 380 will be referred to as the work object.

ステップＳ１３において指令生成部１１３は、副作業が終了する（ステップＳ１２で指示された作業対象物体をロボットが把持して容器に移す）までステップＳ１４～Ｓ１７の処理を繰り返す。
ステップＳ１４において指令生成部１１３は、撮像画像を位置抽出ブロック１４１に入力して位置情報を取得する。詳しくは、指令生成部１１３は位置抽出ブロック１４１を用いて撮影画像から位置情報を算出する。位置情報には、個々の作業対象３８０やエンドエフェクタ３０１の位置座標が含まれる。 In step S13, the command generation unit 113 repeats the processes of steps S14 to S17 until the sub-task is completed (the robot grasps the work object instructed in step S12 and transfers it to a container).
In step S14, the command generation unit 113 acquires position information by inputting the captured image to the position extraction block 141. More specifically, the command generation unit 113 calculates position information from the captured image using the position extraction block 141. The position information includes the position coordinates of each work object 380 and the end effector 301.

ステップＳ１５において指令生成部１１３は、位置情報に含まれる作業対象物体の位置座標を、ステップＳ１２で取得した位置情報に置き換える。なお、指令生成部１１３がステップＳ１４～Ｓ１７を繰り返して、ロボット３００が作業対象物体を把持して移動することで、作業対象物体の位置は変化する。指令生成部１１３は、ステップＳ１２で指示された作業対象物体の位置を追跡して、追跡結果である作業対象物体の位置に置き換える。把持した後の作業対象物体の位置は、後記するステップＳ１６で取得される動作指令から算出されるエンドエフェクタ３０１の位置から取得可能である。 In step S15, the command generation unit 113 replaces the position coordinates of the work target object included in the position information with the position information acquired in step S12. Note that the position of the work target object changes as the command generation unit 113 repeats steps S14 to S17 and the robot 300 grasps and moves the work target object. The command generation unit 113 tracks the position of the work target object instructed in step S12 and replaces the position with the position of the work target object that is the tracking result. The position of the work target object after grasping can be obtained from the position of the end effector 301 calculated from the operation command acquired in step S16, which will be described later.

ステップＳ１６において指令生成部１１３は、置き換えた位置情報とセンサ情報とを動作指令生成ブロック１４２に入力して、動作指令を取得する。詳しくは、指令生成部１１３は動作指令生成ブロック１４２を用いて置き換えた位置情報とセンサ情報とから動作指令を算出する。
ステップＳ１７において指令生成部１１３は、ステップＳ１６で算出した動作指令を制御装置３１０に送信する。 In step S16, the command generation unit 113 inputs the replaced position information and sensor information to the action command generation block 142 to obtain an action command. Specifically, the command generation unit 113 calculates an action command from the replaced position information and sensor information using the action command generation block 142.
In step S17, the command generator 113 transmits the operation command calculated in step S16 to the control device 310.

≪動作指令生成装置の特徴≫
動作指令生成装置１００は、指令生成モデル１４０を用いて、撮影画像とセンサ情報とからロボット３００の動作指令を生成する。指令生成モデル１４０は機械学習モデルであって、その学習データは、操作者がロボット３００を操作した作業から取得可能なデータであって、低コストで作成可能である。 <Features of the motion command generation device>
The motion command generation device 100 generates motion commands for the robot 300 from the captured image and sensor information using the command generation model 140. The command generation model 140 is a machine learning model, and its learning data is data that can be acquired from the work performed by the operator operating the robot 300, and can be created at low cost.

指令生成モデル１４０は、位置抽出ブロック１４１と、動作指令生成ブロック１４２とを含み、動作指令を生成する際に作業対象の位置を算出する構成となっている。動作指令生成装置１００は動作指令を生成する際に、作業対象の位置を操作者が指定した作業対象の位置に置き換える。このようにすることで、複数の作業対象となる物体があった場合でも、操作者が指示した物体に対してロボット３００は作業を行うようになる。
学習データを作成（操作者がロボット３００を操作）するときの作業対象の物体は１つであってもよい。複数の作業対象がある場合の学習データを準備して、訓練／学習するのに比べて低コスト・短時間に学習データを作成できる。 The command generation model 140 includes a position extraction block 141 and a motion command generation block 142, and is configured to calculate the position of the work target when generating a motion command. When generating a motion command, the motion command generation device 100 replaces the position of the work target with the position of the work target specified by the operator. In this way, even if there are multiple objects to be worked on, the robot 300 will perform the work on the object specified by the operator.
The number of work objects to be processed when the learning data is generated (when the operator operates the robot 300) may be one. Compared to preparing learning data for multiple work objects and then training/learning, the learning data can be generated at low cost and in a short time.

≪変形例：位置情報≫
上記した第１実施形態では、指令生成モデル１４０を用いて動作指令を算出する際に、作業対象の位置情報（位置座標）が算出されている（図３参照）。位置抽出ブロック１４１においてSpatial Softmax関数の替わりにsigmoid関数を用いてヒートマップを得ることで、位置情報の替わりに作業対象やエンドエフェクタ３０１の領域情報が算出されるようにしてもよい。 <<Variation: Location Information>>
In the first embodiment described above, position information (position coordinates) of the work object is calculated (see FIG. 3 ) when calculating an action command using the command generation model 140. By obtaining a heat map using a sigmoid function instead of the Spatial Softmax function in the position extraction block 141, area information of the work object and the end effector 301 may be calculated instead of position information.

≪変形例：位置情報の置き換え≫
上記した第１実施形態において指令生成部１１３は、位置抽出ブロック１４１の出力である作業対象の位置情報を、操作者が指定した作業対象の位置情報に置き換えている（図４のステップＳ１５参照）。入力となる撮影画像が作業対象のみを含むようにしてもよい。 <<Variation: Replacing location information>>
In the first embodiment described above, the command generator 113 replaces the position information of the work object output from the position extraction block 141 with the position information of the work object specified by the operator (see step S15 in FIG. 4). The captured image to be input may include only the work object.

図５は、第１実施形態の変形例に係る撮影画像５１０である。撮影画像５１０の右下にある３つの作業対象のなかで、操作者が右の作業対象５１１を把持するように指定したとする。すると指令生成部１１３は、作業対象５１１を含む領域を残して他をマスクする画像であり、作業対象５１１以外の操作者が指定しなかった作業対象をマスクする画像であるマスク用の画像５２０（後記する図６参照）を生成する。 Figure 5 shows a captured image 510 according to a modified example of the first embodiment. Suppose the operator specifies the right work object 511 to be grasped among the three work objects at the bottom right of the captured image 510. The command generation unit 113 then generates a masking image 520 (see Figure 6 described below), which is an image that masks the area including work object 511 and the rest, and which masks all work objects other than work object 511 that the operator did not specify.

図６は、第１実施形態の変形例に係るマスク用の画像５２０である。領域５２１は、作業対象５１１を含む領域で、作業対象５１１以外の操作者が指定しなかった作業対象を含まない領域である。画像５２０において領域５２１以外がマスクされている。 Figure 6 shows a mask image 520 according to a modified example of the first embodiment. Area 521 is an area that includes work object 511 and does not include work objects other than work object 511 that the operator has not specified. In image 520, the area other than area 521 is masked.

図７は、第１実施形態の変形例に係る撮影画像５１０が画像５２０でマスクされた撮影画像５３０である。指令生成部１１３は、位置情報を置き換えるのではなく、指定された作業対象５１１を残して他がマスクされた撮影画像５３０を指令生成モデル１４０（位置抽出ブロック１４１）の入力として、動作指令を算出する。このような作業対象５１１のみが撮影されている画像を入力とすることで、位置抽出ブロック１４１の出力である位置情報には作業対象５１１の位置情報のみが含まれ、作業対象５１１を把持する動作指令が生成される。 Figure 7 shows a captured image 530 in which captured image 510 according to a modified example of the first embodiment has been masked with image 520. Rather than replacing the position information, the command generation unit 113 inputs the captured image 530, in which the specified work object 511 remains and the rest is masked, into the command generation model 140 (position extraction block 141) to calculate an action command. By inputting an image in which only the work object 511 is captured, the position information output by the position extraction block 141 includes only the position information of the work object 511, and an action command to grasp the work object 511 is generated.

≪変形例：作業種別の選択≫
上記した第１実施形態では、操作者が作業対象を選択している（図４記載のステップＳ１２参照）が、さらに複数ある作業種別の１つを選択するようにしてもよい。瓦礫撤去作業における作業種別とは、例えば選択されてロボット３００が把持した作業対象の瓦礫をどの容器に移す作業かということである。またバルブ開閉作業における作業種別とは、例えば選択されたバルブを開く作業か、閉じる作業かということである。 <<Variation: Selection of work type>>
In the first embodiment described above, the operator selects the work target (see step S12 in FIG. 4 ), but the operator may also select one of a plurality of work types. The work type in the rubble removal work refers to, for example, the container to which the rubble of the work target that has been selected and grasped by the robot 300 is to be transferred. Similarly, the work type in the valve opening/closing work refers to, for example, whether the selected valve is to be opened or closed.

指令生成モデル１４０は、作業種別に応じて複数あり、それぞれ作業種別に応じた動作指令を生成するように訓練／学習されている。ステップＳ１２において指令生成部１１３は、作業対象の位置情報とともに、作業種別を取得する。ステップＳ１４～Ｓ１７において指令生成部１１３は、作業種別に対応した指令生成モデル１４０を用いて動作指令を算出する。
このようにすることでロボット３００は、作業対象に対して操作者が指示した作業種別の作業を行うことができるようになる。 There are multiple command generation models 140 corresponding to different task types, and each is trained/learned to generate operational commands corresponding to the task type. In step S12, the command generation unit 113 acquires the task type along with the position information of the task object. In steps S14 to S17, the command generation unit 113 calculates operational commands using the command generation model 140 corresponding to the task type.
In this way, the robot 300 can perform the work type instructed by the operator on the work object.

≪第２実施形態≫
上記した第１実施形態では、把持する作業対象を操作者に問い合わせている（図４のステップＳ１２参照）。把持する作業対象が１つに特定できる場合には、操作者への問い合わせることなく、作業を行ってもよい。例えば、撮影画像において作業対象が一カ所にあると見なせる場合には、操作者に問い合わせることなく、作業を行ってもよい。これは、位置抽出ブロック１４１が算出した位置情報（位置座標）が１つ以上あるが、その位置の散らばり（分散）が小さく、一カ所と見なせる場合である。 Second Embodiment
In the first embodiment described above, the operator is queried about the work object to be grasped (see step S12 in FIG. 4). If it is possible to identify a single work object to be grasped, the work may be performed without querying the operator. For example, if the work object can be considered to be in a single location in the captured image, the work may be performed without querying the operator. This is the case when there is more than one piece of position information (position coordinates) calculated by the position extraction block 141, but the dispersion (variance) of the positions is small and the positions can be considered to be in a single location.

第２実施形態に係る動作指令生成装置１００の機能構成は、指令生成部１１３（動作指令生成処理）を除いて第１実施形態と同様である。第２実施形態の指令生成部を指令生成部１１３Ａと記す。図８は、第２実施形態に係る動作指令生成処理のフローチャートである。
ステップＳ３１において指令生成部１１３Ａは、副作業ごとにステップＳ３２～Ｓ４０を繰り返す処理を開始する。
ステップＳ３２において指令生成部１１３Ａは、撮像画像を位置抽出ブロック１４１に入力して位置情報を取得して、位置情報の分散を算出する。分散は、例えば作業対象の位置を示すＸ座標とＹ座標それぞれの分散の和である。 The functional configuration of the motion command generation device 100 according to the second embodiment is the same as that of the first embodiment, except for the command generation unit 113 (motion command generation processing). The command generation unit according to the second embodiment is referred to as a command generation unit 113A. Fig. 8 is a flowchart of the motion command generation processing according to the second embodiment.
In step S31, the command generator 113A starts the process of repeating steps S32 to S40 for each sub-task.
In step S32, the command generator 113A inputs the captured image to the position extraction block 141 to acquire position information and calculates the variance of the position information. The variance is, for example, the sum of the variances of the X and Y coordinates that indicate the position of the work object.

ステップＳ３３において指令生成部１１３Ａは、ステップＳ３２で算出した分散が所定値より大きければ（ステップＳ３３→ＹＥＳ）ステップＳ３４に進み、所定値以下であれば（ステップＳ３３→ＮＯ）ステップＳ３５に進む。
ステップＳ３４は、図４記載のステップＳ１２と同様の処理である。
ステップＳ３５において指令生成部１１３Ａは、ステップＳ３４で指示された作業対象物体をロボットが把持して容器に移すまでステップＳ３６～Ｓ４０の処理を繰り返す。ステップＳ３４がスキップされた（ステップＳ３３→ＮＯ）場合の作業対象物体は、分散が小さく一カ所と見なせる位置情報の位置にある作業対象の物体である。 In step S33, if the variance calculated in step S32 is greater than a predetermined value (step S33→YES), the command generator 113A proceeds to step S34, and if it is equal to or less than the predetermined value (step S33→NO), the command generator 113A proceeds to step S35.
Step S34 is the same process as step S12 in FIG.
In step S35, the command generation unit 113A repeats the processes of steps S36 to S40 until the robot grasps the work target object instructed in step S34 and transfers it to the container. If step S34 is skipped (step S33→NO), the work target object is an object that is located at a position with position information that has small variance and can be considered as a single location.

ステップＳ３６は、ステップＳ１４と同様の処理である。
ステップＳ３７において指令生成部１１３Ａは、ステップＳ３４（ステップＳ１２参照）において操作者の指示である作業対象物体の位置を取得したならば（ステップＳ３７→ＹＥＳ）ステップＳ３８に進み、取得していないならば（ステップＳ３７→ＮＯ）ステップＳ３９に進む。
ステップＳ３８～Ｓ４０は、ステップＳ１５～Ｓ１７とそれぞれ同様の処理である。 Step S36 is the same process as step S14.
In step S37, if the command generation unit 113A has acquired the position of the object to be worked on, which is the operator's instruction, in step S34 (see step S12) (step S37 → YES), it proceeds to step S38, and if it has not acquired the position (step S37 → NO), it proceeds to step S39.
Steps S38 to S40 are the same processes as steps S15 to S17, respectively.

≪第２実施形態の特徴≫
撮影画像において作業対象が一カ所にあると見なせる場合には、操作者に問い合わせることなく、ロボット３００は作業を行うので、作業効率が向上する。
なおステップＳ３７でＮＯに分岐してステップＳ３８をスキップすることなく、ステップＳ３６に続いてステップＳ３８を実行するようにしてもよい。この場合、ステップＳ３８において指令生成部１１３Ａは、作業対象物体の位置座標を把持された作業対象物体の移動に応じた位置情報に置き換えてもよい（ステップＳ１５参照）。 Features of the Second Embodiment
If the work object can be regarded as being in one place in the captured image, the robot 300 performs the work without asking the operator, thereby improving work efficiency.
Alternatively, step S38 may be executed immediately after step S36 without skipping step S38 by branching to NO in step S37. In this case, in step S38, the command generator 113A may replace the position coordinates of the work target object with position information corresponding to the movement of the gripped work target object (see step S15).

≪第３実施形態≫
第１実施形態では、作業対象の順番が決まっている。順番が決まっていない場合には、操作者に問い合わせることなく、次々と作業を行ってもよい。例えば、瓦礫撤去作業において、類似形状・見た目の瓦礫が散らばっており、順番によらず撤去すればよい場合は、次々と瓦礫を把持して容器に移せばよい。 Third Embodiment
In the first embodiment, the order of work objects is fixed. If the order is not fixed, the work may be performed one after another without asking the operator. For example, in a rubble removal work, if rubble of similar shape and appearance is scattered and can be removed in any order, the rubble may be picked up one after another and placed in a container.

第３実施形態に係る動作指令生成装置１００の機能構成は、指令生成部（動作指令生成処理）を除いて第１実施形態と同様である。第３実施形態の指令生成部を指令生成部１１３Ｂと記す。図９は、第３実施形態に係る動作指令生成処理のフローチャートである。
ステップＳ５１において指令生成部１１３Ｂは、副作業ごとにステップＳ５２～Ｓ５８を繰り返す処理を開始する。
ステップＳ５２において指令生成部１１３Ｂは、撮像画像を位置抽出ブロック１４１に入力して作業対象の位置情報を取得する。 The functional configuration of the motion command generation device 100 according to the third embodiment is the same as that of the first embodiment, except for the command generation unit (motion command generation process). The command generation unit according to the third embodiment is referred to as a command generation unit 113B. Fig. 9 is a flowchart of the motion command generation process according to the third embodiment.
In step S51, the command generator 113B starts the process of repeating steps S52 to S58 for each sub-task.
In step S52, the command generation unit 113B inputs the captured image to the position extraction block 141 to obtain position information of the work object.

ステップＳ５３において指令生成部１１３Ｂは、ステップＳ５２で取得した位置情報にある作業対象の１つをランダムに選択する。以下、この選択された作業対象を作業対象物体と記す。
ステップＳ５４～Ｓ５８は、ステップＳ１３～Ｓ１７とそれぞれ同様である。但し、ステップＳ５６では、位置情報に含まれる作業対象物体の位置座標を、ステップＳ５３で選択した作業対象物体の位置情報に置き換える。 In step S53, the command generation unit 113B randomly selects one of the work targets in the position information acquired in step S52. Hereinafter, this selected work target will be referred to as a work target object.
Steps S54 to S58 are similar to steps S13 to S17, respectively, except that in step S56, the position coordinates of the work target object included in the position information are replaced with the position information of the work target object selected in step S53.

≪第３実施形態の特徴≫
操作者への問い合わせることなく、ロボット３００は作業を行うので、作業効率が向上する。 Features of the Third Embodiment
The robot 300 performs the work without asking the operator, thereby improving work efficiency.

≪変形例≫
以上、本発明のいくつかの実施形態について説明したが、これらの実施形態は、例示に過ぎず、本発明の技術的範囲を限定するものではない。例えば、ロボット３００の作業として瓦礫の撤去（瓦礫を把持して容器に移動）を例にしたが、これに限らず他の作業であってもよい。例えば、プラントにおける順番が決められた複数のバルブの開閉作業や物体（箱）の搬送作業などの作業であってもよい。バルブの開閉作業の場合、副作業は１つのバルブの開閉作業である。 <<Variations>>
Although several embodiments of the present invention have been described above, these embodiments are merely illustrative and do not limit the technical scope of the present invention. For example, while the removal of rubble (grasping the rubble and moving it to a container) has been described as an example of the task of the robot 300, other tasks may be performed. For example, tasks may include opening and closing multiple valves in a plant in a predetermined sequence, or transporting objects (boxes). In the case of the task of opening and closing valves, a sub-task is the opening and closing of one valve.

上記した第１実施形態では、作業対象となる物体をロボット３００の操作者が選択しているが、これに限らない。作業対象の順番を決めるシステムがあり、このシステムが作業対象を選択するようにしてもよい。
上記した第１実施形態では、作業対象３８０（の位置情報）の有無を、位置情報を基に指令生成部１１３が判断しているが（図４のステップＳ１１参照）操作者が撮影画像を基に判断するようにしてもよい。また、副作業の終了（ステップＳ１３参照）を操作者が判断するようにしてもよい。 In the first embodiment described above, the operator of the robot 300 selects the object to be worked on, but this is not limiting. There is a system that determines the order of the objects to be worked on, and this system may select the objects to be worked on.
In the first embodiment described above, the command generator 113 determines whether or not the work object 380 (position information) is present based on the position information (see step S11 in FIG. 4), but the operator may also determine this based on the captured image. Also, the operator may also determine the end of the sub-task (see step S13).

上記した実施形態における動作指令生成装置１００は、学習部１１２と指令生成部１１３，１１３Ａ，１１３Ｂとを備えており、指令生成モデル１４０を生成し、当該指令生成モデル１４０を用いて動作指令を出力している。指令生成モデル１４０を生成する装置と、動作指令を出力する装置とを分けてもよい。例えば、学習部１１２を備えるモデル生成装置が指令生成モデル１４０を生成して、複数の動作指令装置に送信し、それぞれの動作指令装置がそれぞれのロボットの制御装置に動作指令を出力するようにしてもよい。他にも動作指令生成装置１００と制御装置３１０とが一体となる形態であってもよい。 The motion command generation device 100 in the above embodiment includes a learning unit 112 and command generation units 113, 113A, and 113B, generates a command generation model 140, and outputs motion commands using the command generation model 140. The device that generates the command generation model 140 and the device that outputs the motion commands may be separate. For example, a model generation device including a learning unit 112 may generate the command generation model 140 and transmit it to multiple motion command devices, and each motion command device may output a motion command to the control device of each robot. Alternatively, the motion command generation device 100 and the control device 310 may be integrated.

本発明はその他の様々な実施形態を取ることが可能であり、さらに、本発明の要旨を逸脱しない範囲で、省略や置換等種々の変更を行うことができる。これら実施形態やその変形は、本明細書等に記載された発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 The present invention may take on various other embodiments, and various modifications, such as omissions and substitutions, may be made without departing from the spirit of the present invention. These embodiments and their variations are included within the scope and spirit of the invention described in this specification, etc., and are also included in the scope of the invention described in the claims and their equivalents.

１００動作指令生成装置
１１１取得部
１１２学習部
１１３，１１３Ａ，１１３Ｂ指令生成部
１２８プログラム
１３０学習データ
１４０指令生成モデル（機械学習モデル）
１４１位置抽出ブロック
１４２動作指令生成ブロック
３００ロボット
３８０作業対象（対象物） 100 Operation command generating device 111 Acquisition unit 112 Learning unit 113, 113A, 113B Command generating unit 128 Program 130 Learning data 140 Command generation model (machine learning model)
141 Position extraction block 142 Operation command generation block 300 Robot 380 Work object (object)

Claims

an acquisition unit that acquires images and sensor information including candidates for objects on which the robot will work;
a command generation unit that uses a machine learning model to input the image and the sensor information and outputs an operation command for the robot;
The machine learning model is
a position extraction block including parameters to be referenced when calculating a position or area of the candidate object; and an operation command generation block including parameters to be referenced when calculating an operation command for the robot to perform a task on the object,
an explanatory variable including the image and the sensor information, and a target variable including the operation command , the explanatory variable being generated using learning data;
The command generation unit
acquiring work object designation information that designates an object to be worked on by the robot from among the candidate objects;
The information on the position or area of the candidate object calculated using the position extraction block is replaced with the information on the position or area of the object indicated by the work object designation information, and the operation command is output.
1. A motion command generating device comprising:

The machine learning model is
Multiple machine learning models for each type of work,
The command generation unit
acquiring the type of work to be performed by the robot on the object corresponding to the work object designation information;
The operation command generating device according to claim 1, characterized in that a machine learning model corresponding to the task type among the plurality of machine learning models is used to input the image and the sensor information and output an operation command for the robot.

The command generation unit
The operation command generation device according to claim 1, wherein the machine learning model is used to input an image in which an area different from the area of the object indicated by the work object designation information is masked, and the operation command is output.

The command generation unit
When the variance calculated from the information on the position or area of the candidate object calculated using the position extraction block is greater than a predetermined value,
2. The action command generating device according to claim 1 , wherein information on the position or area of the candidate object calculated using the position extraction block is replaced with information on the position or area of the object indicated by the work target designation information, and the action command is output.

The command generation unit
The action command generating device according to claim 1, characterized in that the image is displayed on a display device, and information on a position or area in the image of a candidate object designated from among the displayed candidate objects is used as the work object designation information.

The command generation unit
selecting one of the position or area information of the candidate object calculated using the position extraction block as the work object designation information;
2. The action command generating device according to claim 1, wherein information on the position or area of the candidate object calculated using the position extraction block is replaced with information on the position or area of the object indicated by the work target designation information, and the action command is output.

The motion command generating device
acquiring image and sensor information including candidate objects for the robot to work on;
and using a machine learning model, outputting an operation command for the robot using the image and the sensor information as input.
The machine learning model is
a position extraction block including parameters to be referenced when calculating a position or area of the candidate object; and an operation command generation block including parameters to be referenced when calculating an operation command for the robot to perform a task on the object,
an explanatory variable including the image and the sensor information, and a target variable including the operation command , the explanatory variable being generated using learning data;
In the step of outputting the operation command,
acquiring work object designation information that designates an object to be worked on by the robot from among the candidate objects;
The information on the position or area of the candidate object calculated using the position extraction block is replaced with the information on the position or area of the object indicated by the work object designation information, and the operation command is output.
A motion command generating method comprising: