JP7575007B2

JP7575007B2 - Hand load estimation device, robot control system, and robot system

Info

Publication number: JP7575007B2
Application number: JP2021064141A
Authority: JP
Inventors: 諒松岡; 清石前川; 秀司梶田; 夏樹藤井; 英樹麻生; 昭太郎赤穂
Original assignee: Mitsubishi Electric Corp; National Institute of Advanced Industrial Science and Technology AIST
Current assignee: Mitsubishi Electric Corp; National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2021-04-05
Filing date: 2021-04-05
Publication date: 2024-10-29
Anticipated expiration: 2041-04-05
Also published as: JP2022159755A

Description

本開示は、ロボットの手先に加えられる負荷についてのパラメータを推定する手先負荷推定装置、ロボット制御システムおよびロボットシステムに関する。 The present disclosure relates to a hand load estimation device that estimates parameters regarding the load applied to a robot's hand, a robot control system, and a robot system.

産業用ロボット等のロボットの制御では、高速な動作および高精度な位置決めといった性能をロボットが発揮できるように、操作対象である手先に加えられる負荷についての質量または重心位置等のパラメータを適切に設定することが求められる。以下、ロボットの手先に加えられる負荷についてのパラメータを、手先負荷パラメータと称する。 When controlling industrial and other robots, it is necessary to appropriately set parameters such as the mass or center of gravity position of the load applied to the hand that is the object of operation so that the robot can perform high-speed operations and highly accurate positioning. Hereinafter, parameters related to the load applied to the robot's hand are referred to as hand load parameters.

特許文献１には、多関節のロボットについて、ロボットを動作させたときにおける各関節の状態量を基に手先負荷パラメータを同定する方法が開示されている。特許文献１の方法では、手先負荷パラメータについて線形化された式と、ロボットモデルに示される駆動トルクとに基づいて、ロボットの各関節の位置、速度、加速度、および、動作に要する電流といった状態量から手先負荷パラメータが同定される。特許文献１の方法によると、負荷の種類またはロボットの動作パターンを限定せずに、手先負荷パラメータを推定することができる。 Patent Document 1 discloses a method for identifying hand load parameters for a multi-joint robot based on the state quantities of each joint when the robot is operated. In the method of Patent Document 1, hand load parameters are identified from state quantities such as the position, speed, acceleration, and current required for operation of each joint of the robot, based on a linearized equation for the hand load parameters and the drive torque shown in the robot model. According to the method of Patent Document 1, hand load parameters can be estimated without limiting the type of load or the robot's operation pattern.

特許第４１３７６７３号公報Patent No. 4137673

特許文献１の方法では、手先負荷パラメータは、負荷の推定のために取得されたデータのみに基づいて推定される。このため、特許文献１の方法の場合、ロボットモデルのモデル化誤差の影響、または、ロボットモデルに含まれていない特性の影響によって、手先負荷パラメータの推定結果には、実際の手先負荷パラメータからの誤差が生じる場合がある。このため、特許文献１にかかる従来の技術によると、手先負荷パラメータの高精度な推定結果を得ることが困難であるという問題があった。 In the method of Patent Document 1, the hand load parameters are estimated based only on the data acquired for load estimation. For this reason, in the case of the method of Patent Document 1, the estimated hand load parameters may have errors from the actual hand load parameters due to the influence of modeling errors in the robot model or the influence of characteristics not included in the robot model. For this reason, the conventional technology of Patent Document 1 has the problem that it is difficult to obtain highly accurate estimates of the hand load parameters.

本開示は、上記に鑑みてなされたものであって、手先負荷パラメータの高精度な推定結果を得ることができる手先負荷推定装置を得ることを目的とする。 The present disclosure has been made in consideration of the above, and aims to provide a hand load estimation device that can obtain highly accurate estimation results of hand load parameters.

上述した課題を解決し、目的を達成するために、本開示にかかる手先負荷推定装置は、ロボットの手先に加えられる負荷についてのパラメータである手先負荷パラメータを推定する手先負荷推定装置である。本開示にかかる手先負荷推定装置は、ロボットの関節の位置、速度および加速度のうち少なくとも１つのデータである動きデータが逆動力学モデルに入力されることによって、逆動力学モデルを基に計算されるトルクである公称トルクの値を出力する逆動力学モデル部と、動きデータと、関節の実トルクおよび公称トルクの差分とを含む学習用データに基づいて、逆動力学モデルのモデル化誤差の影響と、ロボットの特性のうち逆動力学モデルに含まれていない特性の影響とが除かれた手先負荷パラメータを学習する学習部と、動きデータと、実トルクおよび公称トルクの差分とを含む推論用データが入力され、学習部による学習結果を基に手先負荷パラメータを推論する推論部と、を備える。 In order to solve the above-mentioned problems and achieve the object, the hand load estimation device according to the present disclosure is a hand load estimation device that estimates a hand load parameter that is a parameter regarding the load applied to the hand of a robot. The hand load estimation device according to the present disclosure includes an inverse dynamics model unit that outputs a nominal torque value that is a torque calculated based on an inverse dynamics model by inputting motion data, which is at least one of data on the position, speed, and acceleration of a joint of the robot, into the inverse dynamics model; a learning unit that learns a hand load parameter from which the influence of modeling errors of the inverse dynamics model and the influence of characteristics of the robot that are not included in the inverse dynamics model are removed, based on learning data including the motion data and the difference between the actual torque and the nominal torque of the joint; and an inference unit that receives inference data including the motion data and the difference between the actual torque and the nominal torque, and infers the hand load parameter based on the learning result by the learning unit.

本開示にかかる手先負荷推定装置は、手先負荷パラメータの高精度な推定結果を得ることができるという効果を奏する。 The hand load estimation device disclosed herein has the advantage of being able to obtain highly accurate estimation results of hand load parameters.

実施の形態１にかかる手先負荷推定装置の構成を示す図FIG. 1 is a diagram showing a configuration of a hand load estimation device according to a first embodiment; 実施の形態１にかかる手先負荷推定装置がＲＮＮを用いて手先負荷パラメータの教師あり学習を行う場合における手先負荷推定装置の構成を示す図FIG. 1 is a diagram showing a configuration of a hand load estimation device according to a first embodiment when the hand load estimation device performs supervised learning of hand load parameters using an RNN; 実施の形態１における学習に使用されるニューラルネットワークの構成を示す概念図FIG. 1 is a conceptual diagram showing a configuration of a neural network used for learning in the first embodiment. 図３に示すニューラルネットワークを構成するＬＳＴＭユニットの例を示す図FIG. 4 shows an example of an LSTM unit constituting the neural network shown in FIG. 3. 実施の形態１にかかる手先負荷推定装置がトルクセンサによって検出されたトルクの値を実トルクの値として用いる場合における手先負荷推定装置の構成を示す図FIG. 1 is a diagram showing a configuration of a hand load estimating device according to a first embodiment in a case where the hand load estimating device uses a torque value detected by a torque sensor as an actual torque value; 実施の形態２にかかる手先負荷推定装置の構成を示す図FIG. 1 is a diagram showing a configuration of a hand load estimation device according to a second embodiment; 実施の形態２における学習に使用されるＧＰＲについて説明するための図FIG. 11 is a diagram for explaining GPRs used in learning in the second embodiment. 実施の形態２における推論に使用されるヒストグラムフィルタについて説明するための図FIG. 13 is a diagram for explaining a histogram filter used in inference in the second embodiment. 実施の形態３にかかる手先負荷推定装置の構成を示す図FIG. 13 is a diagram showing a configuration of a hand load estimation device according to a third embodiment. 実施の形態４にかかる手先負荷推定装置の構成を示す図FIG. 13 is a diagram showing a configuration of a hand load estimation device according to a fourth embodiment. 実施の形態５にかかる手先負荷推定装置の構成を示す図FIG. 13 is a diagram showing a configuration of a hand load estimation device according to a fifth embodiment. 実施の形態６にかかる手先負荷推定装置の構成を示す図FIG. 13 is a diagram showing a configuration of a hand load estimation device according to a sixth embodiment. 実施の形態７にかかる手先負荷推定装置の構成を示す図FIG. 13 is a diagram showing a configuration of a hand load estimation device according to a seventh embodiment. 実施の形態７におけるベイズ線形回帰による確率分布の更新について説明するための概念図FIG. 23 is a conceptual diagram for explaining updating of a probability distribution by Bayesian linear regression in the seventh embodiment. 実施の形態８にかかるロボットシステムの構成を示す図FIG. 13 is a diagram showing a configuration of a robot system according to an eighth embodiment. 実施の形態１から７にかかる手先負荷推定装置を実現するハードウェアの第１の構成例を示す図FIG. 1 is a diagram showing a first example of a hardware configuration for realizing a hand load estimation device according to the first to seventh embodiments. 実施の形態１から７にかかる手先負荷推定装置を実現するハードウェアの第２の構成例を示す図FIG. 13 is a diagram showing a second example of the configuration of hardware for realizing the hand load estimation device according to the first to seventh embodiments.

以下に、実施の形態にかかる手先負荷推定装置、ロボット制御システムおよびロボットシステムを図面に基づいて詳細に説明する。 The hand load estimation device, robot control system, and robot system according to the embodiments are described in detail below with reference to the drawings.

実施の形態１．
図１は、実施の形態１にかかる手先負荷推定装置１の構成を示す図である。手先負荷推定装置１は、ロボットの手先に加えられる負荷についてのパラメータである手先負荷パラメータを推定する。実施の形態１において、ロボットは、産業用ロボットである。ロボットは、ロボットの可動部を動作させる複数の関節を有する。手先は、可動部の先端部分である。手先には、負荷が取り付けられる。負荷の例は、ハンドまたは工具といった着脱可能な部品である。 Embodiment 1.
FIG. 1 is a diagram showing a configuration of a hand load estimation device 1 according to a first embodiment. The hand load estimation device 1 estimates a hand load parameter, which is a parameter regarding a load applied to a hand of a robot. In the first embodiment, the robot is an industrial robot. The robot has a plurality of joints that operate a movable part of the robot. The hand is a tip part of the movable part. A load is attached to the hand. An example of the load is a detachable part such as a hand or a tool.

手先負荷推定装置１は、入力部１１、逆動力学モデル部１２、学習部１３、学習結果記憶部１４および推論部１５を有する。入力部１１には、ロボットの内部の状態量のデータが、ロボットの制御周期ごとに入力される。これにより、手先負荷推定装置１は、ロボットの内部の状態量の時系列データを取得する。状態量のデータは、制御周期の整数倍ごとに入力されても良い。 The hand load estimation device 1 has an input unit 11, an inverse dynamics model unit 12, a learning unit 13, a learning result storage unit 14, and an inference unit 15. Data on the internal state quantities of the robot are input to the input unit 11 for each control cycle of the robot. In this way, the hand load estimation device 1 obtains time series data on the internal state quantities of the robot. The state quantity data may be input for each integer multiple of the control cycle.

状態量のデータは、位置データと電流データとを含む。位置データは、関節の位置を表すデータである。ここでは、位置データは、関節の位置のフィードバック値とする。関節の位置とは、関節の回転方向における位置であって、例えば回転角により表される。電流データは、関節を回転させるモータに流れる電流を表すデータである。ここでは、電流データは、モータに流れる電流のフィードバック値とする。入力部１１には、各関節の位置フィードバック（ＦＢ）値である位置データと、各関節の電流フィードバック（ＦＢ）値である電流データが入力される。 The state quantity data includes position data and current data. The position data is data that represents the position of the joint. Here, the position data is a feedback value of the joint position. The position of the joint is the position in the rotational direction of the joint, and is represented by, for example, a rotation angle. The current data is data that represents the current flowing through the motor that rotates the joint. Here, the current data is a feedback value of the current flowing through the motor. Position data, which is the position feedback (FB) value of each joint, and current data, which is the current feedback (FB) value of each joint are input to the input unit 11.

学習部１３による学習の際に、入力部１１は、学習時の入力データとして、位置ＦＢ値の時系列データと電流ＦＢ値の時系列データとを取得する。推論部１５による推論の際に、入力部１１は、推論時の入力データとして、位置ＦＢ値の時系列データと電流ＦＢ値の時系列データとを取得する。図１では、学習時における入力部１１の入力データのイメージと、推論時における入力部１１の入力データのイメージとを示す。 When the learning unit 13 is learning, the input unit 11 acquires time series data of the position FB value and time series data of the current FB value as input data during learning. When the inference unit 15 is inferring, the input unit 11 acquires time series data of the position FB value and time series data of the current FB value as input data during inference. Figure 1 shows an image of the input data of the input unit 11 during learning and an image of the input data of the input unit 11 during inference.

各関節の位置θ、速度θ′および加速度θ′′は、入力部１１へ入力された位置データから計算される。位置データから位置θ、速度θ′および加速度θ′′を計算する手段の図示は省略する。逆動力学モデル部１２には、各関節の位置データから求まる各関節の位置θ、速度θ′および加速度θ′′のデータである動きデータが入力される。逆動力学モデルは、運動方程式を解くことによってトルクを求めるモデルである。逆動力学モデル部１２は、各関節の位置θ、速度θ′および加速度θ′′が逆動力学モデルに入力されることによって、各関節の公称トルクτ_ｍｄｌを出力する。公称トルクτ_ｍｄｌは、関節についての運動方程式から求まるトルクである。逆動力学モデルは、事前に手先負荷推定装置１に内蔵される。 The position θ, velocity θ' and acceleration θ" of each joint are calculated from the position data input to the input unit 11. Means for calculating the position θ, velocity θ' and acceleration θ" from the position data are omitted from the illustration. Movement data, which is data on the position θ, velocity θ' and acceleration θ" of each joint calculated from the position data of each joint, is input to the inverse dynamics model unit 12. The inverse dynamics model is a model that calculates torque by solving an equation of motion. The position θ, velocity θ' and acceleration θ" of each joint are input to the inverse dynamics model, and the inverse dynamics model unit 12 outputs a nominal torque τ _mdl of each joint. The nominal torque τ _mdl is a torque calculated from the equation of motion for the joint. The inverse dynamics model is built into the hand load estimation device 1 in advance.

逆動力学モデル部１２は、運動方程式を計算する際に、事前に設定された公称パラメータを参照する。公称パラメータとしては、ロボットの各リンクの質量、重心位置または慣性テンソルといったパラメータが設定される。公称パラメータには、ロボットの設計時の値を使用できる。または、公称パラメータには、位置データおよび電流データが取得されるロボットにより事前に同定された値を使用しても良い。 When calculating the equation of motion, the inverse dynamics model unit 12 refers to pre-set nominal parameters. Parameters such as the mass, center of gravity position, or inertia tensor of each link of the robot are set as the nominal parameters. Values at the time of designing the robot can be used for the nominal parameters. Alternatively, values previously identified by the robot from which position data and current data are acquired can be used for the nominal parameters.

学習部１３による学習の際に、逆動力学モデル部１２は、学習時の入力データから得られた位置θ、速度θ′および加速度θ′′を基に公称トルクτ_ｍｄｌを計算する。推論部１５による推論の際に、逆動力学モデル部１２は、推論時の入力データから得られた位置θ、速度θ′および加速度θ′′を基に公称トルクτ_ｍｄｌを計算する。なお、図１では、学習時のデータフローと推論時のデータフローとを分けて示す都合上、逆動力学モデル部１２を２つに分けて示す。 During learning by the learning unit 13, the inverse dynamics model unit 12 calculates the nominal torque τ _{mdl based on the position θ, velocity θ', and acceleration θ" obtained from the input data during learning. During inference by the inference unit 15, the inverse dynamics model unit 12 calculates the nominal torque τ mdl} _based on the position θ, velocity θ', and acceleration θ" obtained from the input data during inference. Note that in FIG. 1, for the convenience of separately showing the data flow during learning and the data flow during inference, the inverse dynamics model unit 12 is shown as being divided into two.

実トルクτ_ｒｅａｌは、入力部１１へ入力された電流データにトルク定数を乗じることによって求まる。すなわち、手先負荷推定装置１は、電流データにトルク定数を乗じた値を実トルクのτ_ｒｅａｌ値として用いる。手先負荷推定装置１は、実トルクτ_ｒｅａｌと、逆動力学モデル部１２から出力された公称トルクτ_ｍｄｌとの差分τ_ｒｅａｌ－τ_ｍｄｌを計算する。 The actual torque τ _real is obtained by multiplying the current data input to the input unit 11 by a torque constant. That is, the hand load estimation device 1 uses the value obtained by multiplying the current data by the torque constant as the τ _real value of the actual torque. The hand load estimation device 1 calculates the difference τ _real -τ _mdl between the actual torque τ _real and the nominal torque τ _mdl output from the inverse dynamics model unit 12.

学習部１３には、学習用データとして、学習時の入力データから得られた動きデータである各関節の位置θ、速度θ′および加速度θ′′と、学習時の入力データに基づく各関節の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。学習部１３には、各関節についての学習用データが入力される。学習部１３は、逆動力学モデルのモデル化誤差の影響と、ロボットの特性のうち逆動力学モデルに含まれていない特性の影響とが除かれた手先負荷パラメータを、学習用データを用いて学習する。以下、ロボットの特性のうち逆動力学モデルに含まれていない特性を、モデル外特性と称する。学習部１３は、学習結果を学習結果記憶部１４に保存する。 The learning unit 13 receives as learning data the position θ, velocity θ', and acceleration θ'' of each joint, which is movement data obtained from the input data during learning, and the difference τ _real - τ _mdl of each joint based on the input data during learning. The learning data for each joint is input to the learning unit 13. Using the learning data, the learning unit 13 learns hand load parameters from which the influence of modeling errors in the inverse dynamics model and the influence of characteristics of the robot that are not included in the inverse dynamics model have been removed. Hereinafter, characteristics of the robot that are not included in the inverse dynamics model are referred to as outside-model characteristics. The learning unit 13 stores the learning results in the learning result storage unit 14.

推論部１５には、推論用データとして、推論時の入力データから得られた動きデータである各関節の位置θ、速度θ′および加速度θ′′と、推論時の入力データに基づく各関節の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。推論部１５には、各関節についての推論用データが入力される。また、推論部１５は、学習部１３による学習結果を学習結果記憶部１４から読み出す。推論部１５は、学習部１３による学習結果を基に、手先負荷パラメータを推定する。図１に示す構成において、手先負荷推定装置１は、学習部１３における学習方法として任意の方法を適用することができる。手先負荷パラメータは、負荷の質量、負荷の重心位置、および負荷の慣性テンソルの各パラメータのうち少なくとも１つである。なお、逆動力学モデル部１２、学習部１３および推論部１５の各々へ入力される動きデータは、位置θ、速度θ′および加速度θ′′の全てについてのデータに限られず、位置θ、速度θ′および加速度θ′′のうち少なくとも１つのデータであれば良い。 The inference unit 15 receives, as inference data, the position θ, velocity θ', and acceleration θ'' of each joint, which is motion data obtained from the input data at the time of inference, and the difference τ _real -τ _mdl of each joint based on the input data at the time of inference. The inference data for each joint is input to the inference unit 15. The inference unit 15 also reads out the learning result by the learning unit 13 from the learning result storage unit 14. The inference unit 15 estimates hand load parameters based on the learning result by the learning unit 13. In the configuration shown in FIG. 1, the hand load estimation device 1 can apply any method as the learning method in the learning unit 13. The hand load parameter is at least one of the parameters of the mass of the load, the center of gravity position of the load, and the inertia tensor of the load. In addition, the motion data input to each of the inverse dynamics model unit 12, the learning unit 13 and the inference unit 15 is not limited to data regarding all of the position θ, velocity θ' and acceleration θ'', but may be data regarding at least one of the position θ, velocity θ' and acceleration θ''.

推論部１５は、推論用データとして、ロボットの動作開始直後における位置ＦＢ値に基づいた動きデータと、ロボットの動作開始直後における電流ＦＢ値に基づいた差分τ_ｒｅａｌ－τ_ｍｄｌとを使用する。これにより、手先負荷推定装置１は、ロボットが動作を開始したときから早い時点において手先負荷パラメータを推定することができる。また、ロボットが動作を開始したときから早い時点において、各関節を制御するための指令値に手先負荷パラメータの推定結果を反映させることが可能となる。 The inference unit 15 uses, as inference data, the motion data based on the position FB value immediately after the robot starts to move, and the difference τ _real -τ _mdl based on the current FB value immediately after the robot starts to move. This allows the hand load estimation device 1 to estimate the hand load parameter at an early point in time after the robot starts to move. Also, it becomes possible to reflect the estimation result of the hand load parameter in the command value for controlling each joint at an early point in time after the robot starts to move.

次に、学習方法の具体的な例について説明する。ここでは、ＲＮＮ(Recurrent Neural Network)を用いて手先負荷パラメータの教師あり学習を行う場合について説明する。図２は、実施の形態１にかかる手先負荷推定装置１がＲＮＮを用いて手先負荷パラメータの教師あり学習を行う場合における手先負荷推定装置１の構成を示す図である。 Next, a specific example of the learning method will be described. Here, a case where supervised learning of hand load parameters is performed using an RNN (Recurrent Neural Network) will be described. Figure 2 is a diagram showing the configuration of the hand load estimation device 1 according to the first embodiment when the hand load estimation device 1 performs supervised learning of hand load parameters using an RNN.

図２に示す手先負荷推定装置１の学習部１３ａには、学習用データとして、各関節の位置θ、速度θ′および加速度θ′′と、学習時の入力データに基づく各関節の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。また、学習部１３ａには、教師信号が入力される。教師信号は、手先負荷パラメータの正解ラベルである。正解ラベルは、逆動力学モデルのモデル化誤差とモデル外特性とによる誤差を含まない手先負荷パラメータである。学習部１３ａは、位置θ、速度θ′および加速度θ′′と差分τ_ｒｅａｌ－τ_ｍｄｌとの入力に対する出力が正解ラベルに一致するようなニューラルネットワークを構築し、学習結果記憶部１４ａに学習結果を保存する。 The learning unit 13a of the hand load estimation device 1 shown in FIG. 2 receives as learning data the position θ, velocity θ', and acceleration θ'' of each joint, and the difference τ _real -τ _mdl of each joint based on the input data during learning. A teacher signal is also input to the learning unit 13a. The teacher signal is a correct answer label of the hand load parameter. The correct answer label is a hand load parameter that does not include errors due to modeling errors of the inverse dynamics model and out-of-model characteristics. The learning unit 13a constructs a neural network in which the output for the input of the position θ, velocity θ', acceleration θ'' and the difference τ _real -τ _mdl matches the correct answer label, and stores the learning result in the learning result storage unit 14a.

ＲＮＮの例としては、Ｅｌｍａｎ／Ｊｏｒｄａｎｎｅｔ、ＬＳＴＭ（Long Short Term Memory）、ＧＲＵ（Gated Recurrent Unit）などがあり、いずれを用いても良い。実施の形態１では、ＬＳＴＭを用いる場合を例に挙げて説明する。 Examples of RNNs include Elman/Jordan net, LSTM (Long Short Term Memory), GRU (Gated Recurrent Unit), etc., and any of them may be used. In the first embodiment, the case where LSTM is used will be explained as an example.

図３は、実施の形態１における学習に使用されるニューラルネットワークの構成を示す概念図である。ニューラルネットワークは、入力層２１、中間層２２および出力層２３を備える。入力層２１は、入力されるデータの要素の数、すなわち入力数に応じた数のニューロン２４を備える。図３では入力層２１における一番上のニューロン２４にのみ符号を付しているが、入力層２１において符号を省略した丸もニューロン２４である。図３ではニューロン２４の数を３としているが、ニューロン２４の数は３に限定されない。 Figure 3 is a conceptual diagram showing the configuration of a neural network used for learning in embodiment 1. The neural network includes an input layer 21, an intermediate layer 22, and an output layer 23. The input layer 21 includes neurons 24, the number of which corresponds to the number of elements of the input data, i.e., the number of inputs. In Figure 3, only the topmost neuron 24 in the input layer 21 is labeled, but circles in the input layer 21 with no label are also neurons 24. In Figure 3, the number of neurons 24 is three, but the number of neurons 24 is not limited to three.

中間層２２は、複数のニューロン２５を備える。各ニューロン２５は、ＬＳＴＭユニットである。図３では中間層２２における一番上のニューロン２５にのみ符号を付しているが、中間層２２において符号を省略した二重丸もニューロン２５である。また、図３では中間層を１層としているが、中間層は２層以上でも良い。図３ではニューロン２５の数を５としているが、ニューロン２５の数は５に限定されない。 The intermediate layer 22 includes a plurality of neurons 25. Each neuron 25 is an LSTM unit. In FIG. 3, only the topmost neuron 25 in the intermediate layer 22 is labeled, but double circles without a label in the intermediate layer 22 are also neurons 25. In FIG. 3, the intermediate layer is one layer, but the intermediate layer may be two or more layers. In FIG. 3, the number of neurons 25 is five, but the number of neurons 25 is not limited to five.

出力層２３は、出力されるデータの要素の数、すなわち出力数に応じた数のニューロン２６を備える。図３では出力層２３における一番上のニューロン２６にのみ符号を付しているが、出力層２３において符号を省略した丸もニューロン２６である。図３ではニューロン２６の数を３としているが、ニューロン２６の数は３に限定されない。 The output layer 23 has neurons 26, the number of which corresponds to the number of data elements to be output, i.e., the number of outputs. In FIG. 3, only the topmost neuron 26 in the output layer 23 is labeled, but circles in the output layer 23 with no label are also neurons 26. In FIG. 3, the number of neurons 26 is three, but the number of neurons 26 is not limited to three.

入力層２１は、入力されたデータを中間層２２へ出力する。詳細には、各ニューロン２４は、入力されたデータを中間層２２の各ニューロン２５へ出力する。各ニューロン２４は、入力された値に重みを乗算してから中間層２２の各ニューロン２５へ出力する。中間層２２の各ニューロン２５は、ＬＳＴＭの処理を行い、処理結果を出力層２３の各ニューロン２６へ出力する。出力層２３の各ニューロン２６は、中間層２２の各ニューロン２５から出力されたデータを加算し、加算結果を出力する。出力層２３の各ニューロン２６は、中間層２２の各ニューロン２５から出力されたデータに重みを乗算し、重みを乗算した結果を加算する。 The input layer 21 outputs the input data to the intermediate layer 22. In detail, each neuron 24 outputs the input data to each neuron 25 in the intermediate layer 22. Each neuron 24 multiplies the input value by a weight and then outputs it to each neuron 25 in the intermediate layer 22. Each neuron 25 in the intermediate layer 22 performs LSTM processing and outputs the processing result to each neuron 26 in the output layer 23. Each neuron 26 in the output layer 23 adds the data output from each neuron 25 in the intermediate layer 22 and outputs the addition result. Each neuron 26 in the output layer 23 multiplies the data output from each neuron 25 in the intermediate layer 22 by a weight and adds the results of multiplication by the weights.

ＲＮＮにおいて、入力値を要素とするベクトルをｘ_ｔ、中間層２２の出力値を要素とするベクトルをｈ_ｔとした場合に、ベクトルｈ_ｔは、次の式（１）で表すことができる。ｔは、制御周期を単位として離散化された時刻であって、何番目の制御周期であるかを示す整数である。ベクトルｘ_ｔの次元をＭ、ベクトルｈ_ｔの次元をＮとするとき、Ｗは、Ｎ×Ｍの次元の線形変換行列であり、Ｒは、Ｎ×Ｎの次元の線形変換行列であり、ｂは、バイアスベクトルである。ｇ（・）は、活性化関数を示す。 In an RNN, if a vector whose elements are input values is _xt and a vector whose elements are output values of the intermediate layer 22 is _ht , the vector _ht can be expressed by the following formula (1). t is a time discretized in units of a control period and is an integer indicating the ordinal control period. If the dimension of vector _xt is M and the dimension of vector _ht is N, W is an N×M dimensional linear transformation matrix, R is an N×N dimensional linear transformation matrix, and b is a bias vector. g(.) indicates an activation function.

ｘ_ｔ ^１，ｘ_ｔ ^２，ｘ_ｔ ^３は、ベクトルｘ_ｔの要素である入力値とする。ｈ_ｔ ^１，ｈ_ｔ ^２，ｈ_ｔ ^３，ｈ_ｔ ^４，ｈ_ｔ ^５は、ベクトルｈ_ｔの要素である出力値とする。ｙ_ｔ ^１，ｙ_ｔ ^２，ｙ_ｔ ^３は、ベクトルｙ_ｔの要素である出力値とする。ベクトルｙ_ｔは、出力層２３の出力値を要素とするベクトルとする。 ^Let _xt1 , _xt2 , and _xt3 be input values that are elements of vector _xt ^. _Let ^ht1 , ^ht2 , _ht3 , _ht4 , _and _ht5 be output values that are ^elements of vector _ht . Let ^yt1 , _yt2 , and ^yt3 be output values that are elements of vector _yt . ^Let vector _yt be ^a _vector whose elements are the output ^values of _the output ^layer 23.

図４は、図３に示すニューラルネットワークを構成するＬＳＴＭユニットの例を示す図である。図４に示す構成例では、ＬＳＴＭユニットでは、以下の式（２）－（７）に示す計算が行われる。 Figure 4 is a diagram showing an example of an LSTM unit that constitutes the neural network shown in Figure 3. In the configuration example shown in Figure 4, the LSTM unit performs the calculations shown in the following equations (2)-(7).

Ｗ_ｆ，Ｗ_ｉ，Ｗ_ｚ，Ｗ_ｏは、Ｎ×Ｍの次元の線形変換行列であり、Ｒ_ｆ，Ｒ_ｉ，Ｒ_ｚ，Ｒ_ｏは、Ｎ×Ｎの次元の線形変換行列であり、ｂ_ｆ，ｂ_ｉ，ｂ_ｚ，ｂ_ｏは、バイアスベクトルである。これらは、学習部１３ａによる学習により決定される。σ（・）は、活性化関数として使用されるシグモイド関数である。tanh（・）は、活性化関数として使用されるハイパボリックタンジェント関数を示す。 _Wf , _Wi , _Wz , and _Wo are linear transformation matrices of dimensions NxM, _Rf , _Ri , _Rz , and _Ro are linear transformation matrices of dimensions NxN, and _bf , _bi , _bz , and _bo are bias vectors. These are determined by learning by the learning unit 13a. σ(·) is a sigmoid function used as an activation function. tanh(·) is a hyperbolic tangent function used as an activation function.

実施の形態１において、入力層２１には、位置θ、速度θ′、加速度θ′′および差分τ_ｒｅａｌ－τ_ｍｄｌの各データが入力される。ベクトルｘ_ｔは、位置θ、速度θ′、加速度θ′′および差分τ_ｒｅａｌ－τ_ｍｄｌの各値を要素とするベクトルである。６軸の垂直多関節ロボットの場合、ベクトルｘ_ｔの次元は２４である。入力されるデータは、ロボットの全ての関節のデータに限られず、一部の関節のデータであっても良い。入力されるデータは、ロボットの全ての関節のうち動作中の関節のデータのみなどとしても良い。学習部１３ａは、入力されるデータを一部の関節のデータのみとすることによって、ベクトルｘ_ｔの次元を減少させても良い。 In the first embodiment, the input layer 21 receives data on the position θ, velocity θ', acceleration θ'', and difference τ _real -τ _mdl . The vector x _t is a vector whose elements are the values of the position θ, velocity θ', acceleration θ'', and difference τ _real -τ _mdl . In the case of a six-axis vertical articulated robot, the dimension of the vector x _t is 24. The input data is not limited to data on all joints of the robot, and may be data on some of the joints. The input data may be data on only those joints that are in operation out of all joints of the robot. The learning unit 13a may reduce the dimension of the vector x _t by inputting data on only some of the joints.

実施の形態１において、出力層２３は、位置θ、速度θ′、加速度θ′′および差分τ_ｒｅａｌ－τ_ｍｄｌのデータから推定された手先負荷パラメータの値を出力する。ベクトルｙ_ｔは、時刻ｔにおけるデータから推定された手先負荷パラメータの各値を要素とするベクトルである。手先負荷パラメータは、負荷の質量、負荷の重心を示すＸＹＺ座標、慣性テンソルの対角成分Ｉ_ｘｘ，Ｉ_ｙｙ，Ｉ_ｚｚ、および慣性テンソルの非対角成分Ｉ_ｘｙ，Ｉ_ｙｚ，Ｉ_ｚｘといったパラメータである。これらのパラメータの全てを推定する場合、ベクトルｙ_ｔの次元は１０である。学習部１３ａは、出力されるデータをこれらのパラメータのうちの一部のパラメータのデータのみとすることによって、ベクトルｙ_ｔの次元を減少させても良い。 In the first embodiment, the output layer 23 outputs values of hand load parameters estimated from data of the position θ, velocity θ', acceleration θ'', and difference τ _real -τ _mdl . The vector y _t is a vector whose elements are the values of the hand load parameters estimated from data at time t. The hand load parameters are parameters such as the mass of the load, the XYZ coordinates indicating the center of gravity of the load, the diagonal components I _xx , I _yy , and I _zz of the inertia tensor, and the off-diagonal components I _xy , I _yz , and I _zx of the inertia tensor. When all of these parameters are estimated, the dimension of the vector y _t is 10. The learning unit 13a may reduce the dimension of the vector y _t by outputting data of only some of these parameters.

学習部１３ａは、中間層２２にＬＳＴＭユニットを適用することによって、時系列データの過去の値に依存する特徴量を抽出可能とするニューラルネットワークを構築する。また、学習部１３ａは、学習用データの入力に対する出力が手先負荷パラメータの正解値に一致するように、各ニューロン２４，２５，２６の重み係数を調整する。これにより、学習部１３ａは、モデル化誤差とモデル外特性との影響を吸収可能なＲＮＮの学習済モデルを生成することができる。学習結果記憶部１４ａは、学習結果である学習済モデルを記憶する。 The learning unit 13a applies an LSTM unit to the intermediate layer 22 to construct a neural network capable of extracting features that depend on past values of time-series data. The learning unit 13a also adjusts the weighting coefficients of each neuron 24, 25, 26 so that the output for the input of learning data matches the correct value of the hand load parameter. This allows the learning unit 13a to generate a trained model of an RNN that can absorb the effects of modeling errors and characteristics outside the model. The learning result storage unit 14a stores the trained model, which is the result of learning.

推論部１５ａは、学習結果記憶部１４ａからＲＮＮの学習済モデルを読み出す。推論部１５ａでは、推論用データである位置θ、速度θ′、加速度θ′′および差分τ_ｒｅａｌ－τ_ｍｄｌの各データが学習済モデルへ入力される。推論部１５ａは、入力に対する学習済モデルの出力を、手先負荷パラメータの推定結果として出力する。 The inference unit 15a reads out the trained model of the RNN from the learning result storage unit 14a. In the inference unit 15a, each piece of data for inference, namely, the position θ, the velocity θ', the acceleration θ'' and the difference τ _real -τ _mdl, is input to the trained model. The inference unit 15a outputs the output of the trained model in response to the input as an estimation result of the hand load parameter.

実施の形態１によると、手先負荷推定装置１は、逆動力学モデルのモデル化誤差の影響とモデル外特性の影響とが除かれた手先負荷パラメータを学習し、学習結果を基に手先負荷パラメータを推論する。手先負荷推定装置１は、モデル化誤差の影響とモデル外特性の影響とによる誤差が低減された手先負荷パラメータを推定可能であることによって、ロボットの動作開始直後における手先負荷パラメータの高精度な推定が可能となる。これにより、手先負荷推定装置１は、手先負荷パラメータの高精度な推定結果を得ることができるという効果を奏する。 According to the first embodiment, the hand load estimation device 1 learns hand load parameters from which the influence of modeling errors in the inverse dynamics model and the influence of characteristics outside the model have been removed, and infers the hand load parameters based on the learning results. The hand load estimation device 1 is capable of estimating hand load parameters from which errors due to the influence of modeling errors and the influence of characteristics outside the model have been reduced, thereby enabling highly accurate estimation of the hand load parameters immediately after the robot starts to operate. This provides the effect that the hand load estimation device 1 can obtain highly accurate estimation results of the hand load parameters.

位置データは、各関節の位置のフィードバック値としたが、これに限定されない。位置データは、各関節を制御するための指令値であっても良い。手先負荷推定装置１は、ロボット制御装置から指令値を取得する。 The position data is a feedback value of the position of each joint, but is not limited to this. The position data may be a command value for controlling each joint. The hand load estimation device 1 obtains the command value from the robot control device.

手先負荷推定装置１は、電流データにトルク定数を乗じた値を実トルクのτ_ｒｅａｌの値として用いることとしたが、これに限定されない。手先負荷推定装置１は、ロボットに備えられたトルクセンサによって検出されたトルクの値を、実トルクτ_ｒｅａｌの値として用いても良い。 The hand load estimation device 1 uses the value obtained by multiplying the current data by the torque constant as the value of the real torque τ _real , but is not limited to this. The hand load estimation device 1 may use the value of the torque detected by a torque sensor provided in the robot as the value of the real torque τ _real .

図５は、実施の形態１にかかる手先負荷推定装置１がトルクセンサ２７によって検出されたトルクの値を実トルクの値として用いる場合における手先負荷推定装置１の構成を示す図である。トルクセンサ２７は、ロボットの手先に実際に加わっているトルクを検出し、検出値を出力する。図５に示す手先負荷推定装置１では、図１に示す手先負荷推定装置１の電流データの入力とトルク定数の乗算とに代えて、トルクセンサ２７からの検出値が入力される。図５における「トルク出力」は、トルクセンサ２７から出力される検出値を表す。 Figure 5 is a diagram showing the configuration of the hand load estimation device 1 in the case where the hand load estimation device 1 according to the first embodiment uses the torque value detected by the torque sensor 27 as the actual torque value. The torque sensor 27 detects the torque actually applied to the robot's hand and outputs the detected value. In the hand load estimation device 1 shown in Figure 5, instead of inputting the current data of the hand load estimation device 1 shown in Figure 1 and multiplying it by the torque constant, the detected value from the torque sensor 27 is input. "Torque output" in Figure 5 represents the detected value output from the torque sensor 27.

学習部１３による学習の際に、入力部１１は、学習時の入力データとして、トルクの検出値を取得する。推論部１５による推論の際に、入力部１１は、推論時の入力データとして、トルクの検出値を取得する。手先負荷推定装置１は、トルクの検出値である実トルクτ_ｒｅａｌと、逆動力学モデル部１２から出力された公称トルクτ_ｍｄｌとの差分τ_ｒｅａｌ－τ_ｍｄｌを計算する。この場合も、手先負荷推定装置１は、手先負荷パラメータの高精度な推定結果を得ることができる。図２に示す手先負荷推定装置１においても、トルクの検出値を、実トルクτ_ｒｅａｌの値として用いても良い。 During learning by the learning unit 13, the input unit 11 acquires a detected torque value as input data during learning. During inference by the inference unit 15, the input unit 11 acquires a detected torque value as input data during inference. The hand load estimation device 1 calculates a difference τ _real - _{τ mdl} between the actual torque τ _real , which is the detected torque value, and the nominal torque τ _mdl output from the inverse dynamics model unit 12. In this case as well, the hand load estimation device 1 can obtain a highly accurate estimation result of the hand load parameter. In the hand load estimation device 1 shown in FIG. 2 as well, the detected torque value may be used as the value of the actual torque τ _real .

学習部１３ａが用いるニューラルネットワークはＲＮＮとしたが、これに限定されない。学習部１３ａは、フィードフォワード型、すなわち順伝搬型のニューラルネットワークを用いても良い。すなわち、学習部１３ａは、時刻ｔの中間層２２の出力を再帰的に次の時刻の中間層２２への入力とするニューラルネットワークではなく、各関節の位置θ、速度θ′加速度θ′′および差分τ_ｒｅａｌ－τ_ｍｄｌの時系列データの入力のみによって出力が決まるニューラルネットワークを用いて、手先負荷パラメータの教師あり学習を行っても良い。この場合、学習部１３ａは、時刻ｔのデータを取得するごとにニューラルネットワークに逐次データを入力するのではなく、あらかじめ定められたバッチサイズのデータが蓄積されてからニューラルネットワークにデータを入力する。 Although the neural network used by the learning unit 13a is an RNN, this is not limiting. The learning unit 13a may use a feedforward type, i.e., forward propagation type, neural network. That is, the learning unit 13a may perform supervised learning of the hand load parameters using a neural network in which the output is determined only by the input of time series data of the position θ, velocity θ′, acceleration θ″, and difference τ _real -τ _mdl of each joint, instead of a neural network in which the output of the intermediate layer 22 at time t is recursively input to the intermediate layer 22 at the next time. In this case, the learning unit 13a inputs data to the neural network after a predetermined batch size of data is accumulated, rather than sequentially inputting data to the neural network every time data is acquired at time t.

実施の形態１では、学習部１３，１３ａは、モデル化誤差の影響とモデル外特性の影響とが除かれた手先負荷パラメータを学習する。すなわち、学習部１３，１３ａは、手先負荷パラメータの学習と、手先負荷パラメータに対応するトルクの誤差の学習とを、１つのニューラルネットワークによってまとめて行う。学習部１３，１３ａは、手先負荷パラメータを学習するニューラルネットワークと、トルクの誤差を学習するニューラルネットワークとを分けても良い。この場合、手先負荷パラメータを学習するニューラルネットワークは、位置θ、速度θ′、加速度θ′′および差分τ_ｒｅａｌ－τ_ｍｄｌが入力されることによって出力される手先負荷パラメータが正解ラベルに一致するように構築される。トルクの誤差を学習するニューラルネットワークは、位置θ、速度θ′、加速度θ′′および手先負荷パラメータが入力されることによって出力されるトルク誤差が正解ラベルである差分τ_ｒｅａｌ－τ_ｍｄｌに一致するように構築される。手先負荷推定装置１は、手先負荷パラメータの推論結果を、トルク誤差の推論結果を基に補正する。この場合も、手先負荷推定装置１は、モデル化誤差の影響とモデル外特性の影響とが除かれた手先負荷パラメータを推定可能とし、手先負荷パラメータの高精度な推定結果を得ることができる。 In the first embodiment, the learning unit 13, 13a learns the hand load parameter from which the influence of the modeling error and the influence of the out-of-model characteristic are removed. That is, the learning unit 13, 13a learns the hand load parameter and the torque error corresponding to the hand load parameter together by using one neural network. The learning unit 13, 13a may separate the neural network for learning the hand load parameter from the neural network for learning the torque error. In this case, the neural network for learning the hand load parameter is constructed so that the hand load parameter output by inputting the position θ, the velocity θ', the acceleration θ'' and the difference τ _real -τ _mdl coincides with the correct label. The neural network for learning the torque error is constructed so that the torque error output by inputting the position θ, the velocity θ', the acceleration θ'' and the hand load parameter coincides with the correct label, the difference τ _real -τ _mdl . The hand load estimation device 1 corrects the inference result of the hand load parameter based on the inference result of the torque error. In this case, too, the hand load estimation device 1 can estimate the hand load parameter from which the influence of the modeling error and the influence of the characteristics outside the model are removed, and can obtain a highly accurate estimation result of the hand load parameter.

実施の形態２．
実施の形態２では、ＧＰＲ(Gaussian Process Regression)を用いて、手先に負荷が加えられていないときのデータを用いて学習を行う場合について説明する。以下、手先に負荷が加えられていないことを、無負荷と称する。図６は、実施の形態２にかかる手先負荷推定装置２の構成を示す図である。実施の形態２では、実施の形態１と同一の構成要素には同一の符号を付し、実施の形態１とは異なる構成について主に説明する。 Embodiment 2.
In the second embodiment, a case will be described in which learning is performed using data when no load is applied to the hand using GPR (Gaussian Process Regression). Hereinafter, no load is applied to the hand, which is referred to as "no load." Fig. 6 is a diagram showing the configuration of a hand load estimation device 2 according to the second embodiment. In the second embodiment, the same components as those in the first embodiment are given the same reference numerals, and the configuration different from the first embodiment will be mainly described.

手先負荷推定装置２は、入力部１１、逆動力学モデル部１２，１２ｂ、学習部１３ｂ、学習結果記憶部１４ｂおよび推論部１５ｂを有する。学習部１３ｂによる学習の際に、逆動力学モデル部１２は、公称パラメータが設定された逆動力学モデルを基に、無負荷時の位置θ、速度θ′および加速度θ′′から公称トルクτ_ｍｄｌを計算する。学習部１３ｂには、学習用データとして、無負荷時の位置θ、速度θ′および加速度θ′′と、無負荷時の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。学習部１３ｂは、無負荷でのモデル化誤差およびモデル外特性の影響が除かれた手先負荷パラメータを、学習用データを用いて学習する。学習部１３ｂは、学習結果を学習結果記憶部１４ｂに保存する。 The hand load estimation device 2 has an input unit 11, inverse dynamics model units 12, 12b, a learning unit 13b, a learning result storage unit 14b, and an inference unit 15b. When the learning unit 13b performs learning, the inverse dynamics model unit 12 calculates a nominal torque τ mdl from the position θ, velocity θ', and acceleration θ'' at no load, based on an inverse dynamics model in which nominal parameters are set. The position θ, velocity θ', and acceleration θ'' at no load, and the difference τ _real -τ _mdl at no _load are input to the learning unit 13b as learning data. The learning unit 13b learns hand load parameters from which modeling errors at no load and the influence of characteristics outside the model have been removed, using the learning data. The learning unit 13b stores the learning results in the learning result storage unit 14b.

推論部１５ｂによる推論の際に、逆動力学モデル部１２ｂは、補正された逆動力学モデルを基に、推論用データである位置θ、速度θ′および加速度θ′′から、無負荷と仮定した場合における推定トルクτ_ｅｓｔを計算する。推論部１５ｂには、推論用データとして、各関節の位置θ、速度θ′および加速度θ′′と、差分τ_ｒｅａｌ－τ_ｅｓｔとが入力される。また、推論部１５ｂは、学習部１３ｂによる学習結果を学習結果記憶部１４ｂから読み出す。推論部１５ｂは、学習部１３ｂによる学習結果を基に、手先負荷パラメータを推定する。 During inference by the inference unit 15b, the inverse dynamics model unit 12b calculates an estimated torque τ est assuming no load from the position θ, velocity θ', and acceleration θ'', which are inference data, based on the corrected inverse dynamics model. The position θ, velocity θ', and acceleration θ'' of each joint and _the difference τ _real -τ _est are input to the inference unit 15b as inference data. The inference unit 15b also reads out the learning results by the learning unit 13b from the learning result storage unit 14b. The inference unit 15b estimates hand load parameters based on the learning results by the learning unit 13b.

図７は、実施の形態２における学習に使用されるＧＰＲについて説明するための図である。ＧＰＲは、非線形な関数ｙ＝ｆ（ｘ）の推定に用いられる手法である。ＧＰＲでは、観測値であるｙが超多次元のガウス分布Ｎ（ｕ，Ｓ）に従うと仮定して、ベイズ推定における尤度ｐ（ｆ｜ｚ）が最大となるようにカーネル関数Ｓの内部パラメータｚを調整する。ガウス分布Ｎ（ｕ，Ｓ）は、ある平均値に相当する平均関数ｕ（ｘ）と、ある分散値に相当するカーネル関数Ｓ（ｘ，ｘ’）とによって決まる。推定された関数ｆ～Ｎ（ｕ，Ｓ）は、関数の分布として出力される。カーネル関数Ｓが大きい領域では、入力ｘに対する出力ｙの不確実性が大きくなる。次の式（８）は、推定された関数ｆ～Ｎ（ｕ，Ｓ）を表したものである。 Figure 7 is a diagram for explaining GPR used for learning in the second embodiment. GPR is a method used to estimate a nonlinear function y = f(x). In GPR, it is assumed that the observed value y follows a multidimensional Gaussian distribution N(u, S), and the internal parameter z of the kernel function S is adjusted so that the likelihood p(f|z) in Bayesian estimation is maximized. The Gaussian distribution N(u, S) is determined by a mean function u(x) corresponding to a certain average value and a kernel function S(x, x') corresponding to a certain variance value. The estimated function f~N(u, S) is output as a distribution of functions. In a region where the kernel function S is large, the uncertainty of the output y for the input x becomes large. The following formula (8) represents the estimated function f~N(u, S).

ＧＰＲを備える学習部１３ｂは、学習用データである無負荷時の位置θ、速度θ′および加速度θ′′を入力とし、差分τ_ｒｅａｌ－τ_ｍｄｌを出力とする関数の分布を学習する。学習部１３ｂは、得られた関数の分布を学習結果記憶部１４ｂに保存する。学習用データは、位置θ、速度θ′および加速度θ′′のうちの一部のデータのみであっても良い。学習部１３ｂには、位置θ、速度θ′および加速度θ′′のうち例えば速度θ′のデータのみが学習用データとして入力されても良い。このように、学習部１３ｂは、入力されるデータを位置θ、速度θ′および加速度θ′′のうちの一部のデータのみに限定することによって、入力されるデータの次元を減少させても良い。 The learning unit 13b equipped with a GPR receives as input the position θ, velocity θ', and acceleration θ" at no load, which are learning data, and learns the distribution of a function that outputs the difference τ _real -τ _mdl . The learning unit 13b stores the obtained distribution of the function in the learning result storage unit 14b. The learning data may be only a portion of the data among the position θ, velocity θ', and acceleration θ". Of the position θ, velocity θ', and acceleration θ", for example, only the data of velocity θ' may be input to the learning unit 13b as learning data. In this way, the learning unit 13b may reduce the dimension of the input data by limiting the input data to only a portion of the data among the position θ, velocity θ', and acceleration θ".

推論部１５ｂは、ベイズフィルタの一種であるヒストグラムフィルタを用いて手先負荷パラメータの推定結果を得ることとしても良い。ヒストグラムフィルタでは、推定対象であるパラメータの存在範囲を特定の刻み幅で区切ることによって各刻み幅に候補点を設定し、時系列データを基に各候補点の存在確率が更新される。 The inference unit 15b may obtain the hand load parameter estimation result using a histogram filter, which is a type of Bayes filter. In the histogram filter, the existence range of the parameter to be estimated is divided into specific intervals, and candidate points are set for each interval, and the existence probability of each candidate point is updated based on the time series data.

図８は、実施の形態２における推論に使用されるヒストグラムフィルタについて説明するための図である。逆動力学モデル部１２ｂは、ＧＰＲの学習結果である関数の分布を学習結果記憶部１４ｂから読み出し、関数の分布を基に逆動力学モデルを補正する。逆動力学モデル部１２ｂは、ＧＰＲの学習結果により補正された逆動力学モデルへ推論用データである位置θ、速度θ′および加速度θ′′が入力されることによって、無負荷と仮定した場合における推定トルクτ_ｅｓｔを出力する。 8 is a diagram for explaining a histogram filter used for inference in the second embodiment. The inverse dynamics model unit 12b reads out the distribution of the function, which is the learning result of the GPR, from the learning result storage unit 14b, and corrects the inverse dynamics model based on the distribution of the function. The inverse dynamics model unit 12b inputs the position θ, velocity θ', and acceleration θ'', which are inference data, to the inverse dynamics model corrected based on the learning result of the GPR, and outputs an estimated torque τ _est in the case where no load is assumed.

手先に負荷を加えてロボットを動作させたときにおける推論用データの実トルクτ_ｒｅａｌと推定トルクτ_ｅｓｔとの差分τ_ｒｅａｌ－τ_ｅｓｔは、負荷によって発生したオフセット分のトルクと考えることができる。以下、オフセット分のトルクを、オフセットトルクΔτと称する。推論部１５ｂは、推定するパラメータの候補点を用意しておき、位置θ、速度θ′および加速度θ′′に対して実際のオフセットトルクΔτのデータが得られる確率が最も大きい候補点を、手先負荷パラメータの推定結果として出力する。 The difference τ _real - τ _est between the real torque τ _real and the estimated torque τ _est of the inference data when the robot is operated with a load applied to the hand can be considered as the offset torque generated by the load. Hereinafter, the offset torque will be referred to as the offset torque Δτ. The inference unit 15b prepares candidate points for the parameters to be estimated, and outputs the candidate point with the highest probability of obtaining the actual offset torque Δτ data for the position θ, velocity θ', and acceleration θ'' as the hand load parameter estimation result.

図８では、推定するパラメータは１種類、例えば手先の質量のみとする。図８では、取り得る質量の値の範囲をｊ等分した候補点ｍ_１，ｍ_２，・・・，ｍ_ｊのうちの１つである候補点ｍ_ｉにおいて、入力されたオフセットトルクΔτのデータが得られる確率が最も大きくなった例を示す。ｊは任意の整数、ｉは１≦ｉ≦ｊを満足する整数である。推定するパラメータがＫ種類である場合、各パラメータの採りうる範囲についてｊ個に分割したｊ^Ｋ通りの候補点を用意しておくことができる。Ｋは２以上の整数とする。 In Fig. 8, the parameter to be estimated is one type, for example, only the mass of the hand. Fig. 8 shows an example in which the probability of obtaining the input offset torque Δτ data is highest at candidate point m _i , which is one of candidate points m ₁ , m ₂ , ..., m _j obtained by equally dividing the range of possible mass values into j parts. j is an arbitrary integer, and i is an integer satisfying 1≦i≦j. When there are K types of parameters to be estimated, ^jK candidate points can be prepared by dividing the possible range of each parameter into j parts. K is an integer of 2 or more.

候補点の刻み幅は、推定するパラメータの種類ごとに異なっても良い。推論部１５ｂは、時刻ｔの位置θ、速度θ′および加速度θ′′に対するオフセットトルクΔτのデータが入力されることによって各候補点の存在確率を更新する際に、推定する各パラメータの採りうる範囲、または各パラメータの刻み幅を変更しても良い。 The step size of the candidate points may differ depending on the type of parameter to be estimated. When updating the probability of existence of each candidate point by inputting data on the position θ, velocity θ', and offset torque Δτ for acceleration θ'' at time t, the inference unit 15b may change the possible range of each parameter to be estimated or the step size of each parameter.

ヒストグラムフィルタの候補点ｍ_ｉの存在確率をｐ_ｉとして、ヒストグラムフィルタ全体の情報量の期待値は、次の式（９）により表される。 If the existence probability of a candidate point m _i of the histogram filter is p _i , the expected value of the amount of information of the entire histogram filter is expressed by the following formula (9).

式（９）に示すＨは、エントロピーと呼ばれる。エントロピーＨは、正の値を持つ。エントロピーＨがゼロに近いほど、ある候補点にヒストグラムフィルタが収束していることを表す。推論用データの外れ値による悪影響を避けるため、推論部１５ｂは、時刻ｔのデータが入力されるたびにエントロピーＨを計算し、エントロピーＨが前回求めたエントロピーＨの値よりも減少している場合のみデータを有効なデータとみなして、各候補点の存在確率を更新しても良い。 H in equation (9) is called entropy. Entropy H has a positive value. The closer entropy H is to zero, the more the histogram filter has converged to a candidate point. To avoid adverse effects due to outliers in the inference data, the inference unit 15b may calculate entropy H each time data at time t is input, and may consider the data to be valid only if entropy H is lower than the previously calculated value of entropy H, and update the existence probability of each candidate point.

実施の形態２によると、手先負荷推定装置２は、無負荷時における学習用データを基に手先負荷パラメータを学習し、実トルクと無負荷時における推定トルクとに基づいて手先負荷パラメータを推論する。これにより、手先負荷推定装置２は、手先負荷パラメータの高精度な推定結果を得ることができるという効果を奏する。 According to the second embodiment, the hand load estimation device 2 learns the hand load parameters based on the learning data at no load, and infers the hand load parameters based on the actual torque and the estimated torque at no load. This allows the hand load estimation device 2 to obtain highly accurate estimation results of the hand load parameters.

実施の形態２では、手先負荷推定装置２は、無負荷にてロボットを動作させたときのデータを用いて学習を行うこととしたが、これに限定されない。手先負荷推定装置２は、質量または重心位置といった手先負荷パラメータが既知である負荷を手先に取り付けてロボットを動作させたときのデータを用いて学習を行っても良い。すなわち、手先負荷推定装置２は、実トルクτ_ｒｅａｌと、無負荷時のパラメータを基に計算された公称トルクτ_ｍｄｌとの差分τ_ｒｅａｌ－τ_ｍｄｌをＧＰＲで学習する代わりに、実トルクτ_ｒｅａｌと、既知の手先負荷パラメータを基に計算された公称トルクτ_ｍｄｌとの差分τ_ｒｅａｌ－τ_ｍｄｌをＧＰＲで学習する。手先負荷推定装置２は、推論用データの実トルクτ_ｒｅａｌと、学習時と同じ負荷が手先に取り付けられていると仮定した場合の推定トルクτ_ｅｓｔとの差分τ_ｒｅａｌ－τ_ｅｓｔから、ヒストグラムフィルタにより手先負荷パラメータを推定することができる。 In the second embodiment, the hand load estimation device 2 performs learning using data obtained when the robot is operated without a load, but the present invention is not limited to this. The hand load estimation device 2 may perform learning using data obtained when the robot is operated with a load, the hand load parameters of which are known, attached to the hand, such as the mass or the center of gravity position. That is, instead of learning the difference τ _real -τ mdl between the actual torque τ _real and the nominal torque τ _mdl calculated based on the parameters when no load is applied, the hand load estimation device 2 learns the difference τ _real -τ _mdl between the actual torque τ _real and the nominal torque τ _mdl calculated based on the known hand load parameters, using the GPR. The hand load estimation device 2 can estimate the hand _load parameters using a histogram filter from the difference τ _real -τ _est between the actual torque τ _real of the inference data and the estimated torque τ _est when it is assumed that the same load as that used during learning is attached to the hand.

実施の形態２における学習には、非線形な未知関数を学習するために、ＧＰＲに代わる近似関数の学習方式を適用しても良い。学習には、例えば、ＮｅｕｒａｌＰｒｏｃｅｓｓｅｓまたはＧＰＮｅｔといった手法を適用しても良い。ＮｅｕｒａｌＰｒｏｃｅｓｓｅｓは、データから確率過程のモデルを生成するニューラルネットワークを学習する手法として知られる。ＧＰＮｅｔは、未知関数の事前分布がガウス過程に従うと仮定して、データによる予測分布をベイズニューラルネットワークで近似する手法として知られる。これらの手法においても、ＧＰＲと同様に、関数の平均値と分散値とを出力可能である。また、実施の形態２における推論には、オフセットトルクΔτのデータに合うように手先負荷パラメータの候補点を更新するために、組合せ最適化問題を解く手法として知られる粒子群最適化（Particle Swarm Optimization：ＰＳＯ）、遺伝的アルゴリズム（Genetic Algorithm：ＧＡ）、またはシミュレーテッドアニーリング（Simulated Annealing：ＳＡ）といった手法を適用しても良い。 In the learning in the second embodiment, a learning method of an approximation function instead of GPR may be applied to learn a nonlinear unknown function. For example, a method such as Neural Processes or GP Net may be applied to the learning. Neural Processes is known as a method of learning a neural network that generates a model of a stochastic process from data. GP Net is known as a method of approximating a predicted distribution by data using a Bayesian neural network, assuming that the prior distribution of an unknown function follows a Gaussian process. In these methods, the mean value and variance value of a function can be output, as in the case of GPR. In addition, in the inference in the second embodiment, a method such as Particle Swarm Optimization (PSO), Genetic Algorithm (GA), or Simulated Annealing (SA), which are known as methods for solving combinatorial optimization problems, may be applied to update the candidate points of the hand load parameters to match the data of the offset torque Δτ.

実施の形態２では、非線形な未知関数を学習するために、ＧＰＲによる学習とその他の学習とが組み合わせられても良い。例えば、ＧＰＲによる学習において入力されたデータをニューラルネットワークの入力として、ＧＰＲが出力する関数の平均値と分散値とを教師信号とする学習を実施する。この場合、手先負荷推定装置２は、学習したニューラルネットワークを用いて、推論時の平均値と分散値とを出力することができる。 In the second embodiment, in order to learn a nonlinear unknown function, learning by GPR may be combined with other learning. For example, data input in learning by GPR is used as input to a neural network, and learning is performed using the average value and variance value of the function output by GPR as teacher signals. In this case, the hand load estimation device 2 can output the average value and variance value at the time of inference using the trained neural network.

実施の形態３．
実施の形態３では、関節の温度データを学習に用いる場合について説明する。図９は、実施の形態３にかかる手先負荷推定装置３の構成を示す図である。実施の形態３では、実施の形態１または２と同一の構成要素には同一の符号を付し、実施の形態１または２とは異なる構成について主に説明する。 Embodiment 3.
In the third embodiment, a case where joint temperature data is used for learning will be described. Fig. 9 is a diagram showing the configuration of a hand load estimating device 3 according to the third embodiment. In the third embodiment, the same components as those in the first or second embodiment are given the same reference numerals, and the configuration different from the first or second embodiment will be mainly described.

手先負荷推定装置３は、入力部１１、逆動力学モデル部１２、学習部１３ｃ、学習結果記憶部１４ｃおよび推論部１５ｃを有する。入力部１１には、位置データである位置ＦＢ値と、電流データである電流ＦＢ値と、温度データである温度フィードバック（ＦＢ）値とが入力される。温度ＦＢ値は、ロボットの動作時における各関節の温度である。 The hand load estimation device 3 has an input unit 11, an inverse dynamics model unit 12, a learning unit 13c, a learning result storage unit 14c, and an inference unit 15c. A position FB value, which is position data, a current FB value, which is current data, and a temperature feedback (FB) value, which is temperature data, are input to the input unit 11. The temperature FB value is the temperature of each joint when the robot is operating.

学習部１３ｃには、学習用データとして、学習時の入力データに基づく各関節の位置θ、速度θ′および加速度θ′′と、学習時の入力データに基づく各関節の差分τ_ｒｅａｌ－τ_ｍｄｌと、各関節の温度データとが入力される。学習部１３ｃは、関節における摩擦特性の温度依存性による影響が除かれた手先負荷パラメータを、学習用データを用いて学習する。学習部１３ｃは、学習結果を学習結果記憶部１４ｃに保存する。 The learning unit 13c receives as learning data the position θ, velocity θ', and acceleration θ'' of each joint based on the input data during learning, the difference τ _real -τ _mdl of each joint based on the input data during learning, and temperature data of each joint. Using the learning data, the learning unit 13c learns hand load parameters from which the effects of the temperature dependency of friction characteristics in the joints have been removed. The learning unit 13c stores the learning results in the learning result storage unit 14c.

推論部１５ｃには、推論用データとして、推論時の入力データに基づく各関節の位置θ、速度θ′および加速度θ′′と、推論時の入力データに基づく各関節の差分τ_ｒｅａｌ－τ_ｍｄｌと、各関節の温度データとが入力される。また、推論部１５ｃは、学習部１３ｃによる学習結果を学習結果記憶部１４ｃから読み出す。推論部１５ｃは、学習部１３ｃによる学習結果を基に、手先負荷パラメータを推定する。 The inference unit 15c receives, as inference data, the position θ, velocity θ', and acceleration θ'' of each joint based on the input data at the time of inference, the difference τ _real -τ _mdl of each joint based on the input data at the time of inference, and temperature data of each joint. Furthermore, the inference unit 15c reads out the learning results by the learning unit 13c from the learning result storage unit 14c. The inference unit 15c estimates hand load parameters based on the learning results by the learning unit 13c.

摩擦特性の温度依存性は、モデル化誤差の要因、またはモデル外特性の１つに挙げられる。実施の形態３によると、手先負荷推定装置３は、摩擦特性の温度依存性による影響が除かれた手先負荷パラメータを学習し、学習結果を基に手先負荷パラメータを推論する。手先負荷推定装置３は、摩擦特性の温度依存性による影響が低減された手先負荷パラメータを推論可能であることによって、手先負荷パラメータの高精度な推定結果を得ることができる。なお、実施の形態１または２の手先負荷推定装置１，２に、実施の形態３と同様の学習および推論を適用しても良い。 The temperature dependency of frictional characteristics is one of the factors that cause modeling errors or characteristics outside the model. According to the third embodiment, the hand load estimation device 3 learns hand load parameters from which the influence of the temperature dependency of frictional characteristics has been removed, and infers the hand load parameters based on the learning results. The hand load estimation device 3 is capable of inferring hand load parameters from which the influence of the temperature dependency of frictional characteristics has been reduced, and is therefore able to obtain highly accurate estimation results of the hand load parameters. Note that the same learning and inference as in the third embodiment may be applied to the hand load estimation devices 1 and 2 of the first and second embodiments.

実施の形態４．
実施の形態４では、各関節の位置θ、速度θ′および加速度θ′′のデータである動きデータを変数に変換する場合について説明する。図１０は、実施の形態４にかかる手先負荷推定装置４の構成を示す図である。実施の形態４では、実施の形態１から３と同一の構成要素には同一の符号を付し、実施の形態１から３とは異なる構成について主に説明する。 Embodiment 4.
In the fourth embodiment, a case will be described in which motion data, which is data on the position θ, velocity θ', and acceleration θ'' of each joint, is converted into variables. Fig. 10 is a diagram showing the configuration of a hand end load estimating device 4 according to the fourth embodiment. In the fourth embodiment, the same components as those in the first to third embodiments are given the same reference numerals, and the configuration different from the first to third embodiments will be mainly described.

手先負荷推定装置４は、入力部１１、逆動力学モデル部１２、学習部１３ｄ、学習結果記憶部１４ｄ、推論部１５ｄおよび変数変換部１６を有する。変数変換部１６には、位置データから計算された位置θ、速度θ′および加速度θ′′が入力される。変数変換部１６は、位置θ、速度θ′および加速度θ′′を、手先負荷パラメータに関して線形化されたトルクの式を設定するための変数に変換する。変数変換部１６は、変換後の変数を出力する。 The hand load estimation device 4 has an input unit 11, an inverse dynamics model unit 12, a learning unit 13d, a learning result storage unit 14d, an inference unit 15d, and a variable conversion unit 16. The position θ, velocity θ', and acceleration θ'' calculated from the position data are input to the variable conversion unit 16. The variable conversion unit 16 converts the position θ, velocity θ', and acceleration θ'' into variables for setting a torque equation linearized with respect to the hand load parameters. The variable conversion unit 16 outputs the converted variables.

学習部１３ｄによる学習の際に、変数変換部１６は、学習時の入力データから得られた位置θ、速度θ′および加速度θ′′のデータを変数に変換し、学習部１３ｄへ変数を出力する。推論部１５ｄによる推論の際に、変数変換部１６は、推論時の入力データから得られた位置θ、速度θ′および加速度θ′′のデータを変数に変換し、推論部１５ｄへ変数を出力する。なお、図１０では、学習時のデータフローと推論時のデータフローとを分けて示す都合上、変数変換部１６を２つに分けて示す。 During learning by the learning unit 13d, the variable conversion unit 16 converts the data of position θ, velocity θ', and acceleration θ'' obtained from the input data during learning into variables, and outputs the variables to the learning unit 13d. During inference by the inference unit 15d, the variable conversion unit 16 converts the data of position θ, velocity θ', and acceleration θ'' obtained from the input data during inference into variables, and outputs the variables to the inference unit 15d. Note that in Figure 10, the variable conversion unit 16 is shown divided into two for the convenience of showing the data flow during learning and the data flow during inference separately.

学習部１３ｄには、学習用データとして、各関節の位置θ、速度θ′および加速度θ′′のデータから変換された変数と、各関節の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。学習部１３ｄは、逆動力学モデルのモデル化誤差の影響とモデル外特性による影響とが除かれた手先負荷パラメータを、学習用データを用いて学習する。学習部１３ｄは、学習結果を学習結果記憶部１４ｄに保存する。 Variables converted from data on the position θ, velocity θ', and acceleration θ'' of each joint, and the difference τ _real -τ _mdl of each joint are input to the learning unit 13d as learning data. The learning unit 13d uses the learning data to learn hand load parameters from which the effects of modeling errors in the inverse dynamics model and the effects of out-of-model characteristics have been removed. The learning unit 13d stores the learning results in the learning result storage unit 14d.

推論部１５ｄには、推論用データとして、各関節の位置θ、速度θ′および加速度θ′′のデータから変換された変数と、各関節の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。推論部１５ｄは、学習部１３ｄによる学習結果を学習結果記憶部１４ｄから読み出す。推論部１５ｄは、学習部１３ｄによる学習結果を基に、手先負荷パラメータを推定する。 The inference unit 15d receives, as inference data, variables converted from the data on the position θ, velocity θ', and acceleration θ'' of each joint, and the difference τ _real -τ _mdl of each joint. The inference unit 15d reads out the learning results of the learning unit 13d from the learning result storage unit 14d. The inference unit 15d estimates hand load parameters based on the learning results of the learning unit 13d.

ロボットの運動方程式は、位置θ、速度θ′および加速度θ′′に関して非線形な式である。ただし、負荷についての質量および重心等のパラメータを（質量）×（重心）といった形にして並べたパラメータベクトルをＭとすると、運動方程式は、次の式（１０）のように線形化が可能となる。Ｋ（θ，θ′，θ′′）は、リグレッサ行列と呼ばれる。Ｋ（θ，θ′，θ′′）は、各関節の位置θ、速度θ′および加速度θ′′から変換された変数である。 The equation of motion for a robot is a nonlinear equation with respect to position θ, velocity θ', and acceleration θ''. However, if a parameter vector M is an arrangement of parameters such as the mass and center of gravity of the load in the form (mass) x (center of gravity), the equation of motion can be linearized as shown in the following equation (10). K(θ, θ', θ'') is called the regressor matrix. K(θ, θ', θ'') is a variable converted from the position θ, velocity θ', and acceleration θ'' of each joint.

上記のＲＮＮおよびＧＰＲは、非線形回帰による処理を扱い得る。手先負荷推定装置４は、位置θ、速度θ′および加速度θ′′のデータを変数に変換することによって、非線形回帰により扱われる処理を線形領域での処理として扱うことが可能となる。これにより、手先負荷推定装置４は、簡易な処理による学習と推論とが可能となる。手先負荷推定装置４は、簡易な処理によって、手先負荷パラメータの高精度な推定結果を得ることができる。実施の形態１から３の手先負荷推定装置１，２，３に、実施の形態４と同様の変数変換を適用しても良い。なお、後述する実施の形態７では、手先負荷パラメータの逐次同定と確率分布との組み合わせによる推論にリグレッサ行列を適用する例について説明する。 The RNN and GPR described above can handle processing by nonlinear regression. By converting the data of the position θ, velocity θ', and acceleration θ'' into variables, the hand load estimation device 4 can handle processing handled by nonlinear regression as processing in a linear domain. This enables the hand load estimation device 4 to learn and infer using simple processing. The hand load estimation device 4 can obtain highly accurate estimation results of hand load parameters using simple processing. The same variable conversion as in embodiment 4 may be applied to the hand load estimation devices 1, 2, and 3 of embodiments 1 to 3. Note that in embodiment 7 described later, an example of applying a regressor matrix to inference by combining sequential identification of hand load parameters and probability distribution will be described.

実施の形態５．
実施の形態５では、各関節の位置θ、速度θ′または加速度θ′′に関係する摩擦特性の情報を学習に用いる場合について説明する。図１１は、実施の形態５にかかる手先負荷推定装置５の構成を示す図である。実施の形態５では、実施の形態１から４と同一の構成要素には同一の符号を付し、実施の形態１から４とは異なる構成について主に説明する。 Embodiment 5.
In the fifth embodiment, a case will be described in which information on friction characteristics related to the position θ, velocity θ', or acceleration θ'' of each joint is used for learning. FIG. 11 is a diagram showing the configuration of a hand end load estimating device 5 according to the fifth embodiment. In the fifth embodiment, the same components as those in the first to fourth embodiments are given the same reference numerals, and the configuration different from the first to fourth embodiments will be mainly described.

手先負荷推定装置５は、入力部１１、逆動力学モデル部１２、学習部１３ｅ、学習結果記憶部１４ｅおよび推論部１５ｅを有する。学習部１３ｅには、学習用データとして、学習時の入力データに基づく各関節の位置θ、速度θ′および加速度θ′′と、学習時の入力データに基づく各関節の差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。 The hand load estimation device 5 has an input unit 11, an inverse dynamics model unit 12, a learning unit 13e, a learning result storage unit 14e, and an inference unit 15e. The position θ, velocity θ', and acceleration θ'' of each joint based on the input data at the time of learning, and the difference τ _real -τ _mdl of each joint based on the input data at the time of learning are input to the learning unit 13e as learning data.

また、学習部１３ｅには、摩擦特性の情報が入力される。学習部１３ｅに入力される摩擦特性の情報は、学習用データとは別にあらかじめ同定された情報であって、各関節の位置θ、速度θ′および加速度θ′′の少なくとも１つに関係する情報である。なお、摩擦特性が位置θ、速度θ′および加速度θ′′の少なくとも１つに関係するとは、位置θ、速度θ′および加速度θ′′の少なくとも１つの変化に対応して摩擦特性が変化することを指す。 In addition, information on friction characteristics is input to the learning unit 13e. The information on friction characteristics input to the learning unit 13e is information identified in advance separately from the learning data, and is information related to at least one of the position θ, velocity θ', and acceleration θ'' of each joint. Note that the friction characteristics being related to at least one of the position θ, velocity θ', and acceleration θ'' refers to the friction characteristics changing in response to a change in at least one of the position θ, velocity θ', and acceleration θ''.

学習部１３ｅは、学習用データと摩擦特性の情報とを用いた学習によって、各関節の位置θ、速度θ′または加速度θ′′に関係する摩擦特性による影響が除かれた手先負荷パラメータを学習する。学習部１３ｅは、学習結果を学習結果記憶部１４ｅに保存する。推論部１５ｅは、学習部１３ｅによる学習結果を学習結果記憶部１４ｅから読み出す。推論部１５ｅは、学習部１３ｅによる学習結果を基に、手先負荷パラメータを推定する。 The learning unit 13e learns hand load parameters from which the influence of friction characteristics related to the position θ, velocity θ', or acceleration θ'' of each joint has been removed by learning using the learning data and information on friction characteristics. The learning unit 13e stores the learning results in the learning result storage unit 14e. The inference unit 15e reads out the learning results by the learning unit 13e from the learning result storage unit 14e. The inference unit 15e estimates the hand load parameters based on the learning results by the learning unit 13e.

位置θ、速度θ′または加速度θ′′に関係する摩擦特性は、モデル化誤差の要因、またはモデル外特性の１つに挙げられる。実施の形態５によると、手先負荷推定装置５は、各関節の位置θ、速度θ′または加速度θ′′に関係する摩擦特性による影響が除かれた手先負荷パラメータを学習し、学習結果を基に手先負荷パラメータを推論する。手先負荷推定装置５は、位置θ、速度θ′または加速度θ′′に関係する摩擦特性の影響が低減された手先負荷パラメータを推論可能であることによって、手先負荷パラメータの高精度な推定結果を得ることができる。 Friction characteristics related to position θ, velocity θ', or acceleration θ'' are cited as one of the causes of modeling error or characteristics outside the model. According to embodiment 5, the hand load estimation device 5 learns hand load parameters from which the influence of friction characteristics related to position θ, velocity θ', or acceleration θ'' of each joint has been removed, and infers the hand load parameters based on the learning results. The hand load estimation device 5 is capable of inferring hand load parameters from which the influence of friction characteristics related to position θ, velocity θ', or acceleration θ'' has been reduced, thereby making it possible to obtain highly accurate estimation results of the hand load parameters.

なお、学習部１３ｅには、各関節の温度に関係する摩擦特性の情報が入力されても良い。各関節の温度に関係する摩擦特性の情報は、学習用データとは別にあらかじめ同定された情報である。この場合、学習部１３ｅは、各関節の温度に関係する摩擦特性による影響が除かれた手先負荷パラメータを学習する。摩擦特性が温度に関係するとは、温度の変化に対応して摩擦特性が変化することを指す。この場合も、手先負荷推定装置５は、手先負荷パラメータの高精度な推定結果を得ることができる。なお、実施の形態１から４の手先負荷推定装置１，２，３，４に、実施の形態５と同様の学習を適用しても良い。 Information on friction characteristics related to the temperature of each joint may be input to the learning unit 13e. The information on friction characteristics related to the temperature of each joint is information that is identified in advance separately from the learning data. In this case, the learning unit 13e learns hand load parameters from which the influence of the friction characteristics related to the temperature of each joint has been removed. The friction characteristics being related to temperature refers to the friction characteristics changing in response to a change in temperature. In this case as well, the hand load estimation device 5 can obtain highly accurate estimation results of the hand load parameters. In addition, learning similar to that of the fifth embodiment may be applied to the hand load estimation devices 1, 2, 3, and 4 of the first to fourth embodiments.

実施の形態６．
実施の形態６では、手先負荷パラメータの信頼度を示す指標を計算および出力する場合について説明する。図１２は、実施の形態６にかかる手先負荷推定装置６の構成を示す図である。実施の形態６では、実施の形態１から５と同一の構成要素には同一の符号を付し、実施の形態１から５とは異なる構成について主に説明する。 Embodiment 6.
In the sixth embodiment, a case where an index indicating the reliability of a hand load parameter is calculated and output will be described. Fig. 12 is a diagram showing the configuration of a hand load estimating device 6 according to the sixth embodiment. In the sixth embodiment, the same components as those in the first to fifth embodiments are given the same reference numerals, and the configuration different from the first to fifth embodiments will be mainly described.

手先負荷推定装置６は、入力部１１、逆動力学モデル部１２、学習部１３ｆ、学習結果記憶部１４ｆおよび推論部１５ｆを有する。学習部１３ｆは、関節における摩擦特性の温度依存性による影響が除かれた手先負荷パラメータを、学習用データを用いて学習する。学習部１３ｆは、学習結果を学習結果記憶部１４ｆに保存する。推論部１５ｆは、学習部１３ｆによる学習結果を学習結果記憶部１４ｆから読み出す。推論部１５ｆは、学習部１３ｆによる学習結果を基に、手先負荷パラメータを推論する。さらに、学習部１３ｆは、信頼度の指標を計算し、計算結果を学習結果記憶部１４ｆに保存する。推論部１５ｆは、信頼度の指標を学習結果記憶部１４ｆから読み出し、手先負荷パラメータの推論結果とともに信頼度の指標を出力する。 The hand load estimation device 6 has an input unit 11, an inverse dynamics model unit 12, a learning unit 13f, a learning result storage unit 14f, and an inference unit 15f. The learning unit 13f uses learning data to learn hand load parameters from which the effects of temperature dependency of friction characteristics in joints have been removed. The learning unit 13f stores the learning results in the learning result storage unit 14f. The inference unit 15f reads out the learning results by the learning unit 13f from the learning result storage unit 14f. The inference unit 15f infers hand load parameters based on the learning results by the learning unit 13f. Furthermore, the learning unit 13f calculates a reliability index and stores the calculation result in the learning result storage unit 14f. The inference unit 15f reads out the reliability index from the learning result storage unit 14f, and outputs the reliability index together with the inference result of the hand load parameter.

ニューラルネットワークを用いた推論を行う場合、信頼度の指標は、学習時の確率分布と推論時の確率分布との差異を表す尺度を用いることによって求めることができる。尺度としては、ＪＳダイバージェンス(Jensen-Shannon divergence)、ＫＬダイバージェンス(Kullback-Leibler divergence)などがあり、いずれを用いても良い。学習部１３ｆは、ニューラルネットワークの出力としてパラメータごとの平均値と分散とを設定する。ここで、差異を表す尺度は、正解ラベルの確率分布と、ニューラルネットワークの出力として得られる平均値および分散で表現される推論値の確率分布との差であって、ＫＬダイバージェンス等で計算して得られる値である。正解ラベルは、平均値に対する教師信号である。学習部１３ｆは、ニューラルネットワークの構築によって、各パラメータの平均値と分散とを正解ラベルの確率分布に近づけるような学習を行う。手先負荷推定装置６は、各パラメータについての平均値を推定値、分散を信頼度の指標として扱う。分散が小さいほど、推定値の信頼度が高いと評価される。 When performing inference using a neural network, the reliability index can be obtained by using a measure that represents the difference between the probability distribution at the time of learning and the probability distribution at the time of inference. As the measure, there are JS divergence (Jensen-Shannon divergence), KL divergence (Kullback-Leibler divergence), etc., and any of them may be used. The learning unit 13f sets the mean value and variance for each parameter as the output of the neural network. Here, the measure that represents the difference is the difference between the probability distribution of the correct label and the probability distribution of the inference value expressed by the mean value and variance obtained as the output of the neural network, and is a value obtained by calculating using KL divergence or the like. The correct label is a teacher signal for the mean value. The learning unit 13f performs learning so as to bring the mean value and variance of each parameter closer to the probability distribution of the correct label by constructing a neural network. The hand load estimation device 6 treats the mean value of each parameter as an estimated value and the variance as an index of reliability. The smaller the variance, the higher the reliability of the estimated value is evaluated.

ニューラルネットワークを用いた推論を行う場合、Ｄｒｏｐｏｕｔによる出力結果を信頼度指標とすることもできる。Ｄｒｏｐｏｕｔは、ニューラルネットワークのユニットごとにある割合でランダムにユニットの利用および非利用を設定して学習する手法として知られる。この場合には、推論時にも同様の割合でユニットをランダムに利用または非利用としたいくつかのニューラルネットワークを用意してそれぞれで推定を行うことで、推定のばらつきを確保する。複数の推定結果の平均値を推定値、分散を信頼度指標として扱い、分散が小さいほど、推定値の信頼度が高いと評価される。 When making inference using a neural network, the output result from Dropout can also be used as a reliability index. Dropout is known as a technique for learning by randomly setting the use and non-use of units at a certain ratio for each unit of a neural network. In this case, several neural networks are prepared in which units are randomly used or not used at the same ratio during inference, and estimation is performed with each of them, ensuring variation in the estimation. The average value of multiple estimation results is treated as the estimate, and the variance is treated as a reliability index, and the smaller the variance, the higher the reliability of the estimate is evaluated.

ヒストグラムフィルタを用いた推論を行う場合、ヒストグラムフィルタの確信度の大きさを信頼度指標とすることができる。この場合には、すべての候補点での存在確率の和が１となるように正規化した場合に最大の存在確率を示す候補点の確率が大きいほど、推定値の信頼度が高いと評価される。上記の式（９）で計算されたヒストグラムフィルタのエントロピーＨの値が小さいほど推定値の信頼度が高いとする評価もできる。 When making inference using a histogram filter, the magnitude of the certainty of the histogram filter can be used as a reliability index. In this case, the greater the probability of the candidate point that shows the greatest existence probability when normalized so that the sum of the existence probabilities at all candidate points is 1, the higher the reliability of the estimated value is evaluated to be. It can also be evaluated that the smaller the value of the entropy H of the histogram filter calculated by the above formula (9), the higher the reliability of the estimated value.

ＧＰＲを用いた学習を行う場合、学習結果として得られている関数の分布の分散値を信頼度指標とすることができる。この場合には、推論時の位置θ、速度θ′、加速度θ′′の領域に対して、ＧＰＲが示すトルク誤差関数の分布の分散値が小さいほど、推定値の信頼度が高いと評価される。 When learning using GPR, the variance of the distribution of the function obtained as a result of learning can be used as a reliability index. In this case, the smaller the variance of the distribution of the torque error function shown by GPR for the region of position θ, velocity θ', and acceleration θ'' at the time of inference, the higher the reliability of the estimated value is evaluated to be.

実施の形態６によると、手先負荷推定装置６は、手先負荷パラメータの信頼度を示す指標を計算し、手先負荷パラメータの推論結果とともに信頼度の指標を出力することによって、推定された手先負荷パラメータの信頼度の評価が可能となる。 According to the sixth embodiment, the hand load estimation device 6 calculates an index indicating the reliability of the hand load parameter, and outputs the reliability index together with the inference result of the hand load parameter, thereby making it possible to evaluate the reliability of the estimated hand load parameter.

実施の形態７．
実施の形態７では、推論部が手先負荷パラメータの推定値とともに信頼度の指標を出力する例について説明する。図１３は、実施の形態７にかかる手先負荷推定装置７の構成を示す図である。実施の形態７では、実施の形態１から６と同一の構成要素には同一の符号を付し、実施の形態１から６とは異なる構成について主に説明する。 Embodiment 7.
In the seventh embodiment, an example will be described in which the inference unit outputs a reliability index together with the estimated value of the hand load parameter. Fig. 13 is a diagram showing the configuration of a hand load estimation device 7 according to the seventh embodiment. In the seventh embodiment, the same components as those in the first to sixth embodiments are given the same reference numerals, and the configuration different from the first to sixth embodiments will be mainly described.

手先負荷推定装置７は、入力部１１、逆動力学モデル部１２，１２ｇ、学習部１３ｇ、学習結果記憶部１４ｇ、推論部１５ｇおよび変数変換部１６を有する。学習部１３ｇによる学習の際に、変数変換部１６は、学習時の入力データから得られた動きデータである位置θ、速度θ′および加速度θ′′を変数に変換し、学習部１３ｇへ変数を出力する。推論部１５ｇによる推論の際に、変数変換部１６は、推論時の入力データから得られた位置θ、速度θ′および加速度θ′′を変数に変換し、推論部１５ｇへ変数を出力する。 The hand load estimation device 7 has an input unit 11, inverse dynamics model units 12, 12g, a learning unit 13g, a learning result storage unit 14g, an inference unit 15g, and a variable conversion unit 16. When the learning unit 13g is learning, the variable conversion unit 16 converts the position θ, velocity θ', and acceleration θ'', which are movement data obtained from the input data during learning, into variables, and outputs the variables to the learning unit 13g. When the inference unit 15g is making an inference, the variable conversion unit 16 converts the position θ, velocity θ', and acceleration θ'', which are obtained from the input data during inference, into variables, and outputs the variables to the inference unit 15g.

学習部１３ｇには、学習用データとして、Ｋ（θ，θ′，θ′′）と、差分τ_ｒｅａｌ－τ_ｍｄｌとが入力される。リグレッサ行列であるＫ（θ，θ′，θ′′）は、上記実施の形態４にて説明するように、位置θ、速度θ′および加速度θ′′から変換された変数である。さらに、学習部１３ｇには、公称パラメータの情報が入力される。ＧＰＲを備える学習部１３ｇは、学習用データである無負荷時の位置θ、速度θ′および加速度θ′′を入力とし、差分τ_ｒｅａｌ－τ_ｍｄｌを出力とする関数の分布を学習する。学習部１３ｇは、学習結果である関数の分布を学習結果記憶部１４ｇに保存する。 The learning unit 13g receives K(θ, θ', θ'') and the difference τ _real -τ _mdl as learning data. The regressor matrix K(θ, θ', θ'') is a variable converted from the position θ, velocity θ', and acceleration θ'', as described in the fourth embodiment. Furthermore, information on nominal parameters is input to the learning unit 13g. The learning unit 13g equipped with a GPR receives the position θ, velocity θ', and acceleration θ'' at no load, which are learning data, as input, and learns the distribution of a function that outputs the difference τ _real -τ _mdl . The learning unit 13g stores the distribution of the function, which is the learning result, in the learning result storage unit 14g.

逆動力学モデル部１２ｇは、ＧＰＲの学習結果である関数の分布を学習結果記憶部１４ｇから読み出し、関数の分布を基に逆動力学モデルを補正する。逆動力学モデル部１２ｇは、ＧＰＲの学習結果により補正された逆動力学モデルへ、推論時の位置データに基づく位置θ、速度θ′および加速度θ′′が入力されることによって、無負荷と仮定した場合における推定トルクτ_ｅｓｔを出力する。 The inverse dynamics model unit 12g reads out the distribution of the function, which is the learning result of the GPR, from the learning result storage unit 14g, and corrects the inverse dynamics model based on the distribution of the function. The inverse dynamics model unit 12g outputs an estimated torque τ est in the case where no load is assumed, by inputting the position θ, velocity θ', and acceleration θ'' based on the position data at the time of inference to the inverse dynamics _model corrected by the learning result of the GPR.

推論部１５ｇには、推論用データとして、Ｋ（θ，θ′，θ′′）とオフセットトルクΔτとが入力される。オフセットトルクΔτは、実トルクτ_ｒｅａｌと推定トルクτ_ｅｓｔとの差分τ_ｒｅａｌ－τ_ｅｓｔである。推論部１５ｇは、学習部１３ｇによる学習結果を学習結果記憶部１４ｇから読み出す。推論部１５ｇは、学習部１３ｇによる学習結果を基に、手先負荷パラメータを推定する。 The inference unit 15g receives K(θ, θ', θ'') and the offset torque Δτ as inference data. The offset torque Δτ is the difference τ _real - τ _est between the actual torque τ _real and the estimated torque τ _est . The inference unit 15g reads out the learning result by the learning unit 13g from the learning result storage unit 14g. The inference unit 15g estimates the hand load parameters based on the learning result by the learning unit 13g.

上記の式（１０）におけるＫ（θ，θ′，θ′′）とパラメータベクトルＭは、摩擦トルクに関する項を含む形に拡張できる。学習部１３ｇは、学習用の無負荷時のデータに含まれる各時刻の実トルクτ_ｒｅａｌを並べたベクトルであるＹと、変数変換部１６で各時刻の位置θ、速度θ′および加速度θ′′から計算されたリグレット行列であるＫと、逆動力学モデルに含まれる公称パラメータとを入力として、無負荷時に上記式（１０）により計算される公称トルクが実トルクに合うようなパラメータベクトルＭの最適解を計算する。パラメータベクトルＭの最適解は、次の式（１１）により表される。なお、式（１１）において、Ｍの上部に「＾」を付したものは、パラメータベクトルＭの最適解を表す。Ｍの上部に「￣」を付したものは、公称パラメータを表す。Ｔは転置行列を表す。Ｗは重み行列を表す。 K(θ, θ', θ'') and the parameter vector M in the above formula (10) can be expanded to include a term related to friction torque. The learning unit 13g receives as input Y, which is a vector in which the real torque τ _real at each time included in the no-load data for learning is arranged, K, which is a regret matrix calculated from the position θ, velocity θ', and acceleration θ'' at each time by the variable conversion unit 16, and the nominal parameters included in the inverse dynamics model, and calculates an optimal solution of the parameter vector M such that the nominal torque calculated by the above formula (10) at no-load matches the actual torque. The optimal solution of the parameter vector M is expressed by the following formula (11). In formula (11), M with "^" above it represents the optimal solution of the parameter vector M. M with "￣" above it represents the nominal parameters. T represents a transposed matrix. W represents a weighting matrix.

オフセットトルクΔτと、推定対象の手先負荷パラメータベクトルｍとについて、次の式（１２）が成り立つ。これにより、推論部１５ｇは、任意の線形回帰手法によって手先負荷パラメータベクトルｍを推定することができる。 The following equation (12) holds for the offset torque Δτ and the hand load parameter vector m to be estimated. This allows the inference unit 15g to estimate the hand load parameter vector m using any linear regression method.

線形回帰手法としてベイズ線形回帰が用いられる場合、推論部１５ｇは、手先負荷パラメータの推定結果とともに、推定結果の信頼度についての指標を出力することができる。 When Bayesian linear regression is used as the linear regression method, the inference unit 15g can output an indicator of the reliability of the estimation result together with the estimation result of the hand load parameters.

図１４は、実施の形態７におけるベイズ線形回帰による確率分布の更新について説明するための概念図である。事前確率分布であるＮ（ｍ｜ｕ_０，Ｓ_０）は、観測データΔτ，Ｋ（θ，θ′，θ′′）により事後確率分布であるＮ（ｍ｜ｕ，Ｓ）へ更新される。かかる更新は繰り返される。ベイズ線形回帰では、時刻ｔのデータを入力としたときの手先負荷パラメータベクトルｍが確率分布Ｎ（ｍ｜ｕ，Ｓ）に従うとして、平均ｕおよび分散Ｓは、それぞれ次の式（１３）、式（１４）に従って更新される。ｕ_０は、平均の事前確率分布を表す。Ｓ_０は、分散の事前確率分布を表す。βは、トルク誤差の共分散の逆数を表すパラメータであって、トルクデータに含まれるノイズ分布に応じて設定される。 FIG. 14 is a conceptual diagram for explaining the update of the probability distribution by Bayesian linear regression in the seventh embodiment. The prior probability distribution N(m|u ₀ , S ₀ ) is updated to the posterior probability distribution N(m|u, S) by the observation data Δτ, K(θ, θ′, θ″). Such updating is repeated. In the Bayesian linear regression, the hand load parameter vector m when the data at time t is input is assumed to follow the probability distribution N(m|u, S), and the mean u and variance S are updated according to the following formulas (13) and (14), respectively. u ₀ represents the prior probability distribution of the mean. S ₀ represents the prior probability distribution of the variance. β is a parameter representing the inverse of the covariance of the torque error, and is set according to the noise distribution included in the torque data.

推論部１５ｇは、手先負荷パラメータの推定値として確率分布の平均ｕを出力する。また、推論部１５ｇは、信頼度の指標として確率分布の分散Ｓを出力する。分散Ｓが小さいほど、推定値の信頼度が高いと評価される。 The inference unit 15g outputs the mean u of the probability distribution as an estimate of the hand load parameter. The inference unit 15g also outputs the variance S of the probability distribution as an index of reliability. The smaller the variance S, the higher the reliability of the estimate is evaluated.

実施の形態７によると、手先負荷推定装置７は、手先負荷パラメータの推論結果とともに信頼度の指標を出力することによって、推定された手先負荷パラメータの信頼度の評価が可能となる。 According to the seventh embodiment, the hand load estimation device 7 outputs an index of reliability together with the inference result of the hand load parameter, thereby making it possible to evaluate the reliability of the estimated hand load parameter.

実施の形態８．
図１５は、実施の形態８にかかるロボットシステム８の構成を示す図である。ロボットシステム８は、ロボット制御システム９とロボット４０とを有する。ロボット制御システム９は、手先負荷推定装置１と、ロボット４０を制御するロボット制御装置３０とを有する。実施の形態８において、ロボット制御装置３０は、ロボット４０の動作中において、手先負荷推定装置１から出力される手先負荷パラメータを基に、ロボット４０の加減速度を調整する。実施の形態８では、実施の形態１から７と同一の構成要素には同一の符号を付し、実施の形態１から７とは異なる構成について主に説明する。なお、ロボットシステム８は、実施の形態１から７のいずれか１つに記載の手先負荷推定装置を有するものであれば良い。 Embodiment 8.
FIG. 15 is a diagram showing a configuration of a robot system 8 according to an eighth embodiment. The robot system 8 includes a robot control system 9 and a robot 40. The robot control system 9 includes a hand load estimation device 1 and a robot control device 30 that controls the robot 40. In the eighth embodiment, the robot control device 30 adjusts the acceleration/deceleration of the robot 40 based on the hand load parameters output from the hand load estimation device 1 during the operation of the robot 40. In the eighth embodiment, the same components as those in the first to seventh embodiments are denoted by the same reference numerals, and the configuration different from the first to seventh embodiments will be mainly described. The robot system 8 may include the hand load estimation device according to any one of the first to seventh embodiments.

ロボット制御装置３０は、指令生成部３１を有する。ロボット制御装置３０は、手先負荷推定装置１から出力される手先負荷パラメータを受信する。指令生成部３１は、手先負荷推定装置１から出力される手先負荷パラメータを基に、ロボット４０へ送る指令を生成する。ロボット制御装置３０は、生成された指令をロボット４０へ送る。 The robot controller 30 has a command generator 31. The robot controller 30 receives the hand load parameters output from the hand load estimation device 1. The command generator 31 generates a command to be sent to the robot 40 based on the hand load parameters output from the hand load estimation device 1. The robot controller 30 sends the generated command to the robot 40.

手先負荷推定装置１は、ロボット制御装置３０と通信可能に接続されている。手先負荷推定装置１は、ロボット制御装置３０の外部の装置に限定されない。手先負荷推定装置１は、ロボット制御装置３０に内蔵されても良い。実施の形態１から７のいずれか１つにおける学習および推論は、ロボット制御装置３０の内部で実施されても良い。 The hand load estimation device 1 is communicatively connected to the robot control device 30. The hand load estimation device 1 is not limited to being an external device to the robot control device 30. The hand load estimation device 1 may be built into the robot control device 30. The learning and inference in any one of the first to seventh embodiments may be performed inside the robot control device 30.

または、手先負荷推定装置１の学習部１３および推論部１５のうち、学習部１３がロボット制御装置３０の外部の装置によって実現され、推論部１５がロボット制御装置３０の内部の手先負荷推定装置１に内蔵されても良い。この場合、入力部１１、逆動力学モデル部１２、学習部１３および学習結果記憶部１４を有する学習装置がロボット制御装置３０の外部に設けられる。また、かかる学習装置と同様の入力部１１および逆動力学モデル部１２と、学習装置の学習結果記憶部１４から読み出された学習済モデルに基づいた推論を行う推論部１５とが、ロボット制御装置３０の内部の手先負荷推定装置１に設けられる。このように、実施の形態１から７のいずれか１つにおける学習はロボット制御装置３０の外部で実施され、かつ、実施の形態１から７のいずれか１つにおける推論はロボット制御装置３０の内部で実施されても良い。 Alternatively, of the learning unit 13 and the inference unit 15 of the hand load estimation device 1, the learning unit 13 may be realized by a device external to the robot control device 30, and the inference unit 15 may be built into the hand load estimation device 1 inside the robot control device 30. In this case, a learning device having an input unit 11, an inverse dynamics model unit 12, a learning unit 13, and a learning result storage unit 14 is provided outside the robot control device 30. Also, an input unit 11 and an inverse dynamics model unit 12 similar to the learning device, and an inference unit 15 that performs inference based on a learned model read from the learning result storage unit 14 of the learning device are provided in the hand load estimation device 1 inside the robot control device 30. In this way, the learning in any one of the embodiments 1 to 7 may be performed outside the robot control device 30, and the inference in any one of the embodiments 1 to 7 may be performed inside the robot control device 30.

次に、ロボット制御装置３０によるロボット４０の加減速度の変更について説明する。ロボット制御装置３０は、各関節を動作させるトルクの許容値を超えない範囲で最短となる加速時間および減速時間を算出する機能を有する。ロボット制御装置３０は、手先負荷パラメータを用いて、ロボット４０の加速開始地点、加速終了地点、減速開始地点、減速終了地点の各々についての運動方程式を計算することによって、加速時間および減速時間を求める。ロボット制御装置３０は、加速時間および減速時間を基に、加減速度を変更する。 Next, the change in acceleration/deceleration of the robot 40 by the robot controller 30 will be described. The robot controller 30 has a function of calculating the shortest acceleration time and deceleration time that do not exceed the allowable value of the torque that operates each joint. The robot controller 30 determines the acceleration time and deceleration time by using the hand load parameters to calculate the equation of motion for each of the acceleration start point, acceleration end point, deceleration start point, and deceleration end point of the robot 40. The robot controller 30 changes the acceleration/deceleration speed based on the acceleration time and deceleration time.

ロボット制御装置３０は、加減速終了地点までのデータによって推定された手先負荷パラメータを用いて、減速開始地点と減速終了地点とについての計算を行う。この場合、実トルクが許容値を超えないように、いわば保守的な固定値に設定された手先負荷パラメータを基に減速時間が計算される場合に比べて、減速時間の短縮が可能となる。ロボット制御装置３０は、トルクの許容値を超えない範囲で減速時間を短縮可能であることによって、トルクを過大とさせること無くロボット４０の高速化を実現できる。 The robot control device 30 calculates the deceleration start point and deceleration end point using the hand load parameters estimated from the data up to the acceleration/deceleration end point. In this case, it is possible to shorten the deceleration time compared to when the deceleration time is calculated based on the hand load parameters that are set to a conservative fixed value so that the actual torque does not exceed the allowable value. By being able to shorten the deceleration time without exceeding the allowable torque value, the robot control device 30 can achieve an increase in the speed of the robot 40 without excessive torque.

ロボット制御装置３０は、手先負荷パラメータの推定結果とともに、推定結果の信頼度を示す指標が得られる場合、信頼度の指標に応じて加減速度をさらに調整しても良い。例えば、ロボット制御装置３０は、最適加減速の演算に用いられる手先負荷パラメータの値に信頼度を反映させることができる。この場合、ロボット制御装置３０は、手先負荷パラメータが固定値である場合よりも減速時間が長くならない範囲で、信頼度に応じて手先負荷パラメータを保守的な方向に補正することができる。 When an index indicating the reliability of the estimation result is obtained along with the hand load parameter estimation result, the robot controller 30 may further adjust the acceleration/deceleration rate according to the reliability index. For example, the robot controller 30 may reflect the reliability in the value of the hand load parameter used in the calculation of the optimal acceleration/deceleration. In this case, the robot controller 30 may correct the hand load parameter in a conservative direction according to the reliability, within a range in which the deceleration time is not longer than when the hand load parameter is a fixed value.

ロボット制御装置３０は、推定結果である手先負荷パラメータの値をそのまま用いて計算された減速時間に信頼度を反映させても良い。この場合、ロボット制御装置３０は、手先パラメータの推定結果をそのまま用いて計算された減速時間に、信頼度に応じた補正係数ａ（ただし、ａ＞１とする）を乗じた値をロボット４０の減速時間として用いる。 The robot controller 30 may reflect the reliability in the deceleration time calculated using the estimated hand load parameter value as is. In this case, the robot controller 30 uses the deceleration time calculated using the estimated hand parameter value as is multiplied by a correction coefficient a (where a>1) according to the reliability as the deceleration time of the robot 40.

ロボット制御装置３０は、手先負荷パラメータの推定に用いられるデータを取得するための１つの動作が終了した後において負荷が加えられた状態で続けられる次の動作の加速時間および減速時間を計算する際に、推定された当該手先負荷パラメータを用いても良い。 The robot control device 30 may use the estimated hand load parameters when calculating the acceleration and deceleration times of the next operation that is continued under load after one operation for obtaining data used to estimate the hand load parameters is completed.

実施の形態８によると、ロボットシステム８は、手先負荷パラメータを基に、ロボットの加減速度を変更することによって、ロボット４０の高速化を実現できる。 According to the eighth embodiment, the robot system 8 can increase the speed of the robot 40 by changing the acceleration/deceleration of the robot based on the hand load parameters.

実施の形態９．
実施の形態９では、図１５を参照して、ロボットシステム８においてロボット４０の衝突検知のためのパラメータを調整する例について説明する。ロボット制御装置３０は、ロボット４０の動作中において、手先負荷パラメータを基に、ロボット４０の衝突を検知するためのパラメータである閾値を調整する。閾値は、ロボット４０と障害物との衝突の有無を判断するための閾値である。 Embodiment 9.
In the ninth embodiment, an example of adjusting parameters for detecting a collision of the robot 40 in the robot system 8 will be described with reference to Fig. 15. The robot control device 30 adjusts a threshold value, which is a parameter for detecting a collision of the robot 40, based on the hand load parameter while the robot 40 is in operation. The threshold value is a threshold value for determining whether or not the robot 40 has collided with an obstacle.

ロボット制御装置３０は、各関節の実トルクと推定トルクとの比較によって衝突を検知する機能を有する。実トルクが推定トルクよりも大きく、かつ実トルクと推定トルクとの差があらかじめ設定された閾値よりも大きい場合、ロボット制御装置３０は、障害物にロボット４０が衝突したと判断する。閾値には、正常にロボット４０が動作しているときの実トルクと推定トルクとの誤差による誤検知を防ぐためのマージンがあらかじめ設けられる。これに加えて、ロボット制御装置３０は、手先負荷パラメータを基に閾値を調整することによって、閾値を固定とする場合に比べて衝突の検知感度を高めることができる。 The robot controller 30 has a function of detecting a collision by comparing the actual torque and estimated torque of each joint. If the actual torque is greater than the estimated torque and the difference between the actual torque and the estimated torque is greater than a preset threshold, the robot controller 30 determines that the robot 40 has collided with an obstacle. A margin is set in advance for the threshold to prevent erroneous detection due to an error between the actual torque and the estimated torque when the robot 40 is operating normally. In addition, the robot controller 30 can increase the collision detection sensitivity by adjusting the threshold based on the hand load parameters compared to when the threshold is fixed.

ロボット制御装置３０は、手先負荷パラメータの推定結果とともに、推定結果の信頼度を示す指標が得られる場合、信頼度の指標に応じて閾値をさらに調整しても良い。例えば、ロボット制御装置３０は、推定トルクの演算に用いられる手先負荷パラメータの値に信頼度を反映させることができる。この場合、ロボット制御装置３０は、信頼度に応じて補正された手先負荷パラメータを基に閾値を調整することで、いわば保守的な方向に閾値を調整することができる。 When an index indicating the reliability of the estimation result is obtained together with the hand load parameter estimation result, the robot control device 30 may further adjust the threshold value according to the reliability index. For example, the robot control device 30 can reflect the reliability in the value of the hand load parameter used to calculate the estimated torque. In this case, the robot control device 30 can adjust the threshold value in a conservative direction, so to speak, by adjusting the threshold value based on the hand load parameter corrected according to the reliability.

ロボット制御装置３０は、推定結果である手先負荷パラメータの値をそのまま用いて計算された閾値の信頼度を反映させても良い。この場合、ロボット制御装置３０は、手先パラメータの推定結果をそのまま用いて推定トルクを計算し、推定トルクを基に求めた閾値に、信頼度に応じた補正係数ｂ（ただし、ｂ＞１とする）を乗じることによって閾値を調整する。 The robot controller 30 may reflect the reliability of the threshold calculated using the estimated hand load parameter value as is. In this case, the robot controller 30 calculates the estimated torque using the estimated hand parameter value as is, and adjusts the threshold by multiplying the threshold calculated based on the estimated torque by a correction coefficient b (where b>1) according to the reliability.

実施の形態８と同様に、手先負荷推定装置１は、ロボット制御装置３０の外部の装置に限定されない。手先負荷推定装置１は、ロボット制御装置３０に内蔵されても良い。実施の形態１から７のいずれか１つにおける学習および推論は、ロボット制御装置３０の内部で実施されても良い。または、実施の形態１から７のいずれか１つにおける学習はロボット制御装置３０の外部で実施され、かつ、実施の形態１から７のいずれか１つにおける推論はロボット制御装置３０の内部で実施されても良い。 As in embodiment 8, the hand load estimation device 1 is not limited to being a device external to the robot control device 30. The hand load estimation device 1 may be built into the robot control device 30. The learning and inference in any one of embodiments 1 to 7 may be performed inside the robot control device 30. Alternatively, the learning in any one of embodiments 1 to 7 may be performed outside the robot control device 30, and the inference in any one of embodiments 1 to 7 may be performed inside the robot control device 30.

実施の形態９によると、ロボットシステム８は、手先負荷パラメータを基に、衝突検知のためのパラメータを調整することによって、障害物へのロボット４０の衝突を低減させ、ロボット４０の安全性を向上させることができる。 According to the ninth embodiment, the robot system 8 can reduce collisions of the robot 40 with obstacles and improve the safety of the robot 40 by adjusting the parameters for collision detection based on the hand load parameters.

次に、実施の形態１から７にかかる手先負荷推定装置１，２，３，４，５，６，７を実現するハードウェア構成について説明する。図１６は、実施の形態１から７にかかる手先負荷推定装置１，２，３，４，５，６，７を実現するハードウェアの第１の構成例を示す図である。第１の構成例は、手先負荷推定装置１，２，３，４，５，６，７の要部である逆動力学モデル部１２，１２ｂ，１２ｇ、学習部１３，１３ａ，１３ｂ，１３ｃ，１３ｄ，１３ｅ，１３ｆ，１３ｇ、推論部１５，１５ａ，１５ｂ，１５ｃ，１５ｄ，１５ｅ，１５ｆ，１５ｇおよび変数変換部１６を、プロセッサ５３とメモリ５４とを有する処理回路５１によって実現する場合の構成例である。 Next, a hardware configuration for realizing the hand load estimation device 1, 2, 3, 4, 5, 6, 7 according to the first to seventh embodiments will be described. FIG. 16 is a diagram showing a first example of the hardware configuration for realizing the hand load estimation device 1, 2, 3, 4, 5, 6, 7 according to the first to seventh embodiments. The first example of the configuration is a configuration example in which the inverse dynamics model unit 12, 12b, 12g, the learning unit 13, 13a, 13b, 13c, 13d, 13e, 13f, 13g, the inference unit 15, 15a, 15b, 15c, 15d, 15e, 15f, 15g, and the variable conversion unit 16, which are the main parts of the hand load estimation device 1, 2, 3, 4, 5, 6, 7, are realized by a processing circuit 51 having a processor 53 and a memory 54.

入力部５２は、手先負荷推定装置１，２，３，４，５，６，７に対する入力信号を外部から受信する回路である。図１等に示す入力部１１は、入力部５２により実現される。出力部５５は、手先負荷推定装置１，２，３，４，５，６，７で生成した信号を外部へ出力する回路である。出力部５５は、推定結果である手先負荷パラメータと、実施の形態６および７における信頼度の指標とを出力する。 The input unit 52 is a circuit that receives an input signal for the hand load estimation device 1, 2, 3, 4, 5, 6, 7 from the outside. The input unit 11 shown in FIG. 1 etc. is realized by the input unit 52. The output unit 55 is a circuit that outputs a signal generated by the hand load estimation device 1, 2, 3, 4, 5, 6, 7 to the outside. The output unit 55 outputs the hand load parameters that are the estimation results and the reliability index in the sixth and seventh embodiments.

プロセッサ５３は、ＣＰＵ（Central Processing Unit）である。プロセッサ５３は、演算装置、マイクロプロセッサ、マイクロコンピュータ、またはＤＳＰ（Digital Signal Processor）でも良い。メモリ５４は、例えば、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、フラッシュメモリ、ＥＰＲＯＭ（Erasable Programmable Read Only Memory）、ＥＥＰＲＯＭ（登録商標）（Electrically Erasable Programmable Read Only Memory）等の不揮発性または揮発性の半導体メモリ、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスクまたはＤＶＤ（Digital Versatile Disk）等である。 The processor 53 is a CPU (Central Processing Unit). The processor 53 may be an arithmetic unit, a microprocessor, a microcomputer, or a DSP (Digital Signal Processor). The memory 54 is, for example, a non-volatile or volatile semiconductor memory such as a RAM (Random Access Memory), a ROM (Read Only Memory), a flash memory, an EPROM (Erasable Programmable Read Only Memory), or an EEPROM (registered trademark) (Electrically Erasable Programmable Read Only Memory), a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, or a DVD (Digital Versatile Disk).

プロセッサ５３は、手先負荷推定プログラムを実行する。手先負荷推定プログラムは、手先負荷推定装置１，２，３，４，５，６，７の要部を構成する各部として動作するための処理が記述されたプログラムである。手先負荷推定プログラムは、メモリ５４にあらかじめ格納される。プロセッサ５３は、メモリ５４に格納されている手先負荷推定プログラムを読み出して実行することにより、手先負荷推定装置１，２，３，４，５，６，７の要部を構成する各部として動作する。また、メモリ５４は、各種情報を記憶する。学習結果記憶部１４，１４ａ，１４ｂ，１４ｃ，１４ｄ，１４ｅ，１４ｆ，１４ｇは、メモリ５４により実現される。メモリ５４は、プロセッサ５３が各種処理を実行する際の一時メモリにも使用される。 The processor 53 executes the hand load estimation program. The hand load estimation program is a program that describes the processing for each part constituting the main part of the hand load estimation device 1, 2, 3, 4, 5, 6, 7. The hand load estimation program is stored in advance in the memory 54. The processor 53 reads out and executes the hand load estimation program stored in the memory 54, thereby operating as each part constituting the main part of the hand load estimation device 1, 2, 3, 4, 5, 6, 7. The memory 54 also stores various information. The learning result storage parts 14, 14a, 14b, 14c, 14d, 14e, 14f, 14g are realized by the memory 54. The memory 54 is also used as a temporary memory when the processor 53 executes various processes.

手先負荷推定プログラムは、メモリ５４にあらかじめ格納されているものとしたがこれに限定されない。手先負荷推定プログラムは、コンピュータシステムによる読み取りが可能とされた記憶媒体に書き込まれた状態で手先負荷推定装置１，２，３，４，５，６，７のユーザに提供され、ユーザによってメモリ５４にインストールされても良い。記憶媒体は、フレキシブルディスクである可搬型記憶媒体、あるいは半導体メモリであるフラッシュメモリでも良い。手先負荷推定プログラムは、他のコンピュータあるいはサーバ装置から通信ネットワークを介してメモリ５４へインストールされても良い。 The hand load estimation program is assumed to be stored in advance in memory 54, but is not limited to this. The hand load estimation program may be provided to a user of the hand load estimation device 1, 2, 3, 4, 5, 6, 7 in a state written on a storage medium that can be read by a computer system, and may be installed in memory 54 by the user. The storage medium may be a portable storage medium such as a flexible disk, or a flash memory such as a semiconductor memory. The hand load estimation program may be installed in memory 54 from another computer or server device via a communication network.

実施の形態１から７にかかる手先負荷推定装置１，２，３，４，５，６，７の要部は、専用のハードウェアによって実現しても良い。図１７は、実施の形態１から７にかかる手先負荷推定装置１，２，３，４，５，６，７を実現するハードウェアの第２の構成例を示す図である。第２の構成例は、図１６に示す処理回路５１の機能を、専用のハードウェアである処理回路５６により実現する場合の構成例である。 The main parts of the hand load estimation devices 1, 2, 3, 4, 5, 6, and 7 according to the first to seventh embodiments may be realized by dedicated hardware. FIG. 17 is a diagram showing a second configuration example of the hardware that realizes the hand load estimation devices 1, 2, 3, 4, 5, 6, and 7 according to the first to seventh embodiments. The second configuration example is a configuration example in which the function of the processing circuit 51 shown in FIG. 16 is realized by the processing circuit 56, which is dedicated hardware.

処理回路５６は、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）、またはこれらを組み合わせた回路である。なお、図１７に示す例では、手先負荷推定装置１，２，３，４，５，６，７の要部を単一の処理回路５６で実現するものとしたがこれに限定されない。ハードウェアが複数の処理回路５６を備え、手先負荷推定装置１，２，３，４，５，６，７の要部をそれぞれ異なる処理回路５６で実現しても良い。記憶装置５７は、ＨＤＤ（Hard Disk Drive）あるいはＳＳＤ（Solid State Drive）であって、各種情報を記憶する。学習結果記憶部１４，１４ａ，１４ｂ，１４ｃ，１４ｄ，１４ｅ，１４ｆ，１４ｇは、記憶装置５７により実現される。 The processing circuit 56 is, for example, an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or a circuit that combines these. In the example shown in FIG. 17, the main parts of the hand load estimation devices 1, 2, 3, 4, 5, 6, and 7 are realized by a single processing circuit 56, but this is not limited to this. The hardware may include multiple processing circuits 56, and the main parts of the hand load estimation devices 1, 2, 3, 4, 5, 6, and 7 may be realized by different processing circuits 56. The storage device 57 is a hard disk drive (HDD) or a solid state drive (SSD) that stores various information. The learning result storage units 14, 14a, 14b, 14c, 14d, 14e, 14f, and 14g are realized by the storage device 57.

なお、手先負荷推定装置１，２，３，４，５，６，７の要部のうちの一部を図１６に示すプロセッサ５３とメモリ５４とで実現し、残りを処理回路５６と同様の専用のハードウェアで実現しても良い。 In addition, some of the main parts of the hand load estimation device 1, 2, 3, 4, 5, 6, and 7 may be realized by the processor 53 and memory 54 shown in FIG. 16, and the rest may be realized by dedicated hardware similar to the processing circuit 56.

以上の各実施の形態に示した構成は、本開示の内容の一例を示すものである。各実施の形態の構成は、別の公知の技術と組み合わせることが可能である。各実施の形態の構成同士が適宜組み合わせられても良い。本開示の要旨を逸脱しない範囲で、各実施の形態の構成の一部を省略または変更することが可能である。 The configurations shown in the above embodiments are examples of the contents of this disclosure. The configurations of each embodiment can be combined with other known technologies. The configurations of each embodiment can be combined as appropriate. A portion of the configuration of each embodiment can be omitted or modified without departing from the gist of this disclosure.

１，２，３，４，５，６，７手先負荷推定装置、８ロボットシステム、９ロボット制御システム、１１，５２入力部、１２，１２ｂ，１２ｇ逆動力学モデル部、１３，１３ａ，１３ｂ，１３ｃ，１３ｄ，１３ｅ，１３ｆ，１３ｇ学習部、１４，１４ａ，１４ｂ，１４ｃ，１４ｄ，１４ｅ，１４ｆ，１４ｇ学習結果記憶部、１５，１５ａ，１５ｂ，１５ｃ，１５ｄ，１５ｅ，１５ｆ，１５ｇ推論部、１６変数変換部、２１入力層、２２中間層、２３出力層、２４，２５，２６ニューロン、２７トルクセンサ、３０ロボット制御装置、３１指令生成部、４０ロボット、５１，５６処理回路、５３プロセッサ、５４メモリ、５５出力部、５７記憶装置。 1, 2, 3, 4, 5, 6, 7 Hand load estimation device, 8 Robot system, 9 Robot control system, 11, 52 Input section, 12, 12b, 12g Inverse dynamics model section, 13, 13a, 13b, 13c, 13d, 13e, 13f, 13g Learning section, 14, 14a, 14b, 14c, 14d, 14e, 14f, 14g Learning result storage section, 15, 15a, 15b, 15c, 15d, 15e, 15f, 15g Inference section, 16 Variable conversion section, 21 Input layer, 22 Intermediate layer, 23 Output layer, 24, 25, 26 Neuron, 27 Torque sensor, 30 Robot control device, 31 Command generation section, 40 Robot, 51, 56 Processing circuit, 53 Processor, 54 Memory, 55 Output section, 57 storage device.

Claims

A hand load estimation device that estimates a hand load parameter which is a parameter regarding a load applied to a hand of a robot, comprising:
an inverse dynamics model unit that outputs a nominal torque value, which is a torque calculated based on an inverse dynamics model, by inputting motion data, which is at least one of data on a position, a velocity, and an acceleration of a joint of the robot, into the inverse dynamics model;
a learning unit that learns the hand load parameters from which an influence of a modeling error of the inverse dynamics model and an influence of a characteristic of the robot that is not included in the inverse dynamics model are removed, based on learning data including the motion data and a difference between the actual torque and the nominal torque of the joint;
an inference unit to which inference data including the motion data and the difference between the actual torque and the nominal torque is input, and which infers the hand load parameter based on a learning result by the learning unit;
A hand load estimation device comprising:

Position data representing positions of the joints is input to the hand load estimation device,
2. The hand load estimating device according to claim 1, wherein the movement data obtained from the position data is input to the inverse dynamics model section, the learning section and the inference section.

The hand load estimation device according to claim 2, characterized in that the position data is a command value for controlling the joint or a feedback value of the position of the joint.

The hand load estimation device according to any one of claims 1 to 3, characterized in that the actual torque value for calculating the difference is a value obtained by multiplying the current data of the joint by a torque constant.

The hand load estimation device according to any one of claims 1 to 3, characterized in that the actual torque value for calculating the difference is a torque value detected by a torque sensor provided on the robot.

A hand load estimation device according to any one of claims 1 to 5, characterized in that the movement data when a load is applied to the hand, the difference calculated based on the actual torque and the nominal torque when a load is applied to the hand, and the correct answer label of the hand load parameter are input to the learning unit.

The motion data when no load is applied to the hand and the difference calculated based on the actual torque and the nominal torque when no load is applied to the hand are input to the learning unit,
6. The hand load estimation device according to claim 1, wherein the inference unit infers the hand load parameter based on a difference between the actual torque and an estimated torque, which is the nominal torque when no load is added to the hand.

The motion data when a load having a known hand load parameter is applied to the hand, and the difference calculated based on the actual torque and the nominal torque when a load having a known hand load parameter is applied to the hand are input to the learning unit,
The hand load estimation device according to any one of claims 1 to 5, characterized in that the inference unit infers the hand load parameter based on a difference between the actual torque and an estimated torque, which is the nominal torque when a load, the hand load parameter of which is known, is applied to the hand.

The hand load estimation device according to any one of claims 1 to 8, characterized in that the learning unit receives input of information on friction characteristics related to at least one of the position, velocity, and acceleration of the joint.

The hand load estimation device according to any one of claims 1 to 5, characterized in that the learning unit learns the hand load parameters that do not include errors due to temperature dependency of friction characteristics in the joint by inputting temperature data indicating the temperature of the joint to the learning unit.

The hand load estimation device according to claim 10, characterized in that the learning unit receives input of information on friction characteristics related to at least one of the position, velocity, acceleration, and temperature of the joint.

A variable conversion unit converts the motion data into variables for setting a linearized torque equation with respect to the hand load parameters,
12. The hand load estimating device according to claim 1, wherein the variables are input to the learning section and the inference section.

The learning unit calculates an index indicating a reliability of the hand load parameter,
13. The hand load estimating device according to claim 1, wherein the inference unit outputs the index together with an inference result of the hand load parameter.

the estimated value of the hand load parameter obtained by the inference unit is an average value of a probability distribution updated by sequential identification from the motion data and the difference,
14. The hand load estimating device according to claim 13, wherein the index indicating the reliability of the estimated value is a variance value of the probability distribution.

A hand load estimation device according to any one of claims 1 to 14,
a robot control device that generates a command to be sent to a robot based on the hand load parameters output from the hand load estimation device;
A robot control system comprising:

The robot control system according to claim 15, characterized in that the robot control device adjusts the acceleration and deceleration of the robot based on the hand load parameters while the robot is in operation.

The robot control system according to claim 15 or 16, characterized in that the robot control device adjusts parameters for detecting a collision of the robot based on the hand load parameters while the robot is in operation.

A robot control system according to any one of claims 15 to 17, characterized in that the hand load estimation device is provided inside the robot control device.

A robot control system according to any one of claims 15 to 17, characterized in that an inference unit of the hand load estimation device that infers the hand load parameters is provided inside the robot control device.

A robot control system according to any one of claims 15 to 19;
a robot that operates according to commands from the robot control system;
A robot system comprising: