JP7781997B2

JP7781997B2 - system

Info

Publication number: JP7781997B2
Application number: JP2024163047A
Authority: JP
Inventors: 青戸誠
Original assignee: SoftBank Group Corp
Current assignee: SoftBank Group Corp
Priority date: 2023-09-22
Filing date: 2024-09-19
Publication date: 2025-12-08
Anticipated expiration: 2044-09-19
Also published as: JP2025051740A

Description

本開示の技術は、システムに関する。 The technology disclosed herein relates to a system.

特許文献１には、少なくとも一つのプロセッサにより遂行される、ペルソナチャットボット制御方法であって、ユーザ発話を受信するステップと、前記ユーザ発話を、チャットボットのキャラクターに関する説明と関連した指示文を含むプロンプトに追加するステップと前記プロンプトをエンコードするステップと、前記エンコードしたプロンプトを言語モデルに入力して、前記ユーザ発話に応答するチャットボット発話を生成するステップ、を含む、方法が開示されている。 Patent document 1 discloses a persona chatbot control method executed by at least one processor, the method including the steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to a description of the chatbot's character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

特開２０２２－１８０２８２号公報Japanese Patent Application Laid-Open No. 2022-180282

従来の技術では、オンライン通話において感情的な声やネガティブな発言が業務効率の低下を招くおそれがある。 With conventional technology, emotional voices and negative comments during online calls can lead to reduced work efficiency.

実施形態に係るシステムは、オンライン通話において感情的な声やネガティブな発言を理性的な音声に変換することを目的とする。 The system according to the embodiment aims to convert emotional voices and negative comments into rational voices during online conversations.

実施形態に係るシステムは、取得部と、解析部と、フィルタリング部と、提供部とを備える。取得部は、音声データを取得する。解析部は、取得部によって取得された音声データを解析し、話者の感情を分類する。フィルタリング部は、解析部によって分類された感情に基づいて、負の感情的な音声を理性的な音声に変換する。提供部は、フィルタリング部によって変換された音声をユーザに提供する。 The system according to the embodiment includes an acquisition unit, an analysis unit, a filtering unit, and a provision unit. The acquisition unit acquires voice data. The analysis unit analyzes the voice data acquired by the acquisition unit and classifies the speaker's emotion. The filtering unit converts negative emotional voice into rational voice based on the emotion classified by the analysis unit. The provision unit provides the user with the voice converted by the filtering unit.

実施形態に係るシステムは、オンライン通話において感情的な声やネガティブな発言を理性的な音声に変換することができる。 The system according to the embodiment can convert emotional voices and negative comments into rational voices during online calls.

第１実施形態に係るデータ処理システムの構成の一例を示す概念図である。1 is a conceptual diagram illustrating an example of the configuration of a data processing system according to a first embodiment. 第１実施形態に係るデータ処理装置およびスマートデバイスの要部機能の一例を示す概念図である。1 is a conceptual diagram showing an example of main functions of a data processing device and a smart device according to a first embodiment. 第２実施形態に係るデータ処理システムの構成の一例を示す概念図である。FIG. 10 is a conceptual diagram illustrating an example of the configuration of a data processing system according to a second embodiment. 第２実施形態に係るデータ処理装置およびスマート眼鏡の要部機能の一例を示す概念図である。FIG. 10 is a conceptual diagram showing an example of main functions of a data processing device and smart glasses according to a second embodiment. 第３実施形態に係るデータ処理システムの構成の一例を示す概念図である。FIG. 10 is a conceptual diagram illustrating an example of the configuration of a data processing system according to a third embodiment. 第３実施形態に係るデータ処理装置およびヘッドセット型端末の要部機能の一例を示す概念図である。FIG. 11 is a conceptual diagram showing an example of main functions of a data processing device and a headset-type terminal according to a third embodiment. 第４実施形態に係るデータ処理システムの構成の一例を示す概念図である。FIG. 10 is a conceptual diagram showing an example of the configuration of a data processing system according to a fourth embodiment. 第４実施形態に係るデータ処理装置およびロボットの要部機能の一例を示す概念図である。FIG. 10 is a conceptual diagram showing an example of main functions of a data processing device and a robot according to a fourth embodiment. 複数の感情がマッピングされる感情マップを示す。1 shows an emotion map onto which multiple emotions are mapped. 複数の感情がマッピングされる感情マップを示す。1 shows an emotion map onto which multiple emotions are mapped.

以下、添付図面に従って本開示の技術に係るシステムの実施形態の一例について説明する。 Below, an example of an embodiment of a system relating to the technology disclosed herein will be described with reference to the accompanying drawings.

先ず、以下の説明で使用される文言について説明する。 First, let me explain the terminology used in the following explanation.

以下の実施形態において、符号付きのプロセッサ（以下、単に「プロセッサ」と称する）は、１つの演算装置であってもよいし、複数の演算装置の組み合わせであってもよい。また、プロセッサは、１種類の演算装置であってもよいし、複数種類の演算装置の組み合わせであってもよい。演算装置の一例としては、ＣＰＵ（Central Processing Unit）、ＧＰＵ（Graphics Processing Unit）、ＧＰＧＰＵ（General-Purpose computing on Graphics Processing Units）、ＡＰＵ（Accelerated Processing Unit）、またはＴＰＵ（Tensor Processing Unit）などが挙げられる。 In the following embodiments, a coded processor (hereinafter simply referred to as a "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Furthermore, a processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose Computing on Graphics Processing Units), an APU (Accelerated Processing Unit), or a TPU (Tensor Processing Unit).

以下の実施形態において、符号付きのＲＡＭ（Random Access Memory）は、一時的に情報が格納されるメモリであり、プロセッサによってワークメモリとして用いられる。 In the following embodiments, coded random access memory (RAM) is memory in which information is temporarily stored and is used as work memory by the processor.

以下の実施形態において、符号付きのストレージは、各種プログラムおよび各種パラメータなどを記憶する１つまたは複数の不揮発性の記憶装置である。不揮発性の記憶装置の一例としては、フラッシュメモリ（ＳＳＤ（Solid State Drive））、磁気ディスク（例えば、ハードディスク）、または磁気テープなどが挙げられる。 In the following embodiments, the coded storage refers to one or more non-volatile storage devices that store various programs, parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), and magnetic tapes.

以下の実施形態において、符号付きの通信Ｉ／Ｆ（Interface）は、通信プロセッサおよびアンテナなどを含むインタフェースである。通信Ｉ／Ｆは、複数のコンピュータ間での通信を司る。通信Ｉ／Ｆに対して適用される通信規格の一例としては、５Ｇ（5th Generation Mobile Communication System）、Ｗｉ－Ｆｉ（登録商標）、またはＢｌｕｅｔｏｏｔｈ（登録商標）などを含む無線通信規格が挙げられる。 In the following embodiments, a communication I/F (Interface) is an interface that includes a communication processor, an antenna, and the like. The communication I/F controls communication between multiple computers. Examples of communication standards that can be applied to the communication I/F include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

以下の実施形態において、「Ａおよび／またはＢ」は、「ＡおよびＢのうちの少なくとも１つ」と同義である。つまり、「Ａおよび／またはＢ」は、Ａだけであってもよいし、Ｂだけであってもよいし、ＡおよびＢの組み合わせであってもよい、という意味である。また、本明細書において、３つ以上の事柄を「および／または」で結び付けて表現する場合も、「Ａおよび／またはＢ」と同様の考え方が適用される。 In the following embodiments, "A and/or B" is synonymous with "at least one of A and B." In other words, "A and/or B" means that it may be A alone, B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and/or B" also applies when three or more things are expressed connected by "and/or."

［第１実施形態］
図１には、第１実施形態に係るデータ処理システム１０の構成の一例が示されている。 [First embodiment]
FIG. 1 shows an example of the configuration of a data processing system 10 according to the first embodiment.

図１に示すように、データ処理システム１０は、データ処理装置１２およびスマートデバイス１４を備えている。データ処理装置１２の一例としては、サーバが挙げられる。 As shown in FIG. 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

データ処理装置１２は、コンピュータ２２、データベース２４、および通信Ｉ／Ｆ２６を備えている。コンピュータ２２は、プロセッサ２８、ＲＡＭ３０、およびストレージ３２を備えている。プロセッサ２８、ＲＡＭ３０、およびストレージ３２は、バス３４に接続されている。また、データベース２４および通信Ｉ／Ｆ２６も、バス３４に接続されている。通信Ｉ／Ｆ２６は、ネットワーク５４に接続されている。ネットワーク５４の一例としては、ＷＡＮ（Wide Area Network）および／またはＬＡＮ（Local Area Network）などが挙げられる。 The data processing device 12 includes a computer 22, a database 24, and a communication I/F 26. The computer 22 includes a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and communication I/F 26 are also connected to the bus 34. The communication I/F 26 is connected to a network 54. Examples of the network 54 include a WAN (Wide Area Network) and/or a LAN (Local Area Network).

スマートデバイス１４は、コンピュータ３６、受付装置３８、出力装置４０、カメラ４２、および通信Ｉ／Ｆ４４を備えている。コンピュータ３６は、プロセッサ４６、ＲＡＭ４８、およびストレージ５０を備えている。プロセッサ４６、ＲＡＭ４８、およびストレージ５０は、バス５２に接続されている。また、受付装置３８、出力装置４０、およびカメラ４２も、バス５２に接続されている。 The smart device 14 includes a computer 36, a reception device 38, an output device 40, a camera 42, and a communication I/F 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

受付装置３８は、タッチパネル３８Ａおよびマイクロフォン３８Ｂなどを備えており、ユーザ入力を受け付ける。タッチパネル３８Ａは、指示体（例えば、ペンまたは指など）の接触を検出することにより、指示体の接触によるユーザ入力を受け付ける。マイクロフォン３８Ｂは、ユーザの音声を検出することにより、音声によるユーザ入力を受け付ける。制御部４６Ａは、タッチパネル３８Ａおよびマイクロフォン３８Ｂによって受け付けたユーザ入力を示すデータをデータ処理装置１２に送信する。データ処理装置１２では、特定処理部２９０（図２参照）が、ユーザ入力を示すデータを取得する。 The reception device 38 includes a touch panel 38A and a microphone 38B, and receives user input. The touch panel 38A detects contact with a pointer (e.g., a pen or a finger) to receive user input via the pointer. The microphone 38B detects the user's voice to receive user input via voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 (see Figure 2) acquires the data indicating the user input.

出力装置４０は、ディスプレイ４０Ａおよびスピーカ４０Ｂなどを備えており、データをユーザが知覚可能な表現形（例えば、音声および／またはテキスト）で出力することでデータをユーザに対して提示する。ディスプレイ４０Ａは、プロセッサ４６からの指示に従ってテキストおよび画像などの可視情報を表示する。スピーカ４０Ｂは、プロセッサ４６からの指示に従って音声を出力する。カメラ４２は、レンズ、絞り、およびシャッタなどの光学系と、ＣＭＯＳ（Complementary Metal-Oxide-Semiconductor）イメージセンサまたはＣＣＤ（Charge Coupled Device）イメージセンサなどの撮像素子とが搭載された小型デジタルカメラである。 The output device 40 includes a display 40A and a speaker 40B, and presents data to the user by outputting the data in a form that the user can perceive (e.g., audio and/or text). The display 40A displays visible information such as text and images in accordance with instructions from the processor 46. The speaker 40B outputs audio in accordance with instructions from the processor 46. The camera 42 is a compact digital camera equipped with an optical system including a lens, aperture, and shutter, and an imaging element such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

通信Ｉ／Ｆ４４は、ネットワーク５４に接続されている。通信Ｉ／Ｆ４４および２６は、ネットワーク５４を介してプロセッサ４６とプロセッサ２８との間の各種情報の授受を司る。 The communication I/F 44 is connected to the network 54. The communication I/Fs 44 and 26 are responsible for the exchange of various information between the processor 46 and the processor 28 via the network 54.

図２には、データ処理装置１２およびスマートデバイス１４の要部機能の一例が示されている。 Figure 2 shows an example of the main functions of the data processing device 12 and smart device 14.

図２に示すように、データ処理装置１２では、プロセッサ２８によって特定処理が行われる。ストレージ３２には、特定処理プログラム５６が格納されている。特定処理プログラム５６は、本開示の技術に係る「プログラム」の一例である。プロセッサ２８は、ストレージ３２から特定処理プログラム５６を読み出し、読み出した特定処理プログラム５６をＲＡＭ３０上で実行する。特定処理は、プロセッサ２８がＲＡＭ３０上で実行する特定処理プログラム５６に従って特定処理部２９０として動作することによって実現される。 As shown in FIG. 2, in the data processing device 12, specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" according to the technology of the present disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

ストレージ３２には、データ生成モデル５８および感情特定モデル５９が格納されている。データ生成モデル５８および感情特定モデル５９は、特定処理部２９０によって用いられる。特定処理部２９０は、感情特定モデル５９を用いてユーザの感情を推定し、ユーザの感情を用いた特定処理を行うことができる。感情特定モデル５９を用いた感情推定機能（感情特定機能）では、ユーザの感情の推定や予測などを含め、ユーザの感情に関する種々の推定や予測などが行われるが、かかる例に限定されない。また、感情の推定や予測には、例えば、感情の分析（解析）なども含まれる。 Storage 32 stores a data generation model 58 and an emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290. The identification processing unit 290 can estimate the user's emotion using the emotion identification model 59 and perform identification processing using the user's emotion. The emotion estimation function (emotion identification function) using the emotion identification model 59 performs various estimations and predictions regarding the user's emotions, including estimation and prediction of the user's emotion, but is not limited to these examples. Furthermore, emotion estimation and prediction also include, for example, emotion analysis.

スマートデバイス１４では、プロセッサ４６によって特定処理が行われる。ストレージ５０には、特定処理プログラム６０が格納されている。特定処理プログラム６０は、データ処理システム１０によって特定処理プログラム５６と併用される。プロセッサ４６は、ストレージ５０から特定処理プログラム６０を読み出し、読み出した特定処理プログラム６０をＲＡＭ４８上で実行する。特定処理は、プロセッサ４６がＲＡＭ４８上で実行する特定処理プログラム６０に従って、制御部４６Ａとして動作することによって実現される。なお、スマートデバイス１４には、データ生成モデル５８および感情特定モデル５９と同様のデータ生成モデルおよび感情特定モデルを有し、これらモデルを用いて特定処理部２９０と同様の処理を行うこともできる。 In the smart device 14, the specific processing is performed by the processor 46. The storage 50 stores the specific processing program 60. The specific processing program 60 is used in conjunction with the specific processing program 56 by the data processing system 10. The processor 46 reads the specific processing program 60 from the storage 50 and executes the read specific processing program 60 on the RAM 48. The specific processing is realized by the processor 46 operating as the control unit 46A in accordance with the specific processing program 60 executed on the RAM 48. The smart device 14 also has a data generation model and emotion identification model similar to the data generation model 58 and emotion identification model 59, and can use these models to perform processing similar to that of the specific processing unit 290.

なお、データ処理装置１２以外の他の装置がデータ生成モデル５８を有してもよい。例えば、サーバ装置（例えば、生成サーバ）がデータ生成モデル５８を有してもよい。この場合、データ処理装置１２は、データ生成モデル５８を有するサーバ装置と通信を行うことで、データ生成モデル５８が用いられた処理結果（予測結果など）を得る。また、データ処理装置１２は、サーバ装置であってもよいし、ユーザが保有する端末装置（例えば、携帯電話、ロボット、家電など）であってもよい。次に、第１実施形態に係るデータ処理システム１０による処理の一例について説明する。 Note that a device other than the data processing device 12 may have the data generation model 58. For example, a server device (e.g., a generation server) may have the data generation model 58. In this case, the data processing device 12 communicates with the server device having the data generation model 58 to obtain processing results (prediction results, etc.) using the data generation model 58. The data processing device 12 may also be a server device, or a terminal device owned by a user (e.g., a mobile phone, robot, home appliance, etc.). Next, an example of processing by the data processing system 10 according to the first embodiment will be described.

（形態例１）
本発明の実施形態に係るオンライン通話フィルタリングシステムは、オンライン通話中の感情的な声やネガティブな発言をフィルタリングするシステムである。このシステムは、オンライン通話の音声データを取得し、感情認識ＡＩが音声データを解析して話者の感情を分類する。例えば、怒り、悲しみ、喜びなどの感情を識別する。この際、感情認識ＡＩは、音声のトーンやピッチ、速度などの特徴を基に感情を判定する。次に、負の感情的な音声を理性的な音声にフィルタリングする。例えば、怒りの感情が含まれる音声を冷静なトーンに変換する。このフィルタリング処理は、感情認識ＡＩが判定した感情に基づいて行われる。さらに、ネガティブな発言に対しては生成ＡＩを用いて、議論の本筋に無関係な発言を省いた音声を提示する。例えば、感情的な発言や無関係な話題を除去し、重要な情報だけを抽出して提示する。この処理により、ユーザはストレスを感じることなく、効率的に業務を進めることができる。この仕組みにより、オンライン通話中に感情的な声やネガティブな発言が耳元で聞こえることがなくなり、業務効率が向上する。また、ストレスフリーな業務環境が整うことで、従業員の満足度も向上する。これにより、オンライン通話フィルタリングシステムは、オンライン通話中の感情的な声やネガティブな発言をフィルタリングし、業務効率を向上させることができる。 (Example 1)
An online call filtering system according to an embodiment of the present invention filters emotional voices and negative comments during online calls. This system acquires audio data from online calls and uses emotion recognition AI to analyze the audio data and classify the speaker's emotions. For example, emotions such as anger, sadness, and joy are identified. The emotion recognition AI determines the emotion based on characteristics such as the tone, pitch, and speed of the voice. Next, negative emotional voices are filtered into rational voices. For example, angry voices are converted into calm voices. This filtering process is performed based on the emotions determined by the emotion recognition AI. Furthermore, for negative comments, a generation AI is used to present audio that omits comments unrelated to the main topic of the discussion. For example, emotional comments and irrelevant topics are removed, and only important information is extracted and presented. This process allows users to work efficiently without feeling stressed. This mechanism eliminates the need to overhear emotional voices and negative comments during online calls, improving work efficiency. Furthermore, a stress-free work environment is created, improving employee satisfaction. This allows the online call filtering system to filter out emotional voices and negative comments during online calls, improving work efficiency.

実施形態に係るオンライン通話フィルタリングシステムは、取得部と、解析部と、フィルタリング部と、提供部とを備える。取得部は、オンライン通話の音声データを取得する。取得部は、例えば、マイクロフォンを用いてリアルタイムで音声データを収集する。また、取得部は、録音された音声ファイルを読み込むこともできる。例えば、取得部は、音声ファイル形式、サンプリングレート、ビットレートなどの音声データの形式をサポートする。解析部は、感情認識ＡＩを用いて、取得部によって取得された音声データを解析し、話者の感情を分類する。解析部は、例えば、音声のトーン、ピッチ、速度などの特徴を基に感情を判定する。感情認識ＡＩは、音声データを入力とし、感情の分類結果を出力する。例えば、解析部は、音声データのトーンを解析し、怒り、悲しみ、喜びなどの感情を識別する。フィルタリング部は、解析部によって分類された感情に基づいて、負の感情的な音声を理性的な音声に変換する。フィルタリング部は、例えば、怒りの感情が含まれる音声を冷静なトーンに変換する。フィルタリング部は、感情認識ＡＩの判定結果を入力とし、変換された音声を出力する。例えば、フィルタリング部は、音声のトーンを調整し、冷静なトーンに変換する。提供部は、フィルタリング部によって変換された音声をユーザに提供する。提供部は、例えば、スピーカーを用いて音声を再生する。また、提供部は、変換された音声をテキストデータとして表示することもできる。例えば、提供部は、音声データをテキストに変換し、画面に表示する。これにより、実施形態に係るオンライン通話フィルタリングシステムは、オンライン通話中の感情的な声やネガティブな発言をフィルタリングし、業務効率を向上させることができる。 An online call filtering system according to an embodiment includes an acquisition unit, an analysis unit, a filtering unit, and a providing unit. The acquisition unit acquires audio data from online calls. The acquisition unit collects audio data in real time using, for example, a microphone. The acquisition unit can also read recorded audio files. For example, the acquisition unit supports audio data formats such as audio file formats, sampling rates, and bit rates. The analysis unit uses emotion recognition AI to analyze the audio data acquired by the acquisition unit and classify the speaker's emotions. The analysis unit determines emotions based on features such as tone, pitch, and speed of the audio. The emotion recognition AI receives audio data as input and outputs an emotion classification result. For example, the analysis unit analyzes the tone of the audio data and identifies emotions such as anger, sadness, and joy. The filtering unit converts negative emotional voice into rational voice based on the emotions classified by the analysis unit. For example, the filtering unit converts voice containing angry emotions into a calm tone. The filtering unit receives the determination result of the emotion recognition AI as input and outputs the converted voice. For example, the filtering unit adjusts the tone of the voice and converts it to a calm tone. The providing unit provides the voice converted by the filtering unit to the user. The providing unit plays the voice using, for example, a speaker. The providing unit can also display the converted voice as text data. For example, the providing unit converts voice data into text and displays it on a screen. In this way, the online call filtering system according to the embodiment can filter out emotional voices and negative comments during online calls, improving work efficiency.

取得部は、オンライン通話の音声データを取得する。取得部は、例えば、マイクロフォンを用いてリアルタイムで音声データを収集する。具体的には、取得部は高感度マイクロフォンを使用し、周囲の雑音を低減するノイズキャンセリング技術を組み込むことで、クリアな音声データを取得する。また、取得部は、録音された音声ファイルを読み込むこともできる。例えば、取得部は、音声ファイル形式、サンプリングレート、ビットレートなどの音声データの形式をサポートする。これにより、取得部は、さまざまな形式の音声データを柔軟に取り扱うことができる。さらに、取得部は、音声データのメタデータも取得し、解析部やフィルタリング部に提供する。メタデータには、録音日時、話者の識別情報、通話のコンテキストなどが含まれる。これにより、取得部は、音声データのコンテキストを理解しやすくし、後続の解析やフィルタリングの精度を向上させることができる。取得部は、クラウドストレージやローカルストレージに音声データを保存し、必要に応じてアクセスできるようにする。これにより、取得部は、リアルタイムの音声データだけでなく、過去の音声データも効率的に管理し、解析やフィルタリングに活用することができる。 The acquisition unit acquires audio data from online calls. The acquisition unit collects audio data in real time using, for example, a microphone. Specifically, the acquisition unit acquires clear audio data by using a high-sensitivity microphone and incorporating noise-canceling technology to reduce ambient noise. The acquisition unit can also read recorded audio files. For example, the acquisition unit supports audio data formats such as audio file formats, sampling rates, and bit rates. This allows the acquisition unit to flexibly handle audio data in various formats. The acquisition unit also acquires metadata for the audio data and provides it to the analysis and filtering units. The metadata includes the recording date and time, speaker identification information, and call context. This makes it easier for the acquisition unit to understand the context of the audio data and improve the accuracy of subsequent analysis and filtering. The acquisition unit stores the audio data in cloud storage or local storage and makes it accessible as needed. This allows the acquisition unit to efficiently manage not only real-time audio data but also past audio data and use it for analysis and filtering.

解析部は、感情認識ＡＩを用いて、取得部によって取得された音声データを解析し、話者の感情を分類する。解析部は、例えば、音声のトーン、ピッチ、速度などの特徴を基に感情を判定する。感情認識ＡＩは、音声データを入力とし、感情の分類結果を出力する。具体的には、感情認識ＡＩは、ディープラーニング技術を用いて音声データの特徴を抽出し、事前に学習したモデルを基に感情を分類する。例えば、解析部は、音声データのトーンを解析し、怒り、悲しみ、喜びなどの感情を識別する。さらに、解析部は、音声データの時間的な変化を考慮し、感情の変動をリアルタイムで追跡することができる。これにより、解析部は、話者の感情の変化を迅速に検出し、適切なフィルタリングを行うための情報を提供する。解析部は、感情認識ＡＩの精度を向上させるために、定期的にモデルの再学習を行い、新しいデータを取り入れる。これにより、解析部は、常に最新の感情認識技術を活用し、高精度な感情分類を実現することができる。 The analysis unit uses emotion recognition AI to analyze the voice data acquired by the acquisition unit and classify the speaker's emotions. The analysis unit determines emotions based on voice characteristics such as tone, pitch, and speed. The emotion recognition AI takes voice data as input and outputs emotion classification results. Specifically, the emotion recognition AI uses deep learning technology to extract features of the voice data and classify emotions based on a pre-trained model. For example, the analysis unit analyzes the tone of the voice data and identifies emotions such as anger, sadness, and joy. Furthermore, the analysis unit takes into account temporal changes in the voice data and can track emotional fluctuations in real time. This allows the analysis unit to quickly detect changes in the speaker's emotions and provide information for appropriate filtering. To improve the accuracy of the emotion recognition AI, the analysis unit regularly retrains the model and incorporates new data. This allows the analysis unit to always utilize the latest emotion recognition technology and achieve highly accurate emotion classification.

フィルタリング部は、解析部によって分類された感情に基づいて、負の感情的な音声を理性的な音声に変換する。フィルタリング部は、例えば、怒りの感情が含まれる音声を冷静なトーンに変換する。具体的には、フィルタリング部は、音声のトーン、ピッチ、速度を調整し、感情の強度を和らげる。例えば、怒りの感情が強い場合、フィルタリング部は、音声のピッチを下げ、速度を遅くし、トーンを穏やかにすることで、冷静な音声に変換する。フィルタリング部は、感情認識ＡＩの判定結果を入力とし、変換された音声を出力する。さらに、フィルタリング部は、音声の内容を保持しつつ、感情の表現を調整するための自然言語処理技術を活用する。これにより、フィルタリング部は、話者の意図を損なうことなく、感情的な表現を理性的な表現に変換することができる。フィルタリング部は、ユーザの設定に応じて、フィルタリングの強度や基準を調整することができる。例えば、特定の感情のみをフィルタリングする設定や、感情の強度に応じてフィルタリングの度合いを変える設定などが可能である。これにより、フィルタリング部は、ユーザのニーズに合わせた柔軟なフィルタリングを提供することができる。 The filtering unit converts negative emotional speech into rational speech based on the emotions classified by the analysis unit. For example, the filtering unit converts speech containing angry emotions into a calm tone. Specifically, the filtering unit adjusts the tone, pitch, and speed of the speech to soften the intensity of the emotion. For example, if the emotion is strong, the filtering unit converts the speech into a calm speech by lowering the pitch, slowing the speed, and softening the tone. The filtering unit inputs the judgment results of the emotion recognition AI and outputs the converted speech. Furthermore, the filtering unit utilizes natural language processing technology to adjust the emotional expression while preserving the content of the speech. This allows the filtering unit to convert emotional expressions into rational expressions without compromising the speaker's intention. The filtering unit can adjust the filtering strength and criteria according to user settings. For example, it is possible to set it to filter only specific emotions or to change the degree of filtering depending on the intensity of the emotion. This allows the filtering unit to provide flexible filtering tailored to the user's needs.

提供部は、フィルタリング部によって変換された音声をユーザに提供する。提供部は、例えば、スピーカーを用いて音声を再生する。具体的には、提供部は、高品質なスピーカーを使用し、変換された音声をクリアに再生する。また、提供部は、変換された音声をテキストデータとして表示することもできる。例えば、提供部は、音声データをテキストに変換し、画面に表示する。これにより、ユーザは、音声だけでなく、テキストとしても情報を確認することができる。さらに、提供部は、ユーザインタフェースを通じて、音声の再生やテキスト表示の設定をカスタマイズすることができる。例えば、音声の再生速度や音量を調整したり、テキストのフォントサイズや表示位置を変更したりすることが可能である。これにより、提供部は、ユーザの好みに合わせた柔軟な情報提供を実現することができる。提供部は、変換された音声やテキストデータを保存し、後で再生や参照ができるようにすることもできる。これにより、ユーザは、過去の通話内容を確認し、必要に応じて再利用することができる。提供部は、他のシステムやデバイスと連携し、変換された音声やテキストデータを共有することもできる。例えば、ビジネスチャットツールやメールシステムと連携し、通話内容を自動的に記録し、共有することが可能である。これにより、提供部は、業務効率を向上させるための多様な情報提供手段を提供することができる。 The providing unit provides the user with the audio converted by the filtering unit. The providing unit plays the audio using, for example, a speaker. Specifically, the providing unit uses a high-quality speaker to clearly play the converted audio. The providing unit can also display the converted audio as text data. For example, the providing unit converts audio data into text and displays it on a screen. This allows the user to view information not only as audio but also as text. Furthermore, the providing unit can customize audio playback and text display settings through a user interface. For example, it is possible to adjust the audio playback speed and volume, and change the font size and display position of the text. This allows the providing unit to flexibly provide information tailored to the user's preferences. The providing unit can also save the converted audio and text data so that it can be played back or referenced later. This allows the user to review past call content and reuse it as needed. The providing unit can also link with other systems and devices to share the converted audio and text data. For example, it can link with business chat tools or email systems to automatically record and share call content. This allows the provider to offer a variety of information provision methods to improve business efficiency.

解析部は、音声のトーン、ピッチ、速度の特徴を基に感情を判定することができる。解析部は、例えば、音声のトーンを解析し、感情を判定する。例えば、トーンの高さや強さを基に感情を識別する。また、解析部は、音声のピッチを解析し、感情を判定することもできる。例えば、ピッチの周波数を基に感情を識別する。また、解析部は、音声の速度を解析し、感情を判定することもできる。例えば、発話速度を基に感情を識別する。これにより、解析部は、音声の特徴を基に感情を正確に判定することができる。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データをＡＩに入力し、感情の判定をＡＩに実行させることができる。 The analysis unit can determine emotions based on the tone, pitch, and speed characteristics of the voice. The analysis unit, for example, analyzes the tone of the voice to determine emotions. For example, it identifies emotions based on the pitch and intensity of the tone. The analysis unit can also analyze the pitch of the voice to determine emotions. For example, it can identify emotions based on the frequency of the pitch. The analysis unit can also analyze the speed of the voice to determine emotions. For example, it can identify emotions based on the speaking rate. This allows the analysis unit to accurately determine emotions based on the voice characteristics. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or may be performed without using AI. For example, the analysis unit can input voice data into AI and have the AI perform emotion determination.

フィルタリング部は、怒りの感情が含まれる音声を冷静なトーンに変換することができる。フィルタリング部は、例えば、怒りの感情が含まれる音声を冷静なトーンに変換する。例えば、音声のトーンを調整し、冷静なトーンに変換する。また、フィルタリング部は、音声のピッチを調整し、冷静なトーンに変換することもできる。例えば、ピッチの周波数を安定させる。また、フィルタリング部は、音声の速度を調整し、冷静なトーンに変換することもできる。例えば、発話速度を一定に保つ。これにより、フィルタリング部は、怒りの感情を冷静なトーンに変換することで、感情的な影響を軽減することができる。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、音声データをＡＩに入力し、トーンの変換をＡＩに実行させることができる。 The filtering unit can convert voice containing angry emotions into a calm tone. The filtering unit, for example, converts voice containing angry emotions into a calm tone. For example, it adjusts the tone of the voice to convert it into a calm tone. The filtering unit can also adjust the pitch of the voice to convert it into a calm tone. For example, it stabilizes the pitch frequency. The filtering unit can also adjust the speed of the voice to convert it into a calm tone. For example, it keeps the speaking rate constant. In this way, the filtering unit can reduce the emotional impact by converting angry emotions into a calm tone. Some or all of the above-mentioned processing in the filtering unit may be performed using AI, or may be performed without using AI. For example, the filtering unit can input voice data into AI and have the AI perform the tone conversion.

生成部を備え、生成部は、否定的な発言を省き、重要な情報だけを抽出して提示することができる。生成部は、生成ＡＩを用いて、否定的な発言を省き、重要な情報だけを抽出して提示する。生成部は、例えば、感情的な発言や無関係な話題を除去し、重要な情報を抽出する。生成ＡＩは、音声データを入力とし、重要な情報を抽出した音声データを出力する。例えば、生成部は、感情的な発言を検出し、それを除去する。また、生成部は、無関係な話題を検出し、それを除去することもできる。例えば、生成部は、議論の本筋に無関係な発言を検出し、それを除去する。これにより、生成部は、ネガティブな発言を省くことで、重要な情報だけを効率的に取得することができる。生成部における上述した処理の一部または全部は、例えば、生成ＡＩを用いて行われてもよく、生成ＡＩを用いずに行われてもよい。例えば、生成部は、音声データを生成ＡＩに入力し、重要な情報の抽出を生成ＡＩに実行させることができる。 The system is equipped with a generation unit that can eliminate negative comments and extract and present only important information. The generation unit uses a generation AI to eliminate negative comments and extract and present only important information. The generation unit, for example, removes emotional comments and irrelevant topics and extracts important information. The generation AI receives audio data as input and outputs audio data from which important information has been extracted. For example, the generation unit detects emotional comments and removes them. The generation unit can also detect irrelevant topics and remove them. For example, the generation unit detects comments unrelated to the main topic of the discussion and removes them. This allows the generation unit to efficiently obtain only important information by omitting negative comments. Some or all of the above-mentioned processing in the generation unit may be performed using, or without, the generation AI. For example, the generation unit may input audio data to the generation AI and have the generation AI extract important information.

提供部は、変換された音声をユーザに提供することができる。提供部は、例えば、変換された音声をスピーカーを用いてユーザに提供する。例えば、提供部は、変換された音声をリアルタイムで再生する。また、提供部は、変換された音声をテキストデータとして表示することもできる。例えば、提供部は、音声データをテキストに変換し、画面に表示する。これにより、提供部は、変換された音声をユーザに提供することで、ストレスフリーな業務環境を実現することができる。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、変換された音声データをＡＩに入力し、テキストデータへの変換をＡＩに実行させることができる。 The providing unit can provide the converted voice to the user. The providing unit, for example, provides the converted voice to the user using a speaker. For example, the providing unit plays the converted voice in real time. The providing unit can also display the converted voice as text data. For example, the providing unit converts voice data into text and displays it on a screen. In this way, the providing unit can provide the converted voice to the user, thereby realizing a stress-free work environment. Some or all of the above-mentioned processing in the providing unit may be performed using AI, for example, or may be performed without using AI. For example, the providing unit can input the converted voice data to AI and have the AI convert it into text data.

取得部は、ユーザの過去の通話履歴を分析し、最適な取得方法を選定することができる。取得部は、例えば、ユーザの過去の通話履歴を分析し、最適な取得方法を選定する。例えば、取得部は、ユーザが過去に頻繁に使用した通話アプリを優先的に取得する。また、取得部は、ユーザの過去の通話履歴から、特定の時間帯に取得する音声データを予測し、最適な取得方法を選定することもできる。例えば、取得部は、ユーザの過去の通話履歴を分析し、重要な通話のみを優先的に取得する。これにより、取得部は、過去の通話履歴を分析することで、最適な取得方法を選定することができる。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、通話履歴データをＡＩに入力し、最適な取得方法の選定をＡＩに実行させることができる。 The acquisition unit can analyze the user's past call history and select the optimal acquisition method. The acquisition unit, for example, analyzes the user's past call history and selects the optimal acquisition method. For example, the acquisition unit prioritizes acquisition of calling apps that the user has used frequently in the past. The acquisition unit can also predict the audio data to be acquired during a specific time period from the user's past call history and select the optimal acquisition method. For example, the acquisition unit analyzes the user's past call history and prioritizes acquisition of only important calls. In this way, the acquisition unit can select the optimal acquisition method by analyzing the past call history. Some or all of the above-mentioned processing in the acquisition unit may be performed using AI, for example, or may be performed without using AI. For example, the acquisition unit can input call history data into AI and have the AI select the optimal acquisition method.

取得部は、音声データの取得時に、ユーザの現在の業務状況または関心分野に基づいてフィルタリングを行うことができる。取得部は、例えば、ユーザが会議中の場合、会議に関連する音声データのみを取得する。例えば、取得部は、会議の議題に関連する音声データを優先的に取得する。また、取得部は、ユーザが特定のプロジェクトに集中している場合、そのプロジェクトに関連する音声データを優先的に取得することもできる。例えば、取得部は、プロジェクトの進行状況に基づいて関連する音声データをフィルタリングする。また、取得部は、ユーザの関心分野に基づいて、関連する音声データをフィルタリングして取得することもできる。例えば、取得部は、ユーザの専門分野や興味のあるトピックに関連する音声データを優先的に取得する。これにより、取得部は、業務状況や関心分野に基づいてフィルタリングすることで、関連性の高いデータを取得することができる。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、業務状況や関心分野のデータをＡＩに入力し、フィルタリングをＡＩに実行させることができる。 When acquiring voice data, the acquisition unit can filter the voice data based on the user's current work situation or areas of interest. For example, if the user is in a meeting, the acquisition unit acquires only voice data related to the meeting. For example, the acquisition unit prioritizes acquiring voice data related to the meeting agenda. Furthermore, if the user is concentrating on a particular project, the acquisition unit can also prioritize acquiring voice data related to that project. For example, the acquisition unit filters relevant voice data based on the progress of the project. Furthermore, the acquisition unit can filter and acquire relevant voice data based on the user's areas of interest. For example, the acquisition unit prioritizes acquiring voice data related to the user's area of expertise or topics of interest. In this way, the acquisition unit can acquire highly relevant data by filtering based on the work situation or areas of interest. Some or all of the above-mentioned processing in the acquisition unit may be performed using, or without, AI. For example, the acquisition unit can input data on the work situation and areas of interest into AI and have the AI perform the filtering.

取得部は、音声データの取得時に、ユーザの地理的位置情報を基に関連性の高いデータを優先的に取得することができる。取得部は、例えば、ユーザが特定の地域にいる場合、その地域に関連する音声データを優先的に取得する。例えば、取得部は、地域のイベントやニュースに関連する音声データを取得する。また、取得部は、ユーザが移動中の場合、現在地に基づいて関連性の高い音声データを取得することもできる。例えば、取得部は、移動中のユーザに関連する情報を含む音声データを取得する。また、取得部は、ユーザの地理的位置情報を基に、最適な音声データをフィルタリングして取得することもできる。例えば、取得部は、地理的な関連性に基づいて音声データをフィルタリングする。これにより、取得部は、地理的位置情報を基に関連性の高いデータを取得することで、効率的なデータ収集が可能となる。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、地理的位置情報をＡＩに入力し、関連性の高いデータの取得をＡＩに実行させることができる。 When acquiring voice data, the acquisition unit can prioritize acquiring highly relevant data based on the user's geographical location information. For example, if the user is in a specific area, the acquisition unit prioritizes acquiring voice data related to that area. For example, the acquisition unit acquires voice data related to local events and news. Furthermore, if the user is on the move, the acquisition unit can also acquire highly relevant voice data based on the user's current location. For example, the acquisition unit acquires voice data containing information relevant to the user while on the move. Furthermore, the acquisition unit can filter and acquire optimal voice data based on the user's geographical location information. For example, the acquisition unit filters voice data based on geographical relevance. This allows the acquisition unit to acquire highly relevant data based on geographical location information, enabling efficient data collection. Some or all of the above-mentioned processing in the acquisition unit may be performed using, or without, AI. For example, the acquisition unit may input geographical location information to AI and cause the AI to acquire highly relevant data.

取得部は、音声データの取得時に、ユーザのSNS活動を分析し、関連するデータを取得することができる。取得部は、例えば、ユーザがソーシャルメディアで頻繁に言及するトピックに関連する音声データを取得する。例えば、取得部は、ユーザのSNS投稿内容を分析し、関連する音声データを取得する。また、取得部は、ユーザのソーシャルメディア活動から、関心のあるトピックを分析し、関連する音声データを取得することもできる。例えば、取得部は、ユーザのSNS活動を基に、関心のあるトピックに関連する音声データを取得する。また、取得部は、ユーザのソーシャルメディアでの発言内容を基に、最適な音声データをフィルタリングして取得することもできる。例えば、取得部は、SNSでのトレンドや話題に基づいて音声データをフィルタリングする。これにより、取得部は、ソーシャルメディア活動を分析することで、関連性の高いデータを取得することができる。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、SNSデータをＡＩに入力し、関連するデータの取得をＡＩに実行させることができる。 When acquiring voice data, the acquisition unit can analyze the user's social media activity and acquire related data. The acquisition unit, for example, acquires voice data related to topics frequently mentioned by the user on social media. For example, the acquisition unit analyzes the content of the user's social media posts and acquires related voice data. The acquisition unit can also analyze topics of interest from the user's social media activity and acquire related voice data. For example, the acquisition unit acquires voice data related to topics of interest based on the user's social media activity. The acquisition unit can also filter and acquire optimal voice data based on the content of the user's social media posts. For example, the acquisition unit filters voice data based on trends and topics on social media. In this way, the acquisition unit can acquire highly relevant data by analyzing social media activity. Some or all of the above-mentioned processing by the acquisition unit may be performed using, or without, AI. For example, the acquisition unit can input social media data into AI and have the AI acquire related data.

解析部は、解析時に、音声データの重要度に基づいて解析の詳細度を調整することができる。解析部は、例えば、重要な音声データに対しては、詳細な解析を行う。例えば、解析部は、音声データの内容を詳細に解析し、重要な情報を抽出する。また、解析部は、重要度の低い音声データに対しては、簡略化された解析を行うこともできる。例えば、解析部は、音声データの概要を解析し、簡潔な情報を提供する。また、解析部は、音声データの重要度に応じて、解析の詳細度を動的に調整することもできる。例えば、解析部は、音声データの重要度に基づいて解析の深さや範囲を調整する。これにより、解析部は、音声データの重要度に応じて解析の詳細度を調整することで、効率的な解析が可能となる。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データの重要度をＡＩに入力し、解析の詳細度の調整をＡＩに実行させることができる。 During analysis, the analysis unit can adjust the level of detail of the analysis based on the importance of the audio data. For example, the analysis unit performs a detailed analysis of important audio data. For example, the analysis unit analyzes the content of the audio data in detail and extracts important information. The analysis unit can also perform a simplified analysis of audio data with low importance. For example, the analysis unit analyzes the overview of the audio data and provides concise information. The analysis unit can also dynamically adjust the level of detail of the analysis based on the importance of the audio data. For example, the analysis unit adjusts the depth and scope of the analysis based on the importance of the audio data. This allows the analysis unit to adjust the level of detail of the analysis based on the importance of the audio data, enabling efficient analysis. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or may be performed without AI. For example, the analysis unit can input the importance of the audio data to AI and have the AI adjust the level of detail of the analysis.

解析部は、解析時に、音声データのカテゴリに応じて異なる解析アルゴリズムを適用することができる。解析部は、例えば、会議音声データに対しては、議事録生成アルゴリズムを適用する。例えば、解析部は、会議の内容を自動的に要約し、議事録を生成する。また、解析部は、カスタマーサポート音声データに対しては、顧客満足度解析アルゴリズムを適用することもできる。例えば、解析部は、顧客の発言内容を解析し、満足度を評価する。また、解析部は、教育音声データに対しては、学習進捗解析アルゴリズムを適用することもできる。例えば、解析部は、教育内容を解析し、学習の進捗状況を評価する。これにより、解析部は、音声データのカテゴリに応じて適切な解析アルゴリズムを適用することで、解析精度が向上する。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データのカテゴリをＡＩに入力し、解析アルゴリズムの適用をＡＩに実行させることができる。 During analysis, the analysis unit can apply different analysis algorithms depending on the category of audio data. For example, the analysis unit applies a minutes generation algorithm to conference audio data. For example, the analysis unit automatically summarizes the contents of the conference and generates minutes. The analysis unit can also apply a customer satisfaction analysis algorithm to customer support audio data. For example, the analysis unit analyzes the content of customer comments and evaluates satisfaction. The analysis unit can also apply a learning progress analysis algorithm to educational audio data. For example, the analysis unit analyzes educational content and evaluates learning progress. In this way, the analysis unit can apply an appropriate analysis algorithm depending on the category of audio data, thereby improving analysis accuracy. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or may be performed without using AI. For example, the analysis unit can input the category of audio data into AI and have the AI apply the analysis algorithm.

解析部は、解析時に、音声データの取得時期に基づいて解析の優先順位を決定することができる。解析部は、例えば、最新の音声データを優先的に解析する。例えば、解析部は、最新の音声データを迅速に解析し、リアルタイムで結果を提供する。また、解析部は、重要なイベントの直後に取得された音声データを優先的に解析することもできる。例えば、解析部は、イベントの直後に取得された音声データを優先的に解析し、重要な情報を抽出する。また、解析部は、ユーザのスケジュールに基づいて、解析の優先順位を動的に調整することもできる。例えば、解析部は、ユーザのスケジュールに基づいて、重要なタスクに関連する音声データを優先的に解析する。これにより、解析部は、音声データの取得時期に基づいて解析の優先順位を決定することで、効率的な解析が可能となる。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データの取得時期をＡＩに入力し、解析の優先順位の決定をＡＩに実行させることができる。 During analysis, the analysis unit can determine analysis priorities based on the time the audio data was acquired. The analysis unit, for example, prioritizes analysis of the most recent audio data. For example, the analysis unit can quickly analyze the most recent audio data and provide results in real time. The analysis unit can also prioritize analysis of audio data acquired immediately after an important event. For example, the analysis unit can prioritize analysis of audio data acquired immediately after the event to extract important information. The analysis unit can also dynamically adjust analysis priorities based on the user's schedule. For example, the analysis unit can prioritize analysis of audio data related to important tasks based on the user's schedule. This enables the analysis unit to determine analysis priorities based on the time the audio data was acquired, enabling efficient analysis. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or without AI. For example, the analysis unit can input the time the audio data was acquired into AI and have the AI determine the analysis priorities.

解析部は、解析時に、音声データの関連性に基づいて解析の順序を調整することができる。解析部は、例えば、関連性の高い音声データを優先的に解析する。例えば、解析部は、関連性の高い音声データを迅速に解析し、重要な情報を抽出する。また、解析部は、関連性の低い音声データを後回しにすることもできる。例えば、解析部は、関連性の低い音声データを後回しにし、重要度の高いデータを優先的に解析する。また、解析部は、音声データの関連性に応じて、解析の順序を動的に調整することもできる。例えば、解析部は、音声データの関連性に基づいて、解析の順序を動的に調整し、効率的な解析を行う。これにより、解析部は、音声データの関連性に基づいて解析の順序を調整することで、効率的な解析が可能となる。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データの関連性をＡＩに入力し、解析の順序の調整をＡＩに実行させることができる。 During analysis, the analysis unit can adjust the order of analysis based on the relevance of the audio data. For example, the analysis unit prioritizes analysis of highly relevant audio data. For example, the analysis unit quickly analyzes highly relevant audio data and extracts important information. The analysis unit can also postpone analysis of less relevant audio data. For example, the analysis unit postpones analysis of less relevant audio data and prioritizes analysis of more important data. The analysis unit can also dynamically adjust the order of analysis based on the relevance of the audio data. For example, the analysis unit dynamically adjusts the order of analysis based on the relevance of the audio data and performs efficient analysis. In this way, the analysis unit can adjust the order of analysis based on the relevance of the audio data, enabling efficient analysis. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or may be performed without using AI. For example, the analysis unit can input the relevance of the audio data to AI and have the AI adjust the order of analysis.

フィルタリング部は、フィルタリング時に、音声データの相互関係に基づいてフィルタリングの精度を向上させることができる。フィルタリング部は、例えば、音声データの前後関係を考慮して、関連する音声を一括でフィルタリングする。例えば、フィルタリング部は、音声データの前後関係を解析し、関連する情報を抽出する。また、フィルタリング部は、音声データの相互関係を分析し、重要な部分を抽出してフィルタリングすることもできる。例えば、フィルタリング部は、音声データの相互関係を基に、重要な情報を抽出する。また、フィルタリング部は、音声データの相互関係に基づいて、フィルタリングの精度を動的に調整することもできる。例えば、フィルタリング部は、音声データの相互関係を基に、フィルタリングの精度を向上させる。これにより、フィルタリング部は、音声データの相互関係を考慮することで、フィルタリングの精度が向上する。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、音声データの相互関係をＡＩに入力し、フィルタリングの精度の向上をＡＩに実行させることができる。 The filtering unit can improve the accuracy of filtering based on the interrelationships between audio data during filtering. The filtering unit, for example, takes into account the context of the audio data and filters related audio collectively. For example, the filtering unit analyzes the context of the audio data and extracts related information. The filtering unit can also analyze the interrelationships between audio data and extract and filter important parts. For example, the filtering unit extracts important information based on the interrelationships between audio data. The filtering unit can also dynamically adjust the accuracy of filtering based on the interrelationships between audio data. For example, the filtering unit improves the accuracy of filtering based on the interrelationships between audio data. In this way, the filtering unit improves the accuracy of filtering by taking the interrelationships between audio data into consideration. Some or all of the above-mentioned processing in the filtering unit may be performed using AI, for example, or may be performed without using AI. For example, the filtering unit can input the interrelationships between audio data to AI and have the AI improve the accuracy of filtering.

フィルタリング部は、フィルタリング時に、音声データの提出者の属性情報を考慮してフィルタリングを行うことができる。フィルタリング部は、例えば、提出者の役職に基づいて、重要な音声データを優先的にフィルタリングする。例えば、フィルタリング部は、提出者の役職に基づいて、重要な情報を含む音声データを優先的にフィルタリングする。また、フィルタリング部は、提出者の専門分野に基づいて、関連する音声データをフィルタリングすることもできる。例えば、フィルタリング部は、提出者の専門分野に関連する情報を含む音声データをフィルタリングする。また、フィルタリング部は、提出者の属性情報を基に、フィルタリングの基準を動的に調整することもできる。例えば、フィルタリング部は、提出者の属性情報を基に、フィルタリングの基準を調整する。これにより、フィルタリング部は、提出者の属性情報を考慮することで、適切なフィルタリングが可能となる。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、提出者の属性情報をＡＩに入力し、フィルタリングの実行をＡＩに行わせることができる。 When filtering, the filtering unit can take into account the attribute information of the person who submitted the audio data. The filtering unit, for example, prioritizes filtering of important audio data based on the submitter's job title. For example, the filtering unit prioritizes filtering of audio data containing important information based on the submitter's job title. The filtering unit can also filter related audio data based on the submitter's field of expertise. For example, the filtering unit filters audio data containing information related to the submitter's field of expertise. The filtering unit can also dynamically adjust the filtering criteria based on the submitter's attribute information. For example, the filtering unit adjusts the filtering criteria based on the submitter's attribute information. This enables the filtering unit to perform appropriate filtering by taking the submitter's attribute information into consideration. Some or all of the above-mentioned processing in the filtering unit may be performed using AI, for example, or may be performed without using AI. For example, the filtering unit can input the submitter's attribute information into AI and have the AI perform the filtering.

フィルタリング部は、フィルタリング時に、音声データの地理的分布を考慮してフィルタリングを行うことができる。フィルタリング部は、例えば、特定の地域に関連する音声データを優先的にフィルタリングする。例えば、フィルタリング部は、地域のイベントやニュースに関連する音声データをフィルタリングする。また、フィルタリング部は、地理的分布に基づいて、関連性の高い音声データをフィルタリングすることもできる。例えば、フィルタリング部は、地理的な関連性に基づいて音声データをフィルタリングする。また、フィルタリング部は、音声データの地理的分布を考慮して、フィルタリングの精度を向上させることもできる。例えば、フィルタリング部は、地理的分布に基づいてフィルタリングの精度を向上させる。これにより、フィルタリング部は、地理的分布を考慮することで、関連性の高い音声データを効率的にフィルタリングすることができる。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、地理的分布データをＡＩに入力し、フィルタリングの実行をＡＩに行わせることができる。 The filtering unit can perform filtering while taking into account the geographical distribution of the audio data. For example, the filtering unit prioritizes filtering of audio data related to a specific region. For example, the filtering unit filters audio data related to local events or news. The filtering unit can also filter highly relevant audio data based on the geographical distribution. For example, the filtering unit filters audio data based on geographical relevance. The filtering unit can also improve the accuracy of filtering by taking the geographical distribution of the audio data into account. For example, the filtering unit improves the accuracy of filtering based on the geographical distribution. In this way, the filtering unit can efficiently filter highly relevant audio data by taking the geographical distribution into account. Some or all of the above-mentioned processing in the filtering unit may be performed using AI, for example, or may be performed without using AI. For example, the filtering unit can input the geographical distribution data to AI and have the AI perform the filtering.

フィルタリング部は、フィルタリング時に、音声データの関連文献を参照してフィルタリングの精度を向上させることができる。フィルタリング部は、例えば、関連文献を基に、重要な音声データを優先的にフィルタリングする。例えば、フィルタリング部は、関連文献を参照して、音声データのフィルタリング基準を調整する。また、フィルタリング部は、音声データの関連文献を分析し、フィルタリングの精度を動的に向上させることもできる。例えば、フィルタリング部は、関連文献を基にフィルタリングの精度を向上させる。これにより、フィルタリング部は、関連文献を参照することで、フィルタリングの精度が向上する。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、関連文献データをＡＩに入力し、フィルタリングの精度の向上をＡＩに実行させることができる。 When filtering, the filtering unit can improve the accuracy of filtering by referring to literature related to the audio data. For example, the filtering unit prioritizes filtering of important audio data based on related literature. For example, the filtering unit adjusts the filtering criteria for the audio data by referring to related literature. The filtering unit can also analyze literature related to the audio data and dynamically improve the accuracy of filtering. For example, the filtering unit improves the accuracy of filtering based on related literature. In this way, the filtering unit improves the accuracy of filtering by referring to related literature. Some or all of the above-mentioned processing in the filtering unit may be performed using, or without, AI, for example. For example, the filtering unit can input related literature data into AI and have the AI improve the accuracy of filtering.

提供部は、提供時に、ユーザの過去の操作履歴を参照して最適な表示方法を選定することができる。提供部は、例えば、ユーザが過去に使用した表示方法を優先的に提供する。例えば、提供部は、ユーザの過去の操作履歴を基に、最適な表示方法を選定する。また、提供部は、ユーザの過去の操作履歴から、最適な表示方法を予測し、提供することもできる。例えば、提供部は、過去の操作パターンを分析し、最適な表示方法を提供する。また、提供部は、ユーザの過去の操作履歴を分析し、視認性の高い表示方法を提供することもできる。例えば、提供部は、過去の操作履歴を基に、視認性の高い表示方法を選定する。これにより、提供部は、過去の操作履歴を参照することで、最適な表示方法を提供することができる。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、操作履歴データをＡＩに入力し、最適な表示方法の選定をＡＩに実行させることができる。 When providing the display method, the providing unit can select the optimal display method by referring to the user's past operation history. For example, the providing unit prioritizes providing display methods that the user has used in the past. For example, the providing unit selects the optimal display method based on the user's past operation history. The providing unit can also predict and provide the optimal display method from the user's past operation history. For example, the providing unit analyzes past operation patterns and provides the optimal display method. The providing unit can also analyze the user's past operation history and provide a display method with high visibility. For example, the providing unit selects a display method with high visibility based on the past operation history. In this way, the providing unit can provide the optimal display method by referring to the past operation history. Some or all of the above-mentioned processing in the providing unit may be performed using AI, for example, or may be performed without using AI. For example, the providing unit can input operation history data to AI and have the AI select the optimal display method.

提供部は、提供時に、ユーザのデバイス情報を考慮して最適な表示方法を選定することができる。提供部は、例えば、ユーザがスマートフォンを使用している場合、画面サイズに合わせた表示方法を提供する。例えば、提供部は、スマートフォンの画面サイズに最適化された表示方法を提供する。また、提供部は、ユーザがタブレットを使用している場合、大きな画面に最適化された表示方法を提供することもできる。例えば、提供部は、タブレットの画面サイズに最適化された表示方法を提供する。また、提供部は、ユーザがスマートウォッチを使用している場合、簡潔で視認性の高い表示方法を提供することもできる。例えば、提供部は、スマートウォッチの画面サイズに最適化された表示方法を提供する。これにより、提供部は、デバイス情報を考慮することで、最適な表示方法を提供することができる。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、デバイス情報をＡＩに入力し、最適な表示方法の選定をＡＩに実行させることができる。 When providing the display, the providing unit can select the optimal display method by taking into account the user's device information. For example, if the user is using a smartphone, the providing unit provides a display method that matches the screen size. For example, the providing unit provides a display method optimized for the smartphone screen size. Furthermore, if the user is using a tablet, the providing unit can also provide a display method optimized for a large screen. For example, the providing unit provides a display method optimized for the tablet screen size. Furthermore, if the user is using a smartwatch, the providing unit can also provide a display method that is simple and highly visible. For example, the providing unit provides a display method optimized for the smartwatch screen size. In this way, the providing unit can provide the optimal display method by taking the device information into account. Some or all of the above-mentioned processing in the providing unit may be performed using AI, for example, or may be performed without using AI. For example, the providing unit can input device information into AI and have the AI select the optimal display method.

生成部は、生成時に、音声データの相互関係を考慮して生成の精度を向上させることができる。生成部は、例えば、音声データの前後関係を考慮して、関連する音声を一括で生成する。例えば、生成部は、音声データの前後関係を解析し、関連する情報を抽出して生成する。また、生成部は、音声データの相互関係を分析し、重要な部分を抽出して生成することもできる。例えば、生成部は、音声データの相互関係を基に、重要な情報を抽出して生成する。また、生成部は、音声データの相互関係に基づいて、生成の精度を動的に調整することもできる。例えば、生成部は、音声データの相互関係を基に、生成の精度を向上させる。これにより、生成部は、音声データの相互関係を考慮することで、生成の精度が向上する。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、音声データの相互関係をＡＩに入力し、生成の精度の向上をＡＩに実行させることができる。 The generation unit can improve the accuracy of generation by taking into account the interrelationships between audio data during generation. For example, the generation unit generates related audio all at once, taking into account the context of the audio data. For example, the generation unit analyzes the context of the audio data and extracts and generates related information. The generation unit can also analyze the interrelationships between audio data and extract and generate important parts. For example, the generation unit extracts and generates important information based on the interrelationships between audio data. The generation unit can also dynamically adjust the accuracy of generation based on the interrelationships between audio data. For example, the generation unit improves the accuracy of generation based on the interrelationships between audio data. In this way, the generation unit improves the accuracy of generation by taking the interrelationships between audio data into account. Some or all of the above-mentioned processing in the generation unit may be performed using, or without, AI, for example. For example, the generation unit can input the interrelationships between audio data into AI and have the AI improve the accuracy of generation.

生成部は、生成時に、音声データの提出者の属性情報を考慮して生成を行うことができる。生成部は、例えば、提出者の役職に基づいて、重要な音声データを優先的に生成する。例えば、生成部は、提出者の役職に基づいて、重要な情報を含む音声データを優先的に生成する。また、生成部は、提出者の専門分野に基づいて、関連する音声データを生成することもできる。例えば、生成部は、提出者の専門分野に関連する情報を含む音声データを生成する。また、生成部は、提出者の属性情報を基に、生成の基準を動的に調整することもできる。例えば、生成部は、提出者の属性情報を基に、生成の基準を調整する。これにより、生成部は、提出者の属性情報を考慮することで、適切な音声生成が可能となる。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、提出者の属性情報をＡＩに入力し、生成の実行をＡＩに行わせることができる。 When generating audio data, the generation unit can take into consideration the attribute information of the person who submitted the audio data. For example, the generation unit prioritizes the generation of important audio data based on the submitter's job title. For example, the generation unit prioritizes the generation of audio data containing important information based on the submitter's job title. The generation unit can also generate related audio data based on the submitter's field of expertise. For example, the generation unit generates audio data containing information related to the submitter's field of expertise. The generation unit can also dynamically adjust the generation criteria based on the submitter's attribute information. For example, the generation unit adjusts the generation criteria based on the submitter's attribute information. This enables the generation unit to generate appropriate audio by taking the submitter's attribute information into consideration. Some or all of the above-mentioned processing in the generation unit may be performed using AI, for example, or may be performed without using AI. For example, the generation unit can input the submitter's attribute information into AI and have the AI perform the generation.

生成部は、生成時に、音声データの地理的分布を考慮して生成を行うことができる。生成部は、例えば、特定の地域に関連する音声データを優先的に生成する。例えば、生成部は、地域のイベントやニュースに関連する音声データを生成する。また、生成部は、地理的分布に基づいて、関連性の高い音声データを生成することもできる。例えば、生成部は、地理的な関連性に基づいて音声データを生成する。また、生成部は、音声データの地理的分布を考慮して、生成の精度を向上させることもできる。例えば、生成部は、地理的分布に基づいて生成の精度を向上させる。これにより、生成部は、地理的分布を考慮することで、関連性の高い音声データを効率的に生成することができる。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、地理的分布データをＡＩに入力し、生成の実行をＡＩに行わせることができる。 The generation unit can generate the audio data while taking into account the geographical distribution of the audio data. For example, the generation unit prioritizes the generation of audio data related to a specific region. For example, the generation unit generates audio data related to local events or news. The generation unit can also generate highly relevant audio data based on the geographical distribution. For example, the generation unit generates audio data based on geographical relevance. The generation unit can also improve the accuracy of generation by taking the geographical distribution of the audio data into account. For example, the generation unit improves the accuracy of generation based on the geographical distribution. In this way, the generation unit can efficiently generate highly relevant audio data by taking the geographical distribution into account. Some or all of the above-mentioned processing in the generation unit may be performed using AI, for example, or may be performed without using AI. For example, the generation unit can input geographical distribution data to AI and have the AI perform the generation.

生成部は、生成時に、音声データの関連文献を参照して生成の精度を向上させることができる。生成部は、例えば、関連文献を基に、重要な音声データを優先的に生成する。例えば、生成部は、関連文献を参照して、音声データの生成基準を調整する。また、生成部は、音声データの関連文献を分析し、生成の精度を動的に向上させることもできる。例えば、生成部は、関連文献を基に生成の精度を向上させる。これにより、生成部は、関連文献を参照することで、生成の精度が向上する。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、関連文献データをＡＩに入力し、生成の精度の向上をＡＩに実行させることができる。 During generation, the generation unit can improve the accuracy of generation by referring to literature related to the audio data. For example, the generation unit prioritizes the generation of important audio data based on related literature. For example, the generation unit adjusts the generation criteria for the audio data by referring to related literature. The generation unit can also analyze literature related to the audio data and dynamically improve the accuracy of generation. For example, the generation unit improves the accuracy of generation based on related literature. In this way, the generation unit improves the accuracy of generation by referring to related literature. Some or all of the above-mentioned processing in the generation unit may be performed using, or without, AI, for example. For example, the generation unit can input related literature data into AI and have the AI improve the accuracy of generation.

実施形態に係るシステムは、上述した例に限定されず、例えば、以下のように、種々の変更が可能である。 The system according to the embodiment is not limited to the example described above, and various modifications are possible, for example, as follows:

取得部は、ユーザの過去の通話履歴を分析し、最適な取得方法を選定することができる。例えば、取得部は、ユーザが過去に頻繁に使用した通話アプリを優先的に取得する。また、取得部は、ユーザの過去の通話履歴から、特定の時間帯に取得する音声データを予測し、最適な取得方法を選定することもできる。例えば、取得部は、ユーザの過去の通話履歴を分析し、重要な通話のみを優先的に取得する。これにより、取得部は、過去の通話履歴を分析することで、最適な取得方法を選定することができる。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、通話履歴データをＡＩに入力し、最適な取得方法の選定をＡＩに実行させることができる。 The acquisition unit can analyze the user's past call history and select the optimal acquisition method. For example, the acquisition unit can prioritize acquisition of calling apps that the user has used frequently in the past. The acquisition unit can also predict the audio data to be acquired during a specific time period from the user's past call history and select the optimal acquisition method. For example, the acquisition unit can analyze the user's past call history and prioritize acquisition of only important calls. In this way, the acquisition unit can select the optimal acquisition method by analyzing the past call history. Some or all of the above-mentioned processing in the acquisition unit may be performed using AI, for example, or may be performed without using AI. For example, the acquisition unit can input call history data into AI and have the AI select the optimal acquisition method.

解析部は、音声データの取得時期に基づいて解析の優先順位を決定することができる。例えば、解析部は、最新の音声データを優先的に解析する。例えば、解析部は、最新の音声データを迅速に解析し、リアルタイムで結果を提供する。また、解析部は、重要なイベントの直後に取得された音声データを優先的に解析することもできる。例えば、解析部は、イベントの直後に取得された音声データを優先的に解析し、重要な情報を抽出する。また、解析部は、ユーザのスケジュールに基づいて、解析の優先順位を動的に調整することもできる。例えば、解析部は、ユーザのスケジュールに基づいて、重要なタスクに関連する音声データを優先的に解析する。これにより、解析部は、音声データの取得時期に基づいて解析の優先順位を決定することで、効率的な解析が可能となる。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データの取得時期をＡＩに入力し、解析の優先順位の決定をＡＩに実行させることができる。 The analysis unit can determine analysis priorities based on the time when the audio data was acquired. For example, the analysis unit prioritizes analysis of the most recent audio data. For example, the analysis unit can quickly analyze the most recent audio data and provide results in real time. The analysis unit can also prioritize analysis of audio data acquired immediately after an important event. For example, the analysis unit prioritizes analysis of audio data acquired immediately after the event to extract important information. The analysis unit can also dynamically adjust analysis priorities based on the user's schedule. For example, the analysis unit prioritizes analysis of audio data related to important tasks based on the user's schedule. This enables the analysis unit to determine analysis priorities based on the time when the audio data was acquired, enabling efficient analysis. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or without AI. For example, the analysis unit can input the time when the audio data was acquired into AI and have the AI determine the analysis priorities.

フィルタリング部は、音声データの相互関係に基づいてフィルタリングの精度を向上させることができる。例えば、フィルタリング部は、音声データの前後関係を考慮して、関連する音声を一括でフィルタリングする。例えば、フィルタリング部は、音声データの前後関係を解析し、関連する情報を抽出する。また、フィルタリング部は、音声データの相互関係を分析し、重要な部分を抽出してフィルタリングすることもできる。例えば、フィルタリング部は、音声データの相互関係を基に、重要な情報を抽出する。また、フィルタリング部は、音声データの相互関係に基づいて、フィルタリングの精度を動的に調整することもできる。例えば、フィルタリング部は、音声データの相互関係を基に、フィルタリングの精度を向上させる。これにより、フィルタリング部は、音声データの相互関係を考慮することで、フィルタリングの精度が向上する。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、音声データの相互関係をＡＩに入力し、フィルタリングの精度の向上をＡＩに実行させることができる。 The filtering unit can improve the accuracy of filtering based on the interrelationships between audio data. For example, the filtering unit filters related audio collectively, taking into account the context of the audio data. For example, the filtering unit analyzes the context of the audio data and extracts related information. The filtering unit can also analyze the interrelationships between audio data, extract important parts, and filter them. For example, the filtering unit extracts important information based on the interrelationships between audio data. The filtering unit can also dynamically adjust the accuracy of filtering based on the interrelationships between audio data. For example, the filtering unit improves the accuracy of filtering based on the interrelationships between audio data. In this way, the filtering unit improves the accuracy of filtering by taking the interrelationships between audio data into consideration. Some or all of the above-mentioned processing in the filtering unit may be performed using AI, for example, or may be performed without using AI. For example, the filtering unit can input the interrelationships between audio data to AI and have the AI improve the accuracy of filtering.

提供部は、ユーザの過去の操作履歴を参照して最適な表示方法を選定することができる。例えば、提供部は、ユーザが過去に使用した表示方法を優先的に提供する。例えば、提供部は、ユーザの過去の操作履歴を基に、最適な表示方法を選定する。また、提供部は、ユーザの過去の操作履歴から、最適な表示方法を予測し、提供することもできる。例えば、提供部は、過去の操作パターンを分析し、最適な表示方法を提供する。また、提供部は、ユーザの過去の操作履歴を分析し、視認性の高い表示方法を提供することもできる。例えば、提供部は、過去の操作履歴を基に、視認性の高い表示方法を選定する。これにより、提供部は、過去の操作履歴を参照することで、最適な表示方法を提供することができる。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、操作履歴データをＡＩに入力し、最適な表示方法の選定をＡＩに実行させることができる。 The providing unit can select the optimal display method by referring to the user's past operation history. For example, the providing unit prioritizes providing display methods that the user has used in the past. For example, the providing unit selects the optimal display method based on the user's past operation history. The providing unit can also predict and provide the optimal display method from the user's past operation history. For example, the providing unit analyzes past operation patterns and provides the optimal display method. The providing unit can also analyze the user's past operation history and provide a display method with high visibility. For example, the providing unit selects a display method with high visibility based on the past operation history. In this way, the providing unit can provide the optimal display method by referring to the past operation history. Some or all of the above-mentioned processing in the providing unit may be performed using AI, for example, or may be performed without using AI. For example, the providing unit can input operation history data to AI and have the AI select the optimal display method.

生成部は、音声データの相互関係を考慮して生成の精度を向上させることができる。例えば、生成部は、音声データの前後関係を考慮して、関連する音声を一括で生成する。例えば、生成部は、音声データの前後関係を解析し、関連する情報を抽出して生成する。また、生成部は、音声データの相互関係を分析し、重要な部分を抽出して生成することもできる。例えば、生成部は、音声データの相互関係を基に、重要な情報を抽出して生成する。また、生成部は、音声データの相互関係に基づいて、生成の精度を動的に調整することもできる。例えば、生成部は、音声データの相互関係を基に、生成の精度を向上させる。これにより、生成部は、音声データの相互関係を考慮することで、生成の精度が向上する。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、音声データの相互関係をＡＩに入力し、生成の精度の向上をＡＩに実行させることができる。 The generation unit can improve the accuracy of generation by taking into account the interrelationships between audio data. For example, the generation unit generates related audio all at once, taking into account the context of the audio data. For example, the generation unit analyzes the context of the audio data and extracts and generates related information. The generation unit can also analyze the interrelationships between audio data and extract and generate important parts. For example, the generation unit extracts and generates important information based on the interrelationships between audio data. The generation unit can also dynamically adjust the accuracy of generation based on the interrelationships between audio data. For example, the generation unit improves the accuracy of generation based on the interrelationships between audio data. In this way, the generation unit improves the accuracy of generation by taking the interrelationships between audio data into account. Some or all of the above-mentioned processing in the generation unit may be performed using, or without, AI, for example. For example, the generation unit can input the interrelationships between audio data into AI and have the AI improve the accuracy of generation.

以下に、形態例１の処理の流れについて簡単に説明する。 The processing flow for Example 1 is briefly explained below.

ステップ１：取得部は、オンライン通話の音声データを取得する。取得部は、例えば、マイクロフォンを用いてリアルタイムで音声データを収集する。また、取得部は、録音された音声ファイルを読み込むこともできる。例えば、取得部は、音声ファイル形式、サンプリングレート、ビットレートなどの音声データの形式をサポートする。
ステップ２：解析部は、感情認識ＡＩを用いて、取得部によって取得された音声データを解析し、話者の感情を分類する。解析部は、例えば、音声のトーン、ピッチ、速度などの特徴を基に感情を判定する。感情認識ＡＩは、音声データを入力とし、感情の分類結果を出力する。例えば、解析部は、音声データのトーンを解析し、怒り、悲しみ、喜びなどの感情を識別する。
ステップ３：フィルタリング部は、解析部によって分類された感情に基づいて、負の感情的な音声を理性的な音声に変換する。フィルタリング部は、例えば、怒りの感情が含まれる音声を冷静なトーンに変換する。フィルタリング部は、感情認識ＡＩの判定結果を入力とし、変換された音声を出力する。例えば、フィルタリング部は、音声のトーンを調整し、冷静なトーンに変換する。
ステップ４：提供部は、フィルタリング部によって変換された音声をユーザに提供する。提供部は、例えば、スピーカーを用いて音声を再生する。また、提供部は、変換された音声をテキストデータとして表示することもできる。例えば、提供部は、音声データをテキストに変換し、画面に表示する。 Step 1: The acquisition unit acquires audio data of an online call. For example, the acquisition unit collects audio data in real time using a microphone. The acquisition unit can also read recorded audio files. For example, the acquisition unit supports audio data formats such as audio file formats, sampling rates, and bit rates.
Step 2: The analysis unit uses emotion recognition AI to analyze the voice data acquired by the acquisition unit and classify the speaker's emotion. The analysis unit determines the emotion based on features such as voice tone, pitch, and speed. The emotion recognition AI takes the voice data as input and outputs an emotion classification result. For example, the analysis unit analyzes the tone of the voice data and identifies emotions such as anger, sadness, and joy.
Step 3: The filtering unit converts negative emotional voice into rational voice based on the emotions classified by the analysis unit. For example, the filtering unit converts voice containing angry emotions into a calm tone. The filtering unit receives the judgment result of the emotion recognition AI as input and outputs the converted voice. For example, the filtering unit adjusts the tone of the voice to convert it into a calm tone.
Step 4: The providing unit provides the user with the voice converted by the filtering unit. The providing unit, for example, plays the voice using a speaker. The providing unit can also display the converted voice as text data. For example, the providing unit converts the voice data into text and displays it on a screen.

（形態例２）
本発明の実施形態に係るオンライン通話フィルタリングシステムは、オンライン通話中の感情的な声やネガティブな発言をフィルタリングするシステムである。このシステムは、オンライン通話の音声データを取得し、感情認識ＡＩが音声データを解析して話者の感情を分類する。例えば、怒り、悲しみ、喜びなどの感情を識別する。この際、感情認識ＡＩは、音声のトーンやピッチ、速度などの特徴を基に感情を判定する。次に、負の感情的な音声を理性的な音声にフィルタリングする。例えば、怒りの感情が含まれる音声を冷静なトーンに変換する。このフィルタリング処理は、感情認識ＡＩが判定した感情に基づいて行われる。さらに、ネガティブな発言に対しては生成ＡＩを用いて、議論の本筋に無関係な発言を省いた音声を提示する。例えば、感情的な発言や無関係な話題を除去し、重要な情報だけを抽出して提示する。この処理により、ユーザはストレスを感じることなく、効率的に業務を進めることができる。この仕組みにより、オンライン通話中に感情的な声やネガティブな発言が耳元で聞こえることがなくなり、業務効率が向上する。また、ストレスフリーな業務環境が整うことで、従業員の満足度も向上する。これにより、オンライン通話フィルタリングシステムは、オンライン通話中の感情的な声やネガティブな発言をフィルタリングし、業務効率を向上させることができる。 (Example 2)
An online call filtering system according to an embodiment of the present invention filters emotional voices and negative comments during online calls. This system acquires audio data from online calls and uses emotion recognition AI to analyze the audio data and classify the speaker's emotions. For example, emotions such as anger, sadness, and joy are identified. The emotion recognition AI determines the emotion based on characteristics such as the tone, pitch, and speed of the voice. Next, negative emotional voices are filtered into rational voices. For example, angry voices are converted into calm voices. This filtering process is performed based on the emotions determined by the emotion recognition AI. Furthermore, for negative comments, a generation AI is used to present audio that omits comments unrelated to the main topic of the discussion. For example, emotional comments and irrelevant topics are removed, and only important information is extracted and presented. This process allows users to work efficiently without feeling stressed. This mechanism eliminates the need to overhear emotional voices and negative comments during online calls, improving work efficiency. Furthermore, a stress-free work environment is created, improving employee satisfaction. This allows the online call filtering system to filter out emotional voices and negative comments during online calls, improving work efficiency.

取得部は、オンライン通話の音声データを取得する。取得部は、例えば、マイクロフォンを用いてリアルタイムで音声データを収集する。具体的には、取得部は高感度マイクロフォンを使用し、周囲の雑音を低減するノイズキャンセリング技術を組み込むことで、クリアな音声データを取得する。また、取得部は、録音された音声ファイルを読み込むこともできる。例えば、取得部は、音声ファイル形式、サンプリングレート、ビットレートなどの音声データの形式をサポートする。これにより、取得部は、さまざまな形式の音声データを柔軟に取り扱うことができる。さらに、取得部は、音声データのメタデータも取得し、解析部やフィルタリング部に提供する。メタデータには、録音日時、話者の識別情報、通話のコンテキストなどが含まれる。これにより、取得部は、音声データのコンテキストを理解しやすくし、後続の解析やフィルタリングの精度を向上させることができる。取得部は、クラウドストレージやローカルストレージに音声データを保存し、必要に応じてアクセスできるようにする。これにより、取得部は、リアルタイムの音声データだけでなく、過去の音声データも効率的に管理し、解析やフィルタリングに活用することができる。 The acquisition unit acquires audio data from online calls. The acquisition unit collects audio data in real time using, for example, a microphone. Specifically, the acquisition unit uses a high-sensitivity microphone and incorporates noise-canceling technology to reduce ambient noise, thereby acquiring clear audio data. The acquisition unit can also read recorded audio files. For example, the acquisition unit supports audio data formats such as audio file formats, sampling rates, and bit rates. This allows the acquisition unit to flexibly handle audio data in various formats. The acquisition unit also acquires metadata for the audio data and provides it to the analysis and filtering units. The metadata includes the recording date and time, speaker identification information, and call context. This makes it easier for the acquisition unit to understand the context of the audio data and improve the accuracy of subsequent analysis and filtering. The acquisition unit stores the audio data in cloud storage or local storage, allowing it to be accessed as needed. This allows the acquisition unit to efficiently manage not only real-time audio data but also past audio data and use it for analysis and filtering.

取得部は、ユーザの感情を推定し、推定されたユーザの感情に基づいて音声データの取得タイミングを調整することができる。取得部は、例えば、ユーザの感情を推定し、推定されたユーザの感情に基づいて音声データの取得タイミングを調整する。例えば、ユーザがストレスを感じている場合、音声データの取得頻度を減らし、重要な部分のみを取得する。また、取得部は、ユーザがリラックスしている場合、音声データの取得頻度を増やし、詳細な情報を取得することもできる。例えば、取得部は、ユーザが急いでいる場合、音声データの取得タイミングを迅速にし、リアルタイムで取得する。これにより、取得部は、ユーザの感情に応じて音声データの取得タイミングを調整することで、効率的なデータ取得が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、ユーザの感情データを生成ＡＩに入力し、感情の推定を生成ＡＩに実行させることができる。 The acquisition unit can estimate the user's emotions and adjust the timing of voice data acquisition based on the estimated user emotions. The acquisition unit, for example, estimates the user's emotions and adjusts the timing of voice data acquisition based on the estimated user emotions. For example, if the user is feeling stressed, the acquisition unit reduces the frequency of voice data acquisition and acquires only important parts. The acquisition unit can also increase the frequency of voice data acquisition and acquire more detailed information if the user is relaxed. For example, if the user is in a hurry, the acquisition unit speeds up the timing of voice data acquisition and acquires it in real time. This allows the acquisition unit to adjust the timing of voice data acquisition according to the user's emotions, enabling efficient data acquisition. Emotion estimation is achieved using an emotion estimation function, for example, an emotion engine or generative AI. Generative AI includes, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-mentioned processing in the acquisition unit may be performed using AI, for example, or without AI. For example, the acquisition unit can input the user's emotion data into the generation AI and have the generation AI perform emotion estimation.

取得部は、ユーザの感情を推定し、推定したユーザの感情に基づいて取得する音声データの優先順位を決定することができる。取得部は、例えば、ユーザがストレスを感じている場合、重要な音声データのみを優先的に取得する。例えば、取得部は、業務に関連する重要な音声データを優先的に取得する。また、取得部は、ユーザがリラックスしている場合、詳細な音声データを優先的に取得することもできる。例えば、取得部は、業務に関連する詳細な情報を含む音声データを優先的に取得する。また、取得部は、ユーザが急いでいる場合、迅速に取得できる音声データを優先的に取得することもできる。例えば、取得部は、リアルタイムで取得できる音声データを優先的に取得する。これにより、取得部は、ユーザの感情に応じて音声データの優先順位を決定することで、重要なデータを優先的に取得することができる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、ユーザの感情データを生成ＡＩに入力し、感情の推定を生成ＡＩに実行させることができる。 The acquisition unit can estimate the user's emotions and prioritize the voice data to be acquired based on the estimated user emotions. For example, when the user is stressed, the acquisition unit prioritizes acquiring only important voice data. For example, the acquisition unit prioritizes acquiring important voice data related to work. Furthermore, when the user is relaxed, the acquisition unit can prioritize acquiring detailed voice data. For example, the acquisition unit prioritizes acquiring voice data containing detailed information related to work. Furthermore, when the user is in a hurry, the acquisition unit can prioritize acquiring voice data that can be acquired quickly. For example, the acquisition unit prioritizes acquiring voice data that can be acquired in real time. Thus, the acquisition unit prioritizes the acquisition of important data by prioritizing the voice data according to the user's emotions. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can be, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the acquisition unit may be performed using AI, or may be performed without AI. For example, the acquisition unit can input the user's emotion data into the generation AI and have the generation AI perform emotion estimation.

解析部は、ユーザの感情を推定し、推定したユーザの感情に基づいて解析の表現方法を調整することができる。解析部は、例えば、ユーザが緊張している場合、シンプルで視認性の高い解析結果を提供する。例えば、解析部は、グラフやチャートを用いて視覚的に分かりやすい形式で解析結果を表示する。また、解析部は、ユーザがリラックスしている場合、詳細な解析結果を提供することもできる。例えば、解析部は、詳細なテキストやデータを含む解析結果を表示する。また、解析部は、ユーザが急いでいる場合、要点を押さえた解析結果を提供することもできる。例えば、解析部は、重要なポイントを強調した簡潔な解析結果を表示する。これにより、解析部は、ユーザの感情に応じて解析の表現方法を調整することで、視認性の高い解析結果を提供することができる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、ユーザの感情データを生成ＡＩに入力し、解析の表現方法の調整を生成ＡＩに実行させることができる。 The analysis unit can estimate the user's emotions and adjust the presentation method of the analysis based on the estimated user emotions. For example, if the user is nervous, the analysis unit provides simple, highly visible analysis results. For example, the analysis unit displays the analysis results in a visually easy-to-understand format using graphs and charts. The analysis unit can also provide detailed analysis results if the user is relaxed. For example, the analysis unit displays analysis results including detailed text and data. The analysis unit can also provide analysis results that focus on the main points if the user is in a hurry. For example, the analysis unit displays concise analysis results that highlight important points. This allows the analysis unit to adjust the presentation method of the analysis according to the user's emotions and provide highly visible analysis results. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can include, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the analysis unit may be performed using AI, or without AI. For example, the analysis unit can input the user's emotional data into the generation AI and have the generation AI adjust the way the analysis is expressed.

解析部は、解析時に、音声データの重要度に基づいて解析の詳細度を調整することができる。解析部は、例えば、重要な音声データに対しては、詳細な解析を行う。例えば、解析部は、音声データの内容を詳細に解析し、重要な情報を抽出する。また、解析部は、重要度の低い音声データに対しては、簡略化された解析を行うこともできる。例えば、解析部は、音声データの概要を解析し、簡潔な情報を提供する。また、解析部は、音声データの重要度に応じて、解析の詳細度を動的に調整することもできる。例えば、解析部は、音声データの重要度に基づいて解析の深さや範囲を調整する。これにより、解析部は、音声データの重要度に応じて解析の詳細度を調整することで、効率的な解析が可能となる。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データの重要度をＡＩに入力し、解析の詳細度の調整をＡＩに実行させることができる。 During analysis, the analysis unit can adjust the level of detail of the analysis based on the importance of the audio data. For example, the analysis unit performs a detailed analysis of important audio data. For example, the analysis unit analyzes the content of the audio data in detail and extracts important information. The analysis unit can also perform a simplified analysis of audio data with low importance. For example, the analysis unit analyzes an overview of the audio data and provides concise information. The analysis unit can also dynamically adjust the level of detail of the analysis based on the importance of the audio data. For example, the analysis unit adjusts the depth and scope of the analysis based on the importance of the audio data. This allows the analysis unit to adjust the level of detail of the analysis based on the importance of the audio data, enabling efficient analysis. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, or may be performed without AI. For example, the analysis unit can input the importance of the audio data to AI and have the AI adjust the level of detail of the analysis.

解析部は、解析時に、音声データのカテゴリに応じて異なる解析アルゴリズムを適用することができる。解析部は、例えば、会議音声データに対しては、議事録生成アルゴリズムを適用する。例えば、解析部は、会議の内容を自動的に要約し、議事録を生成する。また、解析部は、カスタマーサポート音声データに対しては、顧客満足度解析アルゴリズムを適用することもできる。例えば、解析部は、顧客の発言内容を解析し、満足度を評価する。また、解析部は、教育音声データに対しては、学習進捗解析アルゴリズムを適用することもできる。例えば、解析部は、教育内容を解析し、学習の進捗状況を評価する。これにより、解析部は、音声データのカテゴリに応じて適切な解析アルゴリズムを適用することで、解析精度が向上する。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、音声データのカテゴリをＡＩに入力し、解析アルゴリズムの適用をＡＩに実行させることができる。 During analysis, the analysis unit can apply different analysis algorithms depending on the category of audio data. For example, the analysis unit applies a minutes generation algorithm to conference audio data. For example, the analysis unit automatically summarizes the contents of the conference and generates minutes. The analysis unit can also apply a customer satisfaction analysis algorithm to customer support audio data. For example, the analysis unit analyzes the content of customer comments and evaluates satisfaction. The analysis unit can also apply a learning progress analysis algorithm to educational audio data. For example, the analysis unit analyzes educational content and evaluates learning progress. In this way, the analysis unit can apply an appropriate analysis algorithm depending on the category of audio data, thereby improving analysis accuracy. Some or all of the above-mentioned processing in the analysis unit may be performed using AI, for example, or may be performed without using AI. For example, the analysis unit can input the category of audio data to AI and have the AI apply the analysis algorithm.

解析部は、ユーザの感情を推定し、推定したユーザの感情に基づいて解析の長さを調整することができる。解析部は、例えば、ユーザが急いでいる場合、短くて要点を押さえた解析を行う。例えば、解析部は、重要なポイントを強調した簡潔な解析結果を提供する。また、解析部は、ユーザがリラックスしている場合、詳細な解析を行うこともできる。例えば、解析部は、詳細なテキストやデータを含む解析結果を提供する。また、解析部は、ユーザが興奮している場合、視覚的に刺激的なエフェクトを加えた解析を行うこともできる。例えば、解析部は、グラフやチャートに視覚的なエフェクトを加えて解析結果を表示する。これにより、解析部は、ユーザの感情に応じて解析の長さを調整することで、効率的な解析が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、ユーザの感情データを生成ＡＩに入力し、解析の長さの調整を生成ＡＩに実行させることができる。 The analysis unit can estimate the user's emotions and adjust the length of the analysis based on the estimated user emotions. For example, if the user is in a hurry, the analysis unit can perform a short, to-the-point analysis. For example, the analysis unit can provide concise analysis results that emphasize important points. The analysis unit can also perform a detailed analysis if the user is relaxed. For example, the analysis unit can provide analysis results that include detailed text and data. The analysis unit can also perform an analysis that adds visually stimulating effects if the user is excited. For example, the analysis unit can display the analysis results with visual effects in graphs and charts. This allows the analysis unit to adjust the length of the analysis according to the user's emotions, enabling efficient analysis. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can include, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the analysis unit can be performed using AI, or without AI. For example, the analysis unit can input the user's emotion data into the generation AI and have the generation AI adjust the length of the analysis.

フィルタリング部は、ユーザの感情を推定し、推定したユーザの感情に基づいてフィルタリングの基準を調整することができる。フィルタリング部は、例えば、ユーザがストレスを感じている場合、厳格なフィルタリング基準を適用し、ネガティブな音声を除去する。例えば、フィルタリング部は、感情的な発言や無関係な話題を除去する。また、フィルタリング部は、ユーザがリラックスしている場合、緩やかなフィルタリング基準を適用し、詳細な音声を提供することもできる。例えば、フィルタリング部は、詳細な情報を含む音声データを提供する。また、フィルタリング部は、ユーザが急いでいる場合、迅速にフィルタリングを行い、重要な音声のみを提供することもできる。例えば、フィルタリング部は、重要な情報を含む音声データを迅速に提供する。これにより、フィルタリング部は、ユーザの感情に応じてフィルタリングの基準を調整することで、適切なフィルタリングが可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、ユーザの感情データを生成ＡＩに入力し、フィルタリングの基準の調整を生成ＡＩに実行させることができる。 The filtering unit can estimate the user's emotions and adjust the filtering criteria based on the estimated user emotions. For example, if the user is stressed, the filtering unit applies strict filtering criteria to remove negative voices. For example, the filtering unit removes emotional comments and irrelevant topics. The filtering unit can also apply lenient filtering criteria to provide detailed voices when the user is relaxed. For example, the filtering unit provides voice data containing detailed information. The filtering unit can also quickly filter and provide only important voices when the user is in a hurry. For example, the filtering unit quickly provides voice data containing important information. This allows the filtering unit to adjust the filtering criteria according to the user's emotions, enabling appropriate filtering. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can include, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-described processing in the filtering unit may be performed using AI, or without AI. For example, the filtering unit can input the user's emotional data into the generation AI and have the generation AI adjust the filtering criteria.

フィルタリング部は、ユーザの感情を推定し、推定したユーザの感情に基づいてフィルタリングの結果を表示する順序を調整することができる。フィルタリング部は、例えば、ユーザがストレスを感じている場合、重要な音声データを優先的に表示する。例えば、フィルタリング部は、重要な情報を含む音声データを優先的に表示する。また、フィルタリング部は、ユーザがリラックスしている場合、詳細な音声データを優先的に表示することもできる。例えば、フィルタリング部は、詳細な情報を含む音声データを優先的に表示する。また、フィルタリング部は、ユーザが急いでいる場合、迅速にフィルタリング結果を表示することもできる。例えば、フィルタリング部は、重要な情報を迅速に表示する。これにより、フィルタリング部は、ユーザの感情に応じてフィルタリング結果の表示順序を調整することで、重要な情報を優先的に提供することができる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、ユーザの感情データを生成ＡＩに入力し、フィルタリング結果の表示順序の調整を生成ＡＩに実行させることができる。 The filtering unit can estimate the user's emotions and adjust the display order of the filtering results based on the estimated user emotions. For example, if the user is feeling stressed, the filtering unit can prioritize displaying important audio data. For example, the filtering unit can prioritize displaying audio data containing important information. Furthermore, if the user is relaxed, the filtering unit can prioritize displaying detailed audio data. For example, the filtering unit can prioritize displaying audio data containing detailed information. Furthermore, if the user is in a hurry, the filtering unit can quickly display the filtering results. For example, the filtering unit can quickly display important information. Thus, the filtering unit can prioritize providing important information by adjusting the display order of the filtering results according to the user's emotions. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can include, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the filtering unit may be performed using AI, or without AI. For example, the filtering unit can input user emotion data into the generation AI and have the generation AI adjust the display order of the filtering results.

提供部は、ユーザの感情を推定し、推定したユーザの感情に基づいて提供の表示方法を調整することができる。提供部は、例えば、ユーザが緊張している場合、シンプルで視認性の高い表示方法を提供する。例えば、提供部は、グラフやチャートを用いて視覚的に分かりやすい形式で情報を表示する。また、提供部は、ユーザがリラックスしている場合、詳細な情報を含む表示方法を提供することもできる。例えば、提供部は、詳細なテキストやデータを含む表示方法を提供する。また、提供部は、ユーザが急いでいる場合、要点を押さえた表示方法を提供することもできる。例えば、提供部は、重要なポイントを強調した簡潔な表示方法を提供する。これにより、提供部は、ユーザの感情に応じて表示方法を調整することで、視認性の高い情報提供が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、ユーザの感情データを生成ＡＩに入力し、表示方法の調整を生成ＡＩに実行させることができる。 The providing unit can estimate the user's emotions and adjust the display method based on the estimated user emotions. For example, if the user is nervous, the providing unit can provide a simple, highly visible display method. For example, the providing unit can display information in a visually easy-to-understand format using graphs and charts. Furthermore, if the user is relaxed, the providing unit can provide a display method including detailed information. For example, the providing unit can provide a display method including detailed text and data. Furthermore, if the user is in a hurry, the providing unit can provide a display method that focuses on the main points. For example, the providing unit can provide a concise display method that emphasizes important points. This allows the providing unit to adjust the display method according to the user's emotions, thereby providing highly visible information. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generation AI. Generation AI can include, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the providing unit can be performed using AI, for example, or without AI. For example, the providing unit can input the user's emotion data into the generation AI and have the generation AI adjust the display method.

提供部は、ユーザの感情を推定し、推定したユーザの感情に基づいて提供の操作手順を調整することができる。提供部は、例えば、ユーザが緊張している場合、簡単な操作手順を提供する。例えば、提供部は、ステップバイステップの手順を提供し、ユーザが簡単に操作できるようにする。また、提供部は、ユーザがリラックスしている場合、詳細な操作手順を提供することもできる。例えば、提供部は、詳細な説明やガイドを含む操作手順を提供する。また、提供部は、ユーザが急いでいる場合、迅速に操作できる手順を提供することもできる。例えば、提供部は、ショートカットや簡略化された手順を提供する。これにより、提供部は、ユーザの感情に応じて操作手順を調整することで、効率的な操作が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、ユーザの感情データを生成ＡＩに入力し、操作手順の調整を生成ＡＩに実行させることができる。 The providing unit can estimate the user's emotions and adjust the provided operating procedures based on the estimated user emotions. For example, if the user is nervous, the providing unit can provide simple operating procedures. For example, the providing unit can provide step-by-step procedures to allow the user to operate easily. The providing unit can also provide detailed operating procedures if the user is relaxed. For example, the providing unit can provide operating procedures including detailed explanations and guides. The providing unit can also provide procedures that allow the user to operate quickly if the user is in a hurry. For example, the providing unit can provide shortcuts or simplified procedures. This allows the providing unit to adjust the operating procedures according to the user's emotions, enabling efficient operation. Emotion estimation is achieved using an emotion estimation function, for example, using an emotion engine or generative AI. Generative AI can be, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-mentioned processing in the providing unit can be performed using AI, for example, or without AI. For example, the providing unit can input the user's emotional data into the generation AI and have the generation AI adjust the operating procedure.

生成部は、ユーザの感情を推定し、推定したユーザの感情に基づいて生成する音声の優先順位を決定することができる。生成部は、例えば、ユーザがストレスを感じている場合、重要な音声を優先的に生成する。例えば、生成部は、業務に関連する重要な音声データを優先的に生成する。また、生成部は、ユーザがリラックスしている場合、詳細な音声を優先的に生成することもできる。例えば、生成部は、業務に関連する詳細な情報を含む音声データを優先的に生成する。また、生成部は、ユーザが急いでいる場合、迅速に生成できる音声を優先的に生成することもできる。例えば、生成部は、リアルタイムで生成できる音声データを優先的に生成する。これにより、生成部は、ユーザの感情に応じて音声の優先順位を決定することで、重要な音声を優先的に生成することができる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、ユーザの感情データを生成ＡＩに入力し、音声の優先順位の決定を生成ＡＩに実行させることができる。 The generation unit can estimate the user's emotions and prioritize the audio to be generated based on the estimated user emotions. For example, when the user is stressed, the generation unit prioritizes generating important audio. For example, the generation unit prioritizes generating important audio data related to work. Furthermore, when the user is relaxed, the generation unit can prioritize generating detailed audio. For example, the generation unit prioritizes generating audio data containing detailed information related to work. Furthermore, when the user is in a hurry, the generation unit can prioritize generating audio that can be generated quickly. For example, the generation unit prioritizes generating audio data that can be generated in real time. Thus, the generation unit can prioritize generating important audio by prioritizing audio according to the user's emotions. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generation AI. Generation AI can include, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-described processing in the generation unit may be performed using AI, or without AI. For example, the generation unit can input the user's emotional data into the generation AI and have the generation AI determine the priority of the voices.

生成部は、生成時に、音声データの相互関係を考慮して生成の精度を向上させることができる。生成部は、例えば、音声データの前後関係を考慮して、関連する音声を一括で生成する。例えば、生成部は、音声データの前後関係を解析し、関連する情報を抽出して生成する。また、生成部は、音声データの相互関係を分析し、重要な部分を抽出して生成することもできる。例えば、生成部は、音声データの相互関係を基に、重要な情報を抽出して生成する。また、生成部は、音声データの相互関係に基づいて、生成の精度を動的に調整することもできる。例えば、生成部は、音声データの相互関係を基に、生成の精度を向上させる。これにより、生成部は、音声データの相互関係を考慮することで、生成の精度が向上する。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、音声データの相互関係をＡＩに入力し、生成の精度の向上をＡＩに実行させることができる。 The generation unit can improve the accuracy of generation by taking into account the interrelationships between audio data during generation. For example, the generation unit generates related audio all at once, taking into account the context of the audio data. For example, the generation unit analyzes the context of the audio data and extracts and generates related information. The generation unit can also analyze the interrelationships between audio data and extract and generate important parts. For example, the generation unit extracts and generates important information based on the interrelationships between audio data. The generation unit can also dynamically adjust the accuracy of generation based on the interrelationships between audio data. For example, the generation unit improves the accuracy of generation based on the interrelationships between audio data. In this way, the generation unit improves the accuracy of generation by taking into account the interrelationships between audio data. Some or all of the above-mentioned processing in the generation unit may be performed using, or without, AI. For example, the generation unit can input the interrelationships between audio data into AI and have the AI improve the accuracy of generation.

生成部は、ユーザの感情を推定し、推定したユーザの感情に基づいて生成する音声の表示方法を調整することができる。生成部は、例えば、ユーザが緊張している場合、シンプルで視認性の高い表示方法を提供する。例えば、生成部は、グラフやチャートを用いて視覚的に分かりやすい形式で情報を表示する。また、生成部は、ユーザがリラックスしている場合、詳細な情報を含む表示方法を提供することもできる。例えば、生成部は、詳細なテキストやデータを含む表示方法を提供する。また、生成部は、ユーザが急いでいる場合、要点を押さえた表示方法を提供することもできる。例えば、生成部は、重要なポイントを強調した簡潔な表示方法を提供する。これにより、生成部は、ユーザの感情に応じて表示方法を調整することで、視認性の高い情報提供が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、ユーザの感情データを生成ＡＩに入力し、表示方法の調整を生成ＡＩに実行させることができる。 The generation unit can estimate the user's emotions and adjust the display method of the generated audio based on the estimated user emotions. For example, if the user is nervous, the generation unit provides a simple, highly visible display method. For example, the generation unit displays information in a visually easy-to-understand format using graphs and charts. The generation unit can also provide a display method including detailed information if the user is relaxed. For example, the generation unit provides a display method including detailed text and data. The generation unit can also provide a display method that focuses on the main points if the user is in a hurry. For example, the generation unit provides a concise display method that emphasizes important points. This allows the generation unit to adjust the display method according to the user's emotions, thereby providing highly visible information. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generation AI. Generation AI can include, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-described processing in the generation unit may be performed using AI, or without AI. For example, the generation unit can input the user's emotion data into the generation AI and have the generation AI adjust the display method.

取得部は、ユーザの感情を推定し、推定されたユーザの感情に基づいて音声データの取得タイミングを調整することができる。例えば、ユーザがストレスを感じている場合、音声データの取得頻度を減らし、重要な部分のみを取得する。また、取得部は、ユーザがリラックスしている場合、音声データの取得頻度を増やし、詳細な情報を取得することもできる。例えば、取得部は、ユーザが急いでいる場合、音声データの取得タイミングを迅速にし、リアルタイムで取得する。これにより、取得部は、ユーザの感情に応じて音声データの取得タイミングを調整することで、効率的なデータ取得が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。取得部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、取得部は、ユーザの感情データを生成ＡＩに入力し、感情の推定を生成ＡＩに実行させることができる。 The acquisition unit can estimate the user's emotions and adjust the timing of voice data acquisition based on the estimated user emotions. For example, if the user is feeling stressed, the acquisition unit can reduce the frequency of voice data acquisition and acquire only important parts. Furthermore, if the user is relaxed, the acquisition unit can increase the frequency of voice data acquisition to acquire more detailed information. For example, if the user is in a hurry, the acquisition unit can speed up the timing of voice data acquisition and acquire it in real time. This allows the acquisition unit to adjust the timing of voice data acquisition according to the user's emotions, enabling efficient data acquisition. Emotion estimation is achieved using an emotion estimation function, for example, using an emotion engine or generative AI. Generative AI includes, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-mentioned processing in the acquisition unit may be performed using AI, or may be performed without AI. For example, the acquisition unit can input the user's emotion data into the generative AI and have the generative AI perform emotion estimation.

解析部は、ユーザの感情を推定し、推定したユーザの感情に基づいて解析の表現方法を調整することができる。例えば、ユーザが緊張している場合、シンプルで視認性の高い解析結果を提供する。例えば、解析部は、グラフやチャートを用いて視覚的に分かりやすい形式で解析結果を表示する。また、解析部は、ユーザがリラックスしている場合、詳細な解析結果を提供することもできる。例えば、解析部は、詳細なテキストやデータを含む解析結果を表示する。また、解析部は、ユーザが急いでいる場合、要点を押さえた解析結果を提供することもできる。例えば、解析部は、重要なポイントを強調した簡潔な解析結果を表示する。これにより、解析部は、ユーザの感情に応じて解析の表現方法を調整することで、視認性の高い解析結果を提供することができる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。解析部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、解析部は、ユーザの感情データを生成ＡＩに入力し、解析の表現方法の調整を生成ＡＩに実行させることができる。 The analysis unit can estimate the user's emotions and adjust the presentation of the analysis based on the estimated user emotions. For example, if the user is nervous, the analysis unit can provide simple, highly visible analysis results. For example, the analysis unit can display the analysis results in a visually easy-to-understand format using graphs and charts. The analysis unit can also provide detailed analysis results if the user is relaxed. For example, the analysis unit can display analysis results including detailed text and data. The analysis unit can also provide analysis results that focus on the main points if the user is in a hurry. For example, the analysis unit can display concise analysis results that highlight important points. This allows the analysis unit to adjust the presentation of the analysis based on the user's emotions and provide highly visible analysis results. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can include, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the analysis unit can be performed using AI, or without AI. For example, the analysis unit can input the user's emotional data into the generation AI and have the generation AI adjust the way the analysis is expressed.

フィルタリング部は、ユーザの感情を推定し、推定したユーザの感情に基づいてフィルタリングの基準を調整することができる。例えば、ユーザがストレスを感じている場合、厳格なフィルタリング基準を適用し、ネガティブな音声を除去する。例えば、フィルタリング部は、感情的な発言や無関係な話題を除去する。また、フィルタリング部は、ユーザがリラックスしている場合、緩やかなフィルタリング基準を適用し、詳細な音声を提供することもできる。例えば、フィルタリング部は、詳細な情報を含む音声データを提供する。また、フィルタリング部は、ユーザが急いでいる場合、迅速にフィルタリングを行い、重要な音声のみを提供することもできる。例えば、フィルタリング部は、重要な情報を含む音声データを迅速に提供する。これにより、フィルタリング部は、ユーザの感情に応じてフィルタリングの基準を調整することで、適切なフィルタリングが可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。フィルタリング部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、フィルタリング部は、ユーザの感情データを生成ＡＩに入力し、フィルタリングの基準の調整を生成ＡＩに実行させることができる。 The filtering unit can estimate the user's emotions and adjust the filtering criteria based on the estimated user emotions. For example, if the user is stressed, strict filtering criteria can be applied to remove negative voices. For example, the filtering unit can remove emotional comments and irrelevant topics. Furthermore, if the user is relaxed, the filtering unit can apply looser filtering criteria to provide more detailed voices. For example, the filtering unit can provide voice data containing detailed information. Furthermore, if the user is in a hurry, the filtering unit can quickly filter and provide only important voices. For example, the filtering unit can quickly provide voice data containing important information. This allows the filtering unit to adjust the filtering criteria according to the user's emotions, enabling appropriate filtering. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generative AI. Generative AI can include, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-described processing in the filtering unit may be performed using AI, or without AI. For example, the filtering unit can input the user's emotional data into the generation AI and have the generation AI adjust the filtering criteria.

提供部は、ユーザの感情を推定し、推定したユーザの感情に基づいて提供の表示方法を調整することができる。例えば、ユーザが緊張している場合、シンプルで視認性の高い表示方法を提供する。例えば、提供部は、グラフやチャートを用いて視覚的に分かりやすい形式で情報を表示する。また、提供部は、ユーザがリラックスしている場合、詳細な情報を含む表示方法を提供することもできる。例えば、提供部は、詳細なテキストやデータを含む表示方法を提供する。また、提供部は、ユーザが急いでいる場合、要点を押さえた表示方法を提供することもできる。例えば、提供部は、重要なポイントを強調した簡潔な表示方法を提供する。これにより、提供部は、ユーザの感情に応じて表示方法を調整することで、視認性の高い情報提供が可能となる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。提供部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、提供部は、ユーザの感情データを生成ＡＩに入力し、表示方法の調整を生成ＡＩに実行させることができる。 The providing unit can estimate the user's emotions and adjust the display method based on the estimated user emotions. For example, if the user is nervous, the providing unit can provide a simple, highly visible display method. For example, the providing unit can display information in a visually easy-to-understand format using graphs and charts. Furthermore, if the user is relaxed, the providing unit can provide a display method that includes detailed information. For example, the providing unit can provide a display method that includes detailed text and data. Furthermore, if the user is in a hurry, the providing unit can provide a display method that focuses on the main points. For example, the providing unit can provide a concise display method that highlights important points. This allows the providing unit to adjust the display method according to the user's emotions, thereby providing highly visible information. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generation AI. Generation AI can include, but is not limited to, text generation AI (e.g., LLM) or multimodal generation AI. Some or all of the above-described processing in the providing unit can be performed using AI, for example, or without AI. For example, the providing unit can input the user's emotion data into the generation AI and have the generation AI adjust the display method.

生成部は、ユーザの感情を推定し、推定したユーザの感情に基づいて生成する音声の優先順位を決定することができる。例えば、ユーザがストレスを感じている場合、重要な音声を優先的に生成する。例えば、生成部は、業務に関連する重要な音声データを優先的に生成する。また、生成部は、ユーザがリラックスしている場合、詳細な音声を優先的に生成することもできる。例えば、生成部は、業務に関連する詳細な情報を含む音声データを優先的に生成する。また、生成部は、ユーザが急いでいる場合、迅速に生成できる音声を優先的に生成することもできる。例えば、生成部は、リアルタイムで生成できる音声データを優先的に生成する。これにより、生成部は、ユーザの感情に応じて音声の優先順位を決定することで、重要な音声を優先的に生成することができる。感情の推定は、例えば、感情エンジンまたは生成ＡＩなどを用いて感情推定機能を用いて実現される。生成ＡＩは、テキスト生成ＡＩ（例えば、ＬＬＭ）やマルチモーダル生成ＡＩなどであるが、かかる例に限定されない。生成部における上述した処理の一部または全部は、例えば、ＡＩを用いて行われてもよく、ＡＩを用いずに行われてもよい。例えば、生成部は、ユーザの感情データを生成ＡＩに入力し、音声の優先順位の決定を生成ＡＩに実行させることができる。 The generation unit can estimate the user's emotions and prioritize the audio to be generated based on the estimated user emotions. For example, if the user is stressed, it prioritizes the generation of important audio. For example, the generation unit prioritizes the generation of important work-related audio data. Furthermore, if the user is relaxed, the generation unit can prioritize the generation of detailed audio. For example, the generation unit prioritizes the generation of audio data containing detailed work-related information. Furthermore, if the user is in a hurry, the generation unit can prioritize the generation of audio that can be generated quickly. For example, the generation unit prioritizes the generation of audio data that can be generated in real time. Thus, the generation unit can prioritize the generation of important audio by prioritizing audio according to the user's emotions. Emotion estimation is achieved using an emotion estimation function, such as an emotion engine or generation AI. Generation AI can include, but is not limited to, text generation AI (e.g., LLM) and multimodal generation AI. Some or all of the above-described processing in the generation unit may be performed using AI, or without AI. For example, the generation unit can input the user's emotional data into the generation AI and have the generation AI determine the priority of the voices.

以下に、形態例２の処理の流れについて簡単に説明する。 The processing flow for Example 2 is briefly explained below.

特定処理部２９０は、特定処理の結果をスマートデバイス１４に送信する。スマートデバイス１４では、制御部４６Ａが、出力装置４０に対して特定処理の結果を出力させる。マイクロフォン３８Ｂは、特定処理の結果に対するユーザ入力を示す音声を取得する。制御部４６Ａは、マイクロフォン３８Ｂによって取得されたユーザ入力を示す音声データをデータ処理装置１２に送信する。データ処理装置１２では、特定処理部２９０が音声データを取得する。 The specific processing unit 290 transmits the results of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the results of the specific processing. The microphone 38B acquires audio indicating the user input regarding the results of the specific processing. The control unit 46A transmits audio data indicating the user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

データ生成モデル５８は、いわゆる生成ＡＩ（Artificial Intelligence）である。データ生成モデル５８の一例としては、ＣｈａｔＧＰＴ（登録商標）（インターネット検索＜URL: https://openai.com/blog/chatgpt＞）などの生成ＡＩが挙げられる。データ生成モデル５８は、ニューラルネットワークに対して深層学習を行わせることによって得られる。データ生成モデル５８には、指示を含むプロンプトが入力され、かつ、音声を示す音声データ、テキストを示すテキストデータ、および画像を示す画像データ（例えば、静止画のデータまたは動画のデータ）などの推論用データが入力される。データ生成モデル５８は、入力された推論用データをプロンプトにより示される指示に従って推論し、推論結果を音声データ、テキストデータ、および画像データなどのうちの１以上のデータ形式で出力する。データ生成モデル５８は、例えば、テキスト生成ＡＩ、画像生成ＡＩ、マルチモーダル生成ＡＩなどを含む。ここで、推論とは、例えば、分析、分類、予測、および／または要約などを指す。特定処理部２９０は、データ生成モデル５８を用いながら、上述した特定処理を行う。データ生成モデル５８は、指示を含まないプロンプトから推論結果を出力するように、ファインチューニングされたモデルであってもよく、この場合、データ生成モデル５８は、指示を含まないプロンプトから推論結果を出力することができる。データ処理装置１２などにおいて、データ生成モデル５８は複数種類含まれており、データ生成モデル５８は、生成ＡＩ以外のＡＩを含む。生成ＡＩ以外のＡＩは、例えば、線形回帰、ロジスティック回帰、決定木、ランダムフォレスト、サポートベクターマシン（ＳＶＭ）、ｋ－ｍｅａｎｓクラスタリング、畳み込みニューラルネットワーク（ＣＮＮ）、リカレントニューラルネットワーク（ＲＮＮ）、生成的敵対的ネットワーク（ＧＡＮ）、またはナイーブベイズなどであり、種々の処理を行うことができるが、かかる例に限定されない。また、ＡＩは、ＡＩエージェントであってもよい。また、上述した各部の処理がＡＩで行われる場合、その処理は、ＡＩで一部または全部が行われるが、かかる例に限定されない。また、生成ＡＩを含むＡＩで実施される処理は、ルールベースでの処理に置き換えてもよく、ルールベースの処理は、生成ＡＩを含むＡＩで実施される処理に置き換えてもよい。 The data generation model 58 is what is known as generative AI (artificial intelligence). An example of a data generation model 58 is generative AI such as ChatGPT (registered trademark) (Internet search <URL: https://openai.com/blog/chatgpt>). The data generation model 58 is obtained by performing deep learning on a neural network. A prompt containing an instruction is input to the data generation model 58, and inference data such as voice data indicating speech, text data indicating text, and image data indicating an image (e.g., still image data or video data) is also input. The data generation model 58 performs inference on the input inference data in accordance with the instructions indicated by the prompt and outputs the inference results in one or more data formats, such as voice data, text data, and image data. The data generation model 58 includes, for example, text generation AI, image generation AI, and multimodal generation AI. Here, inference refers to, for example, analysis, classification, prediction, and/or summarization. The identification processing unit 290 performs the above-mentioned identification processing using the data generation model 58. The data generation model 58 may be a fine-tuned model that outputs inference results from prompts that do not include instructions. In this case, the data generation model 58 can output inference results from prompts that do not include instructions. The data processing device 12 or the like includes multiple types of data generation models 58, and the data generation model 58 includes AI other than the generation AI. Examples of AI other than the generation AI include linear regression, logistic regression, decision trees, random forests, support vector machines (SVMs), k-means clustering, convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), and naive Bayes, and can perform various processes, but are not limited to these examples. The AI may also be an AI agent. When the processing of each of the above-mentioned parts is performed by AI, the processing may be performed in part or in whole by AI, but is not limited to these examples. In addition, processing performed by AI, including generation AI, may be replaced with rule-based processing, and rule-based processing may be replaced with processing performed by AI, including generation AI.

また、上述したデータ処理システム１０による処理は、データ処理装置１２の特定処理部２９０またはスマートデバイス１４の制御部４６Ａによって実行されるが、データ処理装置１２の特定処理部２９０とスマートデバイス１４の制御部４６Ａとによって実行されてもよい。また、データ処理装置１２の特定処理部２９０は、処理に必要な情報をスマートデバイス１４または外部の装置などから取得したり収集したりし、スマートデバイス１４は、処理に必要な情報をデータ処理装置１２または外部の装置などから取得したり収集したりする。 Furthermore, the processing by the data processing system 10 described above is executed by the specific processing unit 290 of the data processing device 12 or the control unit 46A of the smart device 14, but may also be executed by the specific processing unit 290 of the data processing device 12 and the control unit 46A of the smart device 14. Furthermore, the specific processing unit 290 of the data processing device 12 acquires or collects information necessary for processing from the smart device 14 or an external device, etc., and the smart device 14 acquires or collects information necessary for processing from the data processing device 12 or an external device, etc.

上述した取得部、解析部、フィルタリング部、提供部、および生成部を含む複数の要素の各々は、例えば、スマートデバイス１４およびデータ処理装置１２のうちの少なくとも一方で実現される。例えば、取得部は、スマートデバイス１４のマイクロフォンを用いて音声データを収集し、データ処理装置１２の特定処理部２９０によって解析される。解析部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、感情認識ＡＩを用いて音声データを解析する。フィルタリング部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、負の感情的な音声を理性的な音声に変換する。提供部は、例えば、スマートデバイス１４のスピーカーを用いて変換された音声を再生する。生成部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、否定的な発言を省き、重要な情報だけを抽出して提示する。取得部は、例えば、スマートデバイス１４の制御部４６Ａによってユーザの感情を推定し、音声データの取得タイミングを調整する。各部と装置や制御部との対応関係は、上述した例に限定されず、種々の変更が可能である。 Each of the multiple elements, including the acquisition unit, analysis unit, filtering unit, provision unit, and generation unit, described above, is realized, for example, by at least one of the smart device 14 and the data processing device 12. For example, the acquisition unit collects voice data using the microphone of the smart device 14, which is then analyzed by the specific processing unit 290 of the data processing device 12. The analysis unit, realized, for example, by the specific processing unit 290 of the data processing device 12, analyzes the voice data using emotion recognition AI. The filtering unit, realized, for example, by the specific processing unit 290 of the data processing device 12, converts negative emotional voice into rational voice. The provision unit, for example, plays the converted voice using the speaker of the smart device 14. The generation unit, realized, for example, by the specific processing unit 290 of the data processing device 12, omits negative comments and extracts and presents only important information. The acquisition unit, for example, estimates the user's emotions using the control unit 46A of the smart device 14 and adjusts the timing of voice data acquisition. The correspondence between each part and the device or control unit is not limited to the example described above, and various modifications are possible.

［第２実施形態］
図３には、第２実施形態に係るデータ処理システム２１０の構成の一例が示されている。 Second Embodiment
FIG. 3 shows an example of the configuration of a data processing system 210 according to the second embodiment.

図３に示すように、データ処理システム２１０は、データ処理装置１２およびスマート眼鏡２１４を備えている。データ処理装置１２の一例としては、サーバが挙げられる。 As shown in FIG. 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

データ処理装置１２は、コンピュータ２２、データベース２４、および通信Ｉ／Ｆ２６を備えている。コンピュータ２２は、プロセッサ２８、ＲＡＭ３０、およびストレージ３２を備えている。プロセッサ２８、ＲＡＭ３０、およびストレージ３２は、バス３４に接続されている。また、データベース２４および通信Ｉ／Ｆ２６も、バス３４に接続されている。通信Ｉ／Ｆ２６は、ネットワーク５４に接続されている。ネットワーク５４の一例としては、ＷＡＮおよび／またはＬＡＮなどが挙げられる。 The data processing device 12 includes a computer 22, a database 24, and a communication I/F 26. The computer 22 includes a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and communication I/F 26 are also connected to the bus 34. The communication I/F 26 is connected to a network 54. Examples of the network 54 include a WAN and/or a LAN.

スマート眼鏡２１４は、コンピュータ３６、マイクロフォン２３８、スピーカ２４０、カメラ４２、および通信Ｉ／Ｆ４４を備えている。コンピュータ３６は、プロセッサ４６、ＲＡＭ４８、およびストレージ５０を備えている。プロセッサ４６、ＲＡＭ４８、およびストレージ５０は、バス５２に接続されている。また、マイクロフォン２３８、スピーカ２４０、およびカメラ４２も、バス５２に接続されている。 The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication I/F 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, the speaker 240, and the camera 42 are also connected to the bus 52.

マイクロフォン２３８は、ユーザが発する音声を受け付けることで、ユーザから指示などを受け付ける。マイクロフォン２３８は、ユーザが発する音声を捕捉し、捕捉した音声を音声データに変換してプロセッサ４６に出力する。スピーカ２４０は、プロセッサ４６からの指示に従って音声を出力する。 The microphone 238 receives instructions and other information from the user by receiving voice uttered by the user. The microphone 238 captures the voice uttered by the user, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio in accordance with instructions from the processor 46.

カメラ４２は、レンズ、絞り、およびシャッタなどの光学系と、ＣＭＯＳ（Complementary Metal-Oxide-Semiconductor）イメージセンサまたはＣＣＤ（Charge Coupled Device）イメージセンサなどの撮像素子とが搭載された小型デジタルカメラであり、ユーザの周囲（例えば、一般的な健常者の視界の広さに相当する画角で規定された撮像範囲）を撮像する。 Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an imaging element such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the user's surroundings (for example, an imaging range defined by an angle of view equivalent to the field of vision of a typical healthy person).

通信Ｉ／Ｆ４４は、ネットワーク５４に接続されている。通信Ｉ／Ｆ４４および２６は、ネットワーク５４を介してプロセッサ４６とプロセッサ２８との間の各種情報の授受を司る。通信Ｉ／Ｆ４４および２６を用いたプロセッサ４６とプロセッサ２８との間の各種情報の授受はセキュアな状態で行われる。 The communication I/F 44 is connected to the network 54. The communication I/Fs 44 and 26 are responsible for the exchange of various information between the processor 46 and the processor 28 via the network 54. The exchange of various information between the processor 46 and the processor 28 using the communication I/Fs 44 and 26 is carried out in a secure manner.

図４には、データ処理装置１２およびスマート眼鏡２１４の要部機能の一例が示されている。図４に示すように、データ処理装置１２では、プロセッサ２８によって特定処理が行われる。ストレージ３２には、特定処理プログラム５６が格納されている。 Figure 4 shows an example of the main functions of the data processing device 12 and smart glasses 214. As shown in Figure 4, in the data processing device 12, specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32.

プロセッサ２８は、ストレージ３２から特定処理プログラム５６を読み出し、読み出した特定処理プログラム５６をＲＡＭ３０上で実行する。特定処理は、プロセッサ２８がＲＡＭ３０上で実行する特定処理プログラム５６に従って、特定処理部２９０として動作することによって実現される。 The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as the specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

スマート眼鏡２１４では、プロセッサ４６によって特定処理が行われる。ストレージ５０には、特定処理プログラム６０が格納されている。プロセッサ４６は、ストレージ５０から特定処理プログラム６０を読み出し、読み出した特定処理プログラム６０をＲＡＭ４８上で実行する。特定処理は、プロセッサ４６がＲＡＭ４８上で実行する特定処理プログラム６０に従って、制御部４６Ａとして動作することによって実現される。なお、スマート眼鏡２１４には、データ生成モデル５８および感情特定モデル５９と同様のデータ生成モデルおよび感情特定モデルを有し、これらモデルを用いて特定処理部２９０と同様の処理を行うこともできる。 In the smart glasses 214, specific processing is performed by the processor 46. A specific processing program 60 is stored in the storage 50. The processor 46 reads the specific processing program 60 from the storage 50 and executes the read specific processing program 60 on the RAM 48. The specific processing is realized by the processor 46 operating as the control unit 46A in accordance with the specific processing program 60 executed on the RAM 48. The smart glasses 214 also have a data generation model and emotion identification model similar to the data generation model 58 and emotion identification model 59, and can use these models to perform processing similar to that of the specific processing unit 290.

なお、データ処理装置１２以外の他の装置がデータ生成モデル５８を有してもよい。例えば、サーバ装置がデータ生成モデル５８を有してもよい。この場合、データ処理装置１２は、データ生成モデル５８を有するサーバ装置と通信を行うことで、データ生成モデル５８が用いられた処理結果（予測結果など）を得る。また、データ処理装置１２は、サーバ装置であってもよいし、ユーザが保有する端末装置（例えば、携帯電話、ロボット、家電など）であってもよい。 Note that a device other than the data processing device 12 may have the data generation model 58. For example, a server device may have the data generation model 58. In this case, the data processing device 12 communicates with the server device having the data generation model 58 to obtain processing results (such as prediction results) using the data generation model 58. The data processing device 12 may also be a server device, or a terminal device owned by a user (for example, a mobile phone, robot, home appliance, etc.).

特定処理部２９０は、特定処理の結果をスマート眼鏡２１４に送信する。スマート眼鏡２１４では、制御部４６Ａが、スピーカ２４０に対して特定処理の結果を出力させる。マイクロフォン２３８は、特定処理の結果に対するユーザ入力を示す音声を取得する。制御部４６Ａは、マイクロフォン２３８によって取得されたユーザ入力を示す音声データをデータ処理装置１２に送信する。データ処理装置１２では、特定処理部２９０が音声データを取得する。 The specific processing unit 290 transmits the results of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the results of the specific processing. The microphone 238 acquires audio indicating the user input regarding the results of the specific processing. The control unit 46A transmits audio data indicating the user input acquired by the microphone 238 to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

データ生成モデル５８は、いわゆる生成ＡＩである。データ生成モデル５８の一例としては、ＣｈａｔＧＰＴなどの生成ＡＩが挙げられる。データ生成モデル５８は、ニューラルネットワークに対して深層学習を行わせることによって得られる。データ生成モデル５８には、指示を含むプロンプトが入力され、かつ、音声を示す音声データ、テキストを示すテキストデータ、および画像を示す画像データ（例えば、静止画のデータまたは動画のデータ）などの推論用データが入力される。データ生成モデル５８は、入力された推論用データをプロンプトにより示される指示に従って推論し、推論結果を音声データ、テキストデータ、および画像データなどのうちの１以上のデータ形式で出力する。データ生成モデル５８は、例えば、テキスト生成ＡＩ、画像生成ＡＩ、マルチモーダル生成ＡＩなどを含む。ここで、推論とは、例えば、分析、分類、予測、および／または要約などを指す。特定処理部２９０は、データ生成モデル５８を用いながら、上述した特定処理を行う。データ生成モデル５８は、指示を含まないプロンプトから推論結果を出力するように、ファインチューニングされたモデルであってもよく、この場合、データ生成モデル５８は、指示を含まないプロンプトから推論結果を出力することができる。データ処理装置１２などにおいて、データ生成モデル５８は複数種類含まれており、データ生成モデル５８は、生成ＡＩ以外のＡＩを含む。生成ＡＩ以外のＡＩは、例えば、線形回帰、ロジスティック回帰、決定木、ランダムフォレスト、サポートベクターマシン（ＳＶＭ）、ｋ－ｍｅａｎｓクラスタリング、畳み込みニューラルネットワーク（ＣＮＮ）、リカレントニューラルネットワーク（ＲＮＮ）、生成的敵対的ネットワーク（ＧＡＮ）、またはナイーブベイズなどであり、種々の処理を行うことができるが、かかる例に限定されない。また、ＡＩは、ＡＩエージェントであってもよい。また、上述した各部の処理がＡＩで行われる場合、その処理は、ＡＩで一部または全部が行われるが、かかる例に限定されない。また、生成ＡＩを含むＡＩで実施される処理は、ルールベースでの処理に置き換えてもよく、ルールベースの処理は、生成ＡＩを含むＡＩで実施される処理に置き換えてもよい。 The data generation model 58 is what is known as generative AI. An example of a data generation model 58 is generative AI such as ChatGPT. The data generation model 58 is obtained by performing deep learning on a neural network. A prompt containing an instruction is input to the data generation model 58, and inference data such as voice data indicating speech, text data indicating text, and image data indicating an image (e.g., still image data or video data) is input. The data generation model 58 performs inference on the input inference data in accordance with the instruction indicated by the prompt, and outputs the inference result in one or more data formats, such as voice data, text data, and image data. The data generation model 58 includes, for example, text generation AI, image generation AI, and multimodal generation AI. Here, inference refers to, for example, analysis, classification, prediction, and/or summarization. The identification processing unit 290 performs the above-mentioned identification processing using the data generation model 58. The data generation model 58 may be a fine-tuned model that outputs inference results from prompts that do not include instructions. In this case, the data generation model 58 can output inference results from prompts that do not include instructions. The data processing device 12 or the like includes multiple types of data generation models 58, and the data generation model 58 includes AI other than the generation AI. Examples of AI other than the generation AI include linear regression, logistic regression, decision trees, random forests, support vector machines (SVMs), k-means clustering, convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), and naive Bayes, and can perform various processes, but are not limited to these examples. The AI may also be an AI agent. When the processing of each of the above-mentioned parts is performed by AI, the processing may be performed in part or in whole by AI, but is not limited to these examples. In addition, processing performed by AI, including generation AI, may be replaced with rule-based processing, and rule-based processing may be replaced with processing performed by AI, including generation AI.

第２実施形態に係るデータ処理システム２１０は、第１実施形態に係るデータ処理システム１０と同様の処理を行う。データ処理システム２１０による処理は、データ処理装置１２の特定処理部２９０またはスマート眼鏡２１４の制御部４６Ａによって実行されるが、データ処理装置１２の特定処理部２９０とスマート眼鏡２１４の制御部４６Ａとによって実行されてもよい。また、データ処理装置１２の特定処理部２９０は、処理に必要な情報をスマート眼鏡２１４または外部の装置などから取得したり収集したりし、スマート眼鏡２１４は、処理に必要な情報をデータ処理装置１２または外部の装置などから取得したり収集したりする。 The data processing system 210 according to the second embodiment performs processing similar to that of the data processing system 10 according to the first embodiment. Processing by the data processing system 210 is executed by the specific processing unit 290 of the data processing device 12 or the control unit 46A of the smart glasses 214, but may also be executed by the specific processing unit 290 of the data processing device 12 and the control unit 46A of the smart glasses 214. In addition, the specific processing unit 290 of the data processing device 12 acquires or collects information necessary for processing from the smart glasses 214 or an external device, etc., and the smart glasses 214 acquires or collects information necessary for processing from the data processing device 12 or an external device, etc.

上述した取得部、解析部、フィルタリング部、提供部、および生成部を含む複数の要素の各々は、例えば、スマート眼鏡２１４およびデータ処理装置１２のうちの少なくとも一方で実現される。例えば、取得部は、スマート眼鏡２１４のマイクロフォンを用いて音声データを収集し、データ処理装置１２の特定処理部２９０によって解析される。解析部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、感情認識ＡＩを用いて音声データを解析する。フィルタリング部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、負の感情的な音声を理性的な音声に変換する。提供部は、例えば、スマート眼鏡２１４のスピーカーを用いて変換された音声を再生する。生成部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、否定的な発言を省き、重要な情報だけを抽出して提示する。取得部は、例えば、スマート眼鏡２１４の制御部４６Ａによってユーザの感情を推定し、音声データの取得タイミングを調整する。各部と装置や制御部との対応関係は、上述した例に限定されず、種々の変更が可能である。 Each of the multiple elements, including the acquisition unit, analysis unit, filtering unit, provision unit, and generation unit, described above, is realized, for example, by at least one of the smart glasses 214 and the data processing device 12. For example, the acquisition unit collects voice data using the microphone of the smart glasses 214, which is analyzed by the specific processing unit 290 of the data processing device 12. The analysis unit, realized, for example, by the specific processing unit 290 of the data processing device 12, analyzes the voice data using emotion recognition AI. The filtering unit, realized, for example, by the specific processing unit 290 of the data processing device 12, converts negative emotional voice into rational voice. The provision unit, for example, plays the converted voice using the speaker of the smart glasses 214. The generation unit, realized, for example, by the specific processing unit 290 of the data processing device 12, omits negative comments and extracts and presents only important information. The acquisition unit, for example, estimates the user's emotions using the control unit 46A of the smart glasses 214 and adjusts the timing of voice data acquisition. The correspondence between each part and the device or control unit is not limited to the example described above, and various modifications are possible.

［第３実施形態］
図５には、第３実施形態に係るデータ処理システム３１０の構成の一例が示されている。 [Third embodiment]
FIG. 5 shows an example of the configuration of a data processing system 310 according to the third embodiment.

図５に示すように、データ処理システム３１０は、データ処理装置１２およびヘッドセット型端末３１４を備えている。データ処理装置１２の一例としては、サーバが挙げられる。 As shown in FIG. 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

ヘッドセット型端末３１４は、コンピュータ３６、マイクロフォン２３８、スピーカ２４０、カメラ４２、通信Ｉ／Ｆ４４、およびディスプレイ３４３を備えている。コンピュータ３６は、プロセッサ４６、ＲＡＭ４８、およびストレージ５０を備えている。プロセッサ４６、ＲＡＭ４８、およびストレージ５０は、バス５２に接続されている。また、マイクロフォン２３８、スピーカ２４０、カメラ４２、およびディスプレイ３４３も、バス５２に接続されている。 The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication I/F 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, the speaker 240, the camera 42, and the display 343 are also connected to the bus 52.

図６には、データ処理装置１２およびヘッドセット型端末３１４の要部機能の一例が示されている。図６に示すように、データ処理装置１２では、プロセッサ２８によって特定処理が行われる。ストレージ３２には、特定処理プログラム５６が格納されている。 Figure 6 shows an example of the main functions of the data processing device 12 and headset terminal 314. As shown in Figure 6, in the data processing device 12, specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32.

ヘッドセット型端末３１４では、プロセッサ４６によって特定処理が行われる。ストレージ５０には、特定プログラム６０が格納されている。プロセッサ４６は、ストレージ５０から特定プログラム６０を読み出し、読み出した特定プログラム６０をＲＡＭ４８上で実行する。特定処理は、プロセッサ４６がＲＡＭ４８上で実行する特定プログラム６０に従って、制御部４６Ａとして動作することによって実現される。なお、ヘッドセット型端末３１４には、データ生成モデル５８および感情特定モデル５９と同様のデータ生成モデルおよび感情特定モデルを有し、これらモデルを用いて特定処理部２９０と同様の処理を行うこともできる。 In the headset type terminal 314, the specific processing is performed by the processor 46. A specific program 60 is stored in the storage 50. The processor 46 reads the specific program 60 from the storage 50 and executes the read specific program 60 on the RAM 48. The specific processing is realized by the processor 46 operating as the control unit 46A in accordance with the specific program 60 executed on the RAM 48. The headset type terminal 314 also has a data generation model and emotion identification model similar to the data generation model 58 and emotion identification model 59, and can use these models to perform processing similar to that of the identification processing unit 290.

特定処理部２９０は、特定処理の結果をヘッドセット型端末３１４に送信する。ヘッドセット型端末３１４では、制御部４６Ａが、スピーカ２４０およびディスプレイ３４３に対して特定処理の結果を出力させる。マイクロフォン２３８は、特定処理の結果に対するユーザ入力を示す音声を取得する。制御部４６Ａは、マイクロフォン２３８によって取得されたユーザ入力を示す音声データをデータ処理装置１２に送信する。データ処理装置１２では、特定処理部２９０が音声データを取得する。 The specific processing unit 290 transmits the results of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the results of the specific processing. The microphone 238 acquires audio indicating the user input regarding the results of the specific processing. The control unit 46A transmits audio data indicating the user input acquired by the microphone 238 to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

第３実施形態に係るデータ処理システム３１０は、第１実施形態に係るデータ処理システム１０と同様の処理を行う。データ処理システム３１０による処理は、データ処理装置１２の特定処理部２９０またはヘッドセット型端末３１４の制御部４６Ａによって実行されるが、データ処理装置１２の特定処理部２９０とヘッドセット型端末３１４の制御部４６Ａとによって実行されてもよい。また、データ処理装置１２の特定処理部２９０は、処理に必要な情報をヘッドセット型端末３１４または外部の装置などから取得したり収集したりし、ヘッドセット型端末３１４は、処理に必要な情報をデータ処理装置１２または外部の装置などから取得したり収集したりする。 The data processing system 310 according to the third embodiment performs the same processing as the data processing system 10 according to the first embodiment. Processing by the data processing system 310 is executed by the specific processing unit 290 of the data processing device 12 or the control unit 46A of the headset terminal 314, but may also be executed by the specific processing unit 290 of the data processing device 12 and the control unit 46A of the headset terminal 314. In addition, the specific processing unit 290 of the data processing device 12 acquires or collects information required for processing from the headset terminal 314 or an external device, and the headset terminal 314 acquires or collects information required for processing from the data processing device 12 or an external device.

上述した取得部、解析部、フィルタリング部、提供部、および生成部を含む複数の要素の各々は、例えば、ヘッドセット型端末３１４およびデータ処理装置１２のうちの少なくとも一方で実現される。例えば、取得部は、ヘッドセット型端末３１４のマイクロフォンを用いて音声データを収集し、データ処理装置１２の特定処理部２９０によって解析される。解析部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、感情認識ＡＩを用いて音声データを解析する。フィルタリング部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、負の感情的な音声を理性的な音声に変換する。提供部は、例えば、ヘッドセット型端末３１４のスピーカーを用いて変換された音声を再生する。生成部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、否定的な発言を省き、重要な情報だけを抽出して提示する。取得部は、例えば、ヘッドセット型端末３１４の制御部４６Ａによってユーザの感情を推定し、音声データの取得タイミングを調整する。各部と装置や制御部との対応関係は、上述した例に限定されず、種々の変更が可能である。 Each of the multiple elements, including the acquisition unit, analysis unit, filtering unit, provision unit, and generation unit, described above, is realized, for example, by at least one of the headset-type terminal 314 and the data processing device 12. For example, the acquisition unit collects voice data using the microphone of the headset-type terminal 314, which is then analyzed by the specific processing unit 290 of the data processing device 12. The analysis unit, realized, for example, by the specific processing unit 290 of the data processing device 12, analyzes the voice data using emotion recognition AI. The filtering unit, realized, for example, by the specific processing unit 290 of the data processing device 12, converts negative emotional voice into rational voice. The provision unit plays the converted voice using, for example, the speaker of the headset-type terminal 314. The generation unit, realized, for example, by the specific processing unit 290 of the data processing device 12, omits negative comments and extracts and presents only important information. The acquisition unit, for example, estimates the user's emotions using the control unit 46A of the headset-type terminal 314, and adjusts the timing of voice data acquisition. The correspondence between each part and the device or control unit is not limited to the example described above, and various modifications are possible.

［第４実施形態］
図７には、第４実施形態に係るデータ処理システム４１０の構成の一例が示されている。 [Fourth embodiment]
FIG. 7 shows an example of the configuration of a data processing system 410 according to the fourth embodiment.

図７に示すように、データ処理システム４１０は、データ処理装置１２およびロボット４１４を備えている。データ処理装置１２の一例としては、サーバが挙げられる。 As shown in FIG. 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

ロボット４１４は、コンピュータ３６、マイクロフォン２３８、スピーカ２４０、カメラ４２、通信Ｉ／Ｆ４４、および制御対象４４３を備えている。コンピュータ３６は、プロセッサ４６、ＲＡＭ４８、およびストレージ５０を備えている。プロセッサ４６、ＲＡＭ４８、およびストレージ５０は、バス５２に接続されている。また、マイクロフォン２３８、スピーカ２４０、カメラ４２、および制御対象４４３も、バス５２に接続されている。 The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication I/F 44, and a control target 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, the speaker 240, the camera 42, and the control target 443 are also connected to the bus 52.

カメラ４２は、レンズ、絞り、およびシャッタなどの光学系と、ＣＭＯＳイメージセンサまたはＣＣＤイメージセンサなどの撮像素子とが搭載された小型デジタルカメラであり、ユーザの周囲（例えば、一般的な健常者の視界の広さに相当する画角で規定された撮像範囲）を撮像する。 Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an imaging element such as a CMOS image sensor or CCD image sensor, and captures images of the user's surroundings (e.g., an imaging range defined by an angle of view equivalent to the field of vision of a typical healthy person).

制御対象４４３は、表示装置、目部のＬＥＤ、並びに、腕、手および足などを駆動するモータなどを含む。ロボット４１４の姿勢や仕草は、腕、手および足などのモータを制御することにより制御される。ロボット４１４の感情の一部は、これらのモータを制御することにより表現できる。また、ロボット４１４の目部のＬＥＤの発光状態を制御することによっても、ロボット４１４の表情を表現できる。 The control object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the emotions of the robot 414 can be expressed by controlling these motors. In addition, the facial expressions of the robot 414 can also be expressed by controlling the light emission state of the LEDs in the eyes of the robot 414.

図８には、データ処理装置１２およびロボット４１４の要部機能の一例が示されている。図８に示すように、データ処理装置１２では、プロセッサ２８によって特定処理が行われる。ストレージ３２には、特定処理プログラム５６が格納されている。 Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, in the data processing device 12, specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32.

ロボット４１４では、プロセッサ４６によって特定処理が行われる。ストレージ５０には、特定プログラム６０が格納されている。プロセッサ４６は、ストレージ５０から特定プログラム６０を読み出し、読み出した特定プログラム６０をＲＡＭ４８上で実行する。特定処理は、プロセッサ４６がＲＡＭ４８上で実行する特定プログラム６０に従って、制御部４６Ａとして動作することによって実現される。なお、ロボット４１４には、データ生成モデル５８および感情特定モデル５９と同様のデータ生成モデルおよび感情特定モデルを有し、これらモデルを用いて特定処理部２９０と同様の処理を行うこともできる。 In robot 414, specific processing is performed by processor 46. Specific program 60 is stored in storage 50. Processor 46 reads specific program 60 from storage 50 and executes the read specific program 60 on RAM 48. The specific processing is realized by processor 46 operating as control unit 46A in accordance with specific program 60 executed on RAM 48. Robot 414 also has a data generation model and emotion identification model similar to data generation model 58 and emotion identification model 59, and can use these models to perform processing similar to that of identification processing unit 290.

特定処理部２９０は、特定処理の結果をロボット４１４に送信する。ロボット４１４では、制御部４６Ａが、スピーカ２４０および制御対象４４３に対して特定処理の結果を出力させる。マイクロフォン２３８は、特定処理の結果に対するユーザ入力を示す音声を取得する。制御部４６Ａは、マイクロフォン２３８によって取得されたユーザ入力を示す音声データをデータ処理装置１２に送信する。データ処理装置１２では、特定処理部２９０が音声データを取得する。 The specific processing unit 290 transmits the results of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the control target 443 to output the results of the specific processing. The microphone 238 acquires audio indicating the user input regarding the results of the specific processing. The control unit 46A transmits audio data indicating the user input acquired by the microphone 238 to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

第４実施形態に係るデータ処理システム４１０は、第１実施形態に係るデータ処理システム１０と同様の処理を行う。データ処理システム４１０による処理は、データ処理装置１２の特定処理部２９０またはロボット４１４の制御部４６Ａによって実行されるが、データ処理装置１２の特定処理部２９０とロボット４１４の制御部４６Ａとによって実行されてもよい。また、データ処理装置１２の特定処理部２９０は、処理に必要な情報をロボット４１４または外部の装置などから取得したり収集したりし、ロボット４１４は、処理に必要な情報をデータ処理装置１２または外部の装置などから取得したり収集したりする。 The data processing system 410 according to the fourth embodiment performs processing similar to that of the data processing system 10 according to the first embodiment. Processing by the data processing system 410 is executed by the specific processing unit 290 of the data processing device 12 or the control unit 46A of the robot 414, but may also be executed by the specific processing unit 290 of the data processing device 12 and the control unit 46A of the robot 414. Furthermore, the specific processing unit 290 of the data processing device 12 acquires or collects information required for processing from the robot 414 or an external device, etc., and the robot 414 acquires or collects information required for processing from the data processing device 12 or an external device, etc.

上述した取得部、解析部、フィルタリング部、提供部、および生成部を含む複数の要素の各々は、例えば、ロボット４１４およびデータ処理装置１２のうちの少なくとも一方で実現される。例えば、取得部は、ロボット４１４のマイクロフォンを用いて音声データを収集し、データ処理装置１２の特定処理部２９０によって解析される。解析部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、感情認識ＡＩを用いて音声データを解析する。フィルタリング部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、負の感情的な音声を理性的な音声に変換する。提供部は、例えば、ロボット４１４のスピーカーを用いて変換された音声を再生する。生成部は、例えば、データ処理装置１２の特定処理部２９０によって実現され、否定的な発言を省き、重要な情報だけを抽出して提示する。取得部は、例えば、ロボット４１４の制御部４６Ａによってユーザの感情を推定し、音声データの取得タイミングを調整する。各部と装置や制御部との対応関係は、上述した例に限定されず、種々の変更が可能である。 Each of the multiple elements, including the acquisition unit, analysis unit, filtering unit, provision unit, and generation unit, described above, is realized, for example, by at least one of the robot 414 and the data processing device 12. For example, the acquisition unit collects voice data using the microphone of the robot 414, which is analyzed by the specific processing unit 290 of the data processing device 12. The analysis unit is realized, for example, by the specific processing unit 290 of the data processing device 12, and analyzes the voice data using emotion recognition AI. The filtering unit is realized, for example, by the specific processing unit 290 of the data processing device 12, and converts negative emotional voice into rational voice. The provision unit plays the converted voice using, for example, the speaker of the robot 414. The generation unit is realized, for example, by the specific processing unit 290 of the data processing device 12, and omits negative comments and extracts and presents only important information. The acquisition unit, for example, estimates the user's emotions using the control unit 46A of the robot 414, and adjusts the timing of voice data acquisition. The correspondence between each part and the device or control unit is not limited to the example described above, and various modifications are possible.

なお、感情エンジンとしての感情特定モデル５９は、特定のマッピングに従い、ユーザの感情を決定してよい。具体的には、感情特定モデル５９は、特定のマッピングである感情マップ（図９参照）に従い、ユーザの感情を決定してよい。また、感情特定モデル５９は、同様に、ロボットの感情を決定し、特定処理部２９０は、ロボットの感情を用いた特定処理を行うようにしてもよい。 The emotion identification model 59, which serves as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to an emotion map (see Figure 9), which is a specific mapping. Similarly, the emotion identification model 59 may determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

図９は、複数の感情がマッピングされる感情マップ４００を示す図である。感情マップ４００において、感情は、中心から放射状に同心円に配置されている。同心円の中心に近いほど、原始的状態の感情が配置されている。同心円のより外側には、心境から生まれる状態や行動を表す感情が配置されている。感情とは、情動や心的状態も含む概念である。同心円の左側には、概して脳内で起きる反応から生成される感情が配置されている。同心円の右側には概して、状況判断で誘導される感情が配置されている。同心円の上方向および下方向には、概して脳内で起きる反応から生成され、かつ、状況判断で誘導される感情が配置されている。また、同心円の上側には、「快」の感情が配置され、下側には、「不快」の感情が配置されている。このように、感情マップ４００では、感情が生まれる構造に基づいて複数の感情がマッピングされており、同時に生じやすい感情が、近くにマッピングされている。 Figure 9 shows an emotion map 400 on which multiple emotions are mapped. In emotion map 400, emotions are arranged in concentric circles radiating from the center. Emotions closer to the center of the concentric circles are more primitive. Emotions representing states and actions arising from a state of mind are arranged on the outer edges of the concentric circles. The concept of emotion includes both emotions and mental states. Emotions that are generally generated from reactions that occur in the brain are arranged on the left side of the concentric circles. Emotions that are generally induced by situational judgment are arranged on the right side of the concentric circles. Emotions that are generally generated from reactions that occur in the brain and are induced by situational judgment are arranged on the upper and lower sides of the concentric circles. Furthermore, the emotion of "pleasure" is arranged on the upper side of the concentric circles, and the emotion of "discomfort" is arranged on the lower side. In this way, emotion map 400 maps multiple emotions based on the structure by which emotions are generated, with emotions that tend to occur simultaneously being mapped close together.

これらの感情は、感情マップ４００の３時の方向に分布しており、普段は安心と不安のあたりを行き来する。感情マップ４００の右半分では、内部的な感覚よりも状況認識の方が優位に立つため、落ち着いた印象になる。 These emotions are distributed in the 3 o'clock direction on emotion map 400, and usually fluctuate between relief and anxiety. In the right half of emotion map 400, situational awareness takes precedence over internal sensations, resulting in a sense of calm.

感情マップ４００の内側は心の中、感情マップ４００の外側は行動を表すため、感情マップ４００の外側に行くほど、感情が目に見える（行動に表れる）ようになる。 The inside of emotion map 400 represents what is going on in the mind, and the outside of emotion map 400 represents behavior, so the further out you go on emotion map 400, the more visible (expressed in behavior) the emotion becomes.

ここで、人の感情は、姿勢や血糖値のような様々なバランスを基礎としており、それらのバランスが理想から遠ざかると不快、理想に近づくと快という状態を示す。ロボットや自動車やバイクなどにおいても、姿勢やバッテリー残量のような様々なバランスを基礎として、それらのバランスが理想から遠ざかると不快、理想に近づくと快という状態を示すように感情を作ることができる。感情マップは、例えば、光吉博士の感情地図（音声感情認識および情動の脳生理信号分析システムに関する研究、徳島大学、博士論文：https://ci.nii.ac.jp/naid/500000375379）に基づいて生成されてよい。感情地図の左半分には、感覚が優位にたつ「反応」と呼ばれる領域に属する感情が並ぶ。また、感情地図の右半分には、状況認識が優位にたつ「状況」と呼ばれる領域に属する感情が並ぶ。 Here, human emotions are based on various balances such as posture and blood sugar levels, and when these balances deviate from the ideal, it indicates discomfort, and when they approach the ideal, it indicates pleasure. Emotions can also be created for robots, cars, motorcycles, etc., based on various balances such as posture and remaining battery life, and when these balances deviate from the ideal, it indicates discomfort, and when they approach the ideal, it indicates pleasure. Emotion maps can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on speech emotion recognition and emotional brain physiological signal analysis systems, Tokushima University, doctoral dissertation: https://ci.nii.ac.jp/naid/500000375379). The left half of the emotion map is lined with emotions belonging to an area called "reaction," where sensation is dominant. The right half of the emotion map is lined with emotions belonging to an area called "situation," where situational awareness is dominant.

感情マップでは学習を促す感情が２つ定義される。１つは、状況側にあるネガティブな「懺悔」や「反省」の真ん中周辺の感情である。つまり、「もう２度とこんな想いはしたくない」「もう叱られたくない」というネガティブな感情がロボットに生じたときである。もう１つは、反応側にあるポジティブな「欲」のあたりの感情である。つまり、「もっと欲しい」「もっと知りたい」というポジティブな気持ちのときである。 The emotion map defines two emotions that encourage learning. One is the negative emotion around the middle of "repentance" or "reflection" on the situation side. In other words, this is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the positive emotion around "desire" on the response side. In other words, this is when the robot experiences positive feelings such as "I want more" or "I want to know more."

感情特定モデル５９は、ユーザ入力を、予め学習されたニューラルネットワークに入力し、感情マップ４００に示す各感情を示す感情値を取得し、ユーザの感情を決定する。このニューラルネットワークは、ユーザ入力と、感情マップ４００に示す各感情を示す感情値との組み合わせである複数の学習データに基づいて予め学習されたものである。また、このニューラルネットワークは、図１０に示す感情マップ９００のように、近くに配置されている感情同士は、近い値を持つように学習される。図１０では、「安心」、「安穏」、「心強い」という複数の感情が、近い感情値となる例を示している。 The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values indicating each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple pieces of training data that are combinations of user input and emotion values indicating each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions that are close to each other have similar values, as in the emotion map 900 shown in Figure 10. Figure 10 shows an example in which multiple emotions, such as "relieved," "calm," and "reassuring," have similar emotion values.

上記実施形態では、１台のコンピュータ２２によって特定処理が行われる形態例を挙げたが、本開示の技術はこれに限定されず、コンピュータ２２を含めた複数のコンピュータによる特定処理に対する分散処理が行われるようにしてもよい。 In the above embodiment, an example was given in which a specific process is performed by a single computer 22, but the technology disclosed herein is not limited to this, and distributed processing of the specific process may also be performed by multiple computers, including computer 22.

上記実施形態では、ストレージ３２に特定処理プログラム５６が格納されている形態例を挙げて説明したが、本開示の技術はこれに限定されない。例えば、特定処理プログラム５６がＵＳＢ（Universal Serial Bus）メモリなどの可搬型のコンピュータ読み取り可能な非一時的格納媒体に格納されていてもよい。非一時的格納媒体に格納されている特定処理プログラム５６は、データ処理装置１２のコンピュータ２２にインストールされる。プロセッサ２８は、特定処理プログラム５６に従って特定処理を実行する。 In the above embodiment, an example was described in which the specific processing program 56 is stored in the storage 32, but the technology of the present disclosure is not limited to this. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-transitory storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-transitory storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes the specific processing in accordance with the specific processing program 56.

また、ネットワーク５４を介してデータ処理装置１２に接続されるサーバなどの格納装置に特定処理プログラム５６を格納させておき、データ処理装置１２の要求に応じて特定処理プログラム５６がダウンロードされ、コンピュータ２２にインストールされるようにしてもよい。 Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

なお、ネットワーク５４を介してデータ処理装置１２に接続されるサーバなどの格納装置に特定処理プログラム５６の全てを格納させておいたり、ストレージ３２に特定処理プログラム５６の全てを記憶させたりしておく必要はなく、特定処理プログラム５６の一部を格納させておいてもよい。 It is not necessary to store the entire specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entire specific processing program 56 in the storage 32; only a portion of the specific processing program 56 may be stored.

特定処理を実行するハードウェア資源としては、次に示す各種のプロセッサを用いることができる。プロセッサとしては、例えば、ソフトウェア、すなわち、プログラムを実行することで、特定処理を実行するハードウェア資源として機能する汎用的なプロセッサであるＣＰＵが挙げられる。また、プロセッサとしては、例えば、ＦＰＧＡ（Field-Programmable Gate Array）、ＰＬＤ（Programmable Logic Device）、またはＡＳＩＣ（Application Specific Integrated Circuit）などの特定の処理を実行させるために専用に設計された回路構成を有するプロセッサである専用電気回路が挙げられる。何れのプロセッサにもメモリが内蔵または接続されており、何れのプロセッサもメモリを使用することで特定処理を実行する。 The following types of processors can be used as hardware resources for executing specific processes. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource for executing specific processes by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which are processors with a circuit configuration designed specifically for executing specific processes. All processors have built-in or connected memory, and all use the memory to execute specific processes.

特定処理を実行するハードウェア資源は、これらの各種のプロセッサのうちの１つで構成されてもよいし、同種または異種の２つ以上のプロセッサの組み合わせ（例えば、複数のＦＰＧＡの組み合わせ、またはＣＰＵとＦＰＧＡとの組み合わせ）で構成されてもよい。また、特定処理を実行するハードウェア資源は１つのプロセッサであってもよい。 The hardware resource that executes the specific processing may be composed of one of these various processors, or may be composed of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). The hardware resource that executes the specific processing may also be a single processor.

１つのプロセッサで構成する例としては、第１に、１つ以上のＣＰＵとソフトウェアの組み合わせで１つのプロセッサを構成し、このプロセッサが、特定処理を実行するハードウェア資源として機能する形態がある。第２に、ＳｏＣ（System-on-a-chip）などに代表されるように、特定処理を実行する複数のハードウェア資源を含むシステム全体の機能を１つのＩＣチップで実現するプロセッサを使用する形態がある。このように、特定処理は、ハードウェア資源として、上記各種のプロセッサの１つ以上を用いて実現される。 As an example of a configuration using a single processor, first, there is a configuration in which one processor is configured using a combination of one or more CPUs and software, and this processor functions as a hardware resource that executes specific processing. Second, there is a configuration in which a processor is used to realize the functions of an entire system, including multiple hardware resources that execute specific processing, on a single IC chip, as typified by SoC (System-on-a-chip). In this way, specific processing is realized using one or more of the various processors listed above as hardware resources.

更に、これらの各種のプロセッサのハードウェア的な構造としては、より具体的には、半導体素子などの回路素子を組み合わせた電気回路を用いることができる。また、上記の特定処理はあくまでも一例である。従って、主旨を逸脱しない範囲内において不要なステップを削除したり、新たなステップを追加したり、処理順序を入れ替えたりしてもよいことは言うまでもない。 More specifically, the hardware structure of these various processors can be an electrical circuit that combines circuit elements such as semiconductor devices. Furthermore, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps can be added, or the processing order can be rearranged, all within the scope of the spirit of the invention.

また、上述した例では、第１実施形態から第４実施形態に分けて説明したが、これらの実施形態の一部または全部は組み合わされてもよい。また、スマートデバイス１４、スマート眼鏡２１４、ヘッドセット型端末３１４、およびロボット４１４は一例であって、それぞれを組み合わせてもよく、それ以外の装置であってもよい。また、上述した例では、形態例１と形態例２に分けて説明したが、これらは組み合わせてもよい。 Furthermore, in the above examples, the first to fourth embodiments have been described separately, but some or all of these embodiments may be combined. Furthermore, the smart device 14, smart glasses 214, headset terminal 314, and robot 414 are only examples, and they may be combined, or other devices may be used. Furthermore, in the above examples, the first and second embodiments have been described separately, but these may also be combined.

以上に示した記載内容および図示内容は、本開示の技術に係る部分についての詳細な説明であり、本開示の技術の一例に過ぎない。例えば、上記の構成、機能、作用、および効果に関する説明は、本開示の技術に係る部分の構成、機能、作用、および効果の一例に関する説明である。よって、本開示の技術の主旨を逸脱しない範囲内において、以上に示した記載内容および図示内容に対して、不要な部分を削除したり、新たな要素を追加したり、置き換えたりしてもよいことは言うまでもない。また、錯綜を回避し、本開示の技術に係る部分の理解を容易にするために、以上に示した記載内容および図示内容では、本開示の技術の実施を可能にする上で特に説明を要しない技術常識等に関する説明は省略されている。 The above-described written content and illustrations are a detailed explanation of the parts related to the technology of the present disclosure and are merely an example of the technology of the present disclosure. For example, the above explanation of the configuration, functions, actions, and effects is an explanation of an example of the configuration, functions, actions, and effects of the parts related to the technology of the present disclosure. Therefore, it goes without saying that unnecessary parts may be deleted, new elements may be added, or substitutions may be made to the above-described written content and illustrations, as long as they do not deviate from the spirit of the technology of the present disclosure. Furthermore, to avoid confusion and facilitate understanding of the parts related to the technology of the present disclosure, the above-described written content and illustrations omit explanations of common technical knowledge that do not require particular explanation to enable the implementation of the technology of the present disclosure.

本明細書に記載された全ての文献、特許出願および技術規格は、個々の文献、特許出願および技術規格が参照により取り込まれることが具体的かつ個々に記された場合と同程度に、本明細書中に参照により取り込まれる。 All publications, patent applications, and technical standards mentioned in this specification are incorporated herein by reference to the same extent as if each individual publication, patent application, and technical standard was specifically and individually indicated to be incorporated by reference.

（付記１）
音声データを取得する取得部と、
前記取得部によって取得された音声データを解析し、話者の感情を分類する解析部と、
前記解析部によって分類された感情に基づいて、負の感情的な音声を理性的な音声に変換するフィルタリング部と、
前記フィルタリング部によって変換された音声をユーザに提供する提供部と、を備える
ことを特徴とするシステム。
（付記２）
前記解析部は、
音声のトーン、ピッチ、速度の特徴を基に感情を判定する
ことを特徴とする付記１に記載のシステム。
（付記３）
前記フィルタリング部は、
怒りの感情が含まれる音声を冷静なトーンに変換する
ことを特徴とする付記１に記載のシステム。
（付記４）
生成部を備え、
前記生成部は、
否定的な発言を省き、重要な情報だけを抽出して提示する
ことを特徴とする付記１に記載のシステム。
（付記５）
前記提供部は、
変換された音声をユーザに提供する
ことを特徴とする付記１に記載のシステム。
（付記６）
前記取得部は、
ユーザの感情を推定し、推定されたユーザの感情に基づいて音声データの取得タイミングを調整する
ことを特徴とする付記１に記載のシステム。
（付記７）
前記取得部は、
ユーザの過去の通話履歴を分析し、最適な取得方法を選定する
ことを特徴とする付記１に記載のシステム。
（付記８）
前記取得部は、
音声データの取得時に、ユーザの現在の業務状況または関心分野に基づいてフィルタリングを行う
ことを特徴とする付記１に記載のシステム。
（付記９）
前記取得部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて取得する音声データの優先順位を決定する
ことを特徴とする付記１に記載のシステム。
（付記１０）
前記取得部は、
音声データの取得時に、ユーザの地理的位置情報を基に関連性の高いデータを優先的に取得する
ことを特徴とする付記１に記載のシステム。
（付記１１）
前記取得部は、
音声データの取得時に、ユーザのSNS活動を分析し、関連するデータを取得する
ことを特徴とする付記１に記載のシステム。
（付記１２）
前記解析部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて解析の表現方法を調整する
ことを特徴とする付記１に記載のシステム。
（付記１３）
前記解析部は、
解析時に、音声データの重要度に基づいて解析の詳細度を調整する
ことを特徴とする付記１に記載のシステム。
（付記１４）
前記解析部は、
解析時に、音声データのカテゴリに応じて異なる解析アルゴリズムを適用する
ことを特徴とする付記１に記載のシステム。
（付記１５）
前記解析部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて解析の長さを調整する
ことを特徴とする付記１に記載のシステム。
（付記１６）
前記解析部は、
解析時に、音声データの取得時期に基づいて解析の優先順位を決定する
ことを特徴とする付記１に記載のシステム。
（付記１７）
前記解析部は、
解析時に、音声データの関連性に基づいて解析の順序を調整する
ことを特徴とする付記１に記載のシステム。
（付記１８）
前記フィルタリング部は、
ユーザの感情を推定し、推定したユーザの感情に基づいてフィルタリングの基準を調整する
ことを特徴とする付記１に記載のシステム。
（付記１９）
前記フィルタリング部は、
フィルタリング時に、音声データの相互関係に基づいてフィルタリングの精度を向上させる
ことを特徴とする付記１に記載のシステム。
（付記２０）
前記フィルタリング部は、
フィルタリング時に、音声データの提出者の属性情報を考慮してフィルタリングを行う
ことを特徴とする付記１に記載のシステム。
（付記２１）
前記フィルタリング部は、
ユーザの感情を推定し、推定したユーザの感情に基づいてフィルタリングの結果を表示する順序を調整する
ことを特徴とする付記１に記載のシステム。
（付記２２）
前記フィルタリング部は、
フィルタリング時に、音声データの地理的分布を考慮してフィルタリングを行う
ことを特徴とする付記１に記載のシステム。
（付記２３）
前記フィルタリング部は、
フィルタリング時に、音声データの関連文献に基づいてフィルタリングの精度を向上させる
ことを特徴とする付記１に記載のシステム。
（付記２４）
前記提供部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて提供の表示方法を調整する
ことを特徴とする付記１に記載のシステム。
（付記２５）
前記提供部は、
提供時に、ユーザの過去の操作履歴を参照して最適な表示方法を選定する
ことを特徴とする付記１に記載のシステム。
（付記２６）
前記提供部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて提供の操作手順を調整する
ことを特徴とする付記１に記載のシステム。
（付記２７）
前記提供部は、
提供時に、ユーザのデバイス情報を考慮して最適な表示方法を選定する
ことを特徴とする付記１に記載のシステム。
（付記２８）
前記生成部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて生成する音声の優先順位を決定する
ことを特徴とする付記１に記載のシステム。
（付記２９）
前記生成部は、
生成時に、音声データの相互関係を考慮して生成の精度を向上させる
ことを特徴とする付記１に記載のシステム。
（付記３０）
前記生成部は、
生成時に、音声データの提出者の属性情報を考慮して生成を行う
ことを特徴とする付記１に記載のシステム。
（付記３１）
前記生成部は、
ユーザの感情を推定し、推定したユーザの感情に基づいて生成する音声の表示方法を調整する
ことを特徴とする付記１に記載のシステム。
（付記３２）
前記生成部は、
生成時に、音声データの地理的分布を考慮して生成を行う
ことを特徴とする付記１に記載のシステム。
（付記３３）
前記生成部は、
生成時に、音声データの関連文献を参照して生成の精度を向上させる
ことを特徴とする付記１に記載のシステム。 (Appendix 1)
an acquisition unit that acquires voice data;
an analysis unit that analyzes the voice data acquired by the acquisition unit and classifies the emotion of the speaker;
a filtering unit that converts negative emotional voice into rational voice based on the emotion classified by the analysis unit;
a providing unit that provides the voice converted by the filtering unit to a user.
(Appendix 2)
The analysis unit
2. The system of claim 1, wherein emotion is determined based on characteristics of tone, pitch, and speed of speech.
(Appendix 3)
The filtering unit
2. The system of claim 1, wherein the system converts voice containing an angry emotion into a calm tone.
(Appendix 4)
A generating unit is provided,
The generation unit
The system described in Appendix 1 is characterized in that it eliminates negative comments and extracts and presents only important information.
(Appendix 5)
The providing unit
2. The system of claim 1, further comprising: providing the converted audio to the user.
(Appendix 6)
The acquisition unit
The system according to claim 1, further comprising: estimating a user's emotion; and adjusting timing for acquiring voice data based on the estimated user's emotion.
(Appendix 7)
The acquisition unit
The system according to claim 1, further comprising: analyzing a user's past call history and selecting an optimal acquisition method.
(Appendix 8)
The acquisition unit
The system according to claim 1, wherein when acquiring voice data, filtering is performed based on the user's current work situation or area of interest.
(Appendix 9)
The acquisition unit
The system according to claim 1, further comprising: estimating a user's emotion; and determining a priority order of voice data to be acquired based on the estimated user's emotion.
(Appendix 10)
The acquisition unit
The system according to claim 1, wherein when acquiring voice data, highly relevant data is acquired preferentially based on the user's geographical location information.
(Appendix 11)
The acquisition unit
The system described in Appendix 1, characterized in that when acquiring voice data, the system analyzes the user's SNS activity and acquires related data.
(Appendix 12)
The analysis unit
2. The system of claim 1, further comprising: estimating a user's emotion; and adjusting a method of expressing the analysis based on the estimated user's emotion.
(Appendix 13)
The analysis unit
The system according to claim 1, wherein the level of detail of the analysis is adjusted based on the importance of the audio data during the analysis.
(Appendix 14)
The analysis unit
2. The system of claim 1, wherein during analysis, different analysis algorithms are applied depending on the category of the audio data.
(Appendix 15)
The analysis unit
10. The system of claim 1, further comprising: estimating a user's emotion; and adjusting the length of the analysis based on the estimated user's emotion.
(Appendix 16)
The analysis unit
The system according to claim 1, wherein, during analysis, the system determines the priority of analysis based on the time when the voice data was acquired.
(Appendix 17)
The analysis unit
2. The system of claim 1, wherein during analysis, the order of analysis is adjusted based on the relevance of the audio data.
(Appendix 18)
The filtering unit
2. The system of claim 1, further comprising: estimating a user's emotion; and adjusting filtering criteria based on the estimated user's emotion.
(Appendix 19)
The filtering unit
The system of claim 1, wherein the accuracy of filtering is improved based on correlations between audio data during filtering.
(Appendix 20)
The filtering unit
The system according to claim 1, wherein the filtering is performed taking into consideration attribute information of the person who submitted the voice data.
(Appendix 21)
The filtering unit
The system according to claim 1, further comprising: estimating a user's emotion; and adjusting an order in which filtering results are displayed based on the estimated user's emotion.
(Appendix 22)
The filtering unit
The system according to claim 1, wherein filtering is performed taking into account the geographical distribution of the audio data.
(Appendix 23)
The filtering unit
The system according to claim 1, wherein the accuracy of filtering is improved based on related literature of the speech data during filtering.
(Appendix 24)
The providing unit
10. The system of claim 1, further comprising: estimating a user's emotion; and adjusting a presentation method of the offering based on the estimated user's emotion.
(Appendix 25)
The providing unit
The system according to claim 1, wherein the system selects the most suitable display method by referring to the user's past operation history when providing the information.
(Appendix 26)
The providing unit
The system according to claim 1, further comprising: estimating a user's emotion; and adjusting the provided operation procedure based on the estimated user's emotion.
(Appendix 27)
The providing unit
The system according to claim 1, wherein the system selects the most suitable display method in consideration of the user's device information when providing the information.
(Appendix 28)
The generation unit
The system according to claim 1, further comprising: estimating a user's emotion; and determining a priority of speech to be generated based on the estimated user's emotion.
(Appendix 29)
The generation unit
The system according to claim 1, characterized in that during generation, correlations between speech data are taken into account to improve generation accuracy.
(Appendix 30)
The generation unit
The system according to claim 1, wherein the generation is performed taking into consideration attribute information of the person who submitted the voice data.
(Appendix 31)
The generation unit
The system according to claim 1, further comprising: estimating a user's emotion; and adjusting a display method of the generated voice based on the estimated user's emotion.
(Appendix 32)
The generation unit
The system according to claim 1, wherein the generation is performed taking into consideration the geographical distribution of the voice data.
(Appendix 33)
The generation unit
The system according to claim 1, characterized in that, during generation, the accuracy of generation is improved by referring to literature related to the speech data.

１０、２１０、３１０、４１０データ処理システム
１２データ処理装置
１４スマートデバイス
２１４スマート眼鏡
３１４ヘッドセット型端末
４１４ロボット 10, 210, 310, 410 Data processing system 12 Data processing device 14 Smart device 214 Smart glasses 314 Headset type terminal 414 Robot

Claims

an acquisition unit that acquires voice data;
an analysis unit that analyzes the voice data acquired by the acquisition unit and classifies the emotion of the speaker;
a filtering unit that converts negative emotional voice into rational voice based on the emotion classified by the analysis unit;
a providing unit that provides the voice converted by the filtering unit to a user ,
The acquisition unit
The user's emotion is estimated, and the timing of acquiring voice data is adjusted based on the estimated user's emotion.
A system characterized by:

The analysis unit
The system of claim 1, wherein emotion is determined based on tone, pitch, and rate characteristics of the voice.

The filtering unit
The system of claim 1, wherein the system converts angry speech into a calm tone.

A generating unit is provided,
The generation unit
The system according to claim 1, characterized in that it omits negative comments and extracts and presents only important information.