JP4036966B2

JP4036966B2 - Navigation system and method, and recording medium recording navigation software

Info

Publication number: JP4036966B2
Application number: JP14865498A
Authority: JP
Inventors: 順大塚; 丘並木
Original assignee: Clarion Co Ltd
Current assignee: Faurecia Clarion Electronics Co Ltd
Priority date: 1998-05-29
Filing date: 1998-05-29
Publication date: 2008-01-23
Anticipated expiration: 2018-05-29
Also published as: JPH11337363A

Description

【０００１】
【発明の属する技術分野】
この発明は、ナビゲーションにかかわる技術の改良に関するもので、より具体的には、メニューの階層や発話順序と関係なく、したい操作を音声認識で容易に指示できるようにしたものである。
【０００２】
【従来の技術】
近年、自動車に代表される移動体の道案内を自動的に行う電子機器として、ナビゲーションシステムが知られている。ナビゲーションシステムは、人工衛星からの電波やジャイロなどを使って、搭載している自動車（自車）の現在位置（自車位置）を計算し、液晶表示パネルなどの表示画面で、自車位置を地図上でコンピュータグラフィックス表示しながら、次にどこをどちらへ曲がればよいといった道案内をするものである。
【０００３】
また、このようなナビゲーションシステムに音声認識の技術を適用し、あらかじめ決められたいろいろな語句を発話（発声）することで、ナビゲーションシステムを操作するようにした例も知られている。このような音声認識では、関連する技術もいろいろ提案されているが、典型的には、認識しようとする語句ごとの特徴を表す特徴パラメータや波形などのデータ、すなわち認識用辞書をあらかじめ用意しておく。そして、マイクロホンから入力される音声をデジタル信号に変換し、上に述べた認識用辞書とパタンマッチングすることで、発話された語句を認識する。
【０００４】
そして、このような音声認識でナビゲーションシステムを制御する場合、従来では、いろいろな操作を階層構造のメニューから呼び出す構成となっていた。このような構成では、メニューの階層構造を辿りながら、ナビゲーションシステムからのアナウンスなどに従って順次、段階を踏んで決められた語句を発話することで、必要とする機能に到達しなければならない。
【０００５】
また、ナビゲーションシステムのこのような操作を実現させるにあたって、上に述べた認識用辞書は、メニューの階層構造に応じた複数のテーブルから構成されていた。そして、メニューのどの位置にいるかに対応したテーブルが読み込まれ、そのテーブル中の語句のみが音声認識の対象となっていた。ここで、図１１は、従来の認識用辞書におけるテーブルの構成を示す概念図である。
【０００６】
この例では、自車のあたりの地図が表示画面に表示されているようないわゆる基本画面では、テーブルＴ１０に登録されている「メニュー」などの語句だけが認識の対象となる。そして、例えば「ヘッディングアップ」という操作をするためには、基本画面でまず「メニュー」と発話することで、メニューの内容を大まかに分けたテーブルＴ２０を認識対象とし、その状態で今度は「セッテイ」と発話する。同様にテーブルＴ３０、テーブルＴ４０と辿ってテーブルＴ５０が認識対象になったときに初めて「ヘッディングアップ」と発話することができる。
【０００７】
つまり、従来では、このように基本画面から１つ目の発話をした後、その後どのような順序でどのような語句を発話できるかという流れは、メニュー階層に応じて固定されていた。また、どのような語句が認識対象になるかという点でも、例えば基本画面から発話出来る言葉はあらかじめ決まっており、「Ａ」→「Ｂ」という順番で語句を発話する事が出来ても、「Ｂ」→「Ａ」という逆の順番で発話する事は出来なかった。
【０００８】
【発明が解決しようとする課題】
しかし、上に述べたような従来技術では、１つの操作をするためにも複数の語句を決まった順番で発話しなければならないため、効率良く操作できないという問題があった。すなわち、ある機能を利用するために、アナウンスなどに従って複数の語句を発話してゆくような操作の仕方は、例えば、初心者にはわかりやすいが、慣れるに従ってその中間の操作は必要性が少なくなる。このため、むしろ早く次の操作をしたくなるものであり、もっと効率良く操作できるインタフェースが潜在的に待ち望まれていた。
【０００９】
さらに、上に述べたような従来技術は、１つ目のコマンドとなる語句が認識されることで、それに続くある固定された操作の流れ（フロー）が実行され、このフローを辿ることで目的の操作や機能に辿りつくものである。このため、いろいろな操作について、その操作に辿り着くための入り口となる１つ目のコマンドはユーザー自身が覚えていなければならず、このような負担の軽減も潜在的に求められていた。
【００１０】
この発明は、上に述べたような従来技術の問題点を解決するために提案されたもので、その目的は、メニューの階層や発話順序と関係なく、したい操作を音声認識で容易に指示できるナビゲーションの技術を提供することである。
【００１１】
【課題を解決するための手段】
上に述べた目的を達成するため、請求項１のナビゲーションシステムは、メニュー階層にしたがってあらかじめ決められた語句を順番に音声認識することで、どの操作をするかを選ぶための第１の選択手段と、前記メニュー階層の最下層で使う語句をその最下層以外で音声認識することで、その語句に応じた操作を選ぶための第２の選択手段と、選ばれた操作に応じた処理を行う処理手段と、を備え、前記第２の選択手段は、前記最下層で使う語句をその最下層以外で音声認識した場合には、現在の階層より一つ下の階層のメニューを表示する操作を選択するものであり、前記第１の選択手段は、前記最下層で使う語句をその最下層で音声認識した場合には、当該認識した語句に対応した操作を選択するものであり、前記処理手段は、前記選ばれた操作に応じた処理に加え、前記第１の選択手段において現在の階層より一つ下の階層のメニューを表示する操作が選択された場合に、当該操作に対応するメッセージを表示器に表示することを特徴とする。
【００１２】
請求項２のナビゲーション方法は、請求項１の発明を方法という見方からとらえたもので、メニュー階層にしたがってあらかじめ決められた語句を順番に音声認識することで、どの操作をするかを選ぶための第１の選択ステップと、前記メニュー階層の最下層で使う語句をその最下層以外で音声認識することで、その語句に応じた操作を選ぶための第２の選択ステップと、選ばれた操作に応じた処理を行う処理ステップと、を含み、前記第２の選択ステップは、前記最下層で使う語句をその最下層以外で音声認識した場合には、現在の階層より一つ下の階層のメニューを表示する操作を選択する処理を含み、前記第１の選択ステップは、前記最下層で使う語句をその最下層で音声認識した場合には、当該認識した語句に対応した操作を選択する処理を含み、前記処理ステップは、前記選ばれた操作に応じた処理に加え、前記第１の選択ステップにおいて現在の階層より一つ下の階層のメニューを表示する操作が選択された場合に、当該操作に対応するメッセージを表示器に表示する処理を含むことを特徴とする。
【００１３】
請求項３のナビゲーション用ソフトウェアを記録した記録媒体は、請求項１及び２の発明を記録媒体に記憶されたソフトウェアという見方からとらえたもので、コンピュータを使ってナビゲーションを行うためのナビゲーション用ソフトウェアを記録した記録媒体において、そのソフトウェアは前記コンピュータに、メニュー階層にしたがってあらかじめ決められた語句を順番に音声認識することで、どの操作をするかを選ぶための第１の選択させる処理と、前記メニュー階層の最下層で使う語句をその最下層以外で音声認識することで、その語句に応じた操作を選ぶための第２の選択させる処理と、選ばれた操作に応じた処理とを、実行させ、前記第２の選択させる処理には、前記最下層で使う語句をその最下層以外で音声認識した場合には、現在の階層より一つ下の階層のメニューを表示する操作を選択させる処理を含み、前記第１の選択させる処理は、前記最下層で使う語句をその最下層で音声認識した場合には、当該認識した語句に対応した操作を選択させる処理を含み、前記選ばれた操作に応じた処理は、前記第１の選択させる処理において現在の階層より一つ下の階層のメニューを表示する操作が選択された場合に、当該操作に対応するメッセージを表示器に表示させる処理を含むことを特徴とする。
【００１４】
以上のような態様では、メニュー階層の最下層で使う語句について、本来のその最下層で言われたときと、それ以外から直接名指しされたときとで、違った処理を行う。例えば、ある画面表示の仕方を指す語句について、メニュー階層の最下層で言われたときは、他の選択肢と一緒に表示することで他の選択肢も検討する機会を与えるが、地図表示の画面などから直接名指しされたときは、それにしたがって画面表示を切り換えたうえ、そのことをメッセージ出力する、といった例が考えられる。このように、同じ語句でもどのような状態で言われたかによって違った処理が行われるので、使い勝手をきめ細かく改善できる。
【００１７】
【発明の実施の形態】
次に、この発明のナビゲーションシステムの実施の形態（以下「実施形態」という）について、図面を参照して具体的に説明する。なお、以下の説明で使うそれぞれの図について、それより前で説明した図と同じ部材や同じ種類の部材については同じ符号をつけ、説明は省略する。
【００１８】
また、この実施形態は、いろいろなハードウェア装置と、ソフトウェアによって制御されるコンピュータとを使って実現される。この場合、そのソフトウェアは、この明細書の記載にしたがった命令を組み合わせることで作られ、上に述べた従来技術と共通の部分には従来技術で説明した手法も使われる。また、そのソフトウェアは、プログラムコードだけでなく、プログラムコードの実行のときに使うために予め用意されたデータも含む。そして、そのソフトウェアは、ナビゲーションシステムに組み込まれたＣＰＵ、各種チップセットといった物理的な処理装置を活用することでこの発明の作用効果を実現する。
【００１９】
但し、この発明を実現する具体的なハードウェアやソフトウェアの構成はいろいろ変更することができる。例えば、回路の構成やＣＰＵの処理能力に応じて、ある機能を、ＬＳＩなどの物理的な電子回路で実現する場合も、ソフトウェアによって実現する場合も考えられる。また、ソフトウェアを使う部分についても、ソフトウェアの形式には、コンパイラ、アセンブラ、マイクロプログラムなどいろいろ考えられる。また、この発明を実現するソフトウェアを記録した記録媒体は、それ単独でもこの発明の一態様である。
【００２０】
以上のように、コンピュータを使ってこの発明を実現する態様はいろいろ考えられるので、以下では、この発明や実施形態に含まれる個々の機能を実現する仮想的回路ブロックを使って、この発明と実施形態とを説明する。
【００２１】
〔１．第１実施形態の構成〕
第１実施形態は、メニュー階層を辿って順次発話する操作と、メニュー階層とは関係なく、したい操作を表す語句を直接発話する操作の両方が可能なものである。すなわち、この第１実施形態では、図３に示すように、地図を表示している画面から、「メニュー」→「画面表示」→「進行方向」→「ヘディングアップ」というように順次、階層を追って操作することもできるし、また、図４に示すように、地図を表示している画面から直接「ヘッディングアップ」と言ってショートカットすることもできる。
【００２２】
〔１−１．全体の構成〕
まず、図１は、この実施形態をどのようなハードウェアの上に構成することができるかの例を示す概念図である。この例は、音声認識ユニットＶと、ナビゲーションユニットＮとを互いに接続するもので、この実施形態の実現に使うハードウェアの一部を示すものである。
【００２３】
このうち音声認識ユニットＶ内のＲＯＭ（Ｖ１）には認識用辞書Ｖ２が記憶されている。また、ナビゲーションユニットＮ内にはＣＤ−ＲＯＭドライブがあり、このＣＤ−ＲＯＭドライブに挿入されているＣＤ−ＲＯＭ（Ｎ１）には、ナビゲーションシステムの機能を実現するアプリケーションプログラムのほか、各種データを含むデータベースＮ３が記録されている。また、この実施形態のナビゲーションシステムの機能を実現するためのアプリケーションプログラムＮ４は、ＣＤ−ＲＯＭ（Ｎ１）から読み出され、フラッシュメモリＮ２に記録される。
【００２４】
また、図２は、図１に示したようなハードウェアを使って実現されるこの実施形態について、具体的な構成を示す機能ブロック図である。すなわち、この実施形態は、絶対位置・方位検出部１と、相対方位検出部２と、車速検出部３と、メインＣＰＵ及びその周辺回路４と、メモリ群Ｍと、表示部１０と、入力部１１と、ＣＤ−ＲＯＭ制御部１２と、ＦＭ多重受信及び処理部１３と、音声認識部１４と、を備えている。
【００２５】
このうち、絶対位置・方位検出部１は、ＧＰＳ衛星から送られてくるＧＰＳ電波を受信することで、自動車（自車）の現在位置について地表での絶対的な位置座標や方位を計算する手段である。また、相対方位検出部２は、ジャイロなどを使って自動車の相対的な方位を検出する部分である。また、車速検出部３は、自動車より得られる車速パルスを処理することで、車の速度を計算する部分である。
【００２６】
また、ＣＤ−ＲＯＭ制御部１２は、ナビゲーションシステム用のプログラムや道路地図データなど各種の情報をＣＤ−ＲＯＭから読み出す部分である。また、ＦＭ多重受信及び処理部１３は、複数のアンテナを受信状態に応じて切り換えることで、ラジオのＦＭ放送を受信する部分である。また、音声認識部１４は、認識しようとする語句ごとの特徴を表す特徴パラメータや波形のデータ、すなわち認識用辞書４３を備えた音声認識ユニットであり、マイクロホンから入力される音声をデジタル信号に変換し、認識用辞書４３とパタンマッチングすることで、発声された語句を認識する部分である。
【００２７】
〔１−２．メインＣＰＵ及びその周辺回路の役割〕
また、メインＣＰＵ及びその周辺回路４は、ナビゲーションシステム全体を制御する制御回路の役割を果たす部分であり、上に述べたソフトウェアの作用によって、第１の選択部４１と、第２の選択部４４と、処理部４５と、の役割を果たす。
【００２８】
このうち、第１の選択部４１は、メニュー階層にしたがってあらかじめ決められた語句を順番に音声認識することで、どの操作をするかを選ぶための第１の選択手段である。なお、ここでは音声認識は音声認識部１４が行い、上に述べたメニュー階層の内容は、メニュー階層データ４２として用意されているものとする。
【００２９】
また、第２の選択部４４は、上に述べたメニュー階層の最下層で使う語句を、その最下層以外で音声認識することで、その語句に応じた操作を選ぶための第２の選択手段である。また、処理部４５は、第１の選択部４１又は第２の選択部４４によって選ばれた操作に応じた処理を行う処理手段である。なお、この処理部４５は、上に述べた最下層で使う語句については、その最下層で音声認識されたか、その最下層以外で音声認識されたかに応じて、異なった処理を行うように構成されている。
【００３０】
なお、図示はしないが、メインＣＰＵ及びその周辺回路４は上に述べたほかに、計算された自車位置を利用して、自動車の現在位置をコンピュータグラフィックスの地図上で表示したり、ユーザの指示に応じて道案内を行ったりといったナビゲーションの各種処理を行う他の処理部を備えていて、また、Ｉ／Ｏ制御回路やドライバなどを含む図示しないユーザインタフェースを使って、いろいろな情報の入出力を行うように構成されている。また、メインＣＰＵ及びその周辺回路４は、このような各種処理を行う際、メモリ群Ｍと、表示部１０と、入力部１１とを利用する。
【００３１】
〔１−３．その他の構成〕
すなわち、メモリ群Ｍは、この実施形態のナビゲーションシステムが動作するのに必要な各種のメモリ、すなわち、ＢＩＯＳやブートアッププログラムなどを格納しているＲＯＭ５、ワークエリアなどに使うＤＲＡＭ６、キャッシュやバッファなどに使うＳＲＡＭ７、ビデオ表示などに使うＶＲＡＭ８を含んでいる。
【００３２】
また、表示部１０は、地図や操作メニューなど各種の情報を液晶表示パネルや音声合成装置などを使って出力するための部分であり、入力部１１は、ユーザが命令や目的地などさまざまな情報を入力するための部分である。
【００３３】
〔２．第１実施形態の作用〕
上に述べたように構成された第１実施形態は、次のように作用する。
（１）まず、認識用辞書４３にはメニューの各階層で使う語句のデータを全て含ませておき、メニューの階層を辿って操作しているときも、常に全ての語句が認識可能な状態にしておく。
【００３４】
（２）ユーザーからの発声があると、音声認識部１４は認識用辞書４３を使って発話された語句を認識し、その語句の番号を第１の選択部４１及び第２の選択部４４に渡す。ここで、第１の選択部４１及び第２の選択部４４の動作は、ナビゲーションシステムの機能を実現するアプリケーションプログラムによって実現される。
【００３５】
（３）すなわち、これらアプリケーションプログラムには、語句番号として何番を受け取ったかに応じた処理をあらかじめプログラミングしておき、その動きを実行する。この場合、語句の番号ごとに、どのような動作をプログラミングしておくかは、基本的には、番号と１対１にしておく。
【００３６】
例えば、メニュー階層のなかほどで「システム」という語句を発話できるメニューがあり、その語句を発話すると１つ下の階層のメニューが表示されるとする。このような場合、この「システム」という語句が上に述べたその「なかほど」のメニュー階層で発話されたときも、それ以外の例えば自車周辺の地図を表示している状態で発話されたときも、いずれも上に述べた同じ「１つ下の階層のメニュー」が表示されるようにしておく。
【００３７】
但し、メニュー階層の最下層の語句は、具体的な操作を行うものであるから、地図上から直接言われたか、またはメニュー上の他の言葉の次に言われたかによって動きを変える様にプログラミングしておく。
【００３８】
例えば、図３（ａ）〜（ｅ）は、メニュー階層にしたがって、決められた語句を順次発話することで「ヘッディングアップ」という操作に辿り着く流れを示す概念図であり、「進行方向」を発話することで表示された画面（ｄ）では、「ヘッディングアップ」の他に「ノースアップ」という選択肢があることが表示されている。
【００３９】
一方、図４（ａ），（ｂ）は、自車周辺の道路地図を表示している画面から、直接「ヘッディングアップ」と発話するだけで、「ヘッディングアップ」という操作を容易に行う例を示すもので、この例では「ヘッディングアップ」と発話した後の画面（ｂ）では、上に述べた他の選択肢は表示されず、進行方向をヘッディングアップにした旨のメッセージが表示されている。
【００４０】
（４）なお、データベースＮ３には、上に述べたアプリケーションプログラムが与えられた語句の番号に対応する処理を実行するときに必要な情報、例えば画面上に出力するメッセージなどを語句の番号ごとに対応させて格納しておき、アプリケーションプログラムは、ＣＤ−ＲＯＭ制御部１２を使ってこのデータを適宜、読み込みながら語句に応じた処理を実行する。
【００４１】
ここで、図５は、認識用辞書４３を使って認識された語句の番号が、上に述べたアプリケーションプログラムによって処理される例を示す概念図である。この例では、認識用辞書４３を使って認識された「ガメンヒョウジ」という語句は８番、「ヘッディングアップ」という語句は１９番という語句の番号でアプリケーションプログラムに渡される。
【００４２】
そして、このアプリケーションプログラムの働きによって、８番の「ガメンヒョウジ」についてはメニューの次の階層が表示される。一方、１９番の「ヘッディングアップ」については、上に述べたように、地図上から直接発声されたかどうかに応じて違った画面が表示されることになるが、このような場合に使う「進行方向をヘッディングアップにしました。」といったメッセージなどの情報は、語句の番号ごとに対応させて、データベースＮ３に格納されている。
【００４３】
なお、図５において８番の「ガメンヒョウジ」について示したように、メニュー階層の途中の状態から次の状態を呼び出すための語句を認識させ、その語句に応じた処理を行わせることもできる。この場合、メニューのある部分を呼び出すためのある語句は、メニュー階層のなかでは、ある特定の状態で発話されるべきという対応関係がある。そして、このような語句が、本来対応している状態で発話されれば、第１の選択部４１がその発話を音声認識部１４を使って音声認識するが、その他の状態、例えば自車周辺の地図を表示しているような状態からそのような語句が発話された場合は、第２の選択部４４がその語句を音声認識部１４を使って音声認識する。
【００４４】
〔３．第１実施形態の効果〕
以上のように、第１実施形態では、不慣れな初心者でも、メニューの階層構造を順番に辿るという基本的な操作の仕方で、語句の発話順序に戸惑うことなく容易に操作することができる。一方、慣れたユーザは、このようなメニューの階層構造を順番に辿る必要はない。すなわち、メニューの途中かどうかといった画面状態などに関係なく、最下層で最終的に具体的な操作を選ぶときの語句を、１つのコマンドのように直接発話することで、所望の機能を簡単に使える。このため、階層的なメニューをショートカット（省略／飛ばすこと）でき、使い勝手が向上する。
【００４５】
また、第１実施形態では、メニュー階層の途中のある状態から次の状態を呼び出すための語句を、その状態とは別の状態や、メニュー階層以外の画面表示の状態などで使うことができる。このため、例えばメニュー階層のなかで使いたい途中の部分を、上位階層を通らずに直接呼び出したり、メニュー階層のある部分から並列する別の部分に、上位階層に戻らずに移動したりできるので、いろいろな操作の仕方が可能になる。
【００４６】
また、第１実施形態では、メニュー階層の最下層で使う語句について、本来のその最下層で言われたときと、それ以外から直接名指しされたときとで、違った処理を行う。すなわち、同じ語句でもどのような状態で言われたかによって違った処理が行われるので、使い勝手をきめ細かく改善できる。
【００４７】
〔４．第２実施形態〕
第２実施形態は、第１実施形態とおおむね同じ構成のナビゲーションシステムにおいて、あらかじめ決められた１つの語句を言うだけで、メニュー階層であれば複数の語句を順番に発話しなければならないのと同じ処理をするようにしたものである。ここでは、目的地を検索する処理の１つとして、自車の周辺に存在する施設をリストアップする例を示す。
【００４８】
まず、この第２実施形態では、図６に例示するように、複数のテーブルを含む構成の認識用辞書４３１を用意する。そして、一例として、自車「シュウヘン」の「カーヨウヒンテン」を検索して表示させるなどの処理を行う場合、例えば、メニュー階層にしたがう通常の操作フローでは、図６のテーブルＴ１１の「シュウヘン」と言う語句が認識されると、その番号が、音声認識ユニットＶからナビゲーションユニットＮのアプリケーションプログラムＮ４に渡される（図１）。
【００４９】
そして、ＣＤ−ＲＯＭ内のデータベースＮ３には、認識された語句の番号に対応させて次に認識対象にするテーブル名称（この場合はテーブルＴ２１）や表示すべきメッセージ文字列などを格納させておき、アプリケーションプログラムＮ４は、それらの情報を読み込んで、画面上もしくは音声でメッセージを出力したり、音声認識ユニットＶに次の認識対象テーブル（この場合はテーブルＴ２１）を知らせる。音声認識ユニットＶは、テーブルＴ２１を次の認識対象テーブルとしてメモリに読み込み、コマンド待ちをする。
【００５０】
このように新しく認識対象とされたテーブルＴ２１には施設などの項目名が列挙されている。この場合、上に述べたようにアプリケーションプログラムＮ４から出力される画面上もしくは音声のメッセーとしては、次に「カーヨウヒンテン」などの項目名の発話を誘導するためのメッセージ、例えば、「次に項目名を言ってください。」などを出力し、ユーザーは、そのメッセージに従い、「カーヨウヒンテン」の様に項目名を発話する。
【００５１】
このように「カーヨウヒンテン」が認識されると、アプリケーションプログラムＮ４は、先程と同様にＣＤ−ＲＯＭ（Ｎ１）上のデータベースＮ３から必要な情報を取得して、メッセージなどを出力し、同時に周辺のカー用品店をリストアップする。なお、このようにメニュー階層にしたがう処理の流れを図７の左側に示す。
【００５２】
また、この第２実施形態では、上に述べたようなメニュー階層にしたがう操作をショートカットするため、図６に示すように、テーブルＴ１１内には「シュウヘンノカーヨウヒンテン」といった語句も用意しておく。この語句は、メニュー階層にしたがう場合は複数の語句を順次発話しなければならない一連の処理を、１つの語句にまとめたものである。
【００５３】
そして、例えば、ユーザがこの「シュウヘンノカーヨウヒンテン」を発話して認識されると、アプリケーションプログラムＮ４は、先程と同様にＣＤ−ＲＯＭ上のデータベースＮ３から必要な情報を取得し、周辺のカー用品店をリストアップするなどの処理を行う。
【００５４】
また、図７の左側に示した手順は、周辺のカー用品店を検索した（ステップ７１〜７３）結果のリストから（ステップ７４）、現在地に一番近い店をユーザが選んで（ステップ７５）、その店を目的地として設定する例である（ステップ７６〜７８）。そこで、これと同様の処理を、現在地に一番近い店を自動的に選ぶことで行うための語句として、「イチバンチカイカーヨウヒンテン」という語句が認識用辞書４３１のテーブル１に格納されているものとする。
【００５５】
この語句を使う場合、図７の右側に示すように、「イチバンチカイカーヨウヒンテン」と発話するだけで（ステップ８１）、自車に一番近いカー用品店が検索され（ステップ８３）、その店を目的地として経路が設定される（ステップ８８）。この結果、図７左側に示したメニュー階層にしたがう例と比べた場合、ステップ７１，７２，７５，７７はわずか１つのステップ８１によって済み、ステップ７４，７６は処理の内容からいって不要となる。
【００５６】
以上のように、第２実施形態では、いくつもの語句を発話して行うような処理をしたいとき、あらかじめ決めた１つの語句を発話することで済ませることができるので、ナビゲーションシステムを操作が効率化される。
【００５７】
〔５．第３実施形態〕
第３実施形態は、上に述べた第１実施形態及び第２実施形態とおおむね同じ構成を持ったナビゲーションシステムにおいて、どの語句とどの語句とが相前後して認識されたかに応じて、あらかじめ決められた操作を選んだりエラー処理を行うようにしたものである。この第３実施形態では、、第２実施形態と同じように自車周辺の施設を検索する処理を例にとって、語句を発声する順序に柔軟性を持たせた例を示す。
【００５８】
ここで、図８は、第３実施形態における認識用辞書４３２を構成するテーブルを示す概念図であり、図６に示した第２実施形態における２つのテーブルに、それぞれ「カーヨウヒンテン」「シュウヘン」という語句Ｇ１，Ｇ２が追加されたものである。なお、これらのテーブルのうち、ナビゲーションシステムが起動された初期状態では、第１のテーブルＴ１２が音声認識の対象となる。
【００５９】
そしてこの第１のテーブルＴ１２に登録された語句のうちあらかじめ決められた語句、すなわちここでは「シュウヘン」が音声認識されると、音声認識の対象を第２のテーブルＴ２２に切り換え、次に音声認識された語句についてはこの第２のテーブルＴ２２に基づいてあらかじめ決められた操作を選び又はエラー処理を行う。
【００６０】
ここで、第３実施形態において、テーブルを使ってこのような処理を行う処理手順を一般化して図９に示す。すなわち、図８の例ではテーブルの数は２つであるが、テーブルの数は３つ以上でもよく、このような場合も含めて図９で説明する。なお、ここではテーブルの記号を抽象化し、仮に１，２，３で示す。
【００６１】
すなわち、テーブル１の語句をユーザが発話し（ステップ９１）、それがある語句Ｗ１であれば音声認識の対象をテーブル２に切り換え（ステップ９３）、ユーザがこのテーブル２の語句を発声するのに応じて（ステップ９４）、処理が行われる（ステップ９５）。そして、例えばテーブル２の他にテーブル３もある場合は、別の語句Ｗ２が発話されたときにテーブル３を使った処理が行われる（ステップ９７〜）。
【００６２】
次に、図１０は、図９に一般化して示した手順を、第３実施形態における実例にあてはめた場合の具体的な手順を示すフローチャートである。すなわち、この手順では、最初にテーブルＴ１２内の「シュウヘン」という語句が認識されることで（ステップ１０１）、テーブルＴ２２が音声認識の対象となるが（ステップ１０２，１０３）、この状態で、次の語句が例えば「カーヨウヒンテン」であれば（ステップ１０４）、自車周辺のカー用品店の検索といった処理が行われるが（ステップ１０５）、次の語句として例えば「シュウヘン」という語句がもう１度続けて認識されると、アプリケーションプログラムがでビープ音を鳴らすなどして警報処理（ステップ１０６）してステップ１０４へ戻る。
【００６３】
また、図１０に例示したのとは逆に、例えば図８に示したテーブルＴ１２内の「カーヨウヒンテン」という語句が先に認識されると、続く語句として、テーブルＴ２２内の「シュウヘン」という語句は意味が通じる。このため、「シュウヘン」と発話された場合は周辺のカー用品店をリストアップするなどの処理が行われるが、これとは逆に、テーブルＴ１２内から発話したのと同じ語句「カーヨウヒン」が再び発話された場合はエラー処理が行われる。つまり、第３実施形態におけるプログラミングでは、一つ前に認識された語句をアプリケーションプログラムが監視し、それによって次に認識された語句に基づく動作を変えるように、語句毎にプログラミングされている。
【００６４】
また、各テーブルに登録されたそれぞれの単語には、その単語が発話された後の処理に使うメッセージが関連づけられているので、例えば、先行する単語に応じてテーブルを切り換えたが、次の単語が認識できないとき、先行する単語の内容に応じて、違った内容の助言や催促をユーザに対して出力することができる。
【００６５】
以上のように、第３実施形態では、１つ前に認識されたのがどの語句かに応じて、次に認識された語句に対してどのような処理を行うか、また、エラーにするかなどが判断される。このため、何をしたいかの意味が通じれば、メニュー階層にしたがった発話順序にこだわらず、逆の発話順序や一部省略した発話順序などもあらかじめ決めておくことができるので、発話順序の自由度が増え、操作が容易になる。
【００６６】
また、第３実施形態では、認識された語句に応じて、認識対象となるテーブルを切り換えるという簡単な構成で、相前後して発話された語句同士の関係を判断できるので、実装が容易である。
【００６７】
また、このような発声順序にとらわれない操作の仕方によって、ユーザが発話順序について迷うことがなくなり、音声認識のユーザーインターフェースにおいて、より人間的に、融通の利く自然な操作が実現される。このため、いろいろな操作がすぐに済み、運転の安全性も一層向上する
〔６．他の実施の形態〕
なお、この発明は上に述べた実施形態に限定されるものではなく、次に例示するような他の実施の形態も含むものである。例えば、この発明のナビゲーションシステムは、自動車に搭載するいわゆるカーナビゲーションシステムだけでなく、二輪車など他の種類の移動体に使うこともできる。
【００６８】
また、メニュー階層の構造や階層数などは自由であり、例えば、メニュー階層をどの方向に進んでも「最下層」に辿り着くが、この「最下層」は、どの部分の最下層かによって、メニュー階層の入り口から数えて何階層目に当たるかが違っていてよい。また、最下層で使う語句について、その最下層で音声認識されたか、その最下層以外で音声認識されたかに応じて、異なった処理を行うように構成する場合、必ずしも全ての最下層の語句についてそのように構成する必要はない。
【００６９】
また、「あらかじめ決められた単一の語句」は、メニュー階層にしたがって複数の語句を順番に発話したときと同じ処理を単一の語句にまとめたという意味で、そのような単一の語句を同時に複数登録しておくことはもちろん可能である。また、上に述べた実施形態では、２つの語句を相前後して認識する例を示したが、３語以上の語句を認識する順序に基づいてナビゲーションシステムを制御することもできる。
【００７０】
【発明の効果】
以上のように、この発明によれば、メニューの階層や発話順序と関係なく、したい操作を音声認識で容易に指示することができる。
【図面の簡単な説明】
【図１】この発明の第１実施形態のハードウェア構成を示すブロック図。
【図２】この発明の第１実施形態の具体的な構成を示す機能ブロック図。
【図３】この発明の第１実施形態において、メニュー階層を辿って語句を順次発話するときの操作手順を示す概念図。
【図４】この発明の第１実施形態において、地図を表示している画面から、メニュー階層の最下層の語句を直接発話するときの操作手順を示す概念図。
【図５】この発明の第１実施形態において、認識された語句の番号に応じた処理が行われることを示す概念図。
【図６】この発明の第２実施形態における認識用辞書の構成を示す概念図。
【図７】この発明の第２実施形態における処理手順を示すフローチャート。
【図８】この発明の第３実施形態における認識用辞書の構成を示す概念図。
【図９】この発明の第３実施形態における処理手順を一般的に示すフローチャート。
【図１０】この発明の第３実施形態における処理手順を具体的に示すフローチャート。
【図１１】従来の認識用辞書の一例を示す概念図。
【符号の説明】
Ｖ…音声認識ユニット
Ｖ１…ＲＯＭ
Ｖ２…認識用辞書
Ｎ…ナビゲーションユニット
Ｎ１…ＣＤ−ＲＯＭ
Ｎ２…フラッシュメモリ
Ｎ３…データベース
Ｎ４…アプリケーションプログラム
１…絶対位置・方位検出部
２…相対方位検出部
３…車速検出部
４…メインＣＰＵ及びその周辺回路
Ｍ…メモリ群
５…ＲＯＭ
６…ＤＲＡＭ
７…ＳＲＡＭ
８…ＶＲＡＭ
１０…表示部
１１…入力部
１２…ＣＤ−ＲＯＭ制御部
１３…ＦＭ多重受信及び処理部
１４…音声認識部
４１…第１の選択部
４２…メニュー階層データ
４３，４３０，４３１，４３２…認識用辞書
Ｔ…テーブル
４４…第２の選択部
４５…処理部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an improvement in technology related to navigation. More specifically, the present invention can easily specify an operation to be performed by voice recognition regardless of a menu hierarchy and an utterance order.
[0002]
[Prior art]
2. Description of the Related Art In recent years, navigation systems are known as electronic devices that automatically perform route guidance for moving bodies represented by automobiles. The navigation system calculates the current position (vehicle position) of the mounted vehicle (own vehicle) using radio waves from a satellite or gyroscope, etc., and displays the position of the vehicle on a display screen such as a liquid crystal display panel. While displaying the computer graphics on the map, it will guide you where to turn next.
[0003]
There is also known an example in which a voice recognition technique is applied to such a navigation system and the navigation system is operated by uttering (speaking) various predetermined phrases. In such speech recognition, various related technologies have been proposed, but typically, data such as feature parameters and waveforms representing features for each word to be recognized, that is, a recognition dictionary is prepared in advance. deep. The voice input from the microphone is converted into a digital signal, and the spoken phrase is recognized by pattern matching with the recognition dictionary described above.
[0004]
When the navigation system is controlled by such voice recognition, conventionally, various operations are called from a hierarchical menu. In such a configuration, necessary functions must be reached by following the hierarchical structure of the menu and uttering words determined step by step in accordance with announcements from the navigation system.
[0005]
In order to realize such an operation of the navigation system, the recognition dictionary described above is composed of a plurality of tables according to the hierarchical structure of the menu. Then, a table corresponding to the position in the menu is read, and only words in the table are subjected to speech recognition. Here, FIG. 11 is a conceptual diagram showing the structure of a table in a conventional recognition dictionary.
[0006]
In this example, on a so-called basic screen in which a map around the vehicle is displayed on the display screen, only words such as “menu” registered in the table T10 are to be recognized. For example, in order to perform an operation of “heading up”, by first saying “menu” on the basic screen, the table T20 that roughly divides the contents of the menu is set as a recognition target. "Speaks." Similarly, when the table T50 and the table T40 are traced and the table T50 becomes a recognition target, it is possible to say “heading up” for the first time.
[0007]
That is, conventionally, the flow of what words can be spoken in what order after the first utterance from the basic screen is fixed in accordance with the menu hierarchy. Also, in terms of what words are to be recognized, for example, words that can be uttered from the basic screen are determined in advance, and even if words can be uttered in the order of “A” → “B”, “ It was not possible to speak in the reverse order of “B” → “A”.
[0008]
[Problems to be solved by the invention]
However, in the conventional technology as described above, there is a problem in that a plurality of words must be uttered in a fixed order in order to perform one operation, and thus the operation cannot be performed efficiently. That is, in order to use a certain function, an operation method in which a plurality of words are uttered in accordance with an announcement or the like is easy to understand, for example, for beginners, but an intermediate operation becomes less necessary as it gets used. For this reason, the user wants to perform the next operation rather quickly, and an interface that can operate more efficiently has been awaited.
[0009]
Further, the conventional technology as described above recognizes the word that becomes the first command, and then executes a fixed flow of operation (flow) that follows. You can reach the operations and functions. For this reason, for various operations, the user must remember the first command as an entrance to reach the operation, and there is a potential demand for reducing such a burden.
[0010]
The present invention has been proposed to solve the problems of the prior art as described above, and its purpose is to easily indicate the desired operation by voice recognition irrespective of the menu hierarchy and the utterance order. It is to provide navigation technology.
[0011]
[Means for Solving the Problems]
  In order to achieve the above-mentioned object, the navigation system according to claim 1 is a first selection means for selecting which operation is to be performed by sequentially recognizing a predetermined phrase according to the menu hierarchy. And a second selection means for selecting an operation corresponding to the word and a process corresponding to the selected operation by recognizing a word used in the lowermost layer of the menu hierarchy other than the lowermost layer. And processing meansThe second selection means selects an operation for displaying a menu of a hierarchy one level lower than the current hierarchy when a phrase used in the lowest hierarchy is recognized by voice other than the lowest hierarchy. The first selecting means is for selecting an operation corresponding to the recognized word / phrase when the word / phrase to be used in the lowermost layer is voice-recognized in the lowermost layer, and the processing means is selected. In addition to processing according to the operation, when an operation for displaying a menu in a hierarchy one level lower than the current hierarchy is selected in the first selection means, a message corresponding to the operation is displayed on the display. It is characterized by.
[0012]
  The navigation method of claim 2 is based on the view of the invention of claim 1 as a method, and is used for selecting which operation is to be performed by sequentially recognizing a predetermined phrase according to the menu hierarchy. FirstChoiceA second step for selecting an operation according to the step by recognizing a word used at the lowest level of the menu hierarchy at a level other than the lowest level.ChoicePerform steps and processing according to the selected operationprocessingAnd including stepsThe second selection step includes a process of selecting an operation for displaying a menu of a hierarchy one level lower than the current hierarchy when a phrase used in the lowest hierarchy is recognized by voice other than the lowest hierarchy, The first selecting step includes a process of selecting an operation corresponding to the recognized word / phrase when the word / phrase used in the lowermost layer is recognized by voice in the lowermost layer, and the processing step is the selected In addition to the processing according to the operation, when an operation for displaying a menu in the hierarchy one level lower than the current hierarchy is selected in the first selection step, a process for displaying a message corresponding to the operation on the display It is characterized by including.
[0013]
  The recording medium on which the navigation software according to claim 3 is recorded is the one in which the inventions of claims 1 and 2 are viewed from the viewpoint of software stored in the recording medium, and navigation software for performing navigation using a computer is provided. In the recorded recording medium, the software tells the computer which operation to perform by recognizing the predetermined phrases in order according to the menu hierarchy.A first selection process for selecting;By recognizing words used at the lowest level of the menu hierarchy at other than the lowest level, operations corresponding to the words can be performed.A second selection process for selecting,Processing according to the selected operationIn the second selection process, when the words used in the lowermost layer are recognized by voice other than the lowermost layer, an operation for displaying a menu in the next lower layer is performed. The first selecting process includes a process of selecting an operation corresponding to the recognized word / phrase when a word / phrase to be used in the lowermost layer is voice-recognized in the lowermost layer. The process corresponding to the selected operation displays a message corresponding to the operation on the display when an operation for displaying a menu in a hierarchy one level lower than the current hierarchy is selected in the first selection process. It is characterized by including the process to make.
[0014]
  Aspect as aboveNow, the wording used at the lowest level of the menu hierarchy is processed differently when it is said at the original lowest level and when it is directly named from other than that. For example, when a word indicating how to display a screen is told at the bottom of the menu hierarchy, displaying it together with other options gives you an opportunity to consider other options. When the name is directly named from, the screen display is switched accordingly and a message is output accordingly. In this way, since different processing is performed depending on the state of the same word / phrase, usability can be improved finely.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
Next, an embodiment (hereinafter referred to as “embodiment”) of the navigation system of the present invention will be specifically described with reference to the drawings. In addition, about each figure used in the following description, the same code | symbol is attached | subjected about the same member as the figure demonstrated before that, and the same kind of member, and description is abbreviate | omitted.
[0018]
This embodiment is realized by using various hardware devices and a computer controlled by software. In this case, the software is created by combining instructions according to the description of this specification, and the method described in the prior art is used in common with the prior art described above. The software includes not only the program code but also data prepared in advance for use when executing the program code. And the software implement | achieves the effect of this invention by utilizing physical processing apparatuses, such as CPU and various chip sets incorporated in the navigation system.
[0019]
However, specific hardware and software configurations for realizing the present invention can be variously changed. For example, depending on the circuit configuration and the processing capability of the CPU, a certain function may be realized by a physical electronic circuit such as an LSI or by software. In addition, regarding the part that uses software, there are various types of software such as compilers, assemblers, and microprograms. Further, a recording medium on which software for realizing the present invention is recorded is an embodiment of the present invention alone.
[0020]
As described above, since there are various ways of realizing the present invention using a computer, the present invention and the implementation will be described below using virtual circuit blocks that implement individual functions included in the present invention and embodiments. The form will be described.
[0021]
[1. Configuration of First Embodiment]
In the first embodiment, both an operation of sequentially uttering by tracing the menu hierarchy and an operation of directly uttering a phrase representing the desired operation regardless of the menu hierarchy are possible. That is, in this first embodiment, as shown in FIG. 3, the hierarchy is sequentially changed from the screen displaying the map in the order of “menu” → “screen display” → “traveling direction” → “heading up”. It can also be operated later, and as shown in FIG. 4, it can also be shortcut by directly saying “Heading Up” from the screen displaying the map.
[0022]
[1-1. Overall configuration)
First, FIG. 1 is a conceptual diagram showing an example of what kind of hardware the embodiment can be configured on. In this example, the voice recognition unit V and the navigation unit N are connected to each other, and a part of hardware used for realizing this embodiment is shown.
[0023]
Among these, the recognition dictionary V2 is stored in the ROM (V1) in the voice recognition unit V. The navigation unit N has a CD-ROM drive, and the CD-ROM (N1) inserted in the CD-ROM drive includes various data in addition to application programs for realizing the functions of the navigation system. A database N3 is recorded. An application program N4 for realizing the functions of the navigation system of this embodiment is read from the CD-ROM (N1) and recorded in the flash memory N2.
[0024]
FIG. 2 is a functional block diagram showing a specific configuration of this embodiment realized by using hardware as shown in FIG. That is, this embodiment includes an absolute position / orientation detection unit 1, a relative orientation detection unit 2, a vehicle speed detection unit 3, a main CPU and its peripheral circuit 4, a memory group M, a display unit 10, and an input unit. 11, a CD-ROM control unit 12, an FM multiplex reception and processing unit 13, and a voice recognition unit 14.
[0025]
Among them, the absolute position / orientation detecting unit 1 receives GPS radio waves transmitted from GPS satellites, and thereby calculates absolute position coordinates and directions on the ground surface for the current position of the automobile (own vehicle). It is. The relative orientation detection unit 2 is a part that detects the relative orientation of the automobile using a gyroscope or the like. The vehicle speed detection unit 3 is a part that calculates the vehicle speed by processing vehicle speed pulses obtained from the automobile.
[0026]
The CD-ROM control unit 12 is a part for reading various information such as a navigation system program and road map data from the CD-ROM. The FM multiplex reception and processing unit 13 is a part that receives an FM broadcast of a radio by switching a plurality of antennas according to the reception state. The speech recognition unit 14 is a speech recognition unit including feature parameters and waveform data representing features for each word to be recognized, that is, a recognition dictionary 43, and converts speech input from a microphone into a digital signal. This is a part for recognizing the uttered word / phrase by pattern matching with the recognition dictionary 43.
[0027]
[1-2. Role of main CPU and its peripheral circuits]
Further, the main CPU and its peripheral circuit 4 are parts that serve as a control circuit for controlling the entire navigation system, and the first selection unit 41 and the second selection unit 44 are operated by the above-described software. And the processing unit 45.
[0028]
Among these, the 1st selection part 41 is a 1st selection means for selecting which operation is performed by carrying out the voice recognition of the predetermined phrase according to a menu hierarchy in order. Here, it is assumed that voice recognition is performed by the voice recognition unit 14, and the contents of the menu hierarchy described above are prepared as menu hierarchy data 42.
[0029]
Further, the second selection unit 44 recognizes a word used in the lowermost layer of the menu hierarchy described above by voice recognition other than the lowermost layer, thereby selecting second selection means for selecting an operation according to the word. It is. The processing unit 45 is processing means that performs processing according to the operation selected by the first selection unit 41 or the second selection unit 44. The processing unit 45 is configured to perform different processing for the words and phrases used in the lowermost layer, depending on whether the speech recognition is performed in the lowermost layer or in other than the lowermost layer. Has been.
[0030]
Although not shown, the main CPU and its peripheral circuit 4 may display the current position of the vehicle on a computer graphics map using the calculated vehicle position, Other processing units that perform various navigation processes such as providing route guidance in response to instructions from the user, and using a user interface (not shown) including an I / O control circuit and a driver, a variety of information It is configured to perform input / output. The main CPU and its peripheral circuit 4 use the memory group M, the display unit 10, and the input unit 11 when performing such various processes.
[0031]
[1-3. Other configurations
That is, the memory group M includes various memories necessary for the operation of the navigation system of this embodiment, that is, the ROM 5 storing the BIOS and the bootup program, the DRAM 6 used for the work area, the cache, the buffer, and the like. SRAM 7 for use in video and VRAM 8 for use in video display are included.
[0032]
The display unit 10 is a part for outputting various types of information such as maps and operation menus using a liquid crystal display panel, a voice synthesizer, and the like. The input unit 11 is used by the user for various information such as instructions and destinations. It is a part for inputting.
[0033]
[2. Operation of First Embodiment]
The first embodiment configured as described above operates as follows.
(1) First, the recognition dictionary 43 includes all data of words used in each level of the menu so that all words can be always recognized even when the menu level is traced. Keep it.
[0034]
(2) When the user utters, the speech recognition unit 14 recognizes the spoken phrase using the recognition dictionary 43, and assigns the number of the phrase to the first selection unit 41 and the second selection unit 44. hand over. Here, the operations of the first selection unit 41 and the second selection unit 44 are realized by an application program that realizes the function of the navigation system.
[0035]
(3) That is, in these application programs, a process corresponding to what number is received as a phrase number is programmed in advance, and the movement is executed. In this case, what kind of operation is programmed for each number of words is basically one-to-one with the number.
[0036]
For example, it is assumed that there is a menu in which the word “system” can be spoken in the middle of the menu hierarchy, and when the word is spoken, a menu in the next lower hierarchy is displayed. In such a case, even when the phrase “system” is spoken in the “medium” menu hierarchy described above, it is spoken in a state where a map around the vehicle is displayed, for example. In all cases, the same “menu one level lower” described above is displayed.
[0037]
However, since the words at the bottom of the menu hierarchy are specific operations, they are programmed to change movement depending on whether they are said directly on the map or next to other words on the menu. Keep it.
[0038]
For example, FIGS. 3A to 3E are conceptual diagrams showing a flow of reaching the operation of “heading up” by sequentially uttering predetermined words and phrases according to the menu hierarchy. In the screen (d) displayed by speaking, it is displayed that there is an option of “North Up” in addition to “Heading Up”.
[0039]
On the other hand, FIGS. 4 (a) and 4 (b) show an example in which an operation “heading up” is easily performed by simply speaking “heading up” directly from a screen displaying a road map around the vehicle. In this example, in the screen (b) after speaking “heading up” in this example, the other options described above are not displayed, but a message indicating that the traveling direction is heading up is displayed.
[0040]
(4) In the database N3, information necessary when the application program described above executes processing corresponding to a given word number, such as a message to be output on the screen, is displayed for each word number. Correspondingly, the application program uses the CD-ROM control unit 12 to execute processing according to the phrase while reading this data as appropriate.
[0041]
Here, FIG. 5 is a conceptual diagram showing an example in which the phrase numbers recognized using the recognition dictionary 43 are processed by the application program described above. In this example, the phrase “GAMEN HYOUJI” recognized using the recognition dictionary 43 is passed to the application program as the number 8 and the phrase “Heading Up” is passed as the number 19 as a phrase.
[0042]
Then, by the function of this application program, the next level of the menu is displayed for No. 8 “GAME HYUK”. On the other hand, as for “Heading Up” of No. 19, different screens will be displayed depending on whether or not it is spoken directly from the map as described above. Information such as a message such as “heading up the direction” is stored in the database N3 in correspondence with each word number.
[0043]
In addition, as shown in FIG. 5 for “No. 8”, it is also possible to recognize a phrase for calling the next state from a state in the middle of the menu hierarchy, and to perform processing according to the phrase. In this case, there is a correspondence relationship that a certain phrase for calling a certain part of the menu should be uttered in a specific state in the menu hierarchy. Then, if such a phrase is uttered in a state where it originally corresponds, the first selection unit 41 recognizes the utterance using the voice recognition unit 14, but other states, for example, around the own vehicle When such a word is uttered from the state where the map is displayed, the second selection unit 44 recognizes the word using the voice recognition unit 14.
[0044]
[3. Effects of the first embodiment]
As described above, in the first embodiment, even an inexperienced beginner can easily operate without being confused by the utterance order of words and phrases by the basic operation method of sequentially tracing the hierarchical structure of the menu. On the other hand, a familiar user does not have to follow the hierarchical structure of such menus in order. In other words, regardless of the state of the screen, such as whether it is in the middle of a menu, the desired function can be easily achieved by directly speaking the phrase when selecting a specific operation at the bottom layer directly like a single command. It can be used. For this reason, a hierarchical menu can be shortcut (omitted / skip), improving usability.
[0045]
In the first embodiment, a phrase for calling the next state from a state in the middle of the menu hierarchy can be used in a state different from that state, a screen display state other than the menu layer, and the like. For this reason, for example, a part of the menu hierarchy that is desired to be used can be called directly without going through the upper hierarchy, or moved from one part of the menu hierarchy to another part in parallel without returning to the upper hierarchy. Various operations are possible.
[0046]
Further, in the first embodiment, different processing is performed for a phrase used in the lowest layer of the menu hierarchy, when it is said in the original lowest layer and when it is directly named from other than that. That is, since different processing is performed depending on the state of the same word / phrase, usability can be improved finely.
[0047]
[4. Second Embodiment]
In the navigation system having the same configuration as the first embodiment, the second embodiment is the same as having to speak a plurality of words in order in a menu hierarchy by simply saying one predetermined word. It is something to be processed. Here, an example of listing facilities existing around the host vehicle is shown as one of the processes for searching for a destination.
[0048]
First, in the second embodiment, as illustrated in FIG. 6, a recognition dictionary 431 having a configuration including a plurality of tables is prepared. As an example, when processing such as searching for and displaying “Kariyo Hinten” of the vehicle “Shuchen” is performed, for example, in a normal operation flow according to the menu hierarchy, “Shuhen” in the table T11 of FIG. When the phrase is recognized, the number is passed from the speech recognition unit V to the application program N4 of the navigation unit N (FIG. 1).
[0049]
Then, in the database N3 in the CD-ROM, the table name (in this case, the table T21) to be recognized next and the message character string to be displayed are stored in correspondence with the recognized word number. The application program N4 reads the information and outputs a message on the screen or by voice, or notifies the voice recognition unit V of the next recognition target table (in this case, the table T21). The voice recognition unit V reads the table T21 into the memory as the next recognition target table and waits for a command.
[0050]
In this way, item names such as facilities are listed in the newly recognized table T21. In this case, as described above, as a message on the screen or voice output from the application program N4, a message for guiding the utterance of an item name such as “Kariyo Hinten”, for example, “Next item name ”, Etc., and the user utters the item name such as“ Kariyo Hinten ”according to the message.
[0051]
When “car hinting” is recognized in this way, the application program N4 obtains necessary information from the database N3 on the CD-ROM (N1) and outputs a message etc. at the same time as before, and simultaneously List supplies stores. The flow of processing according to the menu hierarchy is shown on the left side of FIG.
[0052]
Further, in this second embodiment, in order to shortcut the operation according to the menu hierarchy as described above, as shown in FIG. 6, a phrase such as “Shuhen no Kyo Hinten” is also prepared in the table T11. Keep it. This word / phrase is a combination of a series of processes in which a plurality of words / phrases must be uttered sequentially in accordance with the menu hierarchy.
[0053]
For example, when the user utters and recognizes this “Shuhen no Kyo Hinten”, the application program N4 acquires necessary information from the database N3 on the CD-ROM as before, and Carry out processes such as listing car supply stores.
[0054]
Further, in the procedure shown on the left side of FIG. 7, the user selects a store closest to the current location from the list of results of searching for nearby car supply stores (steps 71 to 73) (step 74) (step 75). In this example, the store is set as the destination (steps 76 to 78). Therefore, the phrase “Ichibanchi Kaiker Yohinten” is stored in the table 1 of the recognition dictionary 431 as a phrase for performing the same processing by automatically selecting the store closest to the current location. It shall be.
[0055]
When using this phrase, as shown on the right side of FIG. 7, simply speaking “Ichibanchi Kaikar Yohinten” (step 81), the car accessory store closest to the vehicle is searched (step 83). A route is set with the store as the destination (step 88). As a result, when compared with the example according to the menu hierarchy shown on the left side of FIG. 7, steps 71, 72, 75, and 77 are performed by only one step 81, and steps 74 and 76 are unnecessary depending on the processing contents. .
[0056]
As described above, in the second embodiment, when processing is performed by uttering a number of words and phrases, it can be done by speaking a predetermined word and phrase, so that the navigation system can be operated more efficiently. Is done.
[0057]
[5. Third Embodiment]
In the navigation system having the same configuration as the first embodiment and the second embodiment described above, the third embodiment is determined in advance according to which words and phrases are recognized in succession. The selected operation is selected and error handling is performed. In the third embodiment, as in the second embodiment, an example of processing for searching for facilities around the host vehicle is taken as an example, and the order in which words are spoken is shown as an example.
[0058]
Here, FIG. 8 is a conceptual diagram showing the tables constituting the recognition dictionary 432 in the third embodiment, and the two tables in the second embodiment shown in FIG. 6 are called “Kayo Hinten” and “Suhen”, respectively. The words G1 and G2 are added. Of these tables, in the initial state where the navigation system is activated, the first table T12 is the target of speech recognition.
[0059]
Then, when a predetermined phrase out of the phrases registered in the first table T12, that is, “shuhen” here is recognized as a voice, the voice recognition target is switched to the second table T22, and then the voice recognition is performed. For the word / phrase, an operation determined in advance based on the second table T22 is selected or error processing is performed.
[0060]
Here, in the third embodiment, a processing procedure for performing such processing using a table is generalized and shown in FIG. That is, in the example of FIG. 8, the number of tables is two, but the number of tables may be three or more, and this case will be described with reference to FIG. Here, the table symbols are abstracted and indicated by 1, 2, and 3.
[0061]
That is, the user speaks the words in Table 1 (step 91), and if the word W1 is a word W1, the speech recognition target is switched to Table 2 (step 93), and the user speaks the words in Table 2 In response (step 94), processing is performed (step 95). For example, when there is a table 3 in addition to the table 2, processing using the table 3 is performed when another word / phrase W2 is uttered (steps 97 and after).
[0062]
Next, FIG. 10 is a flowchart showing a specific procedure when the procedure generalized in FIG. 9 is applied to an example in the third embodiment. That is, in this procedure, the word “Shu Heng” in the table T12 is first recognized (step 101), and the table T22 is subject to speech recognition (steps 102 and 103). For example, if the phrase is “Kaiyou Hinten” (step 104), a process such as searching for a car accessory store around the vehicle is performed (step 105). If the application program is recognized, an alarm is processed (step 106) by, for example, beeping the application program, and the process returns to step 104.
[0063]
Contrary to the example illustrated in FIG. 10, for example, when the phrase “Kayo Hinten” in the table T <b> 12 illustrated in FIG. 8 is recognized first, the phrase “Shuhen” in the table T <b> 22 is the following phrase: It makes sense. For this reason, when “Shu Heng” is spoken, processing such as listing up nearby car supply stores is performed, but conversely, the same phrase “Kay You Hin” spoken from within table T12 is again displayed. If it is spoken, error processing is performed. That is, in the programming in the third embodiment, programming is performed for each phrase so that the application program monitors the phrase recognized immediately before and changes the operation based on the next recognized phrase.
[0064]
In addition, since each word registered in each table is associated with a message used for processing after the word is spoken, for example, the table is switched according to the preceding word. Can not be recognized, advice or prompting with different contents can be output to the user according to the contents of the preceding word.
[0065]
As described above, according to the third embodiment, what kind of processing is performed on the next recognized phrase depending on which phrase is recognized one time ago, and whether an error is caused. Etc. are judged. For this reason, as long as the meaning of what you want to do is understood, the utterance order according to the menu hierarchy can be determined in advance, and the reverse utterance order or partially omitted utterance order can be determined in advance. The degree increases and the operation becomes easy.
[0066]
In the third embodiment, since the relationship between words spoken in succession can be determined with a simple configuration of switching the table to be recognized according to the recognized words and phrases, the implementation is easy. .
[0067]
In addition, the user does not get confused about the utterance order by such an operation method not limited to the utterance order, and a more flexible and natural operation is realized in the user interface for speech recognition. For this reason, various operations can be performed immediately, and driving safety is further improved.
[6. Other Embodiments]
The present invention is not limited to the embodiment described above, and includes other embodiments as exemplified below. For example, the navigation system of the present invention can be used not only for a so-called car navigation system mounted on an automobile but also for other types of mobile objects such as a two-wheeled vehicle.
[0068]
In addition, the structure of the menu hierarchy and the number of hierarchies are free. For example, the menu hierarchy can be reached in any direction to reach the “lowermost layer”. It may be different in the number of ranks counted from the entrance of the hierarchy. In addition, regarding the words and phrases used in the lowest layer, when it is configured to perform different processing depending on whether the voice is recognized in the lowest layer or the voice is recognized in other than the lowest layer, not all the words and phrases in the lowest layer There is no need for such a configuration.
[0069]
“Predetermined single word” means that the same processing as when a plurality of words are uttered in order according to the menu hierarchy is combined into a single word. Of course, it is possible to register more than one at the same time. In the above-described embodiment, an example in which two words are recognized in succession has been described. However, the navigation system can be controlled based on the order in which three or more words are recognized.
[0070]
【The invention's effect】
As described above, according to the present invention, an operation to be performed can be easily instructed by voice recognition regardless of the menu hierarchy and the utterance order.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a hardware configuration of a first embodiment of the present invention.
FIG. 2 is a functional block diagram showing a specific configuration of the first embodiment of the present invention.
FIG. 3 is a conceptual diagram showing an operation procedure when a phrase is sequentially uttered by tracing a menu hierarchy in the first embodiment of the present invention.
FIG. 4 is a conceptual diagram showing an operation procedure when directly speaking a word at the lowest level of a menu hierarchy from a screen displaying a map in the first embodiment of the present invention.
FIG. 5 is a conceptual diagram showing that processing according to a recognized word number is performed in the first embodiment of the present invention.
FIG. 6 is a conceptual diagram showing the configuration of a recognition dictionary in the second embodiment of the present invention.
FIG. 7 is a flowchart showing a processing procedure in the second embodiment of the present invention.
FIG. 8 is a conceptual diagram showing the configuration of a recognition dictionary in the third embodiment of the invention.
FIG. 9 is a flowchart generally showing a processing procedure in the third embodiment of the present invention.
FIG. 10 is a flowchart specifically showing a processing procedure in the third embodiment of the present invention.
FIG. 11 is a conceptual diagram showing an example of a conventional recognition dictionary.
[Explanation of symbols]
V ... Voice recognition unit
V1 ... ROM
V2 ... Recognition dictionary
N ... Navigation unit
N1 ... CD-ROM
N2 ... Flash memory
N3 ... Database
N4 ... Application program
1… Absolute position / azimuth detector
2 ... Relative bearing detector
3 ... Vehicle speed detector
4 ... Main CPU and its peripheral circuits
M ... Memory group
5 ... ROM
6 ... DRAM
7 ... SRAM
8 ... VRAM
10 ... Display section
11 ... Input section
12 ... CD-ROM controller
13 ... FM multiplex reception and processing unit
14 ... Voice recognition unit
41 ... 1st selection part
42 ... Menu hierarchy data
43, 430, 431, 432 ... Recognition dictionary
T ... Table
44 ... Second selection unit
45. Processing section

Claims

A first selection means for selecting which operation is to be performed by sequentially recognizing predetermined phrases according to the menu hierarchy;
A second selection means for selecting an operation corresponding to the phrase by recognizing a phrase to be used in the lowest layer of the menu hierarchy at a place other than the lowest layer;
Processing means for performing processing according to the selected operation ,
The second selection means is for selecting an operation for displaying a menu of a hierarchy one level lower than the current hierarchy when a phrase used in the lowest hierarchy is recognized in a voice other than the lowest hierarchy,
The first selecting means is for selecting an operation corresponding to the recognized word / phrase when the word / phrase to be used in the lowermost layer is voice-recognized in the lowermost layer.
In addition to the processing according to the selected operation, the processing means responds to the operation when the first selection means selects an operation for displaying a menu of a hierarchy one level lower than the current hierarchy. A navigation system characterized by displaying a message to be displayed on a display.

A first selection step for selecting which operation is to be performed by sequentially recognizing predetermined phrases according to the menu hierarchy;
A second selection step for selecting an operation corresponding to the phrase by recognizing a phrase used in the lowest layer of the menu hierarchy other than the lowest layer;
A processing step for performing processing according to the selected operation ,
The second selection step includes a process of selecting an operation for displaying a menu of a hierarchy one level lower than the current hierarchy when a word / phrase used in the lowest hierarchy is recognized by voice other than the lowest hierarchy,
The first selection step includes a process of selecting an operation corresponding to the recognized word / phrase when the word / phrase to be used in the lowermost layer is voice-recognized in the lowermost layer,
In addition to processing according to the selected operation, the processing step corresponds to the operation in the case where an operation for displaying a menu of a hierarchy one level lower than the current hierarchy is selected in the first selection step. The navigation method characterized by including the process which displays the message to perform on a display.

In a recording medium recording navigation software for performing navigation using a computer, the software is stored in the computer.
A first selection process for selecting which operation to perform by sequentially recognizing a predetermined phrase according to the menu hierarchy ;
A second selection process for selecting an operation according to the phrase by recognizing a phrase used in the lowermost layer of the menu hierarchy at a position other than the lowermost layer ;
The process according to the selected operation is executed,
The second selection process includes a process of selecting an operation for displaying a menu in a hierarchy one level lower than the current hierarchy when a phrase used in the lowest hierarchy is recognized in a voice other than the lowest hierarchy. ,
The first selecting process includes a process of selecting an operation corresponding to the recognized word / phrase when a word / phrase to be used in the lowermost layer is recognized by voice in the lowermost layer,
The process corresponding to the selected operation is performed by displaying a message corresponding to the operation when the operation for displaying the menu of the hierarchy one level lower than the current hierarchy is selected in the first selection process. The recording medium which recorded the software for navigation characterized by including the process displayed on the.