JPH0576065B2

JPH0576065B2 -

Info

Publication number: JPH0576065B2
Application number: JP1255491A
Authority: JP
Inventors: Eiichiro Toshima
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1989-09-29
Filing date: 1989-09-29
Publication date: 1993-10-21
Also published as: JPH03116366A

Description

【発明の詳細な説明】［産業上の利用分野］本発明は仮名漢字変換により漢字仮名混り文を
入力する文字処理装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a character processing device for inputting a sentence containing kanji and kana through kana-kanji conversion.

［従来の技術］現在、日本ワードプロセツサなどの文字処理装
置は漢字仮名混り文の入力を仮名漢字変換を使つ
て行なうことが一般的である。[Prior Art] Currently, character processing devices such as Japanese word processors generally input sentences containing kanji and kana using kana-kanji conversion.

従来、仮名漢字変換用辞書は外部メモリ（フロ
ツピーデイスク、ハードデイスク）に格納するこ
とが多かつた。ところが、一括変換等の変換方式
面での改良が進むことにより辞書アクセスの回数
が増大し、また、そうでなくても、変換のスピー
ドアツプ要求が高まつてきたことから、辞書が高
速アクセス可能な内部メモリに格納する方式が主
流となつている。 Conventionally, kana-kanji conversion dictionaries were often stored in external memory (floppy disk, hard disk). However, as improvements in conversion methods such as batch conversion have progressed, the number of dictionary accesses has increased, and even if this is not the case, the demand for speeding up conversion has increased, so dictionaries can be accessed at high speed. The mainstream method is to store data in internal memory.

内部メモリには、読込／書込ともに可能で揮発
性（電源を切ると記憶内容が消えてしまう）の
RAMと、書込が不可能であるが不揮発性（電源
を切つても記憶内容が消えない）のROMの２種
類が一般に広く使用されている。 The internal memory is readable/writable and volatile (memory contents disappear when the power is turned off).
Two types of memory are commonly used: RAM, and ROM, which is non-writable but non-volatile (memory contents do not disappear even when the power is turned off).

辞書をRAMに持つ場合、仮名漢字変換する前
の準備として辞書を外部メモリからRAMにロー
ドすることになるが、ロード時間がかかるという
欠点があり、更に、RAMの方がROMよりも高
価であるためコストが高くなるという欠点もあ
る。このため、辞書はROMに記憶するのが一般
的である。 When storing a dictionary in RAM, the dictionary must be loaded from external memory to RAM as a preparation before kana-kanji conversion, but this has the disadvantage of taking time to load, and in addition, RAM is more expensive than ROM. This also has the disadvantage of increasing costs. For this reason, dictionaries are generally stored in ROM.

また、最近は、仮名漢字変換の変換率に対する
要求も高度化しているため、辞書が大容量化する
傾向がある。 Furthermore, recently, demands on the conversion rate of kana-kanji conversion have become more sophisticated, so there is a tendency for dictionaries to have larger capacities.

このように辞書が大容量化されているため、個
人個人にとつてみれば、絶対に使用しないと思わ
れる単語が相当多く辞書に記憶されていることに
なる。このような無駄な単語は、単に無駄でメモ
リが勿体ないのは仕方がないことであるが、誤変
換の原因ともなり、オペレータによつては存在し
ない方が有難い単語もある。 As the capacity of dictionaries has increased in this way, a considerable number of words that individuals would never use are stored in dictionaries. It is inevitable that such useless words are simply useless and waste memory, but they can also cause erroneous conversions, and some operators would be grateful if they did not exist.

例えば、単語数が増えた結果、「神戸（こう
べ）」という姓が辞書に格納され、また、「功（こ
う）」という名前も辞書に格納されたとする。そ
のような辞書でオペレータが「神戸港」を変換し
ようとして「こうべこう」と打鍵したとすると、
仮名漢字変換には通常「姓＋名前」のパターンを
優先して変換する処理が組み込まれているので、
オペレータの意図に反し「神戸功」と変換する可
能性が高い。 For example, suppose that as a result of an increase in the number of words, the surname "Kobe" is stored in the dictionary, and the name "Kou" is also stored in the dictionary. Suppose an operator tries to convert "Kobe Port" using such a dictionary and types "Kobekou".
Kana-kanji conversion usually has a built-in process that gives priority to the "surname + first name" pattern, so
There is a high possibility that it will be converted to "Kobe Isao" against the operator's intention.

このような場合、「功」を「港」に変更すれば、
学習が働き、次回から「神戸港」は正しく変換さ
れる。ところが、「神戸」という姓が辞書上に存
在する限り、「神戸市」のつもりが「神戸氏」に
なり、「神戸産」のつもりが「神戸さん」になる
など、あちこちで誤変換が生じ、操作性を阻害す
る。 In such a case, if you change "Kong" to "Minato",
The learning will work and "Kobe Port" will be converted correctly next time. However, as long as the surname "Kobe" exists in dictionaries, mistranslations will occur here and there, such as "Kobe City" becoming "Kobe-san" and "Kobe-made" becoming "Kobe-san." , impeding operability.

そもそも、「神戸」という姓は普通の人にとつ
ては馴染の薄い姓であり、辞書から削除するとい
う解決策が手間がなく最も効果が高い。ところ
が、通常、辞書はROM上に存在するので、直接
的に単語の削除を行なうことはできない。 To begin with, the surname "Kobe" is a surname that is not familiar to ordinary people, and the easiest and most effective solution is to delete it from the dictionary. However, since the dictionary normally exists on the ROM, words cannot be deleted directly.

ROM上の単語を削除する方法として、辞書
ROM上の単語の存在アドレスを記憶し、変換時
にその位置の単語を無視し使用しないようにする
方式、削除すべき単語の読み、表記、品詞などの
単語情報を記憶し、変換時にそれと一致する単語
がROM上の辞書に存在すれば、その単語を無視
し使用しないようにする方式、などが考えられ
る。 Dictionary as a way to delete words on ROM
A method that memorizes the address of a word on ROM and ignores the word at that position during conversion so that it is not used.Memorizes word information such as pronunciation, spelling, and part of speech of the word to be deleted, and matches it during conversion. If a word exists in the dictionary on the ROM, a method can be considered in which that word is ignored and not used.

［発明が解決しようとしている問題点］しかし、上記の単語情報を記憶する方式による
単語の削除は、１単語削除する度にかなりのメモ
リを必要とするため、コストが高くなり、また、
変換時に単語情報とマツチングを取るのでは処理
も複雑になり、変換時間も長くなるという欠点が
ある。[Problems to be Solved by the Invention] However, deleting words using the method of storing word information described above requires a considerable amount of memory each time one word is deleted, resulting in high cost.
Matching with word information at the time of conversion has the disadvantage that the processing becomes complicated and the conversion time becomes longer.

また、アドレスを記憶する方式による単語の削
除では、削除単語辞書は、辞書の内容が更新され
た時に使用できなくなつてしまう。そのため、複
数の種類の構成を使用するオペレータはその機械
語とに削除単語辞書を作成しなければならず、デ
ータ互換性の点で問題がある。 Furthermore, when deleting words by storing addresses, the deleted word dictionary becomes unusable when the contents of the dictionary are updated. Therefore, an operator using multiple types of configurations must create a deletion word dictionary for the machine language, which poses a problem in terms of data compatibility.

以上のようにどちらの方式を採用しても何らか
の欠点がある。 As mentioned above, no matter which method is adopted, there are some drawbacks.

［問題点を解決するための手段（及び作用）］上記問題点を解決するために、本発明によれ
ば、文字処理装置に、仮名文字列を入力するため
の入力手段と、単語の読みと、表記を含む当該単
語の単語情報とを対応させて、単語を記憶した書
き換え不可能な第１の辞書手段と、該第１の辞書
手段を参照して、前記入力手段より入力された仮
名文字列を、当該仮名文字列を読みとする単語の
表記に変換する変換手段と、前記第１の辞書手段
に記憶された単語のうち、前記変換手段による変
換において無効とすべき単語の読みと、表記を含
む当該無効とすべき単語の単語情報とを対応させ
て、無効とすべき単語を記憶する第２の辞書手段
と、該第２の辞書手段に無効とすべき単語を登録
する無効単語登録手段と、起動時に、前記第１の
辞書手段を検索して前記第の２辞書手段に記憶さ
れている単語と読み及び単語情報が一致する単語
の当該第１の辞書手段におけるポインタを求める
検索手段と、該検索手段により検索されたポイン
タを記憶する無効単語ポインタ記憶手段と、前記
変換手段による変換において、前記入力された仮
名文字列を読みとする単語の前記第１の辞書手段
におけるポインタを求め、求められた当該ポイン
タと一致するポインタが前記無効単語ポインタ記
憶手段に記憶されているかを判定し、記憶されて
いると判定された場合には、当該ポインタに対応
する単語を変換対象から除外するように制御する
制御手段とを具えたことにより、書き換え不可能
な第１の辞書手段に記憶された単語のうち、無効
とすべき単語を、変換手段による変換において、
変換対象から除外するようにしたものである。[Means for Solving the Problems (and Effects)] In order to solve the above problems, according to the present invention, a character processing device is provided with an input means for inputting a kana character string and a word pronunciation. , a non-rewritable first dictionary means that stores words in correspondence with word information of the word including its notation, and kana characters inputted from the input means with reference to the first dictionary means. a conversion means for converting the string into the notation of the word whose reading is the kana character string; and a reading of the word that is to be invalidated in the conversion by the conversion means among the words stored in the first dictionary means; a second dictionary means for storing the word to be invalidated in association with word information of the word to be invalidated including the notation; and an invalid word for registering the word to be invalidated in the second dictionary means; a registration means, and a search for searching the first dictionary means at the time of activation to find a pointer in the first dictionary means of a word whose pronunciation and word information match the word stored in the second dictionary means; means, an invalid word pointer storage means for storing pointers searched by the search means, and an invalid word pointer storage means for storing a pointer searched by the search means, and a pointer in the first dictionary means of a word whose reading is the input kana character string in the conversion by the conversion means. and determines whether a pointer that matches the found pointer is stored in the invalid word pointer storage means, and if it is determined that it is stored, excludes the word corresponding to the pointer from the conversion target. Among the words stored in the non-rewritable first dictionary means, the words to be invalidated can be changed by the converting means to
This is to exclude them from conversion targets.

［実施例］以下図面を参照しながら本発明を詳細に説明す
る。[Example] The present invention will be described in detail below with reference to the drawings.

第１図は本発明の全体構成の一例である。 FIG. 1 is an example of the overall configuration of the present invention.

図示の構成において、CPUは、マイクロプロ
セツサであり、文字処理のための演算、論理判断
等を行ない、アドレスバスAB、コントロールバ
スCB、データバスDBを介して、それらのバスに
接続された各構成要素を制御する。 In the illustrated configuration, the CPU is a microprocessor that performs arithmetic operations for character processing, logical judgments, etc. Control components.

アドレスバスABはマイクロプロセツサCPUの
制御の対象とする構成要素を指示するアドレス信
号を転送する。コントロールバスCBはマイクロ
プロセツサCPUの制御の対象とする各構成要素
のコントロール信号を転送して印加する。データ
バスDBは各構成機器相互間のデータの転送を行
なう。 Address bus AB transfers address signals indicating the components to be controlled by the microprocessor CPU. The control bus CB transfers and applies control signals for each component to be controlled by the microprocessor CPU. The data bus DB transfers data between each component.

つぎにROMは、読出し専用の固定メモリであ
り、第１０図〜第１４図につき後述するマイクロ
プロセツサCPUによる制御の手順、及び、仮名
漢字変換用辞書DICを記憶させておく。 Next, the ROM is a read-only fixed memory, and stores the control procedure by the microprocessor CPU, which will be described later with reference to FIGS. 10 to 14, and the kana-kanji conversion dictionary DIC.

また、RAMは、１ワード16ビツトの構成の書
込み可能のランダムアクセスメモリであつて、各
構成要素からの各種データの一時記憶に用いる。
DELDは削除単語辞書であり、辞書DICから削除
した単語を記憶する。SWTBLはサーチ単語テー
ブルであり、仮名漢字変換中に必要な単語の存在
位置を一時的に記憶するためのテーブルである。 Further, the RAM is a writable random access memory having a configuration of 1 word and 16 bits, and is used for temporary storage of various data from each component.
DELD is a deleted word dictionary and stores words deleted from dictionary DIC. SWTBL is a search word table, which is used to temporarily store the locations of words required during kana-kanji conversion.

KBはキーボードであつて、アルフアベツトキ
ー、ひらかなキー、カタカナキー等の文字記号入
力キー、及び、変換キー、単語削除キー、実行キ
ー等の本文字処理装置に対する各種機能を指示す
るための各種のフアンクシヨンキーを備えてい
る。 The KB is a keyboard with character symbol input keys such as the alphanumeric key, hirakana key, and katakana key, as well as various keys for instructing the character processing device to perform various functions such as the conversion key, word deletion key, and execution key. It has a function key.

DISKは文書データ、削除単語辞書DELDを記
憶するための外部メモリである。文書、削除単語
辞書DELDは必要に応じて保管され、また、保管
されたデータはキーボードの指示により必要な時
呼び出される。 DISK is an external memory for storing document data and the deleted word dictionary DELD. The documents and deleted word dictionary DELD are saved as needed, and the saved data can be called up when needed by instructions from the keyboard.

CRはカーソルレジスタである。CPUにより、
カーソルレジスタの内容を読み書きできる。後述
するCRTコントローラCRTCは、ここに蓄えら
れたアドレスに対応する表示装置CRT上の位置
にカーソルを表示する。 CR is the cursor register. By CPU,
Can read and write the contents of the cursor register. A CRT controller CRTC, which will be described later, displays a cursor at a position on the display device CRT corresponding to the address stored here.

DBUFは表示用バツフアメモリで、表示すべ
きデータのパターンを蓄える。 DBUF is display buffer memory that stores data patterns to be displayed.

CRTCはカーソルレジスタCR及びバツフア
DBUFに蓄えられた内容を表示器CRTに表示す
る役割を担う。 CRTC is the cursor register CR and buffer
It plays the role of displaying the contents stored in DBUF on the display CRT.

またCRTは陰極線管等を用いた表示装置であ
り、その表示装置CRTにおけるドツト構成の表
示パターンおよびカーソルの表示をCRTコント
ローラで制御する。 Further, a CRT is a display device using a cathode ray tube or the like, and a CRT controller controls the dot-configured display pattern and cursor display on the display device CRT.

さらに、CGはキヤラクタジエネレータであつ
て、表示装置CRTに表示する文字、記号のパタ
ーンを記憶するものである。 Furthermore, CG is a character generator that stores patterns of characters and symbols to be displayed on the display device CRT.

かかる各構成要素からなる本発明文字処理装置
においては、キーボードKBからの各種の入力に
応じて作動するものであつて、キーボードKBか
らの入力が供給されると、まず、インタラプト信
号がマイクロプロセツサCPUに送られ、そのマ
イクロプロセツサCPUがROM内に記憶してある
各種の制御信号が読出し、それらの制御信号に従
つて各種の制御が行なわれる。 The character processing device of the present invention, which is composed of each of these components, operates in response to various inputs from the keyboard KB, and when input from the keyboard KB is supplied, an interrupt signal is first sent to the microprocessor. Various control signals are sent to the CPU, and the microprocessor CPU reads out various control signals stored in the ROM, and various controls are performed in accordance with these control signals.

第２図は本発明装置による変換操作の例を示し
た図である。２−１はまず、読み列「こうべこ
う」を入力した時の画面を示している。カーソル
は入力読み列の次に表示されている。ここで変換
キーを打鍵すると２−２の画面になる。２−２で
は読み列「こうべこう」が「神戸功」と変換され
ている。これは望む変換ではないので、「神戸」
（姓）を辞書から単語削除するという操作を行な
つて、その後もう一度「こうべこう」と入力する
と２−３の画面になる。ここで変換キーは再度打
鍵すると２−４の画面になり、今度は正しく「神
戸港」と変換されている。 FIG. 2 is a diagram showing an example of a conversion operation performed by the apparatus of the present invention. 2-1 first shows the screen when the reading sequence "Kobekou" is input. The cursor is displayed next to the input reading sequence. If you press the conversion key here, the screen 2-2 will appear. In 2-2, the reading sequence ``Kobekou'' has been converted to ``Kobe Isao.'' This is not the conversion you want, so "Kobe"
If you delete the word (last name) from the dictionary and then enter ``Kobekou'' again, the screen shown in 2-3 will appear. If you press the conversion key again, the screen 2-4 will appear, and this time it will be correctly converted to "Kobe Port."

第３図は単語削除の操作を説明した図である。
３−１は初期画面を示しており、この状態で単語
削除キーを打鍵すると３−２の画面になる。３−
２で単語削除が起動され、削除単語の入力ウイン
ドウが表示されている。ここで削除すべき単語の
読み「こうべ」をオペレータが入力すると、ウイ
ンドウ中に削除単語の読みが表示され、３−３の
画面になる。更に変換キーを打鍵すると、「こう
べ」が「神戸」に変換されてウインドウ中に表示
され、３−４の画面となる。ここで実行キーを打
鍵すると「神戸」の表記と読みが取り込まれ、品
詞の入力ウインドウが開き、３−５の画面にな
る。ここで品詞「姓」をオペレータが入力すると
３−６の画面になる。ここで実行キーを打鍵する
と、読み「こうべ」表記「神戸」品詞「姓」の単
語が辞書より削除され、終了メツセージが３−７
に示すように表示される。 FIG. 3 is a diagram explaining the word deletion operation.
3-1 shows the initial screen, and when the word deletion key is pressed in this state, the screen 3-2 appears. 3-
Word deletion is activated in step 2, and an input window for the word to be deleted is displayed. When the operator inputs the pronunciation of the word to be deleted, ``Kobe'', the pronunciation of the word to be deleted is displayed in the window, and the screen shown in 3-3 appears. When the conversion key is further pressed, "Kobe" is converted to "Kobe" and displayed in the window, resulting in screen 3-4. If you press the execution key here, the notation and pronunciation of "Kobe" will be imported, a part-of-speech input window will open, and the screen 3-5 will appear. Here, when the operator inputs the part of speech "last name", the screen 3-6 appears. If you press the execute key here, the word with the pronunciation "Kobe" notation "Kobe" and part of speech "surname" will be deleted from the dictionary, and the ending message will be 3-7.
It will be displayed as shown.

第４図は辞書DICの構成を示した図である。辞
書はROM上に存在する。従つて、内容を変更す
ることはできない。 FIG. 4 is a diagram showing the configuration of the dictionary DIC. The dictionary exists on ROM. Therefore, the contents cannot be changed.

辞書は先頭に「辞書バージヨン」が格納され
る。これは辞書の内容に変更があつた時に更新さ
れるようなデータであり、全く同じ内容を持つ辞
書は同じ辞書バージヨンを持つ。後述するように
削除単語のアドレス部を更新する必要があるかど
うかはこの辞書バージヨンで管理される。 A "dictionary version" is stored at the beginning of the dictionary. This is data that is updated when the contents of the dictionary change, and dictionaries with exactly the same contents have the same dictionary version. As will be described later, whether or not the address part of the deleted word needs to be updated is managed by this dictionary version.

辞書バージヨンに引き続いて、単語データが格
納される。各単語データは「読み」「表記」「品
詞」からなる。 Following the dictionary version, word data is stored. Each word data consists of "pronunciation", "spelling", and "part of speech".

「読み」には単語の読み情報、例えば、「神戸」
であれば「こうべ」が記憶される。コードはJIS
Ｘ 0208コードの下位バイトなどを使用し、１文
字１バイドで格納される。 "Yomi" is the reading information of the word, for example, "Kobe"
If so, ``Kobe'' will be remembered. The code is JIS
The lower byte of the X0208 code is used, and each character is stored as one byte.

「表記」には単語の表記情報、例えば、「神戸」
であれば、「神戸」という字面が１文字２バイト
でJIS Ｘ 0208コード等を使用して格納される。 "Notation" is the notation information of the word, for example, "Kobe"
In this case, the character "Kobe" is stored as 2 bytes per character using JIS X 0208 code or the like.

「品詞」は単語の品詞、例えば、「神戸」であ
れば、「姓」「地名」などが格納される。 "Part of speech" stores the part of speech of a word, for example, in the case of "Kobe", "surname", "place name", etc. are stored.

第５図は辞書DICに格納されるデータの例を示
した図である。図に示すように辞書の単語データ
が読みの昇順（辞書式配列）で格納される。 FIG. 5 is a diagram showing an example of data stored in the dictionary DIC. As shown in the figure, word data in the dictionary is stored in ascending order of pronunciation (lexicographical arrangement).

第６図は削除単語辞書DELDの構成を示した図
である。 FIG. 6 is a diagram showing the structure of the deletion word dictionary DELD.

削除単語辞書は外部記憶に保存されている。電
源立ち上げ時に必要部分が外部メモリから読み込
まれ、RAMにロードされる。 The deleted word dictionary is stored in external memory. When the power is turned on, the necessary parts are read from external memory and loaded into RAM.

削除単語辞書は３つの部分に分かれる。 The deletion word dictionary is divided into three parts.

６−１は「辞書バージヨン」を記憶する部分で
ある。辞書バージヨンは削除単語辞書のアドレス
部が作成された時の辞書DICの辞書バージヨンが
そのまま記憶される。 6-1 is a part that stores a "dictionary version". The dictionary version of the dictionary DIC when the address part of the deleted word dictionary was created is stored as is.

６−２は実体部であり、第７図に詳述するよう
に削除単語の読み、表記、品詞が記憶される。 Reference numeral 6-2 is the entity part, which stores the pronunciation, notation, and part of speech of the deleted word, as detailed in FIG.

６−３はアドレス部であり、第８図に詳述する
ように削除単語が辞書DIC上のどこに存在するか
を示すポインタを記憶する。 Reference numeral 6-3 is an address field, which stores a pointer indicating where the deleted word exists on the dictionary DIC, as detailed in FIG.

なお、RAM上に常に存在するのは６−１の辞
書バージヨンと６−３のアドレス部のみである。
６−２の実体部は普段は外部メモリにのみ存在す
る。アドレス部を再作成する必要が生じた時に６
−２の実体部が外部メモリより一時的にRAMに
ロードされる。それ以外の状況では、実体部に相
当するメモリは開放されており、別の目的のため
に有効利用されている。 Note that only the dictionary version 6-1 and the address section 6-3 always exist on the RAM.
The actual part 6-2 normally exists only in external memory. 6 when it becomes necessary to recreate the address part.
The entity part of -2 is temporarily loaded into RAM from external memory. In other situations, the memory corresponding to the entity part is freed and effectively used for another purpose.

第７図は削除単語辞書実体部の詳細構成を示し
た図である。 FIG. 7 is a diagram showing the detailed configuration of the deletion word dictionary entity section.

「読み」「表記」「品詞」の３つのフイールドか
らなり、削除単語の読み、表記、品詞が記憶され
る。 It consists of three fields: "pronunciation,""notation," and "part of speech," and the pronunciation, notation, and part of speech of the deleted word are stored.

削除単語は、通常、辞書DICに存在するはずで
あり、存在する時は辞書DIC上の読み、表記、品
詞がそのまま記憶されることになる。 The deleted word should normally exist in the dictionary DIC, and when it exists, the pronunciation, spelling, and part of speech in the dictionary DIC will be stored as is.

図中では削除単語１は「神戸」（姓）、削除単語
２は「内閣」（名詞）となつている。 In the figure, deleted word 1 is "Kobe" (surname), and deleted word 2 is "cabinet" (noun).

なお、削除単語として辞書DICに存在しない単
語が記述されていても一向に差し支えない。存在
しない削除単語は単に無視されるだけである。 Note that there is no problem even if a word that does not exist in the dictionary DIC is written as a deleted word. Deletion words that do not exist are simply ignored.

第８図は削除単語辞書アドレス部の詳細構成を
示した図である。 FIG. 8 is a diagram showing the detailed structure of the deletion word dictionary address section.

削除単語辞書アドレス部には削除単語実体部に
格納されている削除単語一つ一つに対して、その
単語が辞書DICのどこに存在するかを記憶してい
る。 The deleted word dictionary address section stores, for each deleted word stored in the deleted word entity section, where that word exists in the dictionary DIC.

削除単語辞書アドレス部は仮名漢字変換処理の
高速化のために存在するデータであり、このデー
タがなくても、実体部と、辞書DICとからいつで
も再作成できる。 The deleted word dictionary address part is data that exists to speed up the kana-kanji conversion process, and even if this data is not present, it can be recreated at any time from the entity part and the dictionary DIC.

例えば、削除単語１は第７図によると「神戸」
（姓）であるので、辞書DIC上の「神戸」（姓）の
存在するアドレスを第１エントリーとして格納す
る。同様に削除単語２については「内閣」（名詞）
の存在するアドレスを第２エントリーとして格納
する。 For example, deletion word 1 is "Kobe" according to Figure 7.
(last name), the address where "Kobe" (last name) exists in the dictionary DIC is stored as the first entry. Similarly, for deleted word 2, “cabinet” (noun)
The address where the file exists is stored as the second entry.

削除単語辞書アドレス部は、辞書DICのバージ
ヨンに依存するデータであり、作成された時の辞
書DICの辞書バージヨンが削除単語辞書の先頭に
格納される。また、辞書DICのバージヨンが変更
した時は、削除単語辞書実体部のデータを参照し
て再作成される。 The deleted word dictionary address field is data that depends on the version of the dictionary DIC, and the dictionary version of the dictionary DIC at the time of creation is stored at the beginning of the deleted word dictionary. Furthermore, when the version of the dictionary DIC changes, it is re-created by referring to the data in the deleted word dictionary entity section.

第９図はサーチ単語テーブルSWTBLの構成を
示した図である。 FIG. 9 is a diagram showing the structure of the search word table SWTBL.

サーチ単語テーブルは仮名漢字変換処理を行な
う過程において、入力読み列の解析を行なうのに
必要な単語が辞書DICのどこに存在するかを一時
的に記憶したテーブルである。 The search word table is a table that temporarily stores where in the dictionary DIC the words necessary to analyze the input pronunciation sequence exist in the process of performing the kana-kanji conversion process.

例えば、入力読み列が「こうべこう」であつた
ときは、その解析のために「こ」「こう」「こう
べ」「う」「うべ」「べこ」「こう」などの単語が必
要であり、それらの単語の辞書DIC上の存在位置
がフイールド「ポインタ」に記憶される。 For example, if the input pronunciation sequence is ``Kobeko'', words such as ``ko'', ``ko'', ``kobe'', ``u'', ``ube'', ``beko'', and ``ko'' are required for analysis. , the positions of those words in the dictionary DIC are stored in the field "pointer".

上述の実施例の動作をフローに従つて説明す
る。 The operation of the above embodiment will be explained according to the flow.

第１０図はキー入力を取り込み、処理を行なう
部分のフローチヤートである。 FIG. 10 is a flowchart of the part in which key input is received and processed.

ステツプ10−１はアドレス作成処理であり、第
１１図に示すように削除単語辞書の初期設定を行
なう。この処理は通常、電源ON直後に１回だけ
実行される。 Step 10-1 is address creation processing, in which the deletion word dictionary is initialized as shown in FIG. This process is normally executed only once immediately after the power is turned on.

ステツプ10−２はキーボードからのデータを取
り込む処理である。ステツプ10−３で取り込まれ
たキーの種別を判定し、各キーの処理ルーチンに
分岐する。 Step 10-2 is a process of taking in data from the keyboard. In step 10-3, the type of key taken in is determined, and the process branches to a processing routine for each key.

変換キーが入力されたときはステツプ10−４に
分岐し、ステツプ10−４において第１２図に詳述
するように仮名漢字変換の変換処理が行なわれ
る。その後ステツプ10−２に分岐する。 When the conversion key is input, the process branches to step 10-4, and in step 10-4, conversion processing for kana-kanji conversion is performed as detailed in FIG. Thereafter, the process branches to step 10-2.

単語削除キーが入力されたときはステツプ10−
５に分岐し、ステツプ10−５において第１４図に
詳述する単語削除補処理が行なわれる。その後ス
テツプ10−２に分岐する。 When the word deletion key is input, step 10−
The process branches to step 10-5, and word deletion supplementary processing detailed in FIG. 14 is performed at step 10-5. Thereafter, the process branches to step 10-2.

その他のキーのときはステツプ10−６に分岐
し、挿入、削除等の通常の文字処理装置において
行なわれるその他の処理が行なわれる。その後ス
テツプ10−２に分岐する。 If any other key is pressed, the process branches to step 10-6, and other processing such as insertion, deletion, etc. performed in a normal character processing device is performed. Thereafter, the process branches to step 10-2.

第１１図はステツプ10−１の「アドレス作成処
理」を詳細化したフローチヤートである。 FIG. 11 is a detailed flowchart of the "address creation process" in step 10-1.

ステツプ11−１において削除単語辞書の「辞書
バージヨン」「アドレス部」を外部メモリから
RAMにロードする。 In step 11-1, the "dictionary version" and "address part" of the deleted word dictionary are retrieved from the external memory.
Load into RAM.

ステツプ11−２において削除単語辞書の辞書バ
ージヨンと辞書DICの辞書バージヨンを比較す
る。一致した時はそのままリターンするが、一致
しない時は削除単語辞書アドレス部を再作成する
必要があるので、ステツプ11−３に進む。 In step 11-2, the dictionary version of the deleted word dictionary is compared with the dictionary version of the dictionary DIC. If they match, the process returns as is, but if they do not match, it is necessary to recreate the address section of the deletion word dictionary, so proceed to step 11-3.

ステツプ11−３において、再作成のためにまず
アドレス部を初期化し、辞書DICの辞書バージヨ
ンを辞書バージヨンとして削除単語辞書に設定す
る。また、削除単語辞書実体部を外部メモリから
RAMに読み込む。 In step 11-3, for re-creation, the address section is first initialized, and the dictionary version of the dictionary DIC is set as the dictionary version in the deleted word dictionary. Also, delete the word dictionary entity part from external memory.
Load into RAM.

ステツプ11−４において、削除単語を実体部か
ら１単語取り出す。 In step 11-4, one word to be deleted is extracted from the entity part.

ステツプ11−５において全ての削除単語につい
て処理が終了したかどうか判定し、処理が終了し
ている時はリターンする。処理が終了していない
時はステツプ11−６の削除単語アドレス決定に進
む。 In step 11-5, it is determined whether the processing has been completed for all deleted words, and if the processing has been completed, the process returns. If the processing has not been completed, the process advances to step 11-6 to determine the address of the word to be deleted.

ステツプ11−６において、取り出された削除単
語と同じものが辞書DIC上のどこに存在するかサ
ーチし、そのアドレスを求める。 In step 11-6, a search is made to see where in the dictionary DIC the same word as the extracted deleted word exists, and its address is obtained.

ステツプ11−７において、上記求めたアドレス
を削除単語辞書アドレス部に設定する。 In step 11-7, the address obtained above is set in the deletion word dictionary address field.

ついで、次の削除単語の処理を行なわないとい
けないのでステツプ11−４に分岐する。 Then, since the next deleted word must be processed, the process branches to step 11-4.

第１２図はステツプ10−４の「変換処理」を詳
細化したフローチヤートである。 FIG. 12 is a detailed flowchart of the "conversion process" in step 10-4.

ステツプ12−１において入力読み列の解析に必
要な単語をサーチ単語テーブルSWTBLに登録す
るために第13に詳述する単語サーチ処理を行な
う。 In step 12-1, word search processing, which will be described in detail in the thirteenth section, is performed in order to register words necessary for analyzing the input pronunciation sequence in the search word table SWTBL.

ステツプ12−２において、形態素解析、構文解
析等を行なつて入力読み列を解析し、文節候補を
作成する。 In step 12-2, the input pronunciation sequence is analyzed by morphological analysis, syntactic analysis, etc., and clause candidates are created.

ステツプ12−３において、各文節候補の尤度を
計算し、どの文節を変換するのが最も尤もらしい
かを判断し、第１候補として決定する。 In step 12-3, the likelihood of each clause candidate is calculated to determine which clause is most likely to be converted, and is determined as the first candidate.

ステツプ12−４において、決定された第１候補
に基づいて変換結果を作成し、出力する。 In step 12-4, a conversion result is created and output based on the determined first candidate.

第１３図はステツプ12−１の「単語サーチ処
理」を詳細化したフローチヤートである。 FIG. 13 is a detailed flowchart of the "word search process" in step 12-1.

ステツプ13−１において、辞書DICより入力読
み列の解析に必要な単語の読み（サーチすべき読
み）を１つ決定する。 In step 13-1, one pronunciation (pronunciation to be searched) of a word necessary for analysis of the input pronunciation sequence is determined from the dictionary DIC.

ステツプ13−２においてサーチすべき読みがな
くなつたかどうか判定し、なくなつた時はリター
ンする。 In step 13-2, it is determined whether there are no more readings to be searched, and if there are no more readings to be searched, the process returns.

ステツプ13−３においてサーチすべき読みにつ
いて実際に辞書DICをサーチし、アドレスを求め
る。 In step 13-3, the dictionary DIC is actually searched for the reading to be searched, and the address is obtained.

ステツプ13−４において見つかつたアドレスが
削除単語辞書アドレス部に記載されているかどう
かを判定するため削除単語辞書アドレス部をサー
チする。 The deletion word dictionary address section is searched to determine whether the address found in step 13-4 is written in the deletion word dictionary address section.

ステツプ13−５において一致するアドレスがあ
つたかどうか判定し、もし、存在すれば、その単
語は削除されたと見なされるから、そのまま、ス
テツプ13−１に分岐し、次のサーチ読みの処理に
移る。存在しない時はその単語は削除されていな
いからステツプ13−６に進み、サーチ単語テーブ
ルにそのアドレスを登録する。 In step 13-5, it is determined whether a matching address is found, and if so, the word is considered to have been deleted, and the process branches to step 13-1 to proceed to the next search reading process. If the word does not exist, the word has not been deleted, and the process proceeds to step 13-6, where the address is registered in the search word table.

第１４図はステツプ10−５の「単語削除処理」
を詳細化したフローチヤートである。 Figure 14 shows the "word deletion process" in step 10-5.
This is a detailed flowchart.

ステツプ14−１において、画面上に単語削除の
ための表記入力のウインドウを表示する。 In step 14-1, a notation entry window for word deletion is displayed on the screen.

ステツプ14−２において、削除単語の表記をオ
ペレータから受付ける処理を行なう。オペレータ
が入力した読み、選択した表記は内部メモリに一
時的に取り込まれる。オペレータが表記を入力
し、実行キーを打鍵した時にステツプ14−３に移
ることになる。 In step 14-2, processing is performed to receive the notation of the deleted word from the operator. The reading input by the operator and the notation selected are temporarily stored in the internal memory. When the operator inputs the notation and presses the execution key, the process moves to step 14-3.

ステツプ14−３において、品詞入力のウインド
ウが表示される。ここでオペレータの入力する品
詞が内部メモリに取り込まれることになる。オペ
レータが品詞を入力し、実行キーを打鍵すると次
のステツプ14−４に移る。 In step 14-3, a window for inputting the part of speech is displayed. At this point, the part of speech input by the operator is imported into the internal memory. When the operator inputs the part of speech and presses the execution key, the process moves to the next step 14-4.

ステツプ14−４において、これまでの処理の結
果得られた読み、表記、品詞を削除単語データ実
体部に登録する。実体部は通常RAM上に存在し
ないから外部メモリから読み込まれ登録されるこ
とになる。 In step 14-4, the pronunciation, spelling, and part of speech obtained as a result of the processing up to now are registered in the deletion word data entity section. The actual part usually does not exist in RAM, so it is read from external memory and registered.

ステツプ14−５において、今削除された単語の
辞書DIC上でのアドレスを求める。 In step 14-5, the address of the word just deleted on the dictionary DIC is obtained.

ステツプ14−６において、求められたアドレス
を削除単語データアドレスにも登録する。 In step 14-6, the obtained address is also registered as a deletion word data address.

ステツプ14−７において単語削除の終了処理を
行なう。すなわち、削除単語辞書全体を外部メモ
リに保存し、RAM上にある実体部の領域を開放
し、終了メツセージをウインドウ上に表示する。
適当なタイミングでウインドウを消去してからリ
ターンする。 In step 14-7, word deletion termination processing is performed. That is, the entire deletion word dictionary is stored in an external memory, the area of the entity part on the RAM is released, and a termination message is displayed on the window.
Clear the window at an appropriate time and return.

［他の実施例］以上の説明において、辞書の格納されるメモリ
としてROMの場合を説明したが、書込不可なメ
モリであれば事情は全て同じであり、本発明を適
用可能である。例えば、光デイスク、CDROMな
どであつても、書込ができないため直接単語を削
除することはできないが、本発明の原理で削除す
ることはできる。[Other Embodiments] In the above description, the case where a ROM is used as the memory in which the dictionary is stored has been explained, but the situation is the same as long as it is a non-writable memory, and the present invention can be applied. For example, even with optical discs, CDROMs, etc., words cannot be directly deleted because they cannot be written to, but they can be deleted using the principles of the present invention.

また、削除単語辞書の持ち方としてアドレスを
持つようにしたが、アドレス以外であつても辞書
のバージヨンに依存するような持ち方であれば、
やはり事情が同じであるので、本発明を適用でき
る。例えば、辞書の先頭からの単語の連番で記憶
するようにしても同様の構成で処理することがで
きる。 In addition, we decided to hold the address as the way to hold the deleted word dictionary, but if it is held in a way other than an address that depends on the version of the dictionary,
Since the circumstances are still the same, the present invention can be applied. For example, even if the words are stored in consecutive numbers starting from the beginning of the dictionary, processing can be performed with a similar configuration.

また、削除単語辞書の持ち方として単語の読
み、表記、品詞を持つようにしたが、他にも記憶
すべき単語情報があれば、記憶する必要があり、
また、マツチングに関係のない情報であれば、記
憶を省略することができる。例えば、辞書中に、
頻度が異なり、読み、表記、品詞が一致する単語
が存在するなら、頻度も削除単語辞書に記憶する
必要がある。ところが、読み、表記、品詞が一致
し、頻度のみ異なる単語が存在しないのであれ
ば、無理をして頻度を削除単語辞書に記憶する必
要はない。 In addition, the deleted word dictionary has been designed to have the pronunciation, spelling, and part of speech of the word, but if there is other word information that needs to be memorized, it is necessary to memorize it.
Furthermore, if the information is unrelated to matching, storage can be omitted. For example, in the dictionary,
If there are words with different frequencies but with the same pronunciation, spelling, and part of speech, the frequencies also need to be stored in the deletion word dictionary. However, if there are no words that have the same pronunciation, spelling, and part of speech and differ only in frequency, there is no need to forcefully store the frequencies in the deletion word dictionary.

［発明の効果］以上の説明から明らかなように本発明によれ
ば、仮名漢字変換用辞書が書込不可なメモリ、例
えば、ROMに存在しても、削除すべき単語の辞
書存在アドレスとその読み、表記及び品詞などの
単語情報の両方を記憶することにより、辞書のバ
ージヨンに依存しないデータ互換性の高い削除単
語辞書を実現し、なおかつ変換時間にも負担を欠
けることがないので、単語削除したいオペレータ
にとつて使い易い文字処理装置を実現することが
できる。[Effects of the Invention] As is clear from the above description, according to the present invention, even if a kana-kanji conversion dictionary exists in a non-writable memory, for example, a ROM, the dictionary existing address of the word to be deleted and its By memorizing both word information such as pronunciation, spelling, and part of speech, we can create a word deletion dictionary with high data compatibility that does not depend on the version of the dictionary. It is possible to realize a character processing device that is easy to use for operators who want to use it.

[Brief explanation of the drawing]

第１図は本発明の全体構成のブロツク図、第２
図は本発明における仮名漢字変換の操作例を示し
た図、第３図は本発明における単語削除の操作の
例を示した図、第４図は本発明における辞書DIC
の構成を示した図、第５図は本発明における辞書
DICに格納される単語の例を示した図、第６図は
本発明における削除単語辞書の全体構成を示した
図、第７図は本発明における削除単語辞書実体部
の構成を示した図、第８図は本発明における削除
単語辞書アドレス部の構成を示した図、第９図は
本発明におけるサーチ単語テーブルの構成を示し
た図、第１０図〜第１４図は本発明文字処理装置
の動作を示すフローチヤート。 DISK……外部メモリ、CPU……マイクロプロ
セツサ、ROM……読出し専用メモリ、RAM…
…ランダムアクセスメモリ、DIC……仮名漢字変
換用辞書、DELD……削除単語辞書、SWTBL…
…サーチ単語テーブル。 Figure 1 is a block diagram of the overall configuration of the present invention, Figure 2 is a block diagram of the overall configuration of the present invention.
The figure shows an example of the Kana-Kanji conversion operation in the present invention, Figure 3 shows an example of the word deletion operation in the present invention, and Figure 4 shows the dictionary DIC in the present invention.
FIG. 5 is a diagram showing the configuration of the dictionary in the present invention.
A diagram showing an example of words stored in the DIC, FIG. 6 is a diagram showing the overall configuration of the deleted word dictionary in the present invention, and FIG. 7 is a diagram showing the configuration of the deleted word dictionary entity part in the present invention, FIG. 8 is a diagram showing the structure of the deletion word dictionary address section in the present invention, FIG. 9 is a diagram showing the structure of the search word table in the present invention, and FIGS. 10 to 14 are diagrams showing the structure of the character processing device of the present invention. Flowchart showing the operation. DISK...external memory, CPU...microprocessor, ROM...read-only memory, RAM...
...Random access memory, DIC...Kana-kanji conversion dictionary, DELD...Deletion word dictionary, SWTBL...
...Search word table.

Claims

[Scope of Claims] 1. An input means for inputting a kana character string; and an unrewritable first dictionary that stores words by associating the pronunciation of the word with the word information of the word, including the notation. means, converting means that refers to the first dictionary means and converts the kana character string inputted from the input means into the notation of a word whose reading is the kana character string; and the first dictionary means. Of the words memorized in
a second dictionary means for storing words to be invalidated by associating the pronunciation of the word to be invalidated in the conversion by the conversion means with word information of the word to be invalidated including the notation; invalid word registration means for registering words to be invalidated in the second dictionary means; upon startup, searches the first dictionary means to retrieve words, pronunciations, and word information stored in the second dictionary means; a search means for obtaining a pointer of a matching word in the first dictionary means; an invalid word pointer storage means for storing the pointer searched by the search means; and in the conversion by the conversion means, the input kana character string Find a pointer in the first dictionary means for the word whose reading is , and determine whether a pointer that matches the found pointer is stored in the invalid word pointer storage means, and it is determined that it is stored. A character processing device characterized in that it has a control means for controlling to exclude a word corresponding to the pointer from a conversion target.