JP4032775B2

JP4032775B2 - Encoding system and program

Info

Publication number: JP4032775B2
Application number: JP2002056731A
Authority: JP
Inventors: 裕之栗山; 伴　　秀行; 邦明南; 武志折出
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2002-03-04
Filing date: 2002-03-04
Publication date: 2008-01-16
Anticipated expiration: 2022-03-04
Also published as: JP2003256462A

Description

【０００１】
【発明の属する技術分野】
本発明は、表記にゆらぎのある語句のコード化を容易にするコード化処理技術、特に医療機関が疾病動向等の分析を行うために傷病名をコード管理する電子カルテシステム、もしくは保険者や審査支払機関が診療報酬明細書に記載された傷病名をコード化して統計処理を行うレセプト処理システムに関する。
【０００２】
【従来の技術】
医師は患者を診察する際に病状、病態に基づき、その患者の傷病名を記録する。傷病名の表記方法は個々の医師の裁量に任されており、標準的な傷病名以外にも慣用的な傷病名、仮名遣いが異なるなど、同じ病状、病態を表現するにもさまざまな傷病名が発生する。加えて患者の病態、病状をより正確に表現するさまざまな修飾語句が付加されるため、傷病名の表記はさらに多岐にわたり、これらの表記ゆらぎが傷病名をコード管理する観点からは大きな障害となっている。
たとえば健康保険組合等の保険者は、各医療機関より発行された診療報酬明細書（以下、レセプトと言う）の傷病名欄に記載されている傷病名を集計し、効果的な保健事業展開の目的で疾病動向の統計処理を行なっている。
疾病動向を分析するためには傷病名をコード化して集計する必要があるが、従来この作業はレセプトに記載された傷病名を知識、経験のある作業者が対応するコードに手作業で変換、集計していた。しかし保険者によっては月に数十万〜数百万件ものレセプトが届き、毎月集計を行うには時間的にもコスト的にも負担が重く、機械化されることが望まれている。
従来の技術によってこの傷病名読み取りの機械化を実現するには、たとえばＯＣＲ装置（文字読取装置）やパンチ入力によりレセプト記載の傷病名文字列を入力し、傷病名およびコードが格納された記録媒体である傷病名マスタと照合して該当傷病名をコード化する方法がある。ところがレセプト記載の傷病名文字列は前述した表記ゆらぎがあるため、傷病名マスタと照合しても確実にコードを得ることは必ずしも容易ではない。そのため、傷病名マスタとは別にユーザ辞書を設け、検出されなかったレセプト記載の傷病名文字列は対応するコードとともに登録しておく必要があった。
【０００３】
【発明が解決しようとする課題】
従来の技術によって多くの傷病名を自動的にコード化するためには、より多くの表記ゆらぎ傷病名をユーザ辞書に登録しておかなくてはならない。
そのため従来の技術では傷病名マスタ照合時にコード化できなかった傷病名を目視により確認し、必要に応じてユーザ辞書に追加する必要があった。
ユーザ辞書への登録作業は、コード化できなかった傷病名に最も近いと思われる傷病名マスタ上の傷病名を探し出し、そのコードを対応付けることになる。この作業は手間がかかると同時に、登録するユーザによってばらつきや登録ミスが生じる可能性がある。
またコード管理に使用するコード体系を変更する際には、すべてのユーザ辞書上の傷病名をもう一度新たなコード体系を用いてコードを対応付け直さなくてはならない。
このようにユーザ辞書を用いたコード化技術はユーザにとって大変な労力を強いるため、極力標準的な傷病名マスタだけを用いてコード管理したほうが作業効率は各段に向上するとともに、ばらつきや登録ミスの防止の観点からも安全であるといえる。
したがって本発明の目的は、少ない負担でより多くの表記ゆらぎを傷病名マスタに照合しやすいコード化方法を提供することにある。
また本発明のもうひとつの目的は、傷病名マスタ照合の結果、完全に一致する傷病名がなかった場合には、適切と思われる傷病名とその一致度合いを評価し、ユーザに提示する環境を提供することにある。
また本発明のもうひとつの目的は、評価の結果ユーザが別の傷病名のコードに対応付ける場合、近似していると思われる傷病名マスタ上の傷病名を複数提示し、ユーザが最適な傷病名を選択できる手段を提供することにある。
【０００４】
【課題を解決するための手段】
本発明の傷病名のコード化方法は上記目的を達成するために、
入力された傷病名を傷病名マスタに照合する前に、傷病名上の表記ゆらぎ部分を標準的な表記に変換してマスタ傷病名が検出されやすくする。
また、本発明の傷病名のコード化方法は上記目的を達成するために、
入力された傷病名文字列内に複数の傷病名や修飾語句が含まれている場合、それらがどのような構造で配置されているかを解析し、その結果を元に傷病名文字列内に含まれている傷病名を用いてコード化可能か否かをユーザに提示する。
また、本発明の傷病名のコード化方法は上記目的を達成するために、
ユーザが入力された傷病名文字列に対して、他のコードを対応付けるべきであると判断した場合、入力された傷病名の近似傷病名を複数検出してユーザに提示し、その中から適切なコードを選択できるようにする。
【０００５】
【発明の実施の形態】
以下に本発明のコード化方法における第一の実施形態であるコード化システムについて説明する。
図１は本実施形態の基本構成図で、コンピュータ装置１に記録媒体に本発明のコード化方法を実行するプログラムが格納されている。図示されていないＯＣＲ装置、パンチ入力、フレキシブルディスクなどの外部記録媒体、他のコンピュータ装置とのネットワーク接続などを介して入力される傷病名文字列データ２はコンピュータ装置１内でコード化処理された結果、傷病名コード化結果データ３として出力される。傷病名コード化結果データ３の出力はフレキシブルディスクなどの外部記録媒体へ記録、ネットワーク接続などを介して他のコンピュータ装置に出力、あるいはプリンタによって帳票上に印刷されてもよい。
コンピュータ装置１は処理プログラムとして表記ゆらぎ標準化手段４、傷病名マスタ照合手段６、文字列構造解析手段８、近似傷病名検索手段１０、照合結果修正手段１２、制御手段１４から構成される。また記録領域として表記ゆらぎテーブル５、傷病名マスタテーブル７、構造式テーブル９、近似傷病名テーブル１１、コード化結果保存テーブル１３、照合テーブル１５を有し、処理結果を操作者が管理、修正するための周辺機器としてディスプレイ等の表示手段１６とキーボード、マウス等の入力手段１７を備える。
以下に図２を用いて本実施形態の大まかな動作について説明する。
まず外部より取得した傷病名文字列データ２である入力文字列は、表記ゆらぎ標準化手段４によって表記ゆらぎテーブル５に照合される。表記ゆらぎテーブル５にはあらかじめ慣用的表現や仮名遣いのゆらぎ語句と、その標準的な表記である標準語句が格納されており、ゆらぎ標準化手段４は表記ゆらぎテーブル５より入力文字列内のゆらぎ語句を検出し、対応する標準語句に変換する。
図２の例では傷病名文字列データ２として入力文字列「右前腕接触性皮フ炎」が入力されている。「皮膚」という語句は、慣用的に「皮フ」という表記で記されることがあり、表記ゆらぎテーブル５上には、あらかじめゆらぎ語句として「皮フ」、その標準語句として「皮膚」が登録してある。そのため入力文字列「右前腕接触性皮フ炎」は「右前腕接触性皮膚炎」という標準化文字列に変換される。このように表記ゆらぎ標準化手段４と表記ゆらぎテーブル５を用いることにより、入力文字列の中に含まれている慣用的な表現や、仮名遣いのばらつきによる表記ゆらぎを標準的な表記に変換することができる。したがって表記にゆらぎのある入力文字列も標準的な文字列としてコード化が可能になる。同様の効果を従来の技術のようにゆらぎ標準化手段４と表記ゆらぎテーブル５を用いずに実現する場合、ユーザ辞書には「皮膚炎」、「皮膚裂傷」のように「皮膚」という語句の含まれるすべての傷病名を「皮フ」と書き換えた語句である「皮フ炎」、「皮フ裂傷」をコードとともに登録しておかなくてはならない。「皮膚」がつく標準的な傷病名だけでも４００種類以上あり、すべての表記ゆらぎに対して登録を行えば、ユーザ辞書は巨大なものになる。このためユーザ辞書への登録作業の負担、照合時間が著しく増大するとともに、登録時の入力ミスによるコード化変換の危険性も増大する。
よって本実施携形態によるコード化方法のように、傷病名全体をユーザ辞書に登録するのではなく、表記ゆらぎ部分のみを変換した上で標準的な傷病名マスタに照合したほうが効率的で、登録ミスの危険性が低減する。
次に生成された標準化文字列は傷病名マスタ照合手段６によって傷病名マスタテーブル７に照合される。傷病名マスタテーブル７には、標準的な傷病名および傷病名に付加される修飾語句がマスタ語句として登録されており、またそれぞれのマスタ語句に対応するコードおよび語句属性が格納されている。傷病名マスタ照合手段６は標準化文字列の中に含まれているすべてのマスタ語句を検出、その語句のコードおよび語句属性を抽出する。
ここでいう語句属性とはマスタ語句の種類を示すパラメータであり、傷病名なのか修飾語なのか、修飾語句であるならばどのような種類の修飾語句なのかを分類するために使用する。図２の例では標準化文字列「右前腕接触性皮膚炎」を傷病名マスタテーブル７に照合した結果、マスタ語句「右」、「前腕」、「接触性皮膚炎」が検出され、それぞれの語句属性は位置を示す修飾語句である「位置」、身体の一部を示す修飾語句である「部位」、傷病名であることを示す「傷病」であるという結果を得ている。
マスタ照合の結果より、入力された傷病名文字列は位置、部位に関する修飾語句を含んだ傷病名「接触性皮膚炎」であり、そのコードは「６９２９１９６」であると判断できる。しかし一般に入力文字列に傷病名と修飾語句が含まれている場合、その傷病名が入力文字列に対応する傷病名であると断定することはできない。たとえば入力文字列が「皮膚炎（接触性）」のように記載されていることもあり、この場合傷病名マスタに照合した結果、「皮膚炎」という傷病名と「接触性」という「病因」を示す修飾語句として検出され、連続した「接触性皮膚炎」という傷病名は検出できない。したがってコードは「皮膚炎」を意味する「６８６９０４３」となる。このように傷病名文字列に含まれている傷病名をそのままコード化しても必ずしも適切なコード化結果が得られないこともある。
そこで本実施形態では入力文字列に含まれる傷病名、修飾語句などのマスタ語句がどのように配置されているかを示す構造式という概念を用いて、コード化結果が正当なものであるか否かを評価する。
構造式によるコード化評価には文字列構造解析手段８と構造式テーブル９を用いる。文字列構造解析手段８は傷病名マスタ照合手段６の生成した照合結果のうち語句属性に着目し、傷病名文字列内に検出されたマスタ語句の順に対応する語句属性を並べ、これを構造式として扱う。文字列構造解析手段８は構造式テーブル９にこの構造式を照合し、同一の構造式が検出された場合には対応する構造評価より、入力された傷病名文字列に含まれる傷病名のコードを採択できるか否か判断する。また同一の構造式が発見されなかった場合には採択不可と判断する。
図２の例では「右前腕接触性皮膚炎」の場合であれば、その構造式は「位置部位傷病」である。この構造式は構造式テーブル９においてその構造評価を「良」として登録されており、したがって傷病名文字列内に含まれている傷病名「接触性皮膚炎」のコードを採択可能であると判断する。
傷病名と修飾語句によって構成される傷病名文字列は、その組み合わせが無限にあるため、一つ一つの傷病名文字列に対して適切なコード化が行われているかどうか評価するのは困難である。しかし本実施形態のように傷病名文字列を構造式に置換えれば、そのほとんどが数百種類程度に収束できるので、構造式毎に採択可能であるかどうか評価すれば、ほとんどの傷病名文字列は自動的にコード化可能か否か判定できる。したがって入力された傷病名が傷病名マスタに一致する傷病名がなかった場合でも、ユーザは目視による確認を最小限に絞り込むことができる。
構造式毎の評価方法は以下のとおりである。一般に傷病名文字列に傷病名と「位置」や身体の「部位」を示す修飾語句が含まれていても、それらの修飾語句が傷病名に対する影響は少ないと考えられる。たとえば「右前腕接触性皮膚炎」が「左下腿接触性皮膚炎」であっても、「接触性皮膚炎」であることに変わりはない。しかし含まれている修飾語句が「病因」や「状態」を示すものであると、これらの傷病名は別の標準的な傷病名で言い表せる可能性があるため、傷病名文字列内に含まれている傷病名がそのまま採用できるとは限らない。また傷病名文字列内に複数の傷病名が検出された場合も、それが単に傷病名の羅列なのか、あるいは二つの傷病名で一つの意味を示しているか判断が難しく、最終的にはユーザによる目視確認を要する。
構造式によるコード化結果評価の提示、ユーザによる目視確認は照合結果修正手段１２によって行う。照合結果修正手段１２はここまでの処理によって求められた入力文字列、標準化文字列、マスタ語句として抽出された傷病名およびそのコード、構造評価をユーザに提示する。ユーザはこれらの情報をもとにこのコード化結果が採択可能であるか否かを判断し、必要に応じて修正を加える。これらのコード化結果はコード化結果保存テーブル１３に格納し、必要に応じて出力される。
コード化結果の修正には近似傷病名検索手段１０および近似傷病名テーブル１１を用いる。近似傷病名検索手段１０は表記ゆらぎ標準化手段４によってすでに生成されている標準化文字列を近似傷病名テーブル１１にあいまい検索を行い、標準化文字列と近似している複数の傷病名およびそのコードを抽出する。このあいまい検索結果は再び照合結果修正１２によってユーザに提示され、ユーザは提示されたいくつかの傷病名の中でもっとも適切であると思われる傷病名を選択することにより、そのコードがコード化結果保存テーブル１３に格納される。
従来の技術によればこの近似傷病名検索作業はユーザの手によって行われ、ユーザは傷病名とコードの対応表などをもとに適切な傷病名を探し出し、そのコードを入力することによりコード化していた。しかしこの方法ではユーザの検索作業の負担は大きく、またより適切な傷病名があるにもかかわらず他の傷病名のコードを登録してしまったり、コード入力時の入力ミスにより誤ったコードを登録してしまう危険性があった。本実施形態では事前に構造評価を行い、目視確認を要する必要最小限の入力文字列を抽出し、ユーザによるコードの変更は複数の近似傷病名の中からもっとも適切な傷病名を選択する作業によって実現するため、ユーザのコード化作業の負担は著しく低減するとともに、入力ミス等によるコード化の誤りを最大限抑制することが可能である。
以上のように、本実施形態によれば表記にゆらぎのある入力文字列を、より少ないユーザ作業負担により的確にコード化することが可能なコード化システムを提供することができる。
以下に図３、図４を用い、本実施形態の表記ゆらぎ標準化手段４と表記ゆらぎテーブル５の詳細な説明を行う。
図３は本実施形態の表記ゆらぎテーブル５の具体例を示す図である。表記ゆらぎテーブル５には、少なくとも慣用的な表記や仮名遣いのばらつきであるゆらぎ語句３０１と、それに対応する標準的な表記である標準語句３０２が格納されている。
表記ゆらぎテーブル５の目的は医師の自由裁量によって記載される様々な傷病名の表記ゆらぎを、傷病名マスタテーブル７に格納されているマスタ語句の表記に収束させることにある。よってゆらぎ語句には傷病名によく見受けられる表記のばらつき、たとえば慣用的な表記（「皮膚」と「皮フ」）、ひらがな／カタカナ／漢字表記の違い（「びらん」、「ビラン」、「糜爛」）、漢字の違い（「頸」と「頚」）、発音の違い（「ウィルス」と「ウイルス」）、長音記号の有無（「カタール」と「カタル」）などを、それぞれ傷病名マスタテーブル７に登録されている標準的な表記である標準語句とともに格納する。
図４は表記ゆらぎ標準化手段４の詳細な動作を示すフローチャート図である。
まずステップ４０１で制御手段１４により傷病名文字列データ２として取りこまれた入力文字列を表記ゆらぎ標準化手段４に取りこむ。
次にステップ４０２で表記ゆらぎテーブル５のゆらぎ語句３０１を参照し、入力文字列の一部に該当する語句がないかどうかを検索する。
もし該当するゆらぎ語句が発見された場合には、ステップ４０３でそのゆらぎ語句の部分を表記ゆらぎテーブル５の標準語句３０２に格納された語句に置換する。ステップ４０２、４０３によりすべてのゆらぎ語句が標準語句に変換されたら、ステップ４０４にてその文字列を標準化文字列として制御手段１４に返す。
このとき変換されるのは必ずしも入力文字列の一部とは限らず、入力文字列全体がゆらぎ語句として標準語句に変換されたり、入力文字列にゆらぎ語句が検出されずにそのまま標準化語句として制御手段１４に返されることもある。
以上、表記ゆらぎ標準化手段４と表記ゆらぎテーブル５により、入力文字列に含まれる表記ゆらぎが標準化され、次の工程である傷病名マスタテーブル６への照合の際に、より多くの傷病名文字列がコード化可能になる。
従来技術のようにユーザ辞書に表記ゆらぎを含む傷病名文字列とその対応するコードを格納する方法と比較し、本実施形態ではあくまで文字列上で表記ゆらぎを標準化するだけなので、次の工程で用いる傷病名マスタテーブル７に依存することがない。
つまり傷病名マスタテーブル７で用いるコード体系を変更するような場合、ユーザ辞書はすべての登録傷病名文字列のコードを新たなコード体系に基づき変更しなくてはならないが、本実施形態では文字列を置換えるだけなので、コード体系には依存しない。したがってユーザ辞書のコードをすべて書きかえるような膨大な作業負荷を回避することができる。
以下に図５、図６、図７、図８、図９を用い、本実施形態の傷病名マスタ照合手段６と傷病名マスタテーブル７の詳細な説明を行う。
図５は本実施形態における傷病名マスタテーブル７の具体例を示す図である。傷病名マスタテーブル７には少なくとも傷病名、修飾語句などのマスタ語句５０１とそれに対応するコードであるコードＡ５０２、コードＢ５０３及びマスタ語句の種類を表す語句属性５０４が格納される。格納されるコードは必ずしも二種類である必要はなく、一種類もしくは三種類以上格納してもかまわない。
一般に広く使われているコードとしては、レセプトに記載する傷病名のコード化を主な目的とするレセプト電算処理システムの病名マスタと修飾語マスタがあり、数多くの傷病名や修飾語がコード付けされている。また傷病名をもとに疾病動向分析を行う際など、あまりコードの種類が多すぎると分類が細かくなりすぎる場合には、膨大な数の傷病名を１１９項目に分類してコード付けしてある社会保険表章用疾病分類（中分類コード）がよく用いられる。このようにコードはその利用目的によってユーザが選択できることが好ましく、傷病名マスタテーブル７に複数のコード体系を格納しておけば、ユーザの利用目的に合わせて適切なコード体系を選択することができる。
図６は傷病名マスタテーブル７に格納される語句属性５０４の例を示す表である。本実施形態では傷病名マスタテーブル７に格納されるすべてのマスタ語句に語句属性が対応付けられている。しかし傷病名文字列として自由に記載される可能性のあるすべての語句をマスタ語句として登録することはできないので、マスタ語句として登録されていない文字列は語句属性が「未知」であるとして扱う。
図７は傷病名マスタ照合手段６の詳細な動作を示すフローチャート図である。
まずステップ７０１で制御手段１４により標準化文字列を入力文字列として傷病名マスタ照合手段６に取りこむ。
ステップ７０２では傷病名マスタテーブル７のマスタ語句５０１を参照し、入力文字列の一部に該当する語句がないかどうかを検索する。
ステップ７０３にて該当する語句が検出された場合、ステップ７０４でそのマスタ語句５０１、コードＡ５０２、コードＢ５０３、語句属性５０４を照合テーブル１５に格納する。すべての該当するマスタ語句が照合テーブル１５に格納されたら、ステップ７０５で必要なマスタ語句だけを選択し、残りのマスタ語句は照合テーブル１５より消去する。これはたとえば「前腕」という語句に検出されるマスタ語句は「前」、「腕」、「前腕」と３種類あるが、必要なのは「前腕」だけであり、他のマスタ語句は不用だからである。
ステップ７０６では最終的に選択されたマスタ語句と、そのもとになる入力文字列を比較し、入力文字列内にマスタ語句が検出されなかった連続する部分文字列を語句属性「未知」として照合テーブル１５に格納する。このときコードＡ、コードＢには何も格納されない。
図８は図７のステップ７０５において必要なマスタ語句だけを選択するためのルールを説明する表である。マスタ照合により複数のマスタ語句が検出された場合、検出されたマスタ語句Ａ、マスタ語句Ｂの入力文字列内での位置、文字数からマスタ語句Ａ、Ｂいずれが選択されるべきかを調べている。
図８の１）から３）のようにマスタ語句Ａとマスタ語句Ｂの関係がいずれか一方が他方の語句に内含されてしまう関係にあるならば、長いほうのマスタ語句が選択される。たとえば「前腕」と「前」とでは「前腕」が選択される。
図８の４）のようにマスタ語句Ａとマスタ語句Ｂが入力文字列内でまったく重ならない関係であれば、両者とも独立したマスタ語句であるため、いずれも選択する。
図８の６）から８）のようにマスタ語句Ａとマスタ語句Ｂが前後にずれて入力文字列内に配置されている場合、マスタ語句Ａとマスタ語句Ｂいずれか文字数の多いほうを選択するが、文字数の同一である場合に限って６）のように後方にずれているほうを選択する。
図９は図７のステップ７０５において必要なマスタ語句が選択され、ステップ７０６で未知の文字列が登録される前後の照合テーブル１５の状態を示す図である。
図９では例として「右前腕接触性皮膚炎の可能性」という傷病名文字列が傷病名マスタ照合手段６に入力されたとする。マスタ照合の結果、マスタ語句として傷病名マスタテーブルより検出されたのは「右」、「前」、「前腕」、「腕」、「接触性皮膚炎」の５個のマスタ語句であり、図９の上の表に示すように照合テーブル１５にはそれぞれのマスタ語句、コードＡ、コードＢおよび語句属性が格納される。
その後、図７のステップ７０５により、図８に示した選択ルールに則り必要なマスタ語句だけを選択した結果、「右」、「前腕」、「接触性皮膚炎」だけが残る。次にステップ７０６が入力文字列との比較を行った結果、「の可能性」という文字列がマスタ語句として検出されなかったことがわかるため、この「の可能性」を語句属性「未知」であるマスタ語句として照合テーブル１５に登録した結果、図９の下の表に示す通りになる。
以上、傷病名マスタ照合手段６と傷病名マスタテーブル７により、入力された傷病名文字列に含まれる各マスタ語句のコードが取得される。また語句属性の取得により次の工程である文字列構造解析手段８で構造解析を行い、このコード化結果の評価を行うことができる。
以下に図１０、図１１を用い、本実施形態の文字列構造解析手段８と構造式テーブル９の詳細な説明を行う。
図１０は本実施形態における構造式テーブル９の具体例を示す図である。構造式テーブル９には少なくとも語句属性の組み合わせである構造式１００１と、その構造式を有する傷病名文字列を検出された傷病名のコードで採択可能であるか否かを示す構造評価１００２が格納されている。またその構造式がどのような理由で採択可否であるかの説明文であるコメント１００３も格納しておけば、ユーザによる目視確認の際にコメント１００３もユーザに提示し、的確な判断を促す補足説明を行うことができる。
構造評価１００２は必ずしも採択可否の二値である必要はなく、図１０に示すように「優」、「良」、「可」、「不可」などと数段階に分類して登録しておくことにより、ユーザは目視確認の際に構造評価１００２を参照し、厳密にコード化したい場合には「優」以外すべて、さほど厳密さが要求されない場合には「不可」のみ目視確認するなど、ニーズや状況に応じてコード化の精度を使い分けることが可能になる。
図１１は文字列構造解析手段８の詳細な動作を示すフローチャート図である。
まずステップ１１０１で照合テーブル１５にある傷病名文字列内に検出されたマスタ語句を傷病名文字列内に出現した順番に並べ替え、各マスタ語句に対応する語句属性を並べ替えられたマスタ語句の順番通りに並べて連結し、構造式を得る。たとえば「左前腕接触性皮膚炎」の構造式は「位置部位傷病」になる。
ステップ１１０２にて文字列構造解析手段８は構造式テーブル９の構造式１００１を参照し、ステップ１１０１で得られた構造式と同一の構造式を検索する。
ステップ１１０３で構造式テーブル９内に同一の構造式が検出された場合は、ステップ１１０４で該当する構造式の構造評価１００２とコメント１００３を制御手段１４に返す。
またステップ１１０３で同一の構造式は検出されなかった場合には、構造評価を「不可」、コメントを「構造式が登録されていません」として制御手段１４に返す。
以上、文字列構造解析手段８と構造式テーブル９により、入力された傷病名文字列に含まれるマスタ語句の語句属性の配列である構造式を用いて、その傷病名文字列に含まれる傷病名のコードを傷病名文字列のコードとして用いることが可能であるか否かの構造評価ができる。よってユーザは傷病名文字列が完全にマスタ語句と一致しなかった場合でも、その傷病名文字列とコード化結果を目視確認する必要があるか、構造評価を元に判断することができるため、目視確認作業を必要最小限に絞り込むことができる。
以上のように入力された傷病名文字列の表記ゆらぎの標準化、マスタ照合によるコードの取得、構造評価によるコード化結果の評価の一連の処理は自動的に行うことが可能である。したがってユーザは大量の傷病名文字列が入力された場合でも、すべての傷病名文字列の自動処理終了後に表示画面を閲覧し、構造評価により自動コード化できない傷病名文字列だけを目視確認し、必要に応じて他にコードに書き換えることができる。
以下に図１２、図１３を用い、本実施形態の照合結果修正手段１２とコード化結果保存テーブル１３の詳細な説明を行う。
図１２は本実施形態におけるコード化結果保存テーブル１３の具体例を示す図である。
次々と入力された傷病名文字列は各工程を経てコード化され、コード化結果保存テーブル１３に蓄積されていく。コード化結果保存テーブル１３に格納される項目は、まず入力されたそのままの文字列である入力文字列１２０１、次に表記ゆらぎ標準化手段４により標準化された標準化文字列１２０２、傷病名マスタ照合手段６により抽出されたマスタ語句のうち、語句属性が「傷病」である採用傷病名１２０３、採用傷病名１２０３のコードであるコードＡ１２０４、採用傷病名の他のコード体系によるコードであるコードＢ１２０５と続く。その次のコードＡ列１２０６とは傷病名マスタ照合手段６により抽出されたマスタ語句すべてのコードを出現順に並べたものであり、傷病名や各種修飾語句を示すコードの羅列である。
続いて文字列構造解析手段８により求められた構造式１２０７、および構造評価１２０８、コメント１２０９も格納されている。
図１２では構造式１２０７の見読性を高めるために、語句属性が「未知」である語句や、「括弧」、「接続」である語句を元の文字列を用いて表示してある。すなわち、番号４の入力文字列「喘息様気管支炎」は、「喘息」および「気管支炎」はマスタ語句として検出されたが、中間の「様」は「未知」である語句であるため、ここでは「傷病[様]傷病」と表示する。また同様に番号７の「扁桃炎（慢性）」の括弧の語句属性は「括弧」であるが、ここでは「傷病（経過）」としている。
図１３は照合結果修正手段１２により表示手段１６に表示された画面の具体例である。
この画面はコード化結果保存テーブル１３に蓄積されたコード化結果とその評価をユーザに提示し、ユーザの手による修正が必要であるか否か判断を促すためのものである。
リスト１３０１はコード化結果保存テーブル１３に蓄積された各入力文字列、およびユーザが修正の必要性を判断するために必要な主な項目の一覧である。
ボタン１３０２およびボタン１３０３はリスト１３０１に表示する内容の切り替えを行うボタンで、全コード化結果を表示するか、構造評価に基づき要目視確認であるコード化結果のみ表示するかを選択できる。このときコード化結果を表示する構造評価はユーザが前もって設定しておく。たとえばユーザによっては構造評価が「不可」、他のユーザにおいては「可」と「不可」というように設定を変えることにより、コード化修正作業の作業量を自由に増減することが可能である。したがってユーザは入力文字列の数、必要なコード化精度、費やせる作業量により要目視確認傷病名の量を自在に制御することが可能である。
テキストボックス１３０４から１３１２は、リスト１３０１で選択された番号に関するコード化結果保存テーブル１３に蓄積されたデータが表示されている。ユーザはこれらの情報をもとにより精細な判断をすることが可能になる。
ボタン１３１３はリストから選択されたこの入力文字列をもう一度自動コード化の一連の工程に流す場合、ボタン１３１４はこのコード化結果を採用する場合、そしてボタン１３１５はこのコード化結果が満足のいかないものであり、後述する近似傷病名検索によってコード化したいとユーザが判断した場合、近似傷病名検索を実行するためのボタンである。ユーザがこのコード化結果を採用すると判断した場合、かならずしもボタン１３１４を押さなくても、ボタン１３１５により近似傷病名検索が行われなければ、採用されたものと判断しても構わない。
このようにユーザは照合結果修正手段１２により表示された画面をもとに修正が必要と思われる入力文字列だけを閲覧できる。またそのコード化結果をよく吟味した結果、修正が必要と判断したのであれば、この画面から後述する近似傷病名検索画面を表示し、そこでさらに適切なコードを検索することができる。
本実施形態における近似文字列検索の例として、自然言語処理によく用いられるＮグラムを用いたあいまい検索手法がある。一般に語句や文章内で隣接する長さＮの文字列をＮグラムと呼び、たとえば文字列「心筋梗塞」の１グラムは「心」、「筋」、「梗」、「塞」であり、２グラムは「心筋」、「筋梗」、「梗塞」である。このように文字列をグラムに分解し、同様にマスタテーブルにもマスタ語句をグラムに分解したものを格納しておけば、両者に存在するグラムの数が多いほど両者の文字列は近似しているということがわかる。当然同一の文字列であれば、すべてのグラムが一致するが、たとえば文字がいくつか欠けていたり、語順が逆になっている場合でも多くのグラムは一致するため、両者はかなり近い文字列であるなどと判断する。
以下に図１４、図１５、図１６を用い、本実施形態における近似傷病名検索手段１０と近似傷病名テーブル１１の詳細な説明を行う。
図１４は本実施形態の近似傷病名テーブル１１の具体例を示す図である。
近似傷病名テーブル１１には傷病名マスタテーブル７に格納されているマスタ語句がグラムに分解された状態で格納されており、傷病名マスタ番号１４０１には傷病名マスタテーブル７において各マスタ語句を管理している番号、そしてグラム１４０２にはそのマスタ語句のグラムが格納されている。
図１４の例ではマスタ語句番号１の「心筋梗塞」と２の「アレルギー性鼻炎」の１グラムと２グラムが格納されている。
Ｎグラムにおいては文字列の長さＮをいくつにするかにより、あいまい検索の精度が変化するが、傷病名の検索においては１と２の両方を用いるのが検索精度とテーブルの容量のバランスからちょうどいい。もちろん３グラム以上を用いてもかまわない。
また近似傷病名テーブル１１にグラムを格納するマスタ語句は、傷病名マスタテーブル７に格納されたすべてのマスタ語句である必要はなく、目的が傷病名の検索であるから語句属性が「傷病」である傷病名、もしくはその中でも代表的な傷病名だけに限り、選択肢を減らして選択しやすいようにすることも可能である。図１５は近似傷病名検索手段１０の詳細な動作を示すフローチャート図である。図１３に示した画面でユーザが選択した入力文字列が近似傷病名検索するよう命じられると、ステップ１５０１はその傷病名文字列の標準化文字列の１および２グラムを生成する。このときグラム分解する文字列は入力文字列そのものでもよいが、標準化文字列を用いたほうが検索精度は向上する。
ステップ１５０２では生成したグラムを近似傷病名テーブル１１に格納された部分文字列１４０２に照合する。一致するグラムがある場合、その傷病名マスタ番号１４０１ごとに一致したグラム数を集計する。
ここでより多くのグラムが一致するほど近似した傷病名といえるが、本実施形態では両者の近似度合いを定量的に示す適合率として、
（一致したグラム数）÷（マスタ語句のグラム数）
を用いる。
グラムの一致の見られたマスタ語句は照合結果修正手段１２に渡され、ステップ１５０４で表示手段１６を用いて適合率の高い順にマスタ語句、コードＡ、コードＢとともにユーザに提示される。
ステップ１５０５でユーザ入力手段１７を用いて、これらのマスタ語句の中からもっとも適切を思われるマスタ語句を選択することにより、ステップ１５０６でそのマスタ語句の各種データが入力文字列の新たなコード化結果としてコード化結果保存テーブル１３に格納される。
図１６はステップ１５０４で照合結果修正手段１２により表示手段１６に表示される画面の具体例である。図１３の画面より近似傷病名検索の実行を命ぜられるとこの図１６の画面が表示されるが、上半分の入力文字列リストは図１３と同じものである。画面下半分が近似傷病名検索のための表示であり、テキストボックスにリストで選択された入力文字列１５０１とその標準化文字列１５０２が表示されている。
グラムを用いたあいまい検索の結果である近似傷病名検索結果は画面右下のリストに表示されており、それぞれグラムの一致が見られたマスタ語句である候補傷病名１５０３、そのコードＡ１５０４、コードＢ１５０５、そして適合率１５０６（パーセント表示）である。
リストに表示されるもののうち、コードＡ１５０４、コードＢ１５０５は必ずしも必須ではないが、他の候補傷病名と見比べることにより、どの候補傷病名が適切であるかより判断が容易になる。
また各候補傷病名は適合率の高い順に上から表示することにより、ユーザは一致度の高い候補傷病名から順に吟味できるため、選択作業が容易になる。
ここで適合率として
（一致したグラム数）÷（マスタ語句のグラム数）
を用い、
（一致したグラム数）÷（標準化文字列のグラム数）
を用いないのは、両者の一致度の順番が異なるからである。
前者ではたとえば「接触性皮膚炎」を検索した場合、「接触性皮膚炎」とともに「皮膚炎」もそれぞれ適合率１としてリストの最上位に表示される。しかし後者を適合率として用いると「接触性皮膚炎」は同様に適合率１としてリスト最上位に表示されるが、「皮膚炎」はおおよそ半分のグラムしか一致してないことになるので、かなり下方に表示されることになる。
一般に傷病名の検索を行う場合、完全に一致する傷病名が見つからないのであれば、その傷病名をもっと広く捉えた傷病名、すなわち「接触性皮膚炎」なら「皮膚炎」、「急性気管支炎」なら「気管支炎」を選択することが自然である。
したがって前者の適合率を用いることにより、その傷病名を広く捉えた傷病名をより適合率が高いと評価でき、ユーザの選択作業に適した優先順位で候補傷病名を提示することができる。
ユーザは近似傷病名検索により、自動コード化処理より適切な傷病名をリストから見つけた場合、その候補傷病名を選択し、「採用」ボタン１５０７を押すことにより、その結果をコード化結果保存テーブル１３に格納できる。また選択しない場合には「構造解析」ボタン１５０８を押すことにより図１３の画面に戻り、自動コード化によるコード化結果を採用することも可能である。
近似傷病名検索による傷病名文字列のコード取得は、従来の技術のようにユーザが幾万とある標準的な傷病名の中からもっとも適切なものを選び出し、そのコードをパンチ入力等で入力するのに対し、はるかに容易でかつ入力ミスによる誤登録を防ぐ効果が高い。
また適切なコードが見つからない場合でも、容易にその傷病名文字列を広義に捉えた傷病名を抽出することができるので、ユーザの知識や経験の違いによるコード化結果のばらつきが少なく、だれでもおおよそ同じようなコード化結果を取得することが可能である。コード化結果の一貫性は、別の疾病動向分析結果と比較するような際に、解釈の相違による結果の違いを防ぐことができ、統計解析上重要である。
図１7は本実施形態における近似傷病名検索後のコード化結果保存テーブル１３の具体例を示す図である。
ここでは図１２の時点で構造評価が「不可」であった入力文字列１７０１の「喘息様気管支炎」、「臭覚障害」、「扁桃炎(慢性)」が近似傷病名検索を行われた結果、より適切なコードに更新されている。また構造評価１７０８が近似傷病名検索による結果であることを示す「近似」に変わり、コメント１７０９にもその旨表示されている。近似傷病名検索ではコードＡ列１７０６、構造式１７０７は生成されないため、表示は空欄である。
近似傷病名検索では様々な傷病名文字列をコード化することができ、たとえば図１７では「喘息様気管支炎」（喘息のような症状の気管支炎）と「喘息性気管支炎」（喘息による気管支炎）にように意味の近い傷病名、「臭覚障害」と「感覚障害」のようにより広義な傷病名、「扁桃炎（慢性）」と「慢性扁桃炎」のように語順違いの表記なども容易にコード化できる。
コード化結果保存テーブル１３に蓄積されたコード化結果は任意にコンピュータ装置1外に出力することができ、ユーザはこの情報を元に集計、統計処理を行い、疾病動向分析などに活用することができる。
以下に図１８、図１９を用い、本実施形態における各種テーブルの保守画面について説明する。
図１８は表記ゆらぎテーブル５の保守画面の具体例を示す図である。
保守画面には表記ゆらぎテーブル５に格納された語句を一覧するリストがあり、少なくともゆらぎ語句１８０１、標準語句１８０２が表示されている。変換回数１８０３はそのゆらぎ語句と標準語句が表記ゆらぎ標準化手段４によって利用された回数の累積である。
ユーザはこのリストを見ながらボタン１８０４により新たなゆらぎ語句と標準語句を追加したり、あるいはボタン１８０５により不要なゆらぎ語句と標準語句を削除、ボタン１８０６により語句の内容を変更したり、あるいはボタン１８０７により編集しないで保守画面を終了することができる。
随時入力される傷病名文字列を効率的にコード化するには、常に表記ゆらぎテーブルの最適な状態に保つことが好ましい。したがって新たに発生した表記ゆらぎによるコード化結果の不良があれば、新たなゆらぎ語句、標準語句を追加したり、あるいは以前登録したゆらぎ語句と標準語句がその後ほとんど利用されていないようであれば、削除したほうが照合負荷が低減し、より高速な変換が可能になる。変換回数１８０３はゆらぎ語句と標準語句の利用状況を把握するための目安として表示されている。
図１９は構造式テーブル９の保守画面の具体例を示す図である。
保守画面には構造式テーブル９に格納された構造式を一覧するリストがあり、少なくとも構造式１９０１、構造評価１９０２が表示されている。また構造評価をユーザに提示する際に、より詳細な情報を付加するためにコメント１９０３も表示されている。
ユーザはこのリストを見ながらボタン１９０４により新たな構造式を追加したり、あるいはボタン１９０５により不要な構造式を削除、ボタン１９０６により構造式や構造評価を変更したり、あるいはボタン１９０７により編集しないで保守画面を終了することができる。
構造式は本実施形態においてコード化結果の評価に用いる指標であり、新たな構造式を追加する場合、あるいは構造評価がユーザの意図しないものであった場合、この保守画面で構造評価の調整を行う必要がある。
以下に図２０、図２１を用いて、本発明のコード化方法における第二の実施形態であるコード化システムについて説明する。
前述した本発明の実施形態では、マスタ照合による自動コード化の結果が適切でないとユーザが判断した場合、そのつどユーザは近似傷病名検索を行い、適切なコードに変換する必要がある。この場合、同じ傷病名文字列が入力された場合、また同じ処理をユーザに負担させることになる。
図２０に示す本実施形態の構成図では、図１の前述した実施形態に加えて変換歴記録手段１８、変換歴検索手段１９、変換歴テーブル２０を設けている。
自動コード化されてかつ、ユーザによって採用と判断された結果、もしくはユーザによって近似傷病名検索された結果、コード化結果保存テーブル１３に格納された入力傷病名文字列、標準化文字列、コードＡ、コードＢ、コードＡ列、構造式、構造評価、コメントは変換歴記録手段１８によって変換歴テーブルに格納される。
本実施形態において傷病名文字列が入力された場合、制御手段１４はまず変換歴検索手段１９にその傷病名文字列を渡し、変換歴検索手段１９は変換歴テーブルに同一の入力文字列を検索する。
同一の入力文字列が検出された場合、変換歴検索手段１９はそのテーブル内の情報を制御手段１４に返し、制御手段１４は照合結果修正手段１２を介し、コード化結果保存テーブル１３に記録する。
本実施形態によれば、一度ユーザにより採用されたうえでコード化結果保存テーブル１３に格納されたコード化結果は変換歴テーブル２０に格納され、次回同じ傷病名文字列が入力された場合は、変換歴テーブル２０からコード化結果保存テーブル１３に登録されるため、途中の様々な処理を省略することが可能になる。したがって特に大量の傷病名文字列が入力され、その中に同一の傷病名文字列が多く含まれている場合、著しい処理速度の向上が可能になる。
図２１は本実施形態における照合結果修正手段１２により表示手段１６に表示される画面の具体例である。
前述した実施形態の図１６と同様の近似傷病名検索結果の表示画面であるが、図２１では加えてボタン２１０１が設けられ、ユーザはこの近似傷病名検索結果を変換歴テーブル２０に格納するか否かを選択することが可能である。したがってユーザは任意のコード化結果のみを次回も同様なコード化結果を得るように調整することが可能である。
以下に図２２、図２３を用いて、本発明のコード化方法における第三の実施形態であるコード化システムについて説明する。
図２２は本実施形態の構成図で、第一の実施例の構成図である図１の入力である傷病名文字列データ２に対し、本実施例ではコード化処理を行うコンピュータ装置１は診療情報データベース１８に接続され、ここから入力情報を得ている。診療情報データベース２１とは電子カルテとかレセプト処理システムなどの診療情報を管理するデータベースで、傷病名以外にも患者情報、診療行為情報、会計情報など、さまざまな情報が蓄積されている。
本実施形態ではコード化を行う際に傷病名文字列以外にもさまざまな診療情報が得られるため、これらの情報を目視確認、ユーザによるコード修正の際に参照することが可能になる。
図２３は本実施形態における表示手段１８に表示される画面の具体例である。表示手段１７には第一の実施形態で図１３、図１５を用いて説明した照合結果表示画面２３０１が表示され、加えて診療情報データベース２１より取得された診療情報表示画面２３０２も表示されている。診療情報表示画面２３０２は文字情報であったり、画像情報でもかまわない。
ユーザは照合結果表示画面２３０１を用いてコード化結果を確認し、必要に応じて修正を加える際、診療情報表示画面２３０２を参照することにより、傷病名文字列が診断結果として採用された経緯を知ることができ、より精度良くコードの修正をすることが可能になる。
本発明の要点をまとめると以下のようになる。
入力した文字列をマスタ照合によりコード化する際、あらかじめ文字列の中の表記ゆらぎのある語句を標準的な語句に変換する。
さらに、入力した文字列の中の表記ゆらぎのある語句を標準的な語句に変換するために、表記ゆらぎの語句と標準的な語句を対にして記憶領域に格納する。
さらに、記憶領域に格納された表記ゆらぎのある語句と標準的な語句の対を編集する表示画面を有する。
入力した文字列をマスタ照合によりコード化する際、文字列に含まれているひとつ、あるいは複数のマスタに登録された語句を検出し、その語句属性の配列を生成し、配列ごとにコード化可能か否か評価する。
さらに、語句とコードおよび属性を記憶領域に格納する。
さらに、語句属性とコード化可能か否かを示す構造評価を記憶領域に格納する。さらに、記憶領域に格納された語句属性の配列と構造評価を編集する表示画面を有する。
さらに、入力された文字列とコード化した結果を表示する際、評価の結果、修正する必要のある入力文字列の分だけ表示できる。
さらに、評価の結果、修正の必要があればこの画面からあいまい検索を行う画面を表示する。
入力された文字列とコード化した結果を記憶領域に格納し、次回同一の文字列が入力された際には、そのコード化結果を出力する。
さらに、入力された文字列とコード化した結果を記憶領域に格納するか否かを選択できる表示画面を有する。
入力された文字列とコード化した結果を表示する際、入力された文字列の元になる帳票のデータもしくは画像を合わせて表示する。
【０００６】
【発明の効果】
以上のように本発明のコード化方法は、傷病名をコード化する際に、従来技術のように様々な傷病名の表記ゆらぎを含む傷病名をすべてユーザ辞書に登録するといった手間がかかり、登録ミスの起こりやすい作業を行わなくてすむようになるため、作業効率およびコード出力の信頼性が大幅に向上するという著しい効果がある。
また本発明によれば、従来技術のように傷病名マスタ照合時に未検出であった傷病名はすべて人の目視により確認し、必要に応じてユーザ辞書に追加する必要があったのに対し、自動的にコード化結果を評価してユーザに提示することにより、ユーザのニーズや用途に応じて最小限の目視確認、修正作業が可能になるという効果がある。
また本発明によれば、従来技術のように傷病名マスタ照合時に未検出であった傷病名をコード化する際に、パンチ入力等で直接コードを入力しなくてはならなかったのに対し、ユーザはもっとも適切な近似傷病名を選択することにより、そのコードが記録されるため、作業効率およびコード出力の信頼性が大幅に向上するという著しい効果がある。
【図面の簡単な説明】
【図１】本発明による一実施形態を示す基本構成図。
【図２】本発明による一実施形態を示す処理流れ図。
【図３】表記ゆらぎテーブルの一例を示す図。
【図４】表記ゆらぎ標準化手段のフローチャート図。
【図５】傷病名マスタテーブルの一例を示す図。
【図６】語句属性の種類を示す説明図。
【図７】傷病名マスタ照合手段のフローチャート図。
【図８】複数のマスタ語句が検出されたときの選択ルールを示す説明図。
【図９】照合テーブルの挙動を示す説明図。
【図１０】構造式テーブルの一例を示す図。
【図１１】文字列構造解析手段のフローチャート図。
【図１２】自動処理後のコード化結果保存テーブルの一例を示す図。
【図１３】自動処理後の照合結果修正手段におけるコード化結果表示画面の一例を示す図。
【図１４】近似傷病名テーブル一例を示す図。
【図１５】近似傷病名検索手段のフローチャート図。
【図１６】近似傷病名検索時の照合結果修正手段におけるコード化結果表示画面の一例を示す図。
【図１７】近似傷病名検索後のコード化結果保存テーブルの一例を示す図。
【図１８】表記ゆらぎテーブルの保守画面例の一例を示す図。
【図１９】構造式テーブルの保守画面例の一例を示す図。
【図２０】本発明による第二の実施形態を示す基本構成図。
【図２１】本発明による第二の実施形態における近似傷病名検索後の照合結果修正手段におけるコード化結果表示画面の一例を示す図。
【図２２】本発明による第三の実施形態を示す基本構成図。
【図２３】本発明による第三の実施形態における表示画面の一例を示す図。
【符号の説明】
１…コンピュータ装置
２…傷病名文字列データ
３…傷病名コード化結果データ
４…表記ゆらぎ標準化手段
５…表記ゆらぎテーブル
６…傷病名マスタ照合手段
７…傷病名マスタテーブル
８…文字列構造解析手段
９…構造式テーブル
１０…近似傷病名検索手段
１１…近似傷病名テーブル
１２…照合結果修正手段
１３…コード化結果保存テーブル
１４…制御手段
１５…照合テーブル
１６…表示手段
１７…入力手段
１８…変換歴記録手段
１９…変換歴検索手段
２０…変換歴テーブル
２１…診療情報データベース。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a coding processing technology that facilitates coding of words with fluctuations in notation, in particular, an electronic medical record system for managing the names of wounds and diseases for medical institutions to analyze disease trends, etc., or insurers and examinations The present invention relates to a receipt processing system in which a payment institution encodes an injury / illness name described in a medical fee description and performs statistical processing.
[0002]
[Prior art]
When a doctor examines a patient, he or she records the name of the patient's injury based on the medical condition and condition. The notation method of injury and illness names is left to the discretion of individual doctors, and in addition to the standard injury and illness names, there are various injury and illness names to express the same medical condition and pathology, such as different idiomatic injury and illness names and kana names. appear. In addition, various modifiers that more accurately describe the patient's condition and medical condition are added, so the notation of wound names is more diverse, and these notation fluctuations are a major obstacle from the viewpoint of code management of wound names. ing.
For example, an insurer such as a health insurance association aggregates the names of injuries and sicknesses listed in the injuries and sickness names column of the medical treatment remuneration statement (hereinafter referred to as “receipt”) issued by each medical institution to effectively develop health services. Statistical processing of disease trends is performed for the purpose.
In order to analyze the disease trend, it is necessary to encode and summarize the names of wounds, but conventionally this work is done by manually converting the names of wounds and diseases described in the receipt into codes corresponding to experienced workers, I was counting. However, some insurers receive hundreds of thousands to millions of receipts per month, and it is desired that the monthly summation is heavy and burdensome in terms of time and cost.
In order to realize the mechanization of reading the name of sickness by the conventional technique, for example, the character string of the name of the sickness described in the receipt is input by an OCR device (character reading device) or punch input, and the recording medium storing the name of the sickness and the disease is stored. There is a method of coding a corresponding disease name by collating with a certain disease name master. However, since the injury / illness name character string described in the receipt has the above-described fluctuation, it is not always easy to reliably obtain a code even if it is checked against the injury / illness name master. For this reason, it is necessary to provide a user dictionary separately from the disease name master, and to register the disease name character strings described in the receipt that have not been detected together with the corresponding codes.
[0003]
[Problems to be solved by the invention]
In order to automatically code many names of injuries and diseases according to the conventional technique, it is necessary to register more notation fluctuation names in the user dictionary.
For this reason, it has been necessary to visually check the names of wounds and diseases that could not be coded at the time of collation of the names of the disease and disease by the conventional technique, and add them to the user dictionary as necessary.
In the registration work in the user dictionary, the wound name on the wound name master that seems to be the closest to the wound name that could not be coded is searched for and associated with the code. This operation is time-consuming and may cause variations and registration errors depending on the registered user.
In addition, when changing the code system used for code management, it is necessary to re-associate the codes with all the names of wounds in the user dictionary by using a new code system.
In this way, coding techniques using user dictionaries impose a great deal of effort on the part of the user, so code management using only the standard injury name master as much as possible improves the work efficiency as well as variations and registration errors. It can be said that it is safe from the viewpoint of prevention.
Accordingly, an object of the present invention is to provide a coding method that can easily collate more notational fluctuations with a disease name master with less burden.
Another object of the present invention is to provide an environment for evaluating the wound name and the degree of matching that are considered appropriate and presenting them to the user when there is no perfectly matched wound name as a result of the wound name master comparison. It is to provide.
Another object of the present invention is to present a plurality of wound names on the wound name master that is considered to be approximate when the user associates with a code of another wound name as a result of the evaluation, so that the user can obtain the optimal wound name. It is to provide a means for selecting.
[0004]
[Means for Solving the Problems]
In order to achieve the above object, the method of encoding a disease name according to the present invention provides:
Before collating the entered wound name with the wound name master, the notation fluctuation part on the wound name is converted into a standard notation so that the master wound name is easily detected.
In addition, in order to achieve the above object, the method of encoding a disease name according to the present invention,
When multiple wound names and modifiers are included in the input disease name string, the structure of them is analyzed and included in the wound name string based on the result. It is shown to the user whether or not encoding is possible using the name of the wound or illness.
In addition, in order to achieve the above object, the method of encoding a disease name according to the present invention,
When the user determines that another code should be associated with the input character string of the disease name, the user detects a plurality of approximate disease names of the input disease name and presents them to the user. Lets you select a code.
[0005]
DETAILED DESCRIPTION OF THE INVENTION
The encoding system which is the first embodiment of the encoding method of the present invention will be described below.
FIG. 1 is a basic configuration diagram of this embodiment. A computer apparatus 1 stores a program for executing the encoding method of the present invention on a recording medium. The disease name character string data 2 input through an OCR device, punch input, an external recording medium such as a flexible disk, a network connection with another computer device, etc., not shown in the figure, is encoded in the computer device 1 As a result, it is output as injury / illness name encoding result data 3. The output of the wound name encoding result data 3 may be recorded on an external recording medium such as a flexible disk, output to another computer device via a network connection, or printed on a form by a printer.
The computer apparatus 1 includes a notation fluctuation standardization means 4, wound name master matching means 6, character string structure analysis means 8, approximate wound name search means 10, matching result correction means 12, and control means 14 as processing programs. In addition, the recording area includes a notation fluctuation table 5, a disease name master table 7, a structural formula table 9, an approximate disease name table 11, an encoded result storage table 13, and a collation table 15, and the operator manages and corrects the processing results. As a peripheral device, a display unit 16 such as a display and an input unit 17 such as a keyboard and a mouse are provided.
The general operation of this embodiment will be described below with reference to FIG.
First, the input character string, which is the disease name character string data 2 acquired from the outside, is collated with the notation fluctuation table 5 by the notation fluctuation standardization means 4. The notation fluctuation table 5 stores in advance fluctuation phrases of idiomatic expressions and kana words and standard phrases that are standard notations. The fluctuation standardization means 4 stores fluctuation phrases in the input character string from the notation fluctuation table 5. Detect and convert to the corresponding standard phrase.
In the example of FIG. 2, the input character string “right forearm contact dermatitis” is input as the disease name character string data 2. The phrase “skin” is sometimes written in the notation “skinfu”, and “skinfu” is registered in advance in the notation fluctuation table 5 as “skinfu” and “skin” as its standard phrase. It is. Therefore, the input character string “right forearm contact dermatitis” is converted into a standardized character string “right forearm contact dermatitis”. By using the notation fluctuation standardization means 4 and the notation fluctuation table 5 in this way, it is possible to convert idiomatic expressions contained in the input character string and notation fluctuations due to variations in kana usage into standard notations. it can. Therefore, an input character string with fluctuation in notation can be encoded as a standard character string. When the same effect is realized without using the fluctuation standardization means 4 and the notation fluctuation table 5 as in the prior art, the user dictionary includes the phrase “skin” such as “dermatitis” and “skin laceration”. It is necessary to register the words “skin fever” and “skin fracture”, which are rewritten as “skin fu”, with the code. There are more than 400 standard names of injuries with “skin” alone, and the user dictionary will become huge if all the notation fluctuations are registered. For this reason, the burden of registration work in the user dictionary and the verification time are remarkably increased, and the risk of coding conversion due to an input error at the time of registration is also increased.
Therefore, as in the coding method according to the present embodiment, it is more efficient to register not only the entire wound name in the user dictionary, but also to check the standard wound name master after converting only the notation fluctuation part. The risk of mistakes is reduced.
Next, the generated standardized character string is collated with the wound name master collating means 6 by the wound name master collating means 6. In the wound name master table 7, standard wound names and modifiers added to the wound names are registered as master words, and codes and phrase attributes corresponding to the respective master words are stored. The wound name master collating means 6 detects all the master words / phrases included in the standardized character string, and extracts codes and word attributes of the words / phrases.
The phrase attribute here is a parameter indicating the type of master phrase, and is used to classify whether the name is a wound name or a modifier, or what type of modifier is a modifier. In the example of FIG. 2, as a result of collating the standardized character string “right forearm contact dermatitis” with the wound name master table 7, master words “right”, “forearm”, “contact dermatitis” are detected. The attributes are “position” which is a modifier indicating position, “part” which is a modifier indicating a part of the body, and “wound and disease” indicating the name of the wound.
From the result of the master collation, it is possible to determine that the input character string of injury / illness name is the injury / illness name “contact dermatitis” including a modifier for the position and part, and the code is “6929196”. However, in general, when an input character string includes a wound name and a modifier, it cannot be determined that the wound name is a wound name corresponding to the input character string. For example, the input character string may be described as “dermatitis (contact)”. In this case, as a result of collation with the wound name master, the wound name “dermatitis” and “pathogenesis” “contact” It is detected as a modifier that indicates that it is impossible to detect the name of a consecutive “contact dermatitis”. Therefore, the code is “6869043” which means “dermatitis”. As described above, even if the disease name included in the disease name character string is encoded as it is, an appropriate encoding result may not always be obtained.
Therefore, in this embodiment, whether or not the coding result is valid by using the concept of the structural formula indicating how the master words such as the names of sicknesses and modifiers included in the input character string are arranged. To evaluate.
The character string structure analyzing means 8 and the structural formula table 9 are used for the coding evaluation by the structural formula. The character string structure analyzing means 8 pays attention to the word attribute among the collation results generated by the disease name master collating means 6, arranges the word attributes corresponding to the order of the master words detected in the disease name character string, Treat as. The character string structure analyzing means 8 collates this structural formula against the structural formula table 9, and when the same structural formula is detected, the code of the wound name included in the input wound name character string from the corresponding structural evaluation. Judge whether or not can be adopted. If the same structural formula is not found, it is determined that it cannot be adopted.
In the example of FIG. 2, in the case of “right forearm contact dermatitis”, the structural formula is “positional site injury”. This structural formula is registered as “good” in the structural formula table 9 as the structural evaluation, and therefore, it is determined that the code of the wound name “contact dermatitis” included in the wound name string can be adopted. To do.
Since there are an infinite number of combinations of injury and disease name strings composed of injury and disease names and modifiers, it is difficult to evaluate whether or not each character string is properly coded. is there. However, if the disease name character strings are replaced with structural formulas as in this embodiment, most of them can converge to about several hundred types, so if you evaluate whether it can be adopted for each structural formula, most of the sick name characters It can be determined whether the column can be automatically encoded. Therefore, even when there is no wound name whose wound name matches the wound name master, the user can narrow down the visual confirmation to the minimum.
The evaluation method for each structural formula is as follows. In general, even if a wound name and character name string includes modifiers indicating the wound name and the “position” or “part” of the body, it is considered that these modifiers have little influence on the wound name. For example, even if “right forearm contact dermatitis” is “left lower leg contact dermatitis”, it is still “contact dermatitis”. However, if the included modifiers indicate "etiology" or "condition", these wound names may be expressed in another standard wound name, so they are included in the wound name string. It is not always possible to adopt the names of wounds and diseases. Also, if multiple wound names are detected in the wound name string, it is difficult to determine whether it is just an enumeration of wound names or whether two wound names represent a single meaning. Requires visual confirmation.
Presentation of the coding result evaluation by the structural formula and visual confirmation by the user are performed by the collation result correcting means 12. The collation result correcting means 12 presents the input character string, standardized character string, wound name extracted as a master word / phrase, its code, and structure evaluation obtained by the processing so far to the user. The user determines whether or not the encoding result can be adopted based on these pieces of information, and makes corrections as necessary. These encoding results are stored in the encoding result storage table 13 and output as necessary.
For the correction of the encoded result, the approximate wound name search means 10 and the approximate wound name table 11 are used. The approximate wound name search means 10 performs a fuzzy search in the approximate wound name table 11 for the standardized character string already generated by the notation fluctuation standardization means 4, and extracts a plurality of wound names and codes that approximate the standardized character string. To do. This fuzzy search result is again presented to the user by the collation result modification 12, and the user selects the name of the disease / sickness that seems to be most appropriate from the presented names of the disease / sickness, and the code is encoded. Stored in the save table 13.
According to the prior art, this approximate disease name search operation is performed by the user, and the user searches for an appropriate disease name based on the correspondence table between the disease name and the code, and encodes it by inputting the code. It was. However, this method places a heavy burden on the user's search work, and even if there is a more appropriate name for the disease, the code for another disease name is registered, or an incorrect code is registered due to an input error when entering the code. There was a risk of doing so. In this embodiment, structural evaluation is performed in advance, the minimum necessary input character string that requires visual confirmation is extracted, and the change of the code by the user is performed by selecting the most appropriate wound name from a plurality of approximate wound names. Therefore, it is possible to significantly reduce the burden of coding work by the user and to suppress coding errors due to input mistakes to the maximum.
As described above, according to the present embodiment, it is possible to provide an encoding system that can accurately encode an input character string with fluctuation in notation with less user work load.
Hereinafter, the notation fluctuation standardization means 4 and the notation fluctuation table 5 of this embodiment will be described in detail with reference to FIGS.
FIG. 3 is a diagram showing a specific example of the notation fluctuation table 5 of the present embodiment. The notation fluctuation table 5 stores at least a fluctuation word / phrase 301 that is a variation of conventional notation and kana usage, and a standard word / phrase 302 that is a standard notation corresponding thereto.
The purpose of the notation fluctuation table 5 is to converge the notation fluctuations of various wound names described at the discretion of the doctor to the notation of the master word / phrase stored in the wound name master table 7. Thus, fluctuations in phrases often include variations in notation commonly used in sickness names, such as idiomatic notation (“skin” and “skin-fu”), hiragana / katakana / kanji notation (“biran”, “bilan”, “糜爛” )), Differences in kanji (“neck” and “neck”), pronunciation differences (“virus” and “virus”), presence / absence of long sound symbols (“Qatar” and “catar”), etc. 7 together with a standard word / phrase which is a standard notation registered in 7.
FIG. 4 is a flowchart showing the detailed operation of the notation fluctuation standardization means 4.
First, in step 401, the input character string fetched as the disease name character string data 2 by the control means 14 is fetched into the notation fluctuation standardization means 4.
Next, in step 402, the fluctuation word / phrase 301 in the notation fluctuation table 5 is referenced to search for a word / phrase corresponding to a part of the input character string.
If a corresponding fluctuation word is found, the fluctuation word part is replaced with a word stored in the standard word / phrase 302 of the notation fluctuation table 5 in step 403. When all the fluctuation phrases are converted into standard phrases in steps 402 and 403, the character strings are returned to the control means 14 as standardized character strings in step 404.
The conversion is not necessarily part of the input character string, and the entire input character string is converted into a standard phrase as a fluctuation phrase, or a fluctuation phrase is not detected in the input string and is controlled as a standardized phrase as it is It may be returned to the means 14.
As described above, the notation fluctuation standardization means 4 and the notation fluctuation table 5 standardize the notation fluctuation included in the input character string, and more collation name character strings are used when collating with the wound name master table 6 as the next step. Can be coded.
Compared with the method of storing the disease name character string including notation fluctuation in the user dictionary and the corresponding code as in the conventional technique, in this embodiment, only the notation fluctuation is standardized on the character string, so in the next step It does not depend on the wound name master table 7 to be used.
In other words, when changing the code system used in the disease name master table 7, the user dictionary must change the codes of all registered disease name character strings based on the new code system. It is not dependent on the coding system. Therefore, it is possible to avoid a huge work load that rewrites all the codes in the user dictionary.
Hereinafter, the wound name master collating unit 6 and the wound name master table 7 of the present embodiment will be described in detail with reference to FIGS. 5, 6, 7, 8, and 9.
FIG. 5 is a diagram showing a specific example of the wound name master table 7 in the present embodiment. The wound name / master name table 7 stores at least a master word / phrase 501 such as a wound / illness name and a modifier, a code A 502 and a code B 503 corresponding to the code, and a word / phrase attribute 504 indicating the type of the master word / phrase. There are not necessarily two types of codes to be stored, and one type or three or more types may be stored.
Commonly used codes are the disease name master and modifier master of the receipt computer processing system whose main purpose is to encode the names of wounds and diseases described in the receipt, and many disease names and modifiers are coded. ing. If too many types of codes are used, such as when analyzing disease trends based on the names of wounds and sicknesses, the classification is too fine and 119 items are coded with a large number of wound names. The disease classification (medium classification code) for social insurance charts is often used. As described above, it is preferable that the user can select the code depending on the purpose of use. If a plurality of code systems are stored in the sickness name master table 7, an appropriate code system can be selected according to the user's purpose of use. .
FIG. 6 is a table showing an example of the phrase attribute 504 stored in the wound name master table 7. In the present embodiment, phrase attributes are associated with all master phrases stored in the wound name master table 7. However, since all words that can be freely described as a disease name character string cannot be registered as a master word, a character string that is not registered as a master word is treated as having an unknown word attribute.
FIG. 7 is a flowchart showing the detailed operation of the wound name master collating means 6.
First, in step 701, the standardized character string is input as an input character string by the control means 14 to the wound name master matching means 6.
In step 702, the master word / phrase 501 in the injury / patient name master table 7 is referenced to search for a corresponding word / phrase as a part of the input character string.
If a corresponding word is detected in step 703, the master word 501, code A 502, code B 503, and word attribute 504 are stored in the matching table 15 in step 704. When all corresponding master phrases are stored in the collation table 15, only necessary master phrases are selected in step 705, and the remaining master phrases are deleted from the collation table 15. This is because, for example, there are three types of master phrases that are detected in the phrase “forearm”: “fore”, “arm”, and “forearm”, but only “forearm” is required, and other master phrases are unnecessary. .
In step 706, the master phrase finally selected is compared with the input character string that is the basis thereof, and consecutive partial character strings in which the master phrase is not detected in the input character string are collated as the phrase attribute “unknown”. Store in table 15. At this time, nothing is stored in the codes A and B.
FIG. 8 is a table for explaining rules for selecting only a master word / phrase necessary in step 705 of FIG. When a plurality of master phrases are detected by the master collation, it is checked which one of the master phrases A and B should be selected from the positions and the number of characters of the detected master phrases A and B in the input character string. .
If the relationship between the master word / phrase A and the master word / phrase B is included in the other word / phrase as in 1) to 3) in FIG. 8, the longer master word / phrase is selected. For example, “forearm” is selected for “forearm” and “front”.
If the master phrase A and the master phrase B do not overlap at all in the input character string as shown in 4) of FIG. 8, both are independent master phrases and are selected.
As shown in 6) to 8) of FIG. 8, when the master phrase A and the master phrase B are shifted in the front and back and are arranged in the input character string, the master phrase A or the master phrase B having the larger number of characters is selected. However, only when the number of characters is the same, the one shifted backward as in 6) is selected.
FIG. 9 is a diagram illustrating the state of the collation table 15 before and after the necessary master word / phrase is selected in step 705 of FIG. 7 and the unknown character string is registered in step 706.
In FIG. 9, as an example, it is assumed that a wound name character string “possibility of right forearm contact dermatitis” is input to the wound name master verification unit 6. As a result of the master collation, five master words / phrases “right”, “front”, “forearm”, “arm”, and “contact dermatitis” are detected as master words from the wound name master table. As shown in the table above 9, the collation table 15 stores the respective master words / phrases, code A, code B, and word / phrase attributes.
After that, in step 705 of FIG. 7, only the necessary master word / phrase is selected according to the selection rule shown in FIG. 8, so that only “right”, “forearm”, and “contact dermatitis” remain. Next, as a result of the comparison with the input character string in step 706, it can be seen that the character string “possibility” was not detected as the master word / phrase. As a result of registration in the collation table 15 as a master phrase, the result is as shown in the lower table of FIG.
As described above, the code of each master word / phrase included in the input character string of injury / illness name is acquired by the injury / patient name master collating means 6 and the injury / illness name master table 7. In addition, by acquiring the phrase attribute, the character string structure analyzing means 8 which is the next step can perform the structure analysis, and the encoding result can be evaluated.
Hereinafter, the character string structure analyzing means 8 and the structural formula table 9 of this embodiment will be described in detail with reference to FIGS.
FIG. 10 is a diagram showing a specific example of the structural formula table 9 in the present embodiment. The structural formula table 9 stores at least a structural formula 1001 that is a combination of phrase attributes and a structural evaluation 1002 that indicates whether or not a wound name string having the structural formula can be adopted by the detected wound name code. Has been. In addition, if a comment 1003 which is an explanation of why the structural formula is accepted or not is stored, the comment 1003 is also presented to the user at the time of visual confirmation by the user, and a supplement is made to promote an accurate determination. Can explain.
The structure evaluation 1002 does not necessarily have to be a binary value indicating whether or not it is accepted, and as shown in FIG. Thus, the user refers to the structural evaluation 1002 at the time of visual confirmation, and when it is desired to strictly code, all other than “excellent” is visually confirmed, and when strictness is not required, only “impossible” is visually confirmed. It is possible to use different coding accuracy depending on the situation.
FIG. 11 is a flowchart showing the detailed operation of the character string structure analyzing means 8.
First, in step 1101, the master words / phrases detected in the disease name string in the collation table 15 are rearranged in the order of appearance in the disease name string, and the word / phrase attributes corresponding to each master word / phrase are rearranged. Arrange them in order and connect them to get the structural formula. For example, the structural formula of “left forearm contact dermatitis” is “positional site injury”.
In step 1102, the character string structure analyzing means 8 refers to the structural formula 1001 in the structural formula table 9 and searches for the same structural formula as the structural formula obtained in step 1101.
If the same structural formula is detected in the structural formula table 9 in step 1103, the structural evaluation 1002 and the comment 1003 of the corresponding structural formula are returned to the control means 14 in step 1104.
If the same structural formula is not detected in step 1103, the structural evaluation is returned to the control means 14 as “impossible” and the comment “structural formula is not registered”.
As described above, by using the structural formula that is an array of the phrase attributes of the master phrase included in the input disease name character string by the character string structure analyzing unit 8 and the structural formula table 9, the disease name included in the disease name character string. It is possible to evaluate the structure of whether or not the above code can be used as the code of the disease name character string. Therefore, even if the disease name string does not completely match the master phrase, the user can determine whether it is necessary to visually check the disease name string and the encoding result, based on the structure evaluation, Visual confirmation work can be reduced to the minimum necessary.
It is possible to automatically perform a series of processes for standardizing the fluctuation of the notation of the character string of the disease name input as described above, obtaining a code by master verification, and evaluating a coded result by structure evaluation. Therefore, even if a large number of disease name strings are entered, the user browses the display screen after the automatic processing of all the disease name strings, and visually checks only the disease name character strings that cannot be automatically encoded by the structure evaluation, Other codes can be rewritten as needed.
The collation result correcting unit 12 and the encoded result storage table 13 of this embodiment will be described in detail below with reference to FIGS.
FIG. 12 is a diagram showing a specific example of the encoding result storage table 13 in the present embodiment.
The disease name character strings inputted one after another are coded through each process and accumulated in the coded result storage table 13. The items stored in the encoding result storage table 13 are an input character string 1201 that is an inputted character string as it is, a standardized character string 1202 that is standardized by the notation fluctuation standardization means 4, and a disease name master collating means 6 Among the extracted master phrases, the adopted wound name 1203 whose phrase attribute is “wound and sickness”, the code A1204 which is a code of the adopted wound and sick name 1203, and the code B1205 which is a code according to another code system of the adopted wound and sickness name are continued. The next code A column 1206 is an enumeration of codes indicating the disease name and various modifiers arranged in the order of appearance of the codes of all the master words extracted by the disease name master collating means 6.
Subsequently, the structural formula 1207 obtained by the character string structure analyzing means 8, the structural evaluation 1208, and the comment 1209 are also stored.
In FIG. 12, in order to improve the legibility of the structural formula 1207, a phrase with a phrase attribute “unknown”, a phrase with “parentheses”, and “connection” is displayed using the original character string. That is, in the input character string “asthma-like bronchitis” of number 4, “asthma” and “bronchitis” are detected as master words, but the middle “like” is a word that is “unknown”. Then, “Wound and sickness [like] injury and sickness” is indicated. Similarly, the parenthesis of the parenthesis of “tonsillitis (chronic)” of the number 7 is “parenthesis”, but here is “injury (progress)”.
FIG. 13 is a specific example of a screen displayed on the display unit 16 by the collation result correcting unit 12.
This screen is for presenting the encoded result stored in the encoded result storage table 13 and its evaluation to the user and prompting the user to determine whether or not correction by the user is necessary.
A list 1301 is a list of input character strings accumulated in the encoding result storage table 13 and main items necessary for the user to determine the necessity of correction.
A button 1302 and a button 1303 are used to switch the content displayed in the list 1301 and can select whether to display all the coding results or only the coding results that are visually confirmed based on the structural evaluation. At this time, the structure evaluation for displaying the coding result is set in advance by the user. For example, it is possible to freely increase / decrease the amount of coding correction work by changing the setting so that structural evaluation is “impossible” for some users and “possible” and “impossible” for other users. Therefore, the user can freely control the amount of the name of the visually confirmed disease and sickness according to the number of input character strings, the required encoding accuracy, and the amount of work that can be spent.
Text boxes 1304 to 1312 display data accumulated in the encoding result storage table 13 relating to the numbers selected in the list 1301. The user can make a fine determination based on the information.
Button 1313 causes this input string selected from the list to flow again through a series of steps of automatic encoding, button 1314 adopts this encoding result, and button 1315 does not satisfy this encoding result. This is a button for executing an approximate disease name search when the user decides to code by an approximate disease name search described later. When it is determined that the user adopts this encoded result, it is possible to determine that the coded result is adopted as long as the button 1315 does not perform an approximate wound name search without pressing the button 1314.
In this way, the user can browse only the input character string that seems to be corrected based on the screen displayed by the collation result correcting means 12. Moreover, if it is determined that correction is necessary as a result of careful examination of the coding result, an approximate wound name search screen described later is displayed from this screen, and a more appropriate code can be searched there.
As an example of the approximate character string search in the present embodiment, there is an ambiguous search method using N-grams often used in natural language processing. In general, a character string of length N adjacent in a phrase or sentence is called an N-gram. For example, one gram of the character string “myocardial infarction” is “heart”, “muscle”, “infarction”, and “block”. Gram is “myocardium”, “muscle infarction”, “infarct”. In this way, if the character string is decomposed into grams and the master table is also stored with the master word / phrase decomposed into grams, the larger the number of grams in both, the closer the character strings of both. You can see that Of course, all the grams match if they are the same string. However, even if some letters are missing or the word order is reversed, many grams match. Judge that there is.
The approximate wound name search means 10 and the approximate wound name table 11 in this embodiment will be described in detail below with reference to FIGS.
FIG. 14 is a diagram showing a specific example of the approximate disease name table 11 of the present embodiment.
The approximate word / phrase name table 11 stores master words / phrases stored in the disease / disease name master table 7 in a state of being decomposed into grams, and each word / phrase name master number 1401 manages each master word / phrase in the disease / disease name master table 7. And the gram 1402 stores the gram of the master phrase.
In the example of FIG. 14, 1 gram and 2 gram of “myocardial infarction” of master word / phrase number 1 and “allergic rhinitis” of 2 are stored.
In N-grams, the accuracy of the fuzzy search varies depending on the length N of the character string, but in searching for sickness names, both 1 and 2 are used because of the balance between the search accuracy and the capacity of the table. Just right. Of course, you can use more than 3 grams.
The master phrase for storing the gram in the approximate disease name table 11 does not have to be all the master words stored in the disease name master table 7, and the purpose of the search is to search for the disease name. It is also possible to reduce the number of options and make it easy to select only a certain wound name or a typical wound name. FIG. 15 is a flowchart showing the detailed operation of the approximate disease name search means 10. When the input character string selected by the user on the screen shown in FIG. 13 is instructed to search for an approximate disease name, step 1501 generates 1 and 2 grams of standardized character strings of the disease name character string. At this time, the character string to be gram-decomposed may be the input character string itself, but the search accuracy is improved by using the standardized character string.
In step 1502, the generated gram is collated with a partial character string 1402 stored in the approximate disease name table 11. If there is a matching gram, the number of matched grams is counted for each wound name master number 1401.
Here, it can be said that the more grams match, it can be said that the name of the wound is approximate, but in this embodiment, as the matching rate that quantitatively indicates the degree of approximation of both,
(Number of matched grams) ÷ (number of grams of master phrase)
Is used.
The master word / phrase in which the gram match is found is passed to the collation result correcting means 12, and is displayed to the user together with the master word / phrase, code A, and code B in the descending order of precision using the display means 16 in step 1504.
In step 1505, the user input means 17 is used to select the most appropriate master phrase from these master phrases, and in step 1506, various data of the master phrase is converted into a new encoding result of the input character string. Is stored in the encoding result storage table 13 as follows.
FIG. 16 is a specific example of a screen displayed on the display unit 16 by the collation result correcting unit 12 in step 1504. When the execution of the approximate disease name search is instructed from the screen of FIG. 13, the screen of FIG. 16 is displayed, but the upper half of the input character string list is the same as FIG. The lower half of the screen is a display for searching for an approximate disease name, and an input character string 1501 selected from the list and a standardized character string 1502 are displayed in a text box.
The approximate wound name search result that is the result of the fuzzy search using the gram is displayed in the list on the lower right of the screen, and the candidate wound name 1503 that is the master word / phrase in which the gram match is found, its code A 1504 and code B 1505 , And the precision 1506 (expressed as a percentage).
Of the items displayed in the list, code A 1504 and code B 1505 are not necessarily required, but by comparing with other candidate disease names, it is easier to determine which candidate disease name is appropriate.
Also, by displaying the candidate wound names from the top in the order of high relevance rate, the user can examine the candidate wound names in descending order of coincidence, thereby facilitating selection.
Here as the precision
(Number of matched grams) ÷ (number of grams of master phrase)
Use
(Number of matched grams) ÷ (Number of grams in standardized character string)
The reason for not using is that the order of coincidence between the two is different.
In the former case, for example, when “contact dermatitis” is searched, “contact dermatitis” and “dermatitis” are also displayed at the top of the list with a precision 1 respectively. However, if the latter is used as the precision, “contact dermatitis” is similarly displayed at the top of the list with precision 1; however, “dermatitis” only matches approximately half of the gram. It will be displayed below.
In general, when searching for a wound name is not found, if the exact wound name is not found, the name of the wound that broadly captures the wound name, that is, “dermatitis” for “contact dermatitis”, “acute bronchitis” "It is natural to choose" bronchitis ".
Accordingly, by using the former relevance rate, it is possible to evaluate that the relevance rate is higher for a disease name that broadly captures the name of the disease, and it is possible to present candidate disease names with priority orders suitable for the user's selection work.
When the user finds an appropriate wound name from the list by the approximate wound name search by the automatic coding process, the candidate wound name is selected and the “adopt” button 1507 is pressed, and the result is stored in the coded result storage table. 13 can be stored. If not selected, a “structure analysis” button 1508 is pressed to return to the screen shown in FIG. 13 and a coding result obtained by automatic coding can be adopted.
In order to obtain the code of the disease name string by the approximate disease name search, the user selects the most appropriate one from among tens of thousands of standard injury names as in the conventional technique, and inputs the code by punch input or the like. On the other hand, it is much easier and the effect of preventing erroneous registration due to an input error is high.
In addition, even if an appropriate code cannot be found, it is possible to easily extract the name of the disease and sickness that captures the character string of the disease and disease in a broad sense, so there is little variation in the coding results due to differences in user knowledge and experience, and anyone can It is possible to obtain roughly similar coding results. The consistency of the encoded results is important for statistical analysis because it can prevent differences in results due to differences in interpretation when compared with other disease trend analysis results.
FIG. 17 is a diagram showing a specific example of the coding result storage table 13 after the approximate injury / illness name search in this embodiment.
Here, as a result of an approximate wound name search for “asthmatic bronchitis”, “olfactory dysfunction”, and “tonsillitis (chronic)” in the input character string 1701 whose structural evaluation was “impossible” at the time of FIG. The code has been updated to be more appropriate. Also, the structural evaluation 1708 is changed to “approximate” indicating that it is a result of the approximate injury / illness name search, and this is also displayed in the comment 1709. Since the code A string 1706 and the structural formula 1707 are not generated in the approximate disease name search, the display is blank.
In the approximate wound name search, various character strings of wound names can be encoded. For example, in FIG. 17, "asthmatic bronchitis" (bronchitis with asthma-like symptoms) and "asthmatic bronchitis" (bronchi due to asthma) Injury and disease names that have similar meanings such as “flame”, broader names such as “olfactory dysfunction” and “sensory disturbance”, and word order notation such as “tonsillitis (chronic)” and “chronic tonsillitis” Can be easily coded.
The coding results stored in the coding result storage table 13 can be output to the outside of the computer device 1 arbitrarily, and the user can perform aggregation and statistical processing based on this information and use it for disease trend analysis and the like. it can.
The maintenance screens for various tables in this embodiment will be described below with reference to FIGS. 18 and 19.
FIG. 18 is a diagram showing a specific example of the maintenance screen of the notation fluctuation table 5.
The maintenance screen includes a list that lists words stored in the notation fluctuation table 5, and at least a fluctuation word 1801 and a standard word 1802 are displayed. The conversion count 1803 is an accumulation of the number of times the fluctuation phrase and standard phrase are used by the notation fluctuation standardization means 4.
While viewing this list, the user adds a new fluctuation phrase and standard phrase using the button 1804, deletes an unnecessary fluctuation phrase and standard phrase using the button 1805, changes the contents of the phrase using the button 1806, or changes a button 1807. The maintenance screen can be terminated without editing.
In order to efficiently encode the disease name character string input at any time, it is preferable to always maintain the optimum state of the notation fluctuation table. Therefore, if there is a poor coding result due to a new notation fluctuation, add a new fluctuation phrase, standard phrase, or if the previously registered fluctuation phrase and standard phrase are rarely used thereafter, Deleting it reduces the verification load and enables faster conversion. The number of conversions 1803 is displayed as a guide for grasping the usage status of the fluctuation phrase and the standard phrase.
FIG. 19 is a diagram showing a specific example of the maintenance screen of the structural formula table 9.
On the maintenance screen, there is a list that lists the structural formulas stored in the structural formula table 9, and at least the structural formula 1901 and the structural evaluation 1902 are displayed. A comment 1903 is also displayed to add more detailed information when the structural evaluation is presented to the user.
While viewing this list, the user can add a new structural formula with the button 1904, delete an unnecessary structural formula with the button 1905, change the structural formula or structural evaluation with the button 1906, or do not edit it with the button 1907. The maintenance screen can be terminated.
The structural formula is an index used for evaluating the coding result in this embodiment. When a new structural formula is added or the structural evaluation is not intended by the user, the structural evaluation is adjusted on this maintenance screen. There is a need to do.
The encoding system according to the second embodiment of the encoding method of the present invention will be described below with reference to FIGS.
In the above-described embodiment of the present invention, when the user determines that the result of the automatic coding by the master verification is not appropriate, the user needs to perform an approximate wound name search and convert it to an appropriate code each time. In this case, when the same disease name character string is input, the user is burdened with the same processing.
In the configuration diagram of this embodiment shown in FIG. 20, in addition to the above-described embodiment of FIG. 1, conversion history recording means 18, conversion history search means 19, and conversion history table 20 are provided.
As a result of being automatically encoded and determined to be adopted by the user, or as a result of a search for an approximate disease name by the user, an input disease name character string, standardized character string, code A, stored in the encoded result storage table 13 Code B, code A string, structural formula, structural evaluation, and comment are stored in the conversion history table by the conversion history recording means 18.
In the present embodiment, when a disease name character string is input, the control unit 14 first passes the disease name character string to the conversion history search unit 19, and the conversion history search unit 19 searches the conversion history table for the same input character string. To do.
When the same input character string is detected, the conversion history search means 19 returns the information in the table to the control means 14, and the control means 14 records it in the encoded result storage table 13 via the matching result correction means 12. .
According to the present embodiment, the encoding result once adopted by the user and stored in the encoding result storage table 13 is stored in the conversion history table 20, and when the same injury name character string is input next time, Since the conversion history table 20 is registered in the encoding result storage table 13, various processes in the middle can be omitted. Therefore, particularly when a large number of injury / sickness name character strings are input and many of the same injury / sickness name character strings are included therein, the processing speed can be remarkably improved.
FIG. 21 is a specific example of a screen displayed on the display unit 16 by the collation result correcting unit 12 in the present embodiment.
FIG. 21 is a display screen of an approximate injury / illness name search result similar to FIG. 16 of the above-described embodiment. In FIG. 21, a button 2101 is additionally provided, and the user stores the approximate injury / disorder name search result in the conversion history table 20. It is possible to select whether or not. Therefore, the user can adjust only an arbitrary coding result so as to obtain a similar coding result next time.
The encoding system according to the third embodiment of the encoding method of the present invention will be described below with reference to FIGS.
FIG. 22 is a block diagram of the present embodiment, and in this embodiment, the computer apparatus 1 that performs coding processing on the disease name character string data 2 that is the input of FIG. Connected to the information database 18 to obtain input information. The medical information database 21 is a database for managing medical information such as an electronic medical record or a receipt processing system, and stores various information such as patient information, medical practice information, and accounting information in addition to the names of wounds.
In the present embodiment, since various medical information other than the disease name character string is obtained at the time of encoding, it is possible to refer to such information at the time of visual confirmation and code correction by the user.
FIG. 23 is a specific example of a screen displayed on the display means 18 in the present embodiment. The display unit 17 displays the matching result display screen 2301 described with reference to FIGS. 13 and 15 in the first embodiment, and also displays a medical information display screen 2302 acquired from the medical information database 21. . The medical information display screen 2302 may be text information or image information.
The user confirms the encoding result using the collation result display screen 2301, and refers to the medical treatment information display screen 2302 when making corrections as necessary. It is possible to know and to correct the code with higher accuracy.
The main points of the present invention are summarized as follows.
When an input character string is encoded by master collation, a phrase having a notation fluctuation in the character string is converted into a standard phrase in advance.
Further, in order to convert a phrase having notation fluctuation in the input character string into a standard phrase, the notation fluctuation phrase and the standard phrase are stored in the storage area as a pair.
Furthermore, a display screen for editing a pair of a phrase having a notation fluctuation and a standard phrase stored in the storage area is provided.
When the input character string is encoded by master collation, one or more words / phrases registered in the character string are detected, and an array of the word / phrase attributes is generated. Evaluate whether or not.
Further, the phrase, code and attribute are stored in the storage area.
Further, the phrase attribute and the structure evaluation indicating whether or not encoding is possible are stored in the storage area. Furthermore, it has a display screen for editing the array of phrase attributes and the structure evaluation stored in the storage area.
Furthermore, when the input character string and the encoded result are displayed, only the input character string that needs to be corrected can be displayed as a result of the evaluation.
Further, as a result of the evaluation, if correction is necessary, a screen for performing a fuzzy search from this screen is displayed.
The input character string and the encoded result are stored in the storage area, and the next time the same character string is input, the encoded result is output.
Furthermore, it has a display screen which can select whether to store the input character string and the encoded result in the storage area.
When the input character string and the encoded result are displayed, the form data or image that is the basis of the input character string is displayed together.
[0006]
【The invention's effect】
As described above, the encoding method of the present invention takes time and effort to register all names of wounds and sickness including notation fluctuations of various wound names as in the prior art when coding the names of wounds and sickness. Since there is no need to perform work that is prone to mistakes, the work efficiency and the reliability of code output are greatly improved.
In addition, according to the present invention, as in the prior art, all the names of wounds and sickness that were not detected at the time of matching with the name of the sickness were confirmed by human eyes, and it was necessary to add them to the user dictionary as necessary. By automatically evaluating the encoding result and presenting it to the user, there is an effect that minimum visual confirmation and correction work can be performed according to the user's needs and applications.
In addition, according to the present invention, when coding a disease name that has not been detected at the time of disease name master matching as in the prior art, a code must be directly input by punch input or the like. By selecting the most appropriate approximate disease name, the user records the code, so that the work efficiency and the reliability of code output are greatly improved.
[Brief description of the drawings]
FIG. 1 is a basic configuration diagram showing an embodiment according to the present invention.
FIG. 2 is a process flow diagram illustrating one embodiment according to the present invention.
FIG. 3 is a diagram illustrating an example of a notation fluctuation table.
FIG. 4 is a flowchart of notation fluctuation standardization means.
FIG. 5 is a diagram showing an example of a wound name master table.
FIG. 6 is an explanatory diagram showing types of word attributes.
FIG. 7 is a flowchart of the disease name master collating means.
FIG. 8 is an explanatory diagram showing a selection rule when a plurality of master words / phrases are detected.
FIG. 9 is an explanatory diagram showing the behavior of a collation table.
FIG. 10 is a diagram showing an example of a structural formula table.
FIG. 11 is a flowchart of character string structure analysis means.
FIG. 12 is a diagram showing an example of a coding result storage table after automatic processing.
FIG. 13 is a diagram showing an example of a coding result display screen in the collation result correcting unit after automatic processing.
FIG. 14 is a diagram showing an example of an approximate wound name table.
FIG. 15 is a flowchart of approximate wound name search means.
FIG. 16 is a diagram showing an example of a coding result display screen in the matching result correcting means at the time of approximate disease name search.
FIG. 17 is a diagram showing an example of a coding result storage table after an approximate injury / illness name search.
FIG. 18 is a diagram showing an example of a maintenance screen example of a notation fluctuation table.
FIG. 19 is a diagram showing an example of a maintenance screen of a structural formula table.
FIG. 20 is a basic configuration diagram showing a second embodiment according to the present invention.
FIG. 21 is a diagram showing an example of a coding result display screen in the matching result correcting unit after the approximate injury / illness name search in the second embodiment according to the present invention.
FIG. 22 is a basic configuration diagram showing a third embodiment according to the present invention.
FIG. 23 is a diagram showing an example of a display screen in the third embodiment according to the present invention.
[Explanation of symbols]
1. Computer device
2 ... Injury name string data
3 ... Wound and disease name coding result data
4 ... Notation fluctuation standardization means
5 ... Notation fluctuation table
6 ... Injury and disease name master verification means
7… Injury and disease name master table
8 ... Character string structure analysis means
9 ... Structural formula table
10 .... Approximate wound name search means
11 ... Approximate wound name table
12 ... Verification result correction means
13 ... Encoded result storage table
14 ... Control means
15 ... Collation table
16: Display means
17 ... Input means
18. Conversion history recording means
19 ... Conversion history search means
20 ... Conversion history table
21 ... Medical information database.

Claims

A notation fluctuation standardization means for collating an input character string with a notation fluctuation table, detecting a fluctuation word / phrase in the input character string and converting it to a corresponding standard word / phrase;
The input character string converted into the standard phrase is collated with a wound name master table as a standardized character string, and the wound name and modifier included in the standardized string are detected as a master phrase, and the code of the master phrase is A disease name master collating means to be extracted;
Display means for presenting the input character string, the standardized character string, the wound name extracted as a master word and its code,
The phrase obtained by gram-decomposing the input character string or the standardized character string is compared with the approximate disease name table in which the phrase obtained by gram-decomposing the master word stored in the disease name master table is matched for each master word / phrase. It has an approximate disease name search means for counting the number of grams, using the ratio of the number of matched grams to the number of grams of the master phrase as a precision, and displaying the master phrase based on the precision. Coding system.

A program executed on a computer having a recording medium storing a notation fluctuation table and a sickness name master table and a display means ,
The computer collates the input character string with the notation fluctuation table, detects a fluctuation word / phrase in the input character string, converts it to a corresponding standard word / phrase, and uses the input character string converted to the standard word / phrase as a standardized character string. , Collate with the wound name master table, detect the wound name and modifier included in the standardized character string as a master phrase, and extract the code of the master phrase,
The input character string, standardized character string, wound name and disease name extracted as a master word and its code are presented to the display means, and the input character string or the phrase obtained by gram-decomposing the standardized character string is displayed in the disease name and disease name master table. The stored master word / phrase is collated with the approximate disease name table in which the words / phrases are stored, and the number of matched grams is counted for each master word / phrase, and the ratio of the number of matched grams to the number of grams of the master word / phrase is adapted. A program for executing a procedure of presenting the master word / phrase on the display unit based on the relevance ratio, which is used as a ratio.