JP7666966B2

JP7666966B2 - Learning method, learning model, prediction method, prediction device, and computer program

Info

Publication number: JP7666966B2
Application number: JP2021058145A
Authority: JP
Inventors: 嗣瑠笠
Original assignee: Ogis Ri Co Ltd
Current assignee: Ogis Ri Co Ltd
Priority date: 2021-03-30
Filing date: 2021-03-30
Publication date: 2025-04-22
Anticipated expiration: 2041-03-30
Also published as: JP2022154885A

Description

本発明は、需要予測に関し、より高い精度での予測を実現する学習方法、学習モデル、予測方法、予測装置、及びコンピュータプログラムに関する。 The present invention relates to demand forecasting, and to a learning method, a learning model, a forecasting method, a forecasting device, and a computer program that realizes more accurate forecasting.

物品の販売、及び物品提供サービスの事業にとって、原材料及び資材の調達、並びに人的資源の確保のための需要予測は重要である。従来、多様な方法で来客数の予測、商品の販売予測等がなされてきた（特許文献１）。 For businesses that sell goods and provide goods services, it is important to forecast demand in order to procure raw materials and supplies, as well as to secure human resources. Traditionally, there have been a variety of methods for forecasting the number of visitors and product sales (Patent Document 1).

需要予測では、指数平滑手法、移動平均法、あるいは重回帰分析など多様なモデルを用いた学習が用いられてきた。演算資源の進歩によって深層学習を用いた予測方法も提案されている。 In demand forecasting, learning using a variety of models such as exponential smoothing, moving average methods, and multiple regression analysis has been used. With advances in computing resources, forecasting methods using deep learning have also been proposed.

特開平４－０００５９６号公報Japanese Patent Application Publication No. 4-000596

従来の予測方法においては、どのようなパラメータが精度向上に寄与しているのかが検討されてきた。例えば特許文献１では、天候、曜日、季節、気温、キャンペーンの有無等をデータとして入力して重回帰分析が実行される。 In conventional prediction methods, the parameters that contribute to improving accuracy have been examined. For example, in Patent Document 1, multiple regression analysis is performed by inputting data such as weather, day of the week, season, temperature, and the presence or absence of campaigns.

多数のデータを取得して分析、学習して予測することにより、精度が向上する場合があるが、結果の出力に要する時間、又はデータの取得に必要なコストが増大する。 Accuracy may improve by acquiring, analyzing, learning from, and making predictions on a large amount of data, but the time required to output the results or the cost required to acquire the data increases.

本発明は、多くの種類のデータを収集することを必要とせずに、高い精度で需要予測を実現する学習方法、学習モデル、予測方法、予測装置、及びコンピュータプログラムを提供することを目的とする。 The present invention aims to provide a learning method, a learning model, a forecasting method, a forecasting device, and a computer program that realizes highly accurate demand forecasting without the need to collect many types of data.

本開示の一実施形態の学習方法は、特定の場所での物品又はサービスの提供に関するデータを入力データとして、前記物品又はサービスへの需要を示すデータを出力する学習モデルに、前記物品又はサービスの販売実績を教師データとして用いて学習させる学習方法であって、前記物品又はサービスが提供される日の暦に関する特徴量と、前記販売実績における前記物品又はサービスに関する統計量と、前記物品又はサービスの売上金額に関する特徴量との内の少なくとも１つの種類の特徴量データを前記販売実績から作成するステップ、及び、作成した特徴量データを入力データとして前記学習モデルの学習を実行するステップを含む。 A learning method according to an embodiment of the present disclosure is a learning method that uses data related to the provision of goods or services at a specific location as input data and causes a learning model that outputs data indicating demand for the goods or services to learn using sales records of the goods or services as training data, and includes the steps of creating at least one type of feature data from the sales records among features related to the calendar of the day on which the goods or services are provided, statistics related to the goods or services in the sales records, and features related to the sales amount of the goods or services, and executing learning of the learning model using the created feature data as input data.

本開示の一実施形態の学習モデルは、コンピュータが特定の場所での物品又はサービスの提供に関するデータを入力した場合に、前記物品又はサービスへの需要を示すデータを得られるべく学習される学習モデルであって、前記物品又はサービスの販売実績から作成される特徴量データであって、前記物品又はサービスが提供される日の暦に関する特徴量と、前記販売実績における前記物品又はサービスに関する統計量と、前記物品又はサービスの売上金額に関する特徴量との内の少なくとも１つの種類の特徴量データを入力データとし、前記販売実績における前記需要を示すデータを目的変数として学習する。 The learning model of one embodiment of the present disclosure is a learning model that is trained to obtain data indicating demand for an item or service when a computer inputs data regarding the provision of the item or service at a specific location, and the learning model learns using feature data created from the sales history of the item or service, with at least one type of feature data being input data among feature data related to the calendar of the day the item or service is provided, statistics related to the item or service in the sales history, and feature data related to the sales amount of the item or service, and the data indicating the demand in the sales history being used as a target variable.

本開示の一実施形態のコンピュータプログラムは、コンピュータに、特定の場所での物品又はサービスの提供に関するデータを入力データとして、前記物品又はサービスへの需要を示すデータを出力する学習モデルを、前記物品又はサービスの販売実績を教師データとして用いて、学習させるコンピュータプログラムであって、前記コンピュータに、前記物品又はサービスが提供される日の暦に関する特徴量と、前記販売実績における前記物品又はサービスに関する統計量と、前記物品又はサービスの売上金額に関する特徴量との内の少なくとも１つの種類の特徴量データを作成するステップ、及び、作成した特徴量データを入力データとして前記学習モデルの学習を実行するステップを実行させる。 A computer program according to an embodiment of the present disclosure is a computer program that causes a computer to learn a learning model that uses data relating to the provision of an item or service at a specific location as input data and outputs data indicating demand for the item or service, using sales records of the item or service as training data, and causes the computer to execute a step of creating at least one type of feature data among features relating to the calendar of the day the item or service is provided, statistics relating to the item or service in the sales records, and features relating to the sales amount of the item or service, and a step of executing learning of the learning model using the created feature data as input data.

本開示の学習方法では、需要を予測するために用いる学習モデル（学習アルゴリズム）へ入力するデータに対する前処理として、販売実績に対する特徴量エンジニアリングが実行される。特徴量エンジニアリングでは、物品又はサービスが提供される日の暦における特徴、例えば何曜日なのか、土日祝日等の休暇に当たる日なのかによって、提供される物品の数、提供を受ける人の人数等が影響されることから、特徴量として作成される。また、物品又はサービスの販売数に関するトレンド、客単価、割引の有無等のデータも特徴量として使用することによって、精度向上が見込まれる。 In the learning method disclosed herein, feature engineering is performed on sales performance as preprocessing of data to be input into a learning model (learning algorithm) used to predict demand. In feature engineering, features are created based on the calendar characteristics of the day on which the goods or services are provided, such as the day of the week or whether it is a holiday such as a weekend or a national holiday, since these affect the number of goods provided, the number of people receiving them, etc. In addition, accuracy is expected to improve by using data such as trends in the number of goods or services sold, the average customer price, and the presence or absence of discounts as features.

本開示の一実施形態の学習方法は、前記特徴量データは、前記物品又はサービスが提供される日が暦上の特定日に当たるか否かの二値、前記物品又はサービスが提供される日の暦における特徴を示す連続値、前記物品又はサービスの実績値の直近特定期間における平均、前記物品又はサービスが提供される日と暦上の同じ曜日の日における実績値、及び、前記物品又はサービスの値引きの影響に関する実績値のいずれか１つを含んでもよい。 In the learning method of one embodiment of the present disclosure, the feature data may include any one of a binary value indicating whether the date on which the product or service is provided falls on a specific calendar day, a continuous value indicating the calendar feature of the date on which the product or service is provided, an average of the performance values of the product or service in the most recent specific period, a performance value for the same day of the week on the calendar as the date on which the product or service is provided, and a performance value relating to the impact of a discount on the product or service.

本開示の学習方法では、特徴量データは、休暇、大型連休、給与日等の特定日に当たるか否か、年間における何日目なのか、１年、１ヶ月、１週間という暦に存在する周期上のどのあたりに位置するのかを示す連続値（周期を三角関数で表した場合の数値）が含まれる。特徴量データは、直近特定期間における平均等のトレンドに関するデータを含んでもよいし、暦上の同じ属性における実際の需要が含まれてもよい。これらのデータを特徴量とすることによって精度向上が見込まれる。 In the learning method disclosed herein, the feature data includes continuous values (numerical values when the cycle is expressed as a trigonometric function) that indicate whether or not a particular day, such as a holiday, long weekend, or payday, which day in the year it is, and where it is located on a cycle that exists in the calendar, such as a year, a month, or a week. The feature data may include data on trends, such as averages, over the most recent specific period, or may include actual demand for the same attributes on the calendar. By using these data as features, it is expected that accuracy will improve.

本開示の一実施形態の学習方法は、前記販売実績は、前記特定の場所における物品又はサービスに関するＰＯＳデータである。 In one embodiment of the learning method of the present disclosure, the sales history is POS data regarding goods or services at the particular location.

本開示の学習方法では、特徴量データの元になるデータは、ＰＯＳデータである。物品又はサービスが提供される場所で通常取得できるデータを用いることで、外部のデータを使用せずに学習が可能である。勿論、物品又はサービスの需要に関連する可能性のある天候等のデータを使用して学習してもよい。 In the learning method disclosed herein, the data that is the source of the feature data is POS data. By using data that is normally available at the location where the goods or services are provided, learning is possible without using external data. Of course, learning may also be performed using data such as weather that may be related to demand for goods or services.

本開示の一実施形態の予測方法は、特定の場所での物品又はサービスへの需要を予測する予測方法であって、上述のいずれかの学習方法によって学習された学習モデルの学習で使用された前記特徴量データと同一の特徴量データを、コンピュータにより、前記特定の場所における販売実績から作成するステップ、前記コンピュータが、作成された特徴量データを、前記学習モデルへ入力するステップ、及び、前記学習モデルから得られる需要を示すデータを出力するステップを含む。 A prediction method according to one embodiment of the present disclosure is a method for predicting demand for goods or services in a specific location, and includes the steps of: creating, by a computer, feature data identical to the feature data used in training a learning model trained by any of the above-mentioned learning methods, from sales performance in the specific location; inputting, by the computer, the created feature data into the learning model; and outputting data indicating demand obtained from the learning model.

本開示の一実施形態の予測装置は、特定の場所での物品又はサービスへの需要を予測する予測装置であって、上述のいずれかの学習方法によって学習された学習モデルの学習で使用された特徴量データであって、前記物品又はサービスが提供される日の暦に関する特徴量と、前記販売実績における前記物品又はサービスに関する統計量と、前記物品又はサービスの売上金額に関する特徴量との内の少なくとも１つの種類の特徴量データを、前記特定の場所における販売実績から作成する作成部と、作成された特徴量データを、前記学習モデルへ入力する入力部と、前記学習モデルから得られる需要を示すデータを出力する出力部とを備える。 A prediction device according to an embodiment of the present disclosure is a prediction device that predicts demand for goods or services in a specific location, and includes a creation unit that creates, from sales records at the specific location, at least one type of feature data used in training a learning model trained by any of the above-mentioned learning methods, feature data related to the calendar of the day the goods or services are provided, statistics related to the goods or services in the sales records, and feature data related to the sales amount of the goods or services; an input unit that inputs the created feature data into the learning model; and an output unit that outputs data indicating demand obtained from the learning model.

本開示の一実施形態のコンピュータプログラムは、コンピュータに、特定の場所での物品又はサービスへの需要を予測させるコンピュータプログラムであって、前記コンピュータに、特定の場所での物品又はサービスの提供に関するデータを入力データとして、前記物品又はサービスへの需要を示すデータを出力する学習モデルの学習で使用された特徴量データであって、前記物品又はサービスが提供される日の暦に関する特徴量と、販売実績における前記物品又はサービスに関する統計量と、前記物品又はサービスの売上金額に関する特徴量との内の少なくとも１つの種類の特徴量データを、前記特定の場所における販売実績から作成するステップ、作成された特徴量データを、前記学習モデルへ入力するステップ、及び、前記学習モデルから得られる需要を示すデータを出力するステップを実行させる。 A computer program according to an embodiment of the present disclosure is a computer program that causes a computer to predict demand for a product or service at a specific location, and causes the computer to execute the steps of: creating, from sales records at the specific location, at least one type of feature data used in learning a learning model that uses data related to the provision of a product or service at the specific location as input data and outputs data indicating the demand for the product or service, the feature data being at least one type of feature data among features related to the calendar of the day the product or service is provided, statistics related to the product or service in sales records, and features related to the sales amount of the product or service; inputting the created feature data into the learning model; and outputting data indicating the demand obtained from the learning model.

本開示の予測方法では、作成された特徴量を用いて学習された学習モデルに対し、その学習モデルで使用された特徴量を、予測対象の期間に対して同様に作成して入力し、学習モデルから得られる需要を示すデータを予測結果として用いる。本開示の予測方法は、販売実績の特徴量そのものではなく、販売実績から必要な特徴量を抽出又は演算して特徴量データを作成し、作成した特徴量を用いる。これにより、予測精度が向上する。 In the prediction method disclosed herein, the features used in the learning model, which has been trained using the created features, are similarly created and input for the period to be predicted, and data indicating demand obtained from the learning model is used as the prediction result. The prediction method disclosed herein does not use the features of sales history themselves, but rather extracts or calculates the necessary features from sales history to create feature data, and uses the created features. This improves prediction accuracy.

本開示の一実施形態の予測方法は、前記学習モデルは、入力された特徴量データ夫々に対する重要度の係数を出力し、前記コンピュータは、前記重要度の係数に基づき、所定値以上の重要度の特徴量データを抽出し、前記学習モデルから得られる需要を示すデータを、抽出された特徴量データと共に出力する。 In one embodiment of the prediction method disclosed herein, the learning model outputs an importance coefficient for each input feature data, and the computer extracts feature data with an importance equal to or greater than a predetermined value based on the importance coefficient, and outputs data indicating demand obtained from the learning model together with the extracted feature data.

本開示の予測方法では、学習で影響が大きかった特徴量についての情報が出力される。どのようなデータ、例えば暦上の特定の日であるか否かといったデータが需要に大きな影響を及ぼすかを、物品又はサービスの提供者が把握できる。 The prediction method disclosed herein outputs information about the features that had the greatest influence during learning. This allows the provider of goods or services to understand what type of data, such as whether or not it is a specific calendar day, has a large impact on demand.

本開示の学習方法、学習モデル、予測方法、予測装置、及びコンピュータプログラムによれば、適切に作成された特徴量による学習によって、多くの種類のデータを収集することを必要とせずに、高い精度で需要予測を実現できる。 The learning method, learning model, prediction method, prediction device, and computer program disclosed herein can realize highly accurate demand forecasting by learning from appropriately created features, without the need to collect many types of data.

本開示の予測装置の概要図である。FIG. 1 is a schematic diagram of a prediction device according to the present disclosure. 第１の実施形態の予測装置の構成を示すブロック図である。1 is a block diagram showing a configuration of a prediction device according to a first embodiment; 学習アルゴリズムの一例の概要図でありFIG. 1 is a schematic diagram of an example of a learning algorithm; 学習アルゴリズムの学習処理手順の一例を示すフローチャートである。13 is a flowchart showing an example of a learning process procedure of a learning algorithm. 予測装置による予測処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a prediction process procedure performed by the prediction device. 予測データの表示例を示す図である。FIG. 13 is a diagram illustrating a display example of predicted data. 第２の実施形態における予測装置及び端末装置の構成を示すブロック図である。FIG. 13 is a block diagram showing the configurations of a prediction device and a terminal device according to a second embodiment.

以下、本願に係る学習方法、学習モデル、予測方法、予測装置及びコンピュータプログラムについて、実施の形態を示す図面を参照しつつ説明する。 The learning method, learning model, prediction method, prediction device, and computer program according to the present application will be described below with reference to the drawings showing the embodiments.

（第１の実施形態）
図１は、本開示の予測装置１の概要図である。予測装置１は、物品を販売する店舗、又はサービスを提供する店舗に設置された集計装置２と接続される。予測装置１は、集計装置２で集計される販売実績に基づき、需要を予測する装置である。需要としては、設定された期間における来客数、又は販売数等が例示される。集計装置２は、以下の説明ではＰＯＳ（Point Of Sales）システムであり、販売実績はＰＯＳデータである。本実施形態では、予測装置１及び集計装置２が、図１に示したように店舗夫々に設置され、個々の店舗毎に需要が予測される。なお、これに限定されず、集計装置２は店舗夫々に設置される一方で、１つの予測装置１が、各店舗の集計装置２と通信接続し、各店舗のＰＯＳデータを統括的に管理してもよい。また、複数の店舗のＰＯＳデータを統括して複数の店舗全体に対して１つの需要予測が行なわれてもよい。ただし、後述する特徴量は店舗によって異なることが多いことから、店舗毎に需要が予測されることが好ましい。 First Embodiment
FIG. 1 is a schematic diagram of a prediction device 1 according to the present disclosure. The prediction device 1 is connected to a counting device 2 installed in a store that sells goods or a store that provides services. The prediction device 1 is a device that predicts demand based on sales results counted by the counting device 2. Examples of demand include the number of visitors or the number of sales during a set period. In the following description, the counting device 2 is a POS (Point Of Sales) system, and the sales results are POS data. In this embodiment, the prediction device 1 and the counting device 2 are installed in each store as shown in FIG. 1, and demand is predicted for each individual store. Note that this is not limited to this, and while the counting device 2 is installed in each store, one prediction device 1 may communicate with the counting device 2 of each store and centrally manage the POS data of each store. In addition, one demand forecast may be performed for the entire multiple stores by centralizing the POS data of multiple stores. However, since the feature values described later often differ depending on the store, it is preferable to predict demand for each store.

本開示の予測装置１は、販売実績に対応するＰＯＳデータを入力データとして、機械学習によって需要を予測する。以下説明では、予測装置１は、１つの店舗への１日当たりの来客数を需要として予測する。勿論、予測装置１が予測する需要は、来客数に限らず、販売個数、来店グループ数等、予測の対象（予測対象の種別）が異なってもよい。また、１週間当たりの来客数、３日平均の来客数等、予測の期間（予測対象の期間）が異なっていてもよい。さらに、本実施形態では、予測装置１は、１日先の来客数を予測するが、これに限らず、１日よりも先、１週間先、４週間（２８日）先等の来客数を予測してもよい。 The prediction device 1 of the present disclosure predicts demand by machine learning using POS data corresponding to sales results as input data. In the following description, the prediction device 1 predicts the number of visitors per day to one store as demand. Of course, the demand predicted by the prediction device 1 is not limited to the number of visitors, and the prediction target (type of prediction target) may be different, such as the number of sales items or the number of visiting groups. In addition, the prediction period (prediction target period) may be different, such as the number of visitors per week or the average number of visitors over three days. Furthermore, in this embodiment, the prediction device 1 predicts the number of visitors one day ahead, but is not limited to this, and may also predict the number of visitors more than one day ahead, one week ahead, four weeks (28 days) ahead, etc.

図２は、第１の実施形態の予測装置１の構成を示すブロック図である。予測装置１は、パーソナルコンピュータ又はサーバコンピュータである。予測装置１は、以下の説明においては１台のコンピュータによって構成されるものとして説明するが、複数台のコンピュータをネットワークで通信接続して分散処理させる態様であってもよい。予測装置１は、他の用途も有するコンピュータの機能の一部として実現されてもよい。 Figure 2 is a block diagram showing the configuration of the prediction device 1 of the first embodiment. The prediction device 1 is a personal computer or a server computer. In the following explanation, the prediction device 1 is described as being configured by one computer, but it may also be configured in a manner in which multiple computers are connected to each other via a network for distributed processing. The prediction device 1 may also be realized as part of the functions of a computer that also has other uses.

予測装置１は、処理部１０、記憶部１１、通信部１２、表示部１３、及び操作部１４等を備える。 The prediction device 1 includes a processing unit 10, a memory unit 11, a communication unit 12, a display unit 13, and an operation unit 14.

処理部１０は、ＣＰＵ（Central Processing Unit ）及び／又はＧＰＵ（Graphics Processing Unit）を用いたプロセッサである。処理部１０は、記憶部１１に記憶されている予測プログラム１Ｐ及び学習アルゴリズム（学習モデル）１Ｌに基づき、予測処理を実行する。 The processing unit 10 is a processor that uses a CPU (Central Processing Unit) and/or a GPU (Graphics Processing Unit). The processing unit 10 executes prediction processing based on a prediction program 1P and a learning algorithm (learning model) 1L stored in the memory unit 11.

記憶部１１は、例えばハードディスク、フラッシュメモリ、ＳＳＤ（Solid State Drive）等の不揮発性メモリを用いる。記憶部１１は、処理部１０が参照するデータを記憶する。記憶部１１は、予測プログラム１Ｐを記憶する。記憶部１１は、学習アルゴリズム１Ｌを記憶する。予測プログラム１Ｐ及び／又は学習アルゴリズム１Ｌは、記録媒体９に記憶してある予測プログラム９Ｐ及び／又は学習アルゴリズム９Ｌを処理部１０が読み出して記憶部１１に複製したものであってもよい。 The storage unit 11 uses a non-volatile memory such as a hard disk, a flash memory, or an SSD (Solid State Drive). The storage unit 11 stores data referenced by the processing unit 10. The storage unit 11 stores a prediction program 1P. The storage unit 11 stores a learning algorithm 1L. The prediction program 1P and/or the learning algorithm 1L may be a prediction program 9P and/or a learning algorithm 9L stored in the recording medium 9 that is read by the processing unit 10 and copied to the storage unit 11.

学習アルゴリズム１Ｌは、機械学習を実施するためのライブラリ群と、学習済みパラメータを含む。学習アルゴリズム１Ｌは、ＰＯＳデータに基づき後述するように求められた入力データが入力された場合に、指定期間における需要の予測データを出力するように学習される。第１の実施形態において学習アルゴリズム１Ｌは例えば、LightGBM（Light Gradient Boosting Machine ）を用いる。学習アルゴリズム１Ｌはその他の、決定木に基づく機械学習を実現する教師有り学習のアルゴリズムであってもよいが、勾配ブースティング、Leaf-wise アルゴリズムの使用がより好ましい。 The learning algorithm 1L includes a group of libraries for implementing machine learning and learned parameters. The learning algorithm 1L is trained to output forecast data of demand for a specified period when input data obtained based on POS data as described below is input. In the first embodiment, the learning algorithm 1L uses, for example, LightGBM (Light Gradient Boosting Machine). The learning algorithm 1L may be other supervised learning algorithms that realize machine learning based on decision trees, but it is more preferable to use gradient boosting or leaf-wise algorithms.

記憶部１１は、入力されたＰＯＳデータを記憶する。ＰＯＳデータは、設定されている期間分（例えば、１０年間分、４年間分、１年間分、半年分、３ヶ月分等）記憶される。 The memory unit 11 stores the input POS data. The POS data is stored for a set period (e.g., 10 years, 4 years, 1 year, 6 months, 3 months, etc.).

通信部１２は、ローカルネットワークを介したＰＯＳシステムとの通信を実現する。通信部１２は具体的には、例えばネットワークカードである。通信部１２は、シリアル通信のインタフェースであってもよい。通信部１２は、無線通信モジュールであってもよい。処理部１０は、通信部１２によってＰＯＳシステムからＰＯＳデータを読み出し、記憶部１１に記憶する。 The communication unit 12 realizes communication with the POS system via the local network. Specifically, the communication unit 12 is, for example, a network card. The communication unit 12 may be a serial communication interface. The communication unit 12 may be a wireless communication module. The processing unit 10 reads POS data from the POS system via the communication unit 12 and stores it in the memory unit 11.

表示部１３は、液晶ディスプレイ、有機ＥＬ（Electro Luminescence）ディスプレイ等のディスプレイである。表示部１３は、予測の実行結果を含む画面を表示する。この画面は、予測のための設定を受け付けるインタフェースを含んでもよい。表示部１３は、タッチパネル内蔵型ディスプレイであってもよい。 The display unit 13 is a display such as a liquid crystal display or an organic EL (Electro Luminescence) display. The display unit 13 displays a screen including the results of the prediction. This screen may include an interface that accepts settings for the prediction. The display unit 13 may be a display with a built-in touch panel.

操作部１４は、処理部１０との間で入出力が可能なキーボード及びポインティングデバイス等のユーザインタフェースである。操作部１４は、音声入力部であってもよい。操作部１４は、表示部１３のタッチパネルであってもよい。操作部１４は、物理ボタンであってもよい。 The operation unit 14 is a user interface such as a keyboard and a pointing device that can input and output data to and from the processing unit 10. The operation unit 14 may be a voice input unit. The operation unit 14 may be a touch panel of the display unit 13. The operation unit 14 may be a physical button.

このように構成される予測装置１は、ＰＯＳシステムから取得するＰＯＳデータを用いて、指定された期間（１日先、１週間先、１ヶ月先）における需要（１日当たりの来客者数）を予測し、予測結果を表示部１３に表示する。予測装置１は、以下に説明するように、学習アルゴリズム１ＬにはＰＯＳデータをそのまま入力せず、予測プログラム１Ｐに基づきＰＯＳデータから適切な特徴量を抽出して特徴量データを作成する特徴量エンジニアリングを実行する。予測装置１は、特徴量エンジニアリングによって得られた入力データを学習アルゴリズム１Ｌへ入力し、学習アルゴリズム１Ｌによって導き出される需要のデータを取得し、出力する。 The prediction device 1 configured in this way uses POS data obtained from the POS system to predict demand (number of customers per day) for a specified period (one day ahead, one week ahead, one month ahead) and displays the prediction results on the display unit 13. As described below, the prediction device 1 does not input POS data directly into the learning algorithm 1L, but instead executes feature engineering to extract appropriate features from the POS data based on the prediction program 1P and create feature data. The prediction device 1 inputs the input data obtained by feature engineering into the learning algorithm 1L, and obtains and outputs demand data derived by the learning algorithm 1L.

図３は、学習アルゴリズム１Ｌの一例の概要図であり、図４は、学習アルゴリズム１Ｌの学習処理手順の一例を示すフローチャートである。学習アルゴリズム１Ｌは、Leaf-Wise 型の決定木アルゴリズムである。学習アルゴリズム１Ｌは、ＰＯＳデータに基づく特徴量データ（説明変数）が入力されると、予測対象の日にち（即ち、特定の場所で物品又はサービスが提供される日）における予測来客者数（目標変数）を出力するように、説明変数と目標変数とが設定されてある。 Fig. 3 is a schematic diagram of an example of learning algorithm 1L, and Fig. 4 is a flowchart showing an example of the learning process procedure of learning algorithm 1L. Learning algorithm 1L is a Leaf-Wise decision tree algorithm. The explanatory variables and target variables are set so that when feature data (explanatory variables) based on POS data is input, learning algorithm 1L outputs a predicted number of customers (target variable) on the date to be predicted (i.e., the date on which goods or services are provided at a specific location).

処理部１０は、以下に示す処理手順を、毎日実行する。実行頻度は、設定可能であってよい。この場合、実行頻度は、例えば、指定された予測期間と、予測する需要の期間（本実施形態では１日当たり）とに基づき定められる。指定された予測期間が１日先である場合は毎日実行され、１週間先である場合は少なくとも１週間に一度実行され、１ヶ月（４週間、５週間）先である場合は少なくとも月に１度実行されるように定められてもよい。なお、予測する需要が、１日当たりの来客数ではなく、１週間当たりの来客数である場合には、実行頻度は、最短で１週間ごとになる。 The processing unit 10 executes the processing procedure shown below every day. The execution frequency may be configurable. In this case, the execution frequency is determined, for example, based on the specified prediction period and the period of the predicted demand (per day in this embodiment). If the specified prediction period is one day ahead, the execution frequency may be determined to be daily, if it is one week ahead, the execution frequency may be determined to be at least once a week, and if it is one month (4 weeks, 5 weeks) ahead, the execution frequency may be determined to be at least once a month. Note that if the predicted demand is the number of visitors per week rather than the number of visitors per day, the execution frequency will be once a week at the shortest.

処理部１０は、記憶部１１からＰＯＳデータを所定期間（例えば４年間）分読み出す（ステップＳ１０１）。処理部１０は、ＰＯＳデータを年単位、月単位、季節単位、と暦における周期に合致した単位で取得するとよい。 The processing unit 10 reads out POS data for a predetermined period (e.g., four years) from the memory unit 11 (step S101). The processing unit 10 may obtain the POS data in units that match the calendar cycle, such as annual, monthly, or seasonal.

ＰＯＳデータは例えば、売上年月日、時刻、レシート番号、部門コード、部門名、分類コード、分類名、商品コード、商品名、売上金額及び売上点数等を含む。なお、これらの全てがＰＯＳデータとして利用されることは必須ではなく、これらの一部がＰＯＳデータとして利用されてもよく、別の要素がＰＯＳデータとして利用されてもよい。 POS data includes, for example, the date and time of the sale, receipt number, department code, department name, classification code, classification name, product code, product name, sales amount, and sales points. Note that it is not essential that all of these are used as POS data; some of these may be used as POS data, or other elements may be used as POS data.

処理部１０は、読み出したＰＯＳデータから、特徴量エンジニアリングにより、各日について、暦に関する特徴量、予測対象の需要（ここでは来客数）に関する統計量、及び、売上金額に関する特徴量をそれぞれ、複数生成する（ステップＳ１０２）。 The processing unit 10 uses feature engineering to generate multiple calendar features, statistics on the demand to be predicted (here, the number of visitors), and features on the sales amount for each day from the read POS data (step S102).

ステップＳ１０２における特徴量エンジニアリングにて処理部１０は、暦に関する特徴量として例えば、「祝日か否か」、「土日祝日であるか否か」、「年末年始期間か否か」、「大型連休中か否か」、「外国の大型連休中か否か」、「休前日か否か」、「月初日か否か」、「月末日であるか否か」、「四半期開始日か否か」、「四半期末日であるか否か」、「給料日に当たる日であるか否か」等の、店舗で商品又はサービスが提供される日（予測対象の日にち）が暦上の特定日に当たるか否かの二値（TRUE or FALSE ）の特徴量を作成する。処理部１０は、暦に関する特徴量として例えば、「連休数」、「給料日からの日数（給料日別）」、「月内における曜日のＮ番目であるかのＮ」、「３６５日周期のコサイン関数での値」、「７日周期のコサイン関数での値」、「年間における経過割合」、「年間における経過割合の対数」等、店舗で商品又はサービスが提供される日の暦における特徴を示す連続値（二値ではない値）の特徴量を作成する。「月内における曜日のＮ番目であるかのＮ」は、例えば第３水曜日であればその特徴量は「３．０」である。「３６５日周期のコサイン関数での値」は、例えば年間の１／４の日に該当すれば、その特徴量は「１．０」であって、年間の１／２の日に該当すればその特徴量は「０．０」、年間の３／４の日に該当すればその特徴量は「－１．０」である。 In the feature engineering in step S102, the processing unit 10 creates features of binary values (TRUE or FALSE) indicating whether the day on which the product or service is provided at the store (the predicted date) corresponds to a specific date on the calendar, such as "is it a national holiday?", "is it a Saturday, Sunday or national holiday?", "is it a New Year's holiday?", "is it during a long holiday?", "is it during a long holiday in a foreign country?", "is it the day before a holiday?", "is it the first day of the month?", "is it the last day of the month?", "is it the first day of the quarter?", "is it the last day of the quarter?", "is it a payday?", etc. The processing unit 10 creates features of continuous values (non-binary values) indicating the calendar characteristics of the day on which the product or service is provided at the store, such as "number of consecutive holidays", "number of days from payday (by payday)", "N whether it is the Nth day of the week in the month", "value of a cosine function with a 365-day cycle", "value of a cosine function with a 7-day cycle", "percentage of the year", "logarithm of the percentage of the year". For example, if the "Nth day of the week in the month" is the third Wednesday, then the feature value is "3.0". For example, if the "value of the cosine function with a 365-day cycle" corresponds to 1/4 of the year, then the feature value is "1.0", if it corresponds to 1/2 of the year, then the feature value is "0.0", and if it corresponds to 3/4 of the year, then the feature value is "-1.0".

同様にしてステップＳ１０２における特徴量エンジニアリングにて処理部１０は、来客数に関する統計量として例えば、予測対象の日までの「１週間来客数の平均」、「４週間来客数の平均」、「２ヶ月来客者数の平均」等、予測対象である来客数の実績値の直近特定期間における平均の特徴量を作成する。処理部１０は、「１週間前（４週間前）の来客数」、「５２週前の来客数」、「同曜日４週の来客数の中央値」、「同曜日４週の標準偏差」、「同曜日４週の来客数の最大値」、「同曜日４週の来客数の最小値」、「同曜日４週平均来客者数と４週間来客者数の平均との差」等の、店舗で商品又はサービスが提供される日と暦上の同じ属性（曜日、月）の日における予測対象である来客数の実績値の特徴量を作成する。処理部１０は、「１週間来客者数の平均と４週間来客者数の平均との差」、「２日前の来客者数」、「前営業日の来客者数」、「１週間来客者数の標準偏差」、「４週間来客者数の標準偏差」等の予測対象である来客者数の実績値のトレンド（変化傾向）に関する特徴量を作成する。 Similarly, in the feature engineering in step S102, the processing unit 10 creates features of the average of the actual number of visitors to be predicted in the most recent specific period, such as the "average number of visitors for one week," "average number of visitors for four weeks," and "average number of visitors for two months" as statistics related to the number of visitors. The processing unit 10 creates features of the actual number of visitors to be predicted on the day with the same attributes (day of the week, month) on the calendar as the day on which the product or service is provided at the store, such as the "number of visitors one week ago (four weeks ago)," "number of visitors 52 weeks ago," "median number of visitors for the same day of the week for four weeks," "standard deviation for the same day of the week for four weeks," "maximum number of visitors for the same day of the week for four weeks," "minimum number of visitors for the same day of the week for four weeks," and "difference between the average number of visitors for the same day of the week for four weeks and the average number of visitors for four weeks." The processing unit 10 creates features related to the trend (tendency of change) of the actual number of visitors to be predicted, such as "the difference between the average number of visitors over one week and the average number of visitors over four weeks," "the number of visitors two days ago," "the number of visitors on the previous business day," "the standard deviation of the number of visitors over one week," and "the standard deviation of the number of visitors over four weeks."

同様にしてステップＳ１０２における特徴量エンジニアリングにて処理部１０は、売上金額に関する特徴量として例えば、「現金件数（現金で支払われたレシートの数）／一日の来客組数（レシートの数）の４週間の平均」等の直近（入力データの直近）の現金支払いの割合の特徴量を作成する。処理部１０は、「｛（値引き組数（値引きがあったレシートの数）＋割引組数（割引があったレシートの数））／１日当たりの来客組数｝の３日間平均」、「｛（値引き組数＋割引組数＋販促組数（販促によって来客した組数））／１日当たりの来客組数｝の４週間平均」、「｛（値引き組数＋割引組数＋販促組数＋テイクアウト値引き件数）／１日当たりの来客組数｝の４週間平均」等の、直近の値引きの影響に関する来客数の実績値の特徴量を作成する。処理部１０は、「売上金額／来客者数の客単価の３日間平均」、「１組当たりの来客数（来客数／来客組数）の３日間平均」等の直近の客単価に関する実績値の特徴量を作成する。 Similarly, in the feature engineering in step S102, the processing unit 10 creates a feature of the most recent (most recent input data) cash payment ratio, such as "cash transaction count (number of receipts paid in cash)/number of customer groups per day (number of receipts) averaged over four weeks" as a feature related to the sales amount. The processing unit 10 creates a feature of the actual number of customers related to the impact of the most recent discount, such as "three-day average of {(number of discounted groups (number of receipts with discounts) + number of discounted groups (number of receipts with discounts))/number of customer groups per day}", "four-week average of {(number of discounted groups + number of discounted groups + number of promotional groups (number of groups that visited due to promotions))/number of customer groups per day}", or "four-week average of {(number of discounted groups + number of discounted groups + number of promotional groups + number of takeout discounts)/number of customer groups per day}". The processing unit 10 creates features of actual values related to the most recent average customer spending, such as "sales amount/number of visitors average over three days" and "number of visitors per group (number of visitors/number of groups of visitors) average over three days."

ステップＳ１０２で生成した、各日についての特徴量のデータを入力データとし、その日の実際の来客者数を目的変数とした教師データを用いて、処理部１０は、学習アルゴリズム１Ｌの学習を実行し（ステップＳ１０３）、学習処理を終了する。 The processing unit 10 uses the feature data for each day generated in step S102 as input data and the training data with the actual number of visitors on that day as the objective variable to execute learning of the learning algorithm 1L (step S103), and ends the learning process.

ステップＳ１０３における学習で処理部１０は、ステップＳ１０１で読み出したデータに対応する所定期間の単位（年）よりも小さい単位（月）での直近のデータをイテレーション、ラウンド数、サイクル数等と呼ばれる学習回数の評価（validation）に用いるとよい。例えば処理部１０は、ステップＳ１０１で４年間分のＰＯＳデータを読み込んだ場合、ステップＳ１０３で３年１１か月分の各日の特徴量のデータと、その実際の来客者数とを教師データとし、１ヶ月分の各日の特徴量のデータと、その実際の来客者数とをテストデータとして指定する。処理部１０は、テストデータにより、学習回数ごとの評価関数を導出し、最適な学習回数を決定して学習を実行してもよい。ステップＳ１０３において処理部１０は、学習アルゴリズム１Ｌにおけるハイパーパラメータ、例えば決定木の深さの指定をしてもよい。 In the learning in step S103, the processing unit 10 may use the most recent data in units (months) smaller than the unit of the predetermined period (years) corresponding to the data read in step S101 for validation of the number of learning times, called iterations, rounds, cycles, etc. For example, if the processing unit 10 reads POS data for four years in step S101, in step S103, the processing unit 10 specifies the feature data for each day for three years and eleven months and the actual number of visitors as teacher data, and the feature data for each day for one month and the actual number of visitors as test data. The processing unit 10 may derive an evaluation function for each number of learning times from the test data, and execute learning by determining the optimal number of learning times. In step S103, the processing unit 10 may specify a hyperparameter in the learning algorithm 1L, for example, the depth of the decision tree.

ステップＳ１０３の学習において、学習アルゴリズム１Ｌは、入力される特徴量のデータ（入力データ）を説明変数とした場合に目的変数を予測するための分岐条件を作成し、決定木を作成する。 In the learning step S103, the learning algorithm 1L creates branching conditions for predicting the objective variable when the input feature data (input data) is used as the explanatory variable, and creates a decision tree.

図４のフローチャートに示した処理により、処理部１０は例えば毎日、指定された期間における１日当たりの来客数を予測する学習アルゴリズム１Ｌを作成する。 By the process shown in the flowchart of Figure 4, the processing unit 10 creates a learning algorithm 1L that predicts the number of visitors per day for a specified period, for example, every day.

図４のフローチャートに示した処理手順の内、ステップＳ１０１及びステップＳ１０２において処理部１０は、新たに追加されたＰＯＳデータのみを読み出し、追加された期間中の各日における特徴量を生成し、生成した特徴量を、日にちを識別するデータに対応付けて記憶しておいてもよい。処理部１０は、ＰＯＳデータそのものを用いずに特徴量エンジニアリング処理によって生成したデータを記憶部１１に蓄積してから使用してもよい。 In the processing procedure shown in the flowchart of FIG. 4, in steps S101 and S102, the processing unit 10 may read only the newly added POS data, generate features for each day during the added period, and store the generated features in association with data identifying the date. The processing unit 10 may accumulate data generated by the feature engineering process in the memory unit 11 and then use the data, without using the POS data itself.

図５は、予測装置１による予測処理手順の一例を示すフローチャートである。処理部１０は、例えば毎日、以下の処理を実行する。 Figure 5 is a flowchart showing an example of a prediction processing procedure by the prediction device 1. The processing unit 10 executes the following processing, for example, every day.

処理部１０は、前日までに取得したＰＯＳデータを記憶部１１から読み出す（ステップＳ２０１）。 The processing unit 10 reads the POS data acquired up until the previous day from the memory unit 11 (step S201).

処理部１０は、ステップＳ２０１で読み出したＰＯＳデータから、特徴量エンジニアリングにより、学習の際に用いた特徴量と同様に、予測対象の需要の期間（１日）に対する特徴量（暦に関する特徴量、需要に関する統計量、及び、売上金額に関する特徴量）をそれぞれ、作成する（ステップＳ２０２）。 The processing unit 10 uses feature engineering to create features (calendar features, demand statistics, and sales amount features) for the period (one day) of the demand to be predicted, similar to the features used during learning, from the POS data read in step S201 (step S202).

処理部１０は、前日までのＰＯＳデータを用いて学習された学習アルゴリズム１Ｌを読み出す（ステップＳ２０３）。 The processing unit 10 reads out the learning algorithm 1L that was learned using POS data up to the previous day (step S203).

処理部１０は、ステップＳ２０２で作成した特徴量のデータを、ステップＳ２０３で読み出した学習アルゴリズム１Ｌに入力し（ステップＳ２０４）、学習アルゴリズム１Ｌから得られる予測データを取得する（ステップＳ２０５）。 The processing unit 10 inputs the feature data created in step S202 into the learning algorithm 1L read out in step S203 (step S204), and obtains prediction data obtained from the learning algorithm 1L (step S205).

処理部１０は、ステップＳ２０３の学習アルゴリズム１Ｌにおける所定の重要度以上と判断された特徴量のデータを抽出する（ステップＳ２０６）。LightGBMを用いた学習アルゴリズム１Ｌでは、特徴量それぞれに対する重要度の係数が得られるので、処理部１０は、得られる係数の総和に対するその係数の割合を重要度として算出し、所定値以上の重要度の特徴量のデータを重要度と共に抽出する。 The processing unit 10 extracts data on features that are determined to have a predetermined importance or higher in the learning algorithm 1L in step S203 (step S206). In the learning algorithm 1L using LightGBM, an importance coefficient for each feature is obtained, so the processing unit 10 calculates the ratio of that coefficient to the sum of the obtained coefficients as the importance, and extracts data on features that have an importance equal to or higher than the predetermined value together with the importance.

処理部１０は、取得した予測データ及び特徴量のデータを、予測対象の日にちを表すデータと対応付けて記憶部１１に記憶する（ステップＳ２０７）。 The processing unit 10 stores the acquired prediction data and feature data in the memory unit 11 in association with data representing the date to be predicted (step S207).

処理部１０は、必要分の予測データが記憶されたか否かを判断する（ステップＳ２０８）。ステップＳ２０８で処理部１０は、翌日分の全ての予測データが記憶されたか否かを判断してもよいし、予測データを計数してもよい。ステップＳ２０８で処理部１０は、翌月（２８日間）分の予測データが記憶されたか否かを判断してもよい。ステップＳ２０８で処理部１０は、集計する所定の日であるか否かを判断してもよい。例えば、予測対象の日にちとして１ヶ月（４週間）先が指定されている場合、各月の末日から１週間前の日を所定の日としてもよい。記憶されていないと判断された場合（Ｓ２０８：ＮＯ）、処理部１０は、処理を終了し、その日の処理を終了する。 The processing unit 10 determines whether the necessary amount of predicted data has been stored (step S208). In step S208, the processing unit 10 may determine whether all the predicted data for the next day has been stored, or may count the predicted data. In step S208, the processing unit 10 may determine whether the predicted data for the next month (28 days) has been stored. In step S208, the processing unit 10 may determine whether it is a specified day to be tallied. For example, if the date to be predicted is specified one month (4 weeks) in the future, the specified day may be the day one week before the last day of each month. If it is determined that the data has not been stored (S208: NO), the processing unit 10 ends the process and ends the processing for that day.

ステップＳ２０８において記憶されたと判断された場合（Ｓ２０８：ＹＥＳ）、処理部１０は、必要分、例えば１ヶ月（４週間）先が指定されている場合には１ヶ月分の予測データを日にちと対応付けて読み出す（ステップＳ２０９）。処理部１０は、読み出した予測データを日にちに対してプロットして表示部１３に出力し（ステップＳ２１０）、抽出された所定の重要度以上の特徴量のデータを予測データのプロットと共に表示部１３に出力し（ステップＳ２１１）、処理を終了する。 If it is determined in step S208 that the data has been stored (S208: YES), the processing unit 10 reads out the necessary amount of forecast data, for example, one month's worth of forecast data in association with the date if one month (four weeks) ahead is specified (step S209). The processing unit 10 plots the read forecast data against the date and outputs it to the display unit 13 (step S210), outputs the extracted feature data with a predetermined importance or higher to the display unit 13 together with the plot of the forecast data (step S211), and ends the process.

図６は、予測データの表示例を示す図である。図６には、横軸を日にち、縦軸を目標変数である来客数としたグラフ１３１、及び、グラフの日にちの期間で抽出された特徴量のデータと重要度とが対応付けられた一覧１３２が示されている。グラフ１３１は、予測データの値及びその推移を実線で示し、実際のデータを黒丸で示す。グラフ１３１には、各日の来客数の絶対値と、１か月前、３か月前からの増減分とが示されてもよい。これにより、各店舗の管理者は、翌月の人員確保、材料確保等の目途を付けることができる。 FIG. 6 is a diagram showing an example of display of predicted data. FIG. 6 shows a graph 131 with the horizontal axis representing the date and the vertical axis representing the number of customers as a target variable, and a list 132 in which the feature quantity data extracted during the period of the graph's dates and the importance are associated with each other. The graph 131 shows the value of the predicted data and its transition with a solid line, and the actual data with a black circle. The graph 131 may show the absolute value of the number of customers for each day, and the increase or decrease from one month ago or three months ago. This allows the manager of each store to estimate the number of personnel and materials for the next month.

単にＰＯＳデータを学習アルゴリズム１Ｌに入力した場合と比較して、予測対象の日にち（物品又はサービスが提供される日）が特定日に当たるか否かの二値の特徴量、予測対象の日にちの暦上における特徴を示す連続値の特徴量等を特徴量エンジニアリングによって作成して用いることで精度が向上した。２０２０年の国・自治体からの休業要請等、予測が困難な事象があったにもかかわらず、ある業態の店舗では店舗毎に４２．７％→４９．６％、６１．１％→６６．１％、４９．４％→６２．１％、５０．３％→５５．８％、５６．５％→６０．７％等と精度が向上したことが確認された。また「土日祝日であるか否か」、「同曜日４週の来客数の中央値」、「年間における経過割合」、「３６５日周期のコサイン関数での値」、「連休数」、「５２週前の来客数」の特徴量の重要度が高く算出される傾向にあった。特徴量エンジニアリングにて、予測対象の日にちの暦上における周期的な特徴を表す説明変数の重要度が高かった。また別の業態の店舗では店舗毎に４０．３％→６０．６％、４４．９％→６４．５％、４５．４％→６３．６％、４７．０％→６４．５％、４２．３％→５８．５％等と精度が向上したことが確認された。また「土日祝日であるか否か」、「連休数」、「５２週前の来客数」、「前営業日の来客数」、「２日前の来客数」、「２日前からの１週間来客数の平均」の特徴量の重要度が高く算出される傾向にあった。 Compared to simply inputting POS data into learning algorithm 1L, accuracy was improved by using feature engineering to create binary features indicating whether the predicted date (the day the goods or services are provided) falls on a specific day, and continuous features indicating the calendar characteristics of the predicted date. Despite events that were difficult to predict, such as requests for business closures from the national and local governments in 2020, it was confirmed that accuracy improved for certain types of stores, from 42.7% to 49.6%, 61.1% to 66.1%, 49.4% to 62.1%, 50.3% to 55.8%, and 56.5% to 60.7%, etc. In addition, the importance of features such as "whether it is a weekend or holiday," "median number of customers in the same four weeks," "percentage of customers over the year," "value of the cosine function with a 365-day cycle," "number of consecutive holidays," and "number of customers 52 weeks ago" tended to be calculated as high. In feature engineering, explanatory variables that represent periodic characteristics on the calendar of the target date were given high importance. Also, in stores of different business formats, it was confirmed that accuracy improved from 40.3% to 60.6%, 44.9% to 64.5%, 45.4% to 63.6%, 47.0% to 64.5%, 42.3% to 58.5%, etc. The importance of features for "whether it is a weekend or public holiday", "number of consecutive holidays", "number of customers 52 weeks prior", "number of customers on the previous business day", "number of customers two days prior", and "average number of customers in the week from two days prior" tended to be calculated as high.

（第２の実施形態）
図７は、第２の実施形態における予測装置３及び端末装置４の構成を示すブロック図である。第２の実施形態における予測装置３の構成を示すブロック図である。第２の実施形態における予測装置３の構成は、以下に示すように、端末装置４と共にサーバ及びクライアント形式としたこと以外、第１の実施形態と同様である。 Second Embodiment
7 is a block diagram showing the configurations of the prediction device 3 and the terminal device 4 in the second embodiment. This is a block diagram showing the configuration of the prediction device 3 in the second embodiment. The configuration of the prediction device 3 in the second embodiment is the same as that in the first embodiment, except that, as shown below, the prediction device 3 and the terminal device 4 are in a server and client format.

第２の実施形態において予測装置３は、各店舗、又は、各グループに設置された端末装置４との間で、インターネットを介した通信によってデータの送受信が可能な所謂クラウドサーバである。予測装置３は、サーバコンピュータを用いる。予測装置３は、複数台のコンピュータをネットワークで通信接続して分散処理させる態様であってもよい。 In the second embodiment, the prediction device 3 is a so-called cloud server that can transmit and receive data by communication via the Internet between the terminal devices 4 installed in each store or each group. The prediction device 3 uses a server computer. The prediction device 3 may be configured to perform distributed processing by connecting multiple computers to each other via a network.

予測装置３は、処理部３０、記憶部３１、及び通信部３２等を備える。処理部３０は、ＣＰＵ及び／又はＧＰＵを用いたプロセッサである。処理部３０は、記憶部３１に記憶されている予測プログラム３Ｐ及び学習アルゴリズム３Ｌに基づき、予測処理を実行する。 The prediction device 3 includes a processing unit 30, a storage unit 31, and a communication unit 32. The processing unit 30 is a processor using a CPU and/or a GPU. The processing unit 30 executes prediction processing based on a prediction program 3P and a learning algorithm 3L stored in the storage unit 31.

記憶部３１は、例えばハードディスク、フラッシュメモリ、ＳＳＤ等の不揮発性メモリを用いる。記憶部３１は、処理部３０が参照するデータを記憶する。記憶部３１は、予測プログラム３Ｐを記憶する。記憶部３１は、学習アルゴリズム３Ｌを記憶する。予測プログラム３Ｐ及び／又は学習アルゴリズム３Ｌは、記録媒体８に記憶してある予測プログラム８Ｐ及び／又は学習アルゴリズム８Ｌを処理部３０が読み出して記憶部３１に複製したものであってもよい。 The storage unit 31 uses a non-volatile memory such as a hard disk, flash memory, SSD, etc. The storage unit 31 stores data referenced by the processing unit 30. The storage unit 31 stores a prediction program 3P. The storage unit 31 stores a learning algorithm 3L. The prediction program 3P and/or the learning algorithm 3L may be a prediction program 8P and/or a learning algorithm 8L stored in the recording medium 8 that is read by the processing unit 30 and copied to the storage unit 31.

学習アルゴリズム３Ｌは、第１の実施形態の学習アルゴリズム１Ｌと同様のLightGBMを用いた機械学習を実施するためのライブラリ群と、学習済みパラメータとを含むデータである。 Learning algorithm 3L is data that includes a library group for performing machine learning using LightGBM similar to learning algorithm 1L in the first embodiment, and learned parameters.

記憶部３１は、複数の端末装置４それぞれから収集したＰＯＳデータを、送信元の端末装置４別に、部門コード毎に記憶する。ＰＯＳデータの記憶期間は、第１の実施形態と同様に設定される。 The storage unit 31 stores the POS data collected from each of the multiple terminal devices 4 by department code and by sending terminal device 4. The storage period for the POS data is set in the same manner as in the first embodiment.

通信部３２は、インターネットを含むネットワークを介した端末装置４との通信を実現する。通信部３２は、具体的にはネットワークカードである。処理部３０は、通信部３２によって端末装置４からＰＯＳデータを収集し、記憶部３１に記憶する。 The communication unit 32 realizes communication with the terminal device 4 via a network including the Internet. Specifically, the communication unit 32 is a network card. The processing unit 30 collects POS data from the terminal device 4 via the communication unit 32 and stores the data in the memory unit 31.

端末装置４は、パーソナルコンピュータである。デスクトップ型、ラップトップ型、タブレット端末、又はスマートフォンであってもよい。端末装置４は、処理部４０、記憶部４１、第１通信部４２、第２通信部４３、表示部４４及び操作部４５等を備える。端末装置４はローカルネットワーク（インターネットでもよい）を介して店舗のＰＯＳシステムと通信接続が可能である。 The terminal device 4 is a personal computer. It may be a desktop type, a laptop type, a tablet terminal, or a smartphone. The terminal device 4 includes a processing unit 40, a memory unit 41, a first communication unit 42, a second communication unit 43, a display unit 44, and an operation unit 45. The terminal device 4 can be connected to the store's POS system via a local network (which may be the Internet).

処理部４０は、ＣＰＵであり、記憶部４１に記憶されている端末プログラム４Ｐに基づき、処理を実行する。処理部４０は、第２通信部４３を介してＰＯＳシステムから取得したＰＯＳデータを予測装置３へ、端末装置４を識別する識別データと対応付けて第１通信部４２から送信する。処理部４０は、予測装置３における予測処理の結果（予測データ及び特徴量データ）を受信する。 The processing unit 40 is a CPU and executes processing based on the terminal program 4P stored in the memory unit 41. The processing unit 40 transmits POS data acquired from the POS system via the second communication unit 43 to the prediction device 3 from the first communication unit 42 in association with identification data for identifying the terminal device 4. The processing unit 40 receives the results of the prediction process in the prediction device 3 (prediction data and feature data).

記憶部４１は、ハードディスク、フラッシュメモリ、ＳＳＤ等の不揮発性メモリを用いる。記憶部４１は、端末プログラム４Ｐを記憶する。記憶部４１は、ＰＯＳデータを記憶する。記憶部４１は、予測装置３から受信した予測処理の結果を記憶する。 The memory unit 41 uses a non-volatile memory such as a hard disk, flash memory, or SSD. The memory unit 41 stores the terminal program 4P. The memory unit 41 stores POS data. The memory unit 41 stores the results of the prediction process received from the prediction device 3.

第１通信部４２は、予測装置３との通信接続を実現する。第１通信部４２は、ネットワークカード又は無線通信モジュールである。 The first communication unit 42 realizes a communication connection with the prediction device 3. The first communication unit 42 is a network card or a wireless communication module.

第２通信部４３は、ローカルネットワークを介したＰＯＳシステムとの通信を実現する。第２通信部４３は、ネットワークカードであってもよいし、無線通信モジュールであってもよいし、シリアル通信のインタフェースであってもよい。処理部４０は、第２通信部４３を介してＰＯＳシステムと通信が可能である。 The second communication unit 43 realizes communication with the POS system via the local network. The second communication unit 43 may be a network card, a wireless communication module, or a serial communication interface. The processing unit 40 is capable of communicating with the POS system via the second communication unit 43.

表示部４４は、液晶ディスプレイ、有機ＥＬディスプレイ等のディスプレイである。表示部４４は、予測の実行結果を含む画面を表示する。この画面は、予測のための設定を受け付けるインタフェースを含んでもよい。表示部４４は、タッチパネル内蔵型ディスプレイであってもよい。 The display unit 44 is a display such as a liquid crystal display or an organic EL display. The display unit 44 displays a screen including the results of the prediction. This screen may include an interface that accepts settings for the prediction. The display unit 44 may be a display with a built-in touch panel.

操作部４５は、処理部４０との間で入出力が可能なキーボード及びポインティングデバイス等のユーザインタフェースである。操作部４５は、音声入力部であってもよい。操作部４５は、表示部４４のタッチパネルであってもよい。操作部４５は、物理ボタンであってもよい。 The operation unit 45 is a user interface such as a keyboard and a pointing device that can input and output data to and from the processing unit 40. The operation unit 45 may be a voice input unit. The operation unit 45 may be a touch panel of the display unit 44. The operation unit 45 may be a physical button.

第２の実施形態における予測装置３は、ＰＯＳデータをＰＯＳシステムから直接取得するのではなく、端末装置４から受信すること以外については第１の実施形態における図４のフローチャートに示した学習処理と同様のことを実行するため、ここでは詳細な説明は省略する。また、第２の実施形態における予測装置３は、予測結果を端末装置４の表示部４４に表示すること以外については第１の実施形態における図５のフローチャートに示した予測処理と同様のことを実行するため、ここでは詳細な説明は省略する。 The prediction device 3 in the second embodiment does not obtain POS data directly from the POS system, but receives it from the terminal device 4, and other than that, performs the same learning process as shown in the flowchart of FIG. 4 in the first embodiment, so a detailed description is omitted here. Also, the prediction device 3 in the second embodiment performs the same prediction process as shown in the flowchart of FIG. 5 in the first embodiment, so a detailed description is omitted here, other than that, the prediction device 3 displays the prediction results on the display unit 44 of the terminal device 4.

第１の実施形態及び第２の実施形態に示したように、予測装置１（３）は、特徴量エンジニアリングによって、予測対象の日にちが暦上の特定日に当たるか否か等を含む暦に関する特徴量、直近特定期間における目標変数の平均、予測対象の日にちと暦上の同じ属性（曜日、月）の日における目標変数等の特徴量を作成する。これにより、決定木アルゴリズムである学習アルゴリズム１Ｌ（３Ｌ）による予測の精度を向上させることができる。 As shown in the first and second embodiments, the prediction device 1 (3) uses feature engineering to create features such as calendar-related features including whether the date to be predicted falls on a specific calendar day, the average of the target variable in the most recent specific period, and the target variable for days with the same attributes (day of the week, month) on the calendar as the date to be predicted. This can improve the accuracy of predictions made by the learning algorithm 1L (3L), which is a decision tree algorithm.

（変形例）
第１の実施形態及び第２の実施形態では、ＰＯＳデータに基づく予測対象を店舗への来客数、若しくは販売数等の需要としたが、これに限られない。ＰＯＳデータではなく、例えば各人が携帯するスマートフォン若しくは携帯電話から得られる位置情報、交通機関における各駅の自動改札の通過回数の時間分布又は、街中での通行者数のカウント調査結果に基づいてもよい。これらのデータは、特定の場所（店舗、駅、街）における人の行動データである。予測対象は、来客数又は販売数ではなく、来場者数、通行者数等、来客数同様に暦上の特徴と相関がある情報であってもよい。これらの予測対象の情報は、何等かの目的をもって特定の場所に来た人の行動であって、特定の場所にて提供される物品又はサービスに対する需要であると言える。 (Modification)
In the first and second embodiments, the prediction target based on POS data is the demand such as the number of visitors to a store or the number of sales, but is not limited to this. Instead of POS data, the prediction target may be, for example, location information obtained from a smartphone or mobile phone carried by each person, the time distribution of the number of times passing through an automatic ticket gate at each station in a transportation facility, or the result of a count survey of the number of pedestrians in a city. These data are behavior data of people in a specific location (store, station, city). The prediction target may not be the number of visitors or the number of sales, but may be information that is correlated with calendar features like the number of visitors, such as the number of visitors or the number of pedestrians. The information of these prediction targets is the behavior of people who come to a specific location with some purpose, and can be said to be the demand for goods or services provided at a specific location.

上述のように開示された実施の形態は全ての点で例示であって、制限的なものではない。本発明の範囲は、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内での全ての変更が含まれる。 The embodiments disclosed above are illustrative in all respects and are not restrictive. The scope of the present invention is defined by the claims, and includes all modifications within the meaning and scope of the claims.

１予測装置
１０処理部
１１記憶部
１Ｐ予測プログラム
１Ｌ学習アルゴリズム（学習モデル）
１３表示部 1 Prediction device 10 Processing unit 11 Memory unit 1P Prediction program 1L Learning algorithm (learning model)
13 Display unit

Claims

A learning method in which a computer learns a learning model that uses data related to the provision of a product or service at a specific location as input data and outputs data indicating demand for the product or service, using data corresponding to the input data and data indicating the demand, which are created based on sales records of the product or service, as training data,
The computer,
creating feature data from the sales performance, the feature data including a feature related to a periodic position in a calendar of a target date on which the product or service is provided, and a statistic related to a sales performance value of the product or service prior to the target date on a date having the same periodic position; and
A learning method comprising the steps of: using feature data created for the target date as input data, using data indicating demand on the target date as a target variable, and executing learning of the learning model by changing the target date.

Among the feature data, a feature relating to a periodic position of the target day in a calendar is
The method of claim 1 , further comprising: a continuous value characteristic of a periodic position in a calendar of days at which the goods or services are provided.

The learning method according to claim 1 , wherein the statistics relating to the actual sales performance values of the feature data include statistics of the actual sales performance values for the same day of the week on the calendar as the day on which the product or service is provided.

The method according to claim 1 , wherein the sales performance is POS data relating to goods or services at the particular location.

1. A method of forecasting demand for a good or service at a particular location, comprising:
The computer,
creating new feature data, which is the same as the feature data used in learning a learning model trained by the learning method according to any one of claims 1 to 4, from sales records in the specific location;
inputting the generated new feature data into the learning model; and
A prediction method that performs a process including a step of outputting data indicating demand obtained from the learning model.

The learning model further outputs a coefficient of importance for each of the input feature data;
The computer includes:
extracting feature quantity data having an importance equal to or greater than a predetermined value based on the importance coefficient;
The prediction method according to claim 5 , further comprising the step of outputting data indicating the demand obtained from the learning model together with the extracted feature amount data.

A forecasting device for forecasting demand for a product or service at a particular location, comprising:
a creation unit that creates new feature data from the sales performance at the specific location, the new feature data being the same type as the feature data used in training a learning model trained by the training method according to claim 1, the new feature data including a feature related to a periodic position in a calendar of a target day on which the product or service is provided, and statistics related to actual sales values of the product or service prior to the target day on a day that has the same periodic position;
an input unit that inputs the newly created feature data into the learning model;
An output unit that outputs data indicating demand obtained from the learning model.

A computer program for causing a computer to learn a learning model that receives data on the provision of a product or service at a specific location as input data and outputs data indicating demand for the product or service, using data corresponding to the input data and data indicating the demand, the data being created based on sales performance of the product or service, as training data,
The computer includes:
creating feature data from the sales performance, the feature data including a feature related to a periodic position in a calendar of a target date on which the product or service is provided, and a statistic related to a sales performance value of the product or service prior to the target date on a date having the same periodic position; and
A computer program that executes a step of using the feature data created for the target date as input data, using data indicating demand on the target date as a target variable, and executing learning of the learning model by changing the target date.

1. A computer program for causing a computer to forecast demand for a good or service at a particular location, comprising:
The computer includes:
creating new feature data from the sales record at the specific location, the new feature data being the same type as the feature data used in training a learning model that uses data on the provision of a product or service at a specific location as input data and outputs data indicating demand for the product or service, the new feature data including a feature related to a periodic position in a calendar of a target day on which the product or service is provided, and statistics related to actual sales values of the product or service prior to the target day that are in the same periodic position;
inputting the generated new feature data into the learning model; and
and outputting data indicating demand obtained from the learning model.