AU2021384028B2

AU2021384028B2 - Detecting impaired physiological function from exhaled gas concentrations and spectral envelope extracted from speech analysis

Info

Publication number: AU2021384028B2
Application number: AU2021384028A
Authority: AU
Inventors: Ilan D. Shallom
Original assignee: Cordio Medical Ltd
Current assignee: Cordio Medical Ltd
Priority date: 2020-11-23
Filing date: 2021-11-22
Publication date: 2024-11-28
Anticipated expiration: 2041-11-22
Also published as: WO2022107091A1; IL302768A; CA3200531A1; AU2021384028A1; JP2023550330A; CN116508101A; KR20230109645A

Abstract

A method includes computing one or more values of at least one parameter at respective times during an exhalation of a subject (22), based on one or more properties of sound passing through air exhaled by the subject during the exhalation, the parameter being related to a concentration of a gas in the air. The method further includes generating (84) an output in response to the values. Other embodiments are also described.

Description

DETECTING IMPAIRED PHYSIOLOGICAL FUNCTION FROM EXHALED GAS CONCENTRATIONS AND SPECTRAL ENVELOPE EXTRACTED FROM SPEECH ANALYSIS FIELD OF THE INVENTION

The present invention is related to the diagnosis and treatment of physiological disorders.

BACKGROUND

During exhalation, carbon dioxide (CO2 ) diffuses from the pulmonary capillaries into the

lungs, while oxygen (02) diffuses from the lungs into the pulmonary capillaries. For each of these

gases, equilibrium is reached (i.e., the exchange of the gas stops) when the partial pressure of the

gas in the lungs equals the partial pressure of the gas in the pulmonary capillaries. In general,

equilibrium for CO2 is reached prior to equilibrium for 02.

The main components in exhaled air are nitrogen (N 2 ), 02, water (H 20), and CO 2 . Hence,

the molecular mass MA(t) of exhaled air, as a function of a time variable t, is approximately equal

to CN2 (t)*MN 2 + C0 2 (*M 02 + CH 20(t)*MH 02 +Cco 2 (t)* MC 0 2, where Cx(t) is the time-varying

concentration of any component x in the exhaled air and Mx is the molecular mass of the

component. In general, M 1 = MA(), the initial molecular mass of the air in the lungs before any

gas exchange takes place, depends only on environmental conditions.

At any time during an exhalation, in approximation:

MA(t) = (1- C' 2 (t) C'CO2 (t))*M1+ C' 0 2 (t)* M 02 + C'coz(t)*Mco2 (1),

where -C' 0 2 (t) is the concentration of 02 that diffused from the lungs (i.e., the volume of 02 that

diffused from the lungs divided by the volume of air in the lungs) before time t, and C'coz(t) is the

concentration of CO2 that diffused into the lungs before time t. When equilibrium for oxygen is

reached at time tE, CCo2 (tE) = -C 0 2 (tE), such that:

ME = MA(tE) = M1 + C'C02 ,E*(MC 0 2 - M0 2 ) M 1 + 12*C'c0 2 ,E (2).

(The approximation MC 0 2 - M02 12, which is assumed for the remainder ofthe present

description, follows from M02 ~ 32 g/mol and MCo2 ~ 44 g/mol.)

In general, two physical mechanisms govern the changes in the concentrations of CO 2 and

02 during exhalation: diffusion and perfusion. Diffusion causes the gases to move across the

alveolar walls, from the side of higher concentration to the side of lower concentration. The rate

of diffusion is proportional to the concentration gradient across the alveolar walls and to a diffusion

constant. Perfusion refers to the flow of blood through the alveolar capillaries.

In some subjects, the flow of blood through the alveolar capillaries is sufficiently fast so

as not to inhibit the rate of diffusion. In other words, the rate of change of the concentration of

each of the gases is diffusion-constrained. In general, for diffusion-constrained gas exchange, C' 02

~ 0 when C02 reaches equilibrium at time tEO <CtE. Hence, from equation (1):

Mc = MA(tEO)~ (1 - CO2,E)*M1+ C'coz,E* MCO2 (3).

Given two molecular masses MA(tl) and MA(t2) where ti< t2 < tEO, it follows, from equation (1), that:

C'c 0 (t2) - C'c0 (ti) = (MA(t2) - MA(t1))/(MCO 2 - MI) (4).

For t2 > ti > tEO, it follows, from equation (1), that:

C'0 2 (t2 ) - C'0 2 (ti) = (MA(t2) - MA(t1))/(M 0 2 - MI) (5).

From the above, it may be deduced that, for diffusion-constrained gas exchange, MA(t)

changes in two stages. In the first stage, MA(t) increases until reaching a maximum of Mc at t =

tEO. In the second stage, MA(t) drops exponentially to ME, which is typically closer to Mc than to

Mi. The rate at which MA(t) changes in each stage depends on the rate of diffusion, which may be

impaired in some subjects.

In other subjects, on the other hand, the blood flow rate through the alveolar capillaries

limits the rate of diffusion; in other words, the rate of concentration change is perfusion

constrained. For such subjects, the rate of concentration change is capped by a maximum value,

which is a function of the rate of blood flow.

In particular, for perfusion-constrained gas exchange, C'co2 (t) ~ -C'o,(t) at all times, and

so, from equation (1):

C'coz(t2) - C'co,(ti) = C' 0 2 (ti) - C' 0 2 (t2) = (M(t2) - M(ti))/12 (6).

Thus, perfusion-constrained gas exchange takes place in a single stage, in which MAt)

increases linearly until reaching ME.

In yet other subjects, the exchange of 02 is diffusion-constrained, but the exchange of CO 2

is perfusion-constrained. In this case, there are two stages, as in the case of complete diffusion

constraint. However, the first stage is relatively long, and Mc is relatively close to ME.

The speed "v" of sound in a gas is a function of the molecular mass "M" of the gas. (In this

context, the term "gas" includes, within its scope, a mixture of gases such as C02, 02, N2, and

water vapor.) For example, for an ideal gas, v = fyRT/M (7), where y is the adiabatic constant of the gas, R is the universal gas constant, and T is the temperature of the gas in Kelvins.

US Patent 8,689,606 describes a sensor chip for gas having cells for emitting and receiving ultrasound and being configured for a sufficiently large frequency range and for measuring concentration of at least one of the gas components based on at least two responses within the range. The frequency range can be achieved by varying the size of cell membranes, varying bias voltages, and/or varying air pressure for an array of cMUTs or MEMS microphones. The sensor chip can be applied in, for example, capnography. A measurement air chamber is implemented in the respiratory pathway, and it and/or the pathway may be designed to reduce turbulence in the exhaled breath subject to ultrasound interrogation. The chip can be implemented as self-contained in the monitoring of parameters, obviating the need for off-chip sensors.

US Patent Application Publication 2019/0080803 describes an apparatus including a network interface and a processor. The processor is configured to receive, via the network interface, speech of a subject who suffers from a pulmonary condition related to accumulation of excess fluid, to identify, by analyzing the speech, one or more speech-related parameters of the speech, to assess, in response to the speech-related parameters, a status of the pulmonary condition, and to generate, in response thereto, an output indicative of the status of the pulmonary condition.

West, John B., et al., "Measurements of pulmonary gas exchange efficiency using expired gas and oximetry: results in normal subjects," American Journal of Physiology-Lung Cellular and Molecular Physiology 314.4 (2018): L686-L689 describes a noninvasive method for measuring the efficiency of pulmonary gas exchange in patients with lung disease. The patient wears an oximeter, and the partial pressures of oxygen and carbon dioxide in inspired and expired gas is measured using miniature analyzers.

West, John B., and G. Kim Prisk. "A new method for noninvasive measurement of pulmonary gas exchange using expired gas," Respiratory physiology & neurobiology 247 (2018): 112-115 describes how the composition of expired gas can be used in conjunction with pulse oximetry to obtain useful measures of gas exchange efficiency.

By way of clarification and for avoidance of doubt, as used herein and except where the context requires otherwise, the term "comprise" and variations of the term, such as "comprising", "comprises" and "comprised", are not intended to exclude further additions, components, integers

or steps.

Reference to any prior art in the specification is not an acknowledgement or suggestion that this prior art forms part of the common general knowledge in any jurisdiction or that this prior art could reasonably be expected to be combined with any other piece of prior art by a skilled person in the art.

SUMMARY OF THE INVENTION

According to a first aspect of the invention, there is provided, a system comprising an output device; and one or more processors configured to cooperatively carry out a process that includes: computing one or more values of at least one parameter at respective times during an exhalation of a subject, based on one or more properties of sound emitted by the subject while producing the exhalation and passing through air exhaled by the subject during the exhalation, the parameter being related to a concentration of a gas in the air, wherein the sound belongs to speech of the subject, and wherein computing the values comprises computing the values based on a speech signal representing the speech, and generating an output, via the output device, in response to the values.

In some embodiments, the system further includes a sensor configured to measure a speed of the sound, and the process includes computing the values based on the speed.

In some embodiments, the system further includes a sensor configured to measure a baseline concentration of the gas in other exhaled air, and the process includes computing the values based on the baseline concentration.

According to a second aspect of the invention, there is provided a method, comprising: computing one or more values of at least one parameter at respective times during an exhalation of a subject, based on one or more properties of sound emitted by the subject while producing the exhalation and passing through air exhaled by the subject during the exhalation, the parameter being related to a concentration of a gas in the air, wherein the sound belongs to speech of the subject, and wherein computing the values comprises computing the values based on a speech signal representing the speech; and generating an output in response to the values.

In some embodiments, the output indicates a state of the subject with respect to a physiological condition selected from the group of conditions consisting of: heart failure, asthma, hypobaropathy, hypercapnia, Chronic Obstructive Pulmonary Disease (COPD), and Interstitial Lung Disease (ILD).

In some embodiments, the method further includes, based on the values, identifying an extent to which a rate of change in the concentration is perfusion-constrained, and the output indicates the extent to which the rate of change is perfusion-constrained.

In some embodiments, computing the one or more values of the at least one parameter includes computing multiple values of the concentration, the method further includes, based on the multiple values of the concentration, computing a rate of change of the concentration, and generating the output includes generating the output in response to the rate of change.

In some embodiments, generating the output includes generating the output in response to comparing the rate of change to a baseline rate of change.

In some embodiments, computing the one or more values of the at least one parameter further includes computing multiple molecular-mass values of a molecular mass of the air,

4a the method further includes identifying an extent to which the rate of change is perfusion constrained, and computing the rate of change includes computing the rate of change based on the molecular-mass values and in response to identifying the extent to which the rate of change is perfusion-constrained.

In some embodiments, computing the rate of change includes computing the rate of change

as a function of:

another rate of change of the molecular mass, and

at least one constant that depends on the extent to which the rate of change is perfusion

constrained.

In some embodiments, the one or more values include an equilibrium value of the

parameter.

In some embodiments, the equilibrium value includes a C02-equilibrium value of a C0 2

concentration of CO2 in the air.

In some embodiments, computing the C02-equilibrium value includes computing the C0 2

equilibrium value based on a baseline C02-equilibrium value that was measured prior to the

exhalation.

In some embodiments, the sound is emitted by the subject while producing the exhalation.

In some embodiments, the sound belongs to speech of the subject, and computing the

values includes computing the values based on a speech signal representing the speech.

In some embodiments, computing the values includes:

selecting portions of the speech signal recorded at the times, respectively;

computing respective spectral envelopes of the portions; and

computing the values based on respective expansions or contractions of the spectral

envelopes relative to respective corresponding baseline spectral envelopes.

In some embodiments, the values include respective expansion factors that quantify the

expansions or contractions.

In some embodiments, the baseline spectral envelopes belong to respective baseline signal

portions corresponding to the portions of the signal, respectively.

In some embodiments, the baseline signal-portions are other portions of the speech signal.

In some embodiments, the baseline signal-portions belong to a reference speech signal.

In some embodiments, the reference speech signal represents other speech uttered while in

a known physiological state, and the output indicates a physiological state of the subject relative

to the known physiological state.

In some embodiments, the reference speech signal represents other speech, and computing

the values includes computing the values based on one or more measured properties of other air

exhaled during the other speech.

In some embodiments, the reference speech signal represents other speech uttered by the

subject.

In some embodiments, computing the values includes computing the values while

identifying the correspondence between the baseline signal-portions and the portions of the speech

signal, by varying the correspondence and expanding or contracting the spectral envelopes or the

baseline spectral envelopes so as to minimize a sum of respective distance measures for the

portions, the distance measure for each of the portions being a distance between (i) spectral

coefficients of the portion and (ii) baseline spectral coefficients of the baseline signal-portion to

which the portion corresponds, following the expansion or contraction.

In some embodiments, computing the values includes computing the values under a

constraint that the values vary in accordance with a predefined function.

In some embodiments, the method further includes, prior to computing the values,

identifying the correspondence between the portions and the baseline signal-portions by

minimizing a sum of respective distance measures for the portions, the distance measure for each

of the portions being a distance between (i) spectral coefficients of the portion and (ii) baseline spectral coefficients of the baseline signal-portion to which the portion corresponds.

In some embodiments, computing the values includes computing the values based on, for

each of the portions, a statistic of respective ratios between (i) one or more formant frequencies of

the portion, and (ii) corresponding formant frequencies in the baseline spectrum for the portion.

expansions or contractions, and the expansion factor for each portion minimizes a distance

between (i) spectral coefficients of the portion and (ii) baseline spectral coefficients of the baseline

spectrum for the portion, following an application of the expansion factor to the spectral

coefficients or to the baseline spectral coefficients.

In some embodiments, computing the values includes computing the values based on

respective measured speeds of the sound.

According to a third aspect of the invention, there is provided a computer software product comprising a tangible non-transitory computer-readable medium in which program instructions are stored, which instructions, when read by one or more processors, cause the processors to cooperatively carry out the method according to the second aspect.

The present invention will be more fully understood from the following detailed description of embodiments thereof, taken together with the drawings, in which:

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is a schematic illustration of a system for evaluating the physiological state of a subject, in accordance with some embodiments of the present invention;

Fig. 2 illustrates an effect of C02 concentration in exhaled air on spectral properties of speech, which may be identified in accordance with some embodiments of the present invention;

Fig. 3 is a schematic illustration of a technique for processing a test signal, in accordance with some embodiments of the present invention;

Fig. 4 is a schematic illustration of a technique for computing an expansion factor, in accordance with some embodiments of the present invention;

Fig. 5 schematically illustrates a technique for selecting a constant, in accordance with some embodiments of the present invention; and

Fig. 6 is a flow chart for an example algorithm for processing a test speech signal, in accordance with some embodiments of the present invention.

DETAILED DESCRIPTION OF EMBODIMENTS OVERVIEW

Many conventional techniques for assessing impaired cardiovascular or pulmonary function in a subject require the subject to visit a hospital or clinic, and/or require the use of a specialized device. Testing for such physiological problems may thus be inconvenient and/or expensive.

Embodiments of the present invention therefore provide improved techniques for diagnosing such physiological problems. These techniques capitalize on the fact that the physiological state of a subject may affect the concentrations of gases in the air exhaled by the subject, and hence, the properties of sound passing through the air, such as sound produced by the subject during the exhalation. Advantageously, using these techniques, the subject may be tested at home using no more than a simple microphone, such as a microphone belonging to a smartphone or personal computer.

More specifically, in embodiments of the present invention, a processor computes one or

more values of at least one parameter related to a concentration of a gas (e.g., 02 or C0 2 ) in the

exhaled air (which, in general, is the same as the concentration of the gas in the subject's lungs) at

respective times during an exhalation of the subject, based on one or more properties of sound

passing through the air at these times. The properties may include, for example, the speed of the

sound (which may be measured directly) and/or spectral properties of the sound. Subsequently,

the processor generates an output in response to the values.

For example, based on the values, the processor may compute a rate of change in the

parameter and/or an equilibrium value of the parameter. Subsequently, based on the rate of change

and/or the equilibrium value (e.g., based on comparing the rate of change and/or the equilibrium

value to a suitable baseline), the processor may ascertain the subject's physiological state and then

generate an output indicating the state. For example, in response to identifying an abnormality in

the rate of change and/or the equilibrium value, the processor may generate an alert indicating that

the subject's state with respect to a physiological condition may be unstable. Alternatively or

additionally, the output may indicate the rate of change, the equilibrium value, and/or the values

themselves, such that a physician may ascertain the subject's physiological state in response thereto.

In some embodiments, the processor computes the values of the parameter based on a

speech signal representing speech of the subject. For example, the processor may select one or

more portions of the speech signal, compute respective spectral envelopes of the portions, and

compute the values of the parameter based on respective expansions or contractions of the spectral

envelopes relative to respective corresponding baseline spectral envelopes. As described in detail

below, these embodiments capitalize on the relationship between the concentration of CO 2 in (and

hence, the molecular mass of) the air exhaled while the speech is uttered, and the spectrum of the

speech signal representing this speech. For example, per this relationship, a higher concentration

of CO 2 causes the spectral envelope of the signal to be more contracted (i.e., less expanded),

relative to a lower concentration of C02.

In some embodiments, the parameter includes an "expansion-factor parameter," whose

values, referred to herein as "expansion factors," quantify the relative expansions or contractions

of the spectral envelopes. The expansion factor for each spectral envelope may be computed from

the ratios between the formant frequencies in the spectral envelope and the corresponding formant

frequencies in the baseline envelope. Alternatively, the expansion factor may be computed directly

from spectral coefficients representing the spectral envelope.

Alternatively or additionally, the parameter may include the concentration of the gas.

Alternatively or additionally, the parameter may include the molecular mass of the exhaled air.

The values of the concentration, and/or the values of the molecular mass, may be derived from the

expansion factors.

In other embodiments, the processor computes the values of the parameter based on

measured speeds of sound passing through the exhaled air. The sound - which does not necessarily

include speech sound, and is not even necessarily within an audible range of frequencies - may be produced by the subject during the exhalation, or it may be produced by another source, such as a

speaker or an ultrasound transducer. The speeds may be measured by any suitable sensor, such as

the aforementioned ultrasound transducer.

SYSTEM DESCRIPTION

Reference is initially made to Fig. 1, which is a schematic illustration of a system 20 for

evaluating the physiological state of a subject 22, in accordance with some embodiments of the

present invention.

System 20 comprises an audio-receiving device 32, which is used by subject 22. Device

32 comprises circuitry, typically comprising an audio sensor 38 (e.g., a microphone), which

converts sound waves to analog electrical signals, an analog-to-digital (A/D) converter 42, a

processor 36, and a network interface, such as a network interface controller (NIC) 34. Typically,

device 32 further comprises a storage device (e.g., a solid-state drive), a screen (e.g., a

touchscreen), and/or other user interface components, such as a keyboard or a speaker. In some

embodiments, device 32 comprises a mobile phone, a tablet computer, a laptop computer, a

desktop computer, a voice-controlled personal assistant (such as an Amazon EchoTMoraGoogle

HomeTM device), a smart speaker device, or a dedicated medical device.

In some embodiments, audio sensor 38 (and, optionally, A/D converter 42) belong to a unit

that is external to device 32. For example, audio sensor 38 may belong to a headset that is

connected to device 32 by a wired or wireless connection, such as a Bluetooth connection.

In some embodiments, system 20 further comprises a temperature sensor configured to

measure the temperature of the exhaled air of subject 22. The measured temperatures are received

by processor 36, and are used by the processor to calculate the values of a relevant parameter

related to a concentration of a gas in air in lungs of the subject, as further described below with

reference to the subsequent figures.

For example, for embodiments in which audio sensor 38 belongs to a microphone in a

headset, the temperature sensor may be mounted onto the microphone. The output signal of the

temperature sensor may be encoded as an acoustic signal, e.g., by frequency modulation, such that

A/D converter 42 receives a bi-channel stereo audio signal including both the output from the

microphone and the acoustic signal from the temperature sensor.

Typically, system 20 further comprises a server 40, comprising circuitry comprising a

processor 28, a storage device 30, such as a hard drive or flash drive, and a network interface, such

as a network interface controller (NIC) 26. Server 40 may further comprise a screen, a keyboard, and/or any other user interface components. Typically, server 40 is located remotely from device

32, e.g., in a control center, and server 40 and device 32 communicate with one another, via their

respective network interfaces, over a network 24, which may include a cellular network and/or the

Internet.

System 20 is configured to facilitate evaluating the subject's physiological state with

respect to heart failure, asthma, hypobaropathy, hypercapnia (e.g., due to changes in altitude or air

quality), Chronic Obstructive Pulmonary Disease (COPD), Interstitial Lung Disease (ILD), or any

other physiological condition that affects the concentration of gases in the subject's lungs.

Typically, the system performs this function by processing one or more speech signals

(also referred to herein as "speech samples") representing speech uttered by the subject during an

exhalation. As further described below with reference to the subsequent figures, the system, based

on the speech signals, computes one or more values of at least one parameter associated with the

air in the lungs of the subject at respective times during the exhalation. The parameter may include,

for example, the molecular mass of the air or the concentration of a gas, such as C02 or 02, in the

air.

In other embodiments, the system is configured to compute the values of the parameter

based on measured speeds of sound in the exhaled air at the respective times during the exhalation. The speeds may be measured by any suitable sensor.

In some such embodiments, the sound is produced by the subject during the exhalation.

Although the sound need not necessarily include speech, an advantage of speech is that, by virtue of the speech imposing a pattern of short inhalations and long exhalations, equilibrium concentrations are generally reached during the exhalations.

In other such embodiments, the sound is produced by another source. For example, the

system may comprise an ultrasonic transducer 33 configured to emit sound (in the form of

ultrasonic waves) into the exhaled air and to measure the speed of the sound, e.g., as described in

Huang, Y. S., et al., "An accurate air temperature measurement system based on an envelope

pulsed ultrasonic time-of-flight technique," Review of Scientific Instruments 78.11 (2007):

115102 or in Jedrusyna, A., "An Ultrasonic Air Temperature Meter," Recent Advances in

Mechatronics, Springer, Berlin, Heidelberg, 2010, 85-89, the respective disclosures of which are

incorporated herein by reference. While the speed of the sound is measured, the subject need not

speak or produce any sound at all; nonetheless, speech may be advantageous by virtue of imposing

a pattern of short inhalations and long exhalations, as described above.

The measured speeds of sound may be communicated, via any suitable communication interface, to processor 36 and/or processor 28. In response to the speeds, the processor may

compute values of the molecular mass per equation (7). (For embodiments in which a temperature

sensor is not used, an approximation for the temperature of the exhaled air, such as 307.5 K, may

be used.) Subsequently, the processor may derive values of a gas concentration from the values of

the molecular mass, e.g., per equation (2).

In yet other embodiments, the system, during a registration procedure, measures one or

more properties of air exhaled by the subject or another subject. For example, the system may

measure the speed of sound in the air, e.g., using the ultrasonic techniques described above.

Alternatively or additionally, a baseline concentration of a gas (e.g., C02 ) in the air may be

measured directly using any suitable sensor, e.g., as described in J. B. West et al., "Measurements

of pulmonary gas exchange efficiency using expired gas and oximetry: results in normal subjects,"

Am. J. Physiol. Cell. Mol. Physiol., vol. 314, no. 4, pp. L686-L689, Apr. 2018, whose disclosure is incorporated herein by reference. During this registration procedure, the subject need not speak

or produce any sound at all; nonetheless, speech may be advantageous by virtue of imposing a

pattern of short inhalations and long exhalations, as described above.

Subsequently to the registration, based on the measured properties, the processor computes

parameter values (e.g., gas concentrations) with respect to the subject's test speech signals, as

described below with reference to the subsequent figures. For example, for embodiments in which

the speed of sound is measured during the registration, the processor may compute baseline values

of the parameter based on the measured speeds of sound, and then compute the parameter values based on the baseline values.

In response to the values of the parameter, the system generates an output via any suitable

output device, such as a display or a speaker. For example, based on the values of the parameter,

the processor may estimate the state of the subject with respect to a physiological condition.

Subsequently, the processor may include the estimated state, optionally with a likelihood

associated with the estimation, in the output. Thus, for example, the output may indicate a

likelihood that the subject is in a stable state, and/or a likelihood that the subject is in an unstable

state, with respect to the condition. Alternatively or additionally, the output may include a score

indicating the degree to which the subject's state appears to be unstable.

Typically, processor 36 of device 32 and processor 28 of server 40 cooperatively perform

the receiving and processing of the speech samples. For example, as the subject speaks into device

32, the sound waves of the subject's speech may be converted to an analog signal by audio sensor

38, which may in turn be sampled and digitized by A/D converter 42. (In general, the subject's speech may be sampled at any suitable rate, such as a rate of between 8 and 45 kHz.) The resulting

digital speech signal may be received by processor 36. Subsequently, processor 36 may

communicate the speech signal, via NIC 34, to server 40, such that processor 28 receives the

speech signal via NIC 26 and then processes the speech signal. Alternatively, processor 36 may

process the speech signal, in which case the system need not necessarily comprise server 40.

(Notwithstanding the above, the remainder of the present description, for simplicity, generally

assumes that processor 28 - also referred to hereinbelow simply as "the processor" - performs the

processing.)

Subsequently to generating the aforementioned output, system 20 may communicate the

output to the subject, to another person (e.g., the subject's physician), and/or to an electronic

patient management system, which may integrate the output with other subject-specific

information and take appropriate action.

For example, processor 28 may communicate the output to processor 36, and processor 36

may then communicate the output to the subject, e.g., by displaying a visual message on the display

of device 32 and/or by playing an audio message using a speaker of device 32. Alternatively or

additionally, in response to the output indicating a relatively high likelihood that the subject's state

is unstable, the processor may communicate an alert indicating that the subject should take

medication or visit a physician. The alert may be communicated by placing a call or sending a

message (e.g., a text message) to the subject, to the subject's physician, and/or to a monitoring

center. Alternatively or additionally, in response to the output, the processor may control a medication-administering device so as to adjust an amount of medication administered to the subject.

In some embodiments, device 32 comprises an analog telephone that does not comprise an

A/D converter or a processor. In such embodiments, device 32 sends the analog audio signal from

audio sensor 38 to server 40 over a telephone network. Typically, in the telephone network, the

audio signal is digitized, communicated digitally, and then converted back to analog before

reaching server 40. Accordingly, server 40 may comprise an A/D converter, which converts the

incoming analog audio signal - received via a suitable telephone-network interface - to a digital

speech signal. Processor 28 receives the digital speech signal from the A/D converter, and then

processes the signal as described above. Alternatively, server 40 may receive the signal from the

telephone network before the signal is converted back to analog, such that the server need not

necessarily comprise an A/D converter.

As further described below with reference to the subsequent figures, processor 28, in processing the speech samples, may compare the spectral envelopes of the samples to baseline

spectral envelopes. The baseline spectral envelopes, and/or reference speech signals to which the

baseline spectral envelopes belong, may be received by processor 28 via NIC 26 and/or any other

suitable communication interface, such as a flash-drive interface.

Processor 28 may be embodied as a single processor, or as a cooperatively networked or

clustered set of processors. For example, a control center may include a plurality of interconnected

servers comprising respective processors, which cooperatively perform the techniques described

herein. In some embodiments, processor 28 belongs to a virtual machine.

In some embodiments, the functionality of processor 28 and/or of processor 36, as

described herein, is implemented solely in hardware, e.g., using one or more Application-Specific

Integrated Circuits (ASICs) or Field-Programmable Gate Arrays (FPGAs). Alternatively or

additionally, the functionality of processor 28 and/or of processor 36 is implemented at least partly

in software. For example, in some embodiments, processor 28 and/or processor 36 is embodied as

a programmed digital computing device comprising at least a central processing unit (CPU) and

random access memory (RAM). Program code, including software programs, and/or data are

loaded into the RAM for execution and processing by the CPU. The program code and/or data

may be downloaded to the processor in electronic form, over a network, for example. Alternatively

or additionally, the program code and/or data may be provided and/or stored on non-transitory

tangible media, such as magnetic, optical, or electronic memory. Such program code and/or data,

when provided to the processor, produce a machine or special-purpose computer, configured to perform the tasks described herein.

PROCESSING SPEECH SIGNALS

Reference is now made to Fig. 2, which illustrates an effect of CO 2 concentration in

exhaled air on spectral properties of speech, which may be identified in accordance with some embodiments of the present invention.

The center of Fig. 2 shows a first plot 44of C02 concentration ([C02]) in exhaled air as a

function of time during an exhalation performed, during speech, by a subject in a stable

physiological state. Further shown is an analogous second plot 46 for the subject in an unstable

state with respect to congestive heart failure or another condition that similarly affects the C02

concentration. As can be seen from the plots, the C02 concentration may increase at a slower rate

in the unstable state, e.g., due to slower blood flow through the lungs and/or accumulation of fluid

in the lungs, which impedes the diffusion of C02. Alternatively or additionally, the equilibrium

C02 concentration may be higher in the unstable state, e.g., due to a higher concentration of C02

in the blood. Alternatively, for a subject suffering from hypocapnia (e.g., due to stroke,

hyperthyroidism, hyperventilation, fear, stress, acute asthma, or COPD), the equilibrium CO 2

concentration may be lower than for the same subject in the stable state.

CO2 has a significantly higher molecular mass than that of the other main gaseous

components of exhaled air. Hence, in general, exhaled air having a higher concentration of CO 2

has a higher molecular mass than exhaled air having a lower concentration, such that, per equation

(7), the speed of sound in the former is less than the speed of sound in the latter. (Although the

exhaled air of the subject is generally not an ideal gas, embodiments of the present invention generally assume equation (7) to hold. Other, more complex formulae for non-ideal gases that may

be used are described, for example, in 0. Cramer, "The variation of the specific heat ratio and the

speed of sound in air with temperature, pressure, humidity, and C02 concentration," J. Acoust.

Soc. Am., vol. 93, no. 5, pp. 2510-2516, May 1993, whose disclosure is incorporated herein by reference.)

The relationship between the speed of sound and the frequency "f'of the sound for a given

wavelength "1" is given by the formula f = vX. Thus, for any given subject, speech uttered with a

greater speed of sound will generally have higher frequency content, relative to speech containing

the same verbal content but uttered with a lesser speed of sound.

It follows, therefore, that the spectral properties of speech are affected by the level of CO 2

concentration in the air exhaled during the speech. This effect is illustrated in the left and right portions of Fig. 2. In particular, the left portion of Fig. 2 shows the spectral envelope 48 of the subject's speech in the stable state at a first time t, assuming the spectral envelope is calculated for some arbitrary time window around time ti. Further shown is the spectral envelope 50 for the subject in the unstable state at the same time, assuming that the same verbal content is uttered in both states such that the shape of envelope 50 is generally the same as that of envelope 48. As can be seen, envelope 50 is expanded relative to envelope 48, due to the lower CO 2 concentration at ti in the unstable state. Conversely, the right portion of Fig. 2, which shows envelope 48 and envelope 50 for a second time t2, shows the latter envelope contracted relative to the former, due to the higher CO 2 concentration at t2 in the unstable state.

More formally, letting o denote frequency, letting Ho(o) denote spectral envelope 48, and

letting Hi(o) denote spectral envelope 50, it follows from the above that H1(o)= Ho(o/J), where

!V0', i.e., the ratio of the speed of sound in the unstable-state exhalation to the speed of sound in

the stable-state exhalation. Per equation (7), this ratio is equivalent to M*T, where Mo and M1

are the respective molecular masses of the air exhaled in the stable and unstable states, and To and

Ti are the respective temperatures of the exhaled air. Hence:

p*7 (8) M,*To

is referred to below as the "expansion factor" of Hi(o) relative to Ho(o); > 1 indicates

expansion, as at the left of Fig. 2, while < 1 indicates contraction, as at the right of Fig. 2.

As described in detail below with reference to the subsequent figures, the processor is

configured to identify the degree to which the spectral envelopes of one or more portions of the

speech signal received from subject 22 (Fig. 1), referred to below as the "test signal," are expanded

or contracted. In particular, the processor may compute one or more expansion factors, each of

which quantifies the expansion or contraction of the spectral envelope of a portion of the test signal

relative to a corresponding baseline spectral envelope, such as spectral envelope 48.

Reference is now made to Fig. 3, which is a schematic illustration of a technique for

processing a test signal 52, which represents speech uttered by the subject, in accordance with

some embodiments of the present invention.

Typically, the processor begins to process test signal 52 by identifying any breathing

pauses 54 in signal 52, e.g., using voice activity detection (VAD). Subsequently, the processor

divides the signal into segments 56, which are separated from each other by breathing pauses 54.

The processor then selects one of segments 56 for further processing. Typically, subsequently to selecting the segment, the processor divides the segment into smaller portions referred to herein as "frames," the length of each frame typically being between 10 and 50 ms. By way of example,

Fig. 3 shows three such frames TF1, TF2, and TF3.

Subsequently, the processor computes respective values of a parameter, such as the

expansion-factor parameter or the molecular mass of the exhaled air, for the selected one or more

portions (e.g., frames) of the test signal, by comparing the spectral envelopes of the portions to

corresponding baseline spectral envelopes. Typically, the baseline spectral envelopes belong to

respective baseline signal-portions corresponding to the selected portions of the test signal,

respectively.

In some cases, the baseline signal-portions belong to a reference speech signal, i.e., a

reference signal representing other speech uttered by the subject or by a different subject. (The

reference signal may be stored, for example, in storage device 30 (Fig. 1).) This reference utterance

may include any suitable verbal content, such as a designated sentence. Typically, the test utterance of the subject includes the same verbal content. (In some embodiments, the verbal

content is chosen to be sufficiently long, and the subject is instructed not to inhale while uttering

the utterance, such that equilibrium is reached during each utterance.)

In some such cases, the reference utterance is uttered by a speaker in a known physiological

state, e.g., a stable state with respect to a physiological condition. Thus, the output generated by

the processor (as described above with reference to Fig. 1) may indicate the physiological state of

the subject relative to the known physiological state. For example, in response to values of

, which indicate the expansion or contraction of the test utterance relative to the reference utterance,

the processor may generate an output indicating whether the subject's state is more or less stable

than the known physiological state.

In other cases, the baseline signal-portions belong to the test signal. For example, when

recording the test signal, the subject may repeat a word, or a short series of words, several times.

Any portion of the test signal, such as a portion assumed to be at equilibrium, may then be

designated as the baseline.

In some embodiments, the processor computes the values by executing a two-stage

technique. First, the processor identifies a correspondence between the portions of the test signal

and the baseline signal-portions. Next, the processor computes the expansion factors and, optionally, the values of another parameter responsively to the correspondence. In other

embodiments, the processor executes a one-stage technique, per which the processor computes the

values while finding the correspondence. Each of these techniques is described below.

By way of example, Fig. 3 shows five frames RF1, RF2, RF3, RF4, and RF5 belonging to a reference signal. As shown in Fig. 3, the processor may, using the one-stage or two-stage

technique, find a correspondence between {TF1, TF2, TF3} and {RF1, RF3, RF5}. (This correspondence indicates that the verbal content of TF1 is similar to that of RF1, the verbal content

of TF2 is similar to that of RF2, and the verbal content of TF3 is similar to that of RF3.)

The two-stage technique

Per the two-stage technique, the processor first identifies the correspondence, using any

suitable algorithm such as the Dynamic Time Warping (DTW) algorithm described in Sakoe and

Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE

Transactions on Acoustics, Speech, and Signal Processing 26.2 (1978): 43-49, whose disclosure

is incorporated herein by reference.

Typically, the correspondence-finding algorithm minimizes a sum of respective distance

measures between corresponding pairs of frames, subject to one or more suitable constraints. The

distance measure between two frames may be defined, for example, as the distance (e.g., the

Euclidean distance) between the spectral coefficients of the first frame and the spectral coefficients

of the second frame. (The spectral coefficients, which represent the spectral envelope of the frame,

may be computed using linear prediction, cepstral analysis, or any other suitable technique for

short-time spectrum computation.) By way of example, Fig. 3 shows a distance dl between TF1

and RF1, a distance d2 between TF2 and RF3, and a distance d3 between TF3 and RF5.

For example, the processor may first select a sequence of Nt test-signal frames for which

expansion factors are to be calculated. The processor may further calculate a sequence {v[nt]} for

nt = 1...Nt, each v[nt] being a vector of spectral coefficients for the ntth test-signal frame of the

sequence. The processor may further identify N baseline frames from which corresponding

frames are to be found. The processor may also calculate a sequence{u[n]} for n = 1...Nb, each

u[n] being a vector of spectral coefficients for the nth baseline frame.

Subsequently, the processor may identify the correspondence between the test-signal

frames and the baseline frames, by identifying the mapping (n,n),(n2,n2),...,(nf,nK) that pairs

the nkth test-signal frame with the n th baseline frame for k = 1...K such that the sum of distances

between the pairs is minimized subject to one or more constraints. In other words, the processor

may identify the mapping that minimizes k=1 w[k] * d(u[n ],v[n]), where d(u[n ],v[n]) is

the distance between u[nb] and v[nf] and {w[k]} are respective weights for the pairs of frames.

(Typically, at least some of the weights are different from each other.) Typically, d(u[n], v[n])

= ||u[n]-v[nIIp, p being greater than or equal to one, where |xIp indicates the LP norm of x. (The Euclidean distance is obtained for p = 2.)

The constraints may include, for example, one or more of the following:

(i) The sequences n'...n and n... are non-decreasing.

(ii) The slopes (n+2-nk)/(n k+2-n ), k=1,...,K-2, are within a predefined range, such as

[0.5, 2] or [1/3, 3].

(iii) Each of the first and last pairs of indices is constrained to a predefined pair of integers, or to a small set of such pairs. For example, (n,n) may be constrained to (1,1), and (nf,nK) may be constrained to (Nt,N).

Subsequently to identifying the correspondence, the processor calculates the expansion factors for the K test-signal frames relative to the K baseline frames, respectively. By way of example, Fig. 3 shows an expansion factor 1 for TF1 relative to RF1, an expansion factor 2 for TF2 relative to RF3, and an expansion factor 3 for TF3 relative to RF5.

It is noted that, in some cases, the same test-signal frame may correspond to multiple baseline frames, such that multiple expansion factors may be computed for the test-signal frame. In such a case, the processor may average the multiple expansion factors, or simply select one of the expansion factors.

For further details regarding the computation of the expansion factors, reference is now made to Fig. 4, which is a schematic illustration of a technique for computing an expansion factor, in accordance with some embodiments of the present invention.

In some embodiments, the processor computes each expansion factor by computing a statistic (e.g., an average, such as a weighted average, or a median) of respective ratios between (i) one or more formant frequencies of the portion of the test signal for which the expansion factor is computed, and (ii) corresponding formant frequencies in the baseline spectrum for the portion.

By way of illustration, Fig. 4 shows a baseline spectral envelope 58 (belonging, for example, to a reference-signal frame such as RF1 of Fig. 3), along with the spectral envelope 60 of the corresponding test-signal frame (such as TF1 of Fig. 3). The formant frequencies of the baseline spectrum are F1, F2, F3, and F4 . Spectral envelope 60 is contracted relative to baseline spectral envelope 58, such that the formant frequencies F1', F2', F3', and F4' of the spectrum of the test-signal frame are slightly smaller than F1 , F2, F3, and F4, respectively. In other words, each formant frequency Fi' of the spectrum of the test-signal frame is slightly smaller than the corresponding formant frequency Fi of the baseline spectrum, for i = 1...4. In this scenario, the processor may compute the expansion factor by averaging Fi'/Fi over i = 1...4. (In general, the processor may use any suitable technique known in the art to identify the formant frequencies of each spectrum.)

In other embodiments, rather than considering ratios between formant frequencies, the

processor computes the expansion factor by utilizing a mathematical relationship between the

expansion factor and the spectral coefficients that represent the spectral envelopes. As an example

of such a relationship, expansion of a spectral envelope by a factor of causes the nthcepstral s=" in (7r(k-n/f3))whr kite coefficient c.' of the expanded envelope to have the value E'_ _, ck, where ck is the kth cepstral coefficient of the original envelope. (In practice, the above summation may be

performed over values of k between n-j and n+j, j being an integer such as five, for example,

without significant loss of precision.) In other words, applying the expansion factor fto the

cepstral coefficient c yields a new cepstral coefficient c' = c ((kn/fl)) Similar 7r(k-n/f3)*

relationships exist for other types of spectral coefficients, such as Discrete Fourier Transform

(DFT) coefficients.

In particular, in such embodiments, the processor computes the expansion factors such that

the expansion factor for each of the test-signal portions minimizes a distance between (i) spectral

coefficients v of the test-signal portion and (ii) baseline spectral coefficients u of the baseline

spectrum for the test-signal portion, following an application of the expansion factor to u or v. In

other words, the processor calculates each expansion factor such that the expansion factor

minimizes the distance d(u', v), u' being the vector u following the application of thereto, or the distance d(u, v'), v' being the vector v following the application of 1/ thereto. The distance d(x,

y) between the two vectors of spectral coefficients may be calculated, for example, as |x-yp, p

being greater than or equal to one.

In some embodiments, the processor minimizes the distance using an iterative optimization

technique, such as the Newton-Raphson method. In other embodiments, the processor performs

an exhaustive search for the optimal expansion factor over a discrete set of values within an

expected range. For example, the processor may compute d(u',v) (or d(u, v')) for # = 1 i j± , j = 0, . . ,J, where 5 is a small step size, and then select the # yielding the minimum distance. In yet

other embodiments, the processor executes the above two techniques in combination: first an

exhaustive search is performed, and then the selected # is used as a starting point for an iterative

optimization technique.

In some embodiments, the processor outputs the values of without explicitly deriving

values of any other parameter from the values of P.

In other embodiments, the processor derives one or more values of another parameter from

the values of , based on equation (8). Examples are hereby provided.

(a) An equilibrium molecular mass or gas concentration

Provided the frames represent speech uttered after equilibrium was reached, the processor

may compute an equilibrium molecular mass or concentration of a gas from the expansion factors.

The processor may assume that equilibrium is reached after a threshold time from the start of the

segment, the threshold time being between one and four seconds, for example.

For example, the processor may first compute a single equilibrium value PE from the

individual values of , e.g., by averaging the individual values. Subsequently, based on E, the

processor may compute Mtest,E, the equilibrium molecular mass of the air exhaled by the subject

while uttering the test signal, and/or CCO 2 ,test,E, the equilibrium concentration of CO2 in the air

(and in the lungs of the subject).

In particular, for embodiments in which a temperature sensor is used as described above

with reference to Fig. 1, Mtest,E may be computed as (MrefE*Ttest,E)/(*TrefE), where Mref,E is the

equilibrium molecular mass of the air exhaled by the subject while uttering the reference signal

(or the baseline portion of the test signal), and Ttest,E and Tref,E are the temperatures of the air

exhaled while uttering the equilibrium portions of the test and reference signals, respectively. For

embodiments in which a temperature sensor is not used, it may be assumed that Ttest,E= TrefE, such that Mtest,E = Mref,E/ E

Mref,E may be computed, for example, from the speed of sound measured (e.g., using the

ultrasonic techniques described above with reference to Fig. 1) while the reference signal was

uttered, per equation (7). (For embodiments in which the temperature T is not measured, an

approximation for T, such as 307.5 K, may be used.) Alternatively, Mref,E may have any suitable

estimated value, such as a value between 25 and 29 g/mol.

For CCO2 ,test,E, the processor may first compute Mtest,E as described above, and then

compute CCO2 ,test,E, per equation (2), as CCO2,testi + (Mtest,E - M1)/12, where CCO2,testi is the

concentration of C02 at the start of the exhalation. Typically, CCo2 ,testI ~ 0, such that CC 2 ,test,E

~ (Mtest,E - Mi)/12.

Alternatively, for embodiments in which a baseline concentration of CO 2 is measured

during the utterance of the reference signal as described above with reference to Fig. 1, the processor may compute CCO2 ,test,E based on Cco2 ,ref,E, the equilibrium CO 2 concentration during the utterance of the reference signal. In particular, the processor may use the formula CCO2 ,test,E=

M1*(f 2-1)/12 + (CC0 2 ,ref,E - CCO2 ,ref,I)/ f ~ M1*(f 2-1)/12 + CCO2 ,ref,E/f, which derives

from substituting equation (2) for both Mtest,E and Mref,E into the equation Mtest,E=Mref,E/

. (b) A rate of change in a gas concentration

Alternatively or additionally to computing an equilibrium value, the processor may

compute a rate of change in the concentration of a gas, such as CO2 or 02, in the expired air. In

particular, the processor may first compute multiple values of the concentration at different

respective times. Subsequently, the processor may compute the rate of change based on the

multiple values.

For example, the processor may first compute multiple values of the molecular mass,

{Mtest 1, for at least some of the K test-signal frames for which the correspondence to the baseline

was found. In particular, each est may be computed as (M ref*Ttest)/(() 2*Tref) or, for

embodiments in which a temperature sensor is not used, asMref/(#)2, where #is the expansion

factor for the kthpair of frames. (Mref may be computed as described above for Mref,E.)

Subsequently, based on {Mtest1, the processor may identify the extent to which the rate of

change in the concentration is perfusion-constrained. Subsequently, the processor may compute

the rate of change in the concentration in response to identifying the extent to which the rate of

change is perfusion-constrained. For example, the processor may compute Mest, the discrete-time

derivative of {Mtest1. Next, the processor may compute the rate of change in the concentration

based on Miest and at least one constant that depends on the extent to which the rate of change is

perfusion-constrained.

For example, the processor may compute the rate of change as C*Mtest, where C is a

constant depending on the extent to which the rate of change is perfusion-constrained. In this

regard, reference is now made to Fig. 5, which schematically illustrates a technique for selecting

the constant C, in accordance with some embodiments of the present invention. (The technique

illustrated in Fig. 5 may be extended to selecting multiple constants for computing the rate of

change in the concentration.)

In some embodiments, respective predefined values of C are assigned to multiple functions

61, which represent the change over time in the molecular mass of exhaled air, or in the expansion

factor, in various cases. For example, Fig. 5 shows three functions 61: a first function 61_1,

corresponding to diffusion-constrained gas exchange, a second function 61_2, corresponding to perfusion-constrained gas exchange, and a third function 61_3, corresponding to partially perfusion-constrained gas exchange. (The CO 2 concentration reaches equilibrium at a time tE0_1 for first function 611, a time tE0_2 for second function 612, and a timetE0_3 for third function

61_3.) Alternatively, functions 61 may include multiple functions corresponding to different

respective extents of partial perfusion constraint.

By comparing {Mt/et }or {#l} to functions 61, the processor identifies the extent to which

the gas exchange is perfusion constrained, and selects C responsively thereto. For example, the

processor may regress {Mtest} or {#k} to sets of function parameters that define functions 61,

respectively. (The sets of function parameters may be stored in storage device 30 (Fig. 1), for

example.) The processor may further identify the function for which the minimal regression error

is received, and then select the value of C assigned to the identified function. Alternatively, the

processor may identify a linear combination of the functions for which the regression error is

minimal, and compute the corresponding linear combination of predefined C values.

(As a purely illustrative example, the function P(t) = J + Kie-t/Ti - K2 e-t/2 for the

case of diffusion constraint may be defined by the set of parameters{K1 , K2 , 11, 12.)

Thus, for example:

(i) For diffusion-constrained exchange prior to tEO, C may equal 1/(MCO2 - MI) for the rate of change in the concentration of C0 2 , per equation (4).

(ii) For diffusion-constrained exchange subsequent to tEO, C may equal 1/(MO2- M1) for

the rate of change in the concentration of 02, per equation (5).

(iii) For perfusion-constrained exchange, C may equal ±1/12, per equation (6).

Optionally, prior to computing Mtest, the processor may smooth {M est } or {# kIsuch that the regression error is reduced to zero.

In other embodiments (e.g., based on the regression described above), the processor

outputs an indication of the extent to which the rate of change in the gas concentration is perfusion

constrained, without necessarily computing the rate of change.

In some embodiments, values of the relevant parameter, such as the equilibrium C02

concentration, are calculated for multiple segments 56, as further described below with reference

to Fig. 6. In such embodiments, the processor may generate an output in response to the maximum,

minimum, median, average, or any other statistic of the multiple parameter values.

The one-stage technique

Per the one-stage technique, the processor uses a modified form of the aforementioned

correspondence-finding algorithm (e.g., DTW), which computes the values of the parameter while

signal. In particular, the modified algorithm varies the correspondence and expands or contracts

the spectral envelopes or the baseline spectral envelopes (i.e., applies an expansion factor or 1/

to each of the spectral envelopes or baseline spectral envelopes) so as to minimize a sum (e.g., a

weighted sum) of respective distance measures for the portions. The distance measure for each of

the portions is the distance between the spectral coefficients of the portion and the spectral

coefficients of the baseline signal-portion to which the portion corresponds, following the

expansion or contraction.

Typically, the minimization is performed under the constraints described above for the

two-stage technique. Moreover, one or more additional constraints are imposed. For example, for cases in which the test frames were produced at equilibrium, the processor may require that the

same expansion factor is applied to each of the frames, such that the resulting molecular mass is

constant.

For example, after calculating v[1],...,v[Nt] and u[I],...,u[Nb], the processor may identify 1 the mapping ( n , n' , #l, r1 ),( n2 , n2 , #2, r2 ),...,( nf, , #lK KrK ) that minimizes

w[k]* d(u[n ],v[n ],k) subject to the constraints described above for the two-stage

technique along with the additional constraints described below, where r is a binary "direction"

variable indicating whether the molecular mass is increasing or decreasing, and

d(u[n ],v[ngt#)=d(u[n ]',v[n]) ord(u[n ],v[n]'), where the "'" appendage indicates

modification of the vector by pk or 1 /pk (respectively) as described above. (Each pk may be selected

from a discrete set of potential values.)

The additional constraints may be as follows. (The description below assumes a convention

in which r = 0 corresponds to an increase in the molecular mass, while r = 1 corresponds to a

decrease. In some embodiments, the opposite convention is used.)

(a) r' = 0, and rk = 1 if rk-1 = 1. (This constraint requires that the molecular mass increase initially and change direction only once.)

(b) Each pk is such that #3 < k k+1 <pk -- + forrk= 1and #lk - El p k+1 <pk

for rk = 0, where E0 and E1 are suitable constants. (This constraint ensures smoothness and

monotonicity.)

Alternatively or additionally to the latter constraint, for cases in which the test frames were

produced prior to equilibrium, the processor may require that the resulting molecular mass or

expansion factor vary in accordance with a predefined parametric function, such as any one of

functions 61 or a linear combination thereof.

Subsequently, the processor may generate an output responsively to the parameter values

that were computed while finding the correspondence. Alternatively or additionally, the processor

may compute the values of another parameter; for example, the processor may compute gas

concentrations from { k} or { Mtst }, as described above for the one-stage technique. Subsequently, the output may be generated responsively to these additional parameter values.

EXAMPLE ALGORITHM

Reference is now made to Fig. 6, which is a flow chart for an example algorithm 62 for

processing a test speech signal, in accordance with some embodiments of the present invention.

Algorithm 62 begins with a signal-receiving step 64, at which the processor receives a test

signal from the subject, e.g., over network 24 (Fig. 1). Subsequently to receiving the test signal,

the processor, at a signal-segmenting step 66, segments the test signal by identifying any breathing

pauses in the signal, as described above with reference to Fig. 3. Typically, the subject is

instructed, when generating the test signal, to repeat the same utterance several times, pausing to

inhale between the utterances; hence, the segments of the test signal typically represent the same

verbal content.

Next, at a segment-selecting step 68, the processor selects one of the segments (or a portion

thereof). Subsequently, at a dividing step 70, the processor divides the selected segment into

frames. The processor then computes the rate of change of C02 concentration over the period of

time spanned by the selected segment.

For example, using the two-stage technique, the processor may, at an optimizing step 72,

find the optimal correspondence of the test-signal frames to baseline frames. (As described above

with reference to Fig. 3, the baseline frames may be extracted from a reference signal or from the

test signal itself.) Subsequently, at a computing step 73, the processor may compute respective

expansion factors for the test-signal frames. Next, at another computing step 75, the processor may

compute molecular masses (i.e., { Mtst ) from the expansion factors. Subsequently, at an

identifying step 77, the processor may identify the function that the molecular masses fit best (i.e.,

the function for which the regression error is smallest), along with the corresponding constant C.

(As described above with reference to Fig. 5, this constant may be computed as a linear combination of predefined constants.) Next, at a smoothing step 74, the processor may smooth the molecular masses to fit the best-fit function. Finally, at another computing step 76, the processor may compute the rate of change of the CO 2 concentration based on the smoothed molecular masses and the identified constant C.

Subsequently, at another computing step 79, the processor computes a statistic, such as an

average or maximum, of the rate of change over the frames.

Subsequently, the processor checks, at a checking step 78, whether any unselected

segments remain. If yes, the processor returns to segment-selecting step 68.

Following the computation of the statistic for each segment, the processor, at an averaging

step 80, computes the average of the statistic over the segments. In doing so, the processor may

weight the segments differently, discard outliers, and/or use any other techniques known in the art

for increasing the reliability of the average. (Alternatively to an average, the processor may

compute a median or any other suitable statistic.)

Next, the processor, at a comparing step 82, compares the average to a suitable predefined

threshold. If the average passes the threshold, the value of the parameter is deemed to be abnormal.

(Depending on the parameter and on the physiological condition with respect to which the subject

is being tested, an abnormal value may either be less than or greater than the threshold.) In response

to identifying the abnormality, the processor generates an alert at an alert-generating step 84.

Otherwise, algorithm 62 ends without the generation of an alert.

It will be appreciated by persons skilled in the art that the present invention is not limited

to what has been particularly shown and described hereinabove. Rather, the scope of embodiments of the present invention includes both combinations and subcombinations of the various features

described hereinabove, as well as variations and modifications thereof that are not in the prior art,

which would occur to persons skilled in the art upon reading the foregoing description. Documents

incorporated by reference in the present patent application are to be considered an integral part of

the application except that to the extent any terms are defined in these incorporated documents in

a manner that conflicts with the definitions made explicitly or implicitly in the present

specification, only the definitions in the present specification should be considered.

Claims

1. A system, comprising: an output device; and one or more processors configured to cooperatively carry out a process that includes: computing one or more values of at least one parameter at respective times during an exhalation of a subject, based on one or more properties of sound emitted by the subject while producing the exhalation and passing through air exhaled by the subject during the exhalation, the parameter being related to a concentration of a gas in the air, wherein the sound belongs to speech of the subject, and wherein computing the values comprises computing the values based on a speech signal representing the speech, and generating an output, via the output device, in response to the values.

2. The system according to claim 1, further comprising a sensor configured to measure a speed of the sound, wherein the process includes computing the values based on the speed.

3. The system according to claim 1, further comprising a sensor configured to measure a baseline concentration of the gas in other exhaled air, wherein the process includes computing the values based on the baseline concentration.

4. A method, comprising: computing one or more values of at least one parameter at respective times during an exhalation of a subject, based on one or more properties of sound emitted by the subject while producing the exhalation and passing through air exhaled by the subject during the exhalation, the parameter being related to a concentration of a gas in the air, wherein the sound belongs to speech of the subject, and wherein computing the values comprises computing the values based on a speech signal representing the speech; and generating an output in response to the values.

5. The method according to claim 4, wherein the output indicates a state of the subject with respect to a physiological condition selected from the group of conditions consisting of: heart failure, asthma, hypobaropathy, hypercapnia, Chronic Obstructive Pulmonary Disease (COPD), and Interstitial Lung Disease (ILD).

6. The method according to claim 4, further comprising, based on the values, identifying an extent to which a rate of change in the concentration is perfusion-constrained, wherein the output indicates the extent to which the rate of change is perfusion-constrained.

7. The method according to claim 4, wherein computing the one or more values of the at least one parameter comprises computing multiple values of the concentration, wherein the method further comprises, based on the multiple values of the concentration, computing a rate of change of the concentration, and wherein generating the output comprises generating the output in response to the rate of change.

8. The method according to claim 7, wherein generating the output comprises generating the output in response to comparing the rate of change to a baseline rate of change.

9. The method according to claim 7, wherein computing the one or more values of the at least one parameter further comprises computing multiple molecular-mass values of a molecular mass of the air, wherein the method further comprises identifying an extent to which the rate of change is perfusion-constrained, and wherein computing the rate of change comprises computing the rate of change based on the molecular-mass values and in response to identifying the extent to which the rate of change is perfusion-constrained.

10. The method according to claim 9, wherein computing the rate of change comprises computing the rate of change as a function of: another rate of change of the molecular mass, and at least one constant that depends on the extent to which the rate of change is perfusion constrained.

11. The method according to claim 4, wherein the one or more values include an equilibrium value of the parameter.

12. The method according to claim 11, wherein the equilibrium value includes a C02 equilibrium value of a C02-concentration of C02 in the air.

13. The method according to claim 12, wherein computing the C2-equilibrium value comprises computing the C02-equilibrium value based on a baseline C02-equilibrium value that was measured prior to the exhalation.

14. The method according to any one of claims 4-13, wherein computing the values comprises: selecting portions of the speech signal recorded at the times, respectively; computing respective spectral envelopes of the portions; and computing the values based on respective expansions or contractions of the spectral envelopes relative to respective corresponding baseline spectral envelopes.

15. The method according to claim 14, wherein the values include respective expansion factors that quantify the expansions or contractions.

16. The method according to claim 14, wherein the baseline spectral envelopes belong to respective baseline signal-portions corresponding to the portions of the signal, respectively.

17. The method according to claim 16, wherein the baseline signal-portions are other portions of the speech signal.

18. The method according to claim 16, wherein the baseline signal-portions belong to a reference speech signal.

19. The method according to claim 18, wherein the reference speech signal represents other speech uttered while in a known physiological state, and wherein the output indicates a physiological state of the subject relative to the known physiological state.

20. The method according to claim 18, wherein the reference speech signal represents other speech, and wherein computing the values comprises computing the values based on one or more measured properties of other air exhaled during the other speech.

21. The method according to claim 18, wherein the reference speech signal represents other speech uttered by the subject.

22. The method according to claim 16, wherein computing the values comprises computing the values while identifying the correspondence between the baseline signal-portions and the portions of the speech signal, by varying the correspondence and expanding or contracting the spectral envelopes or the baseline spectral envelopes so as to minimize a sum of respective distance measures for the portions, the distance measure for each of the portions being a distance between (i) spectral coefficients of the portion and (ii) baseline spectral coefficients of the baseline signal portion to which the portion corresponds, following the expansion or contraction.

23. The method according to claim 22, wherein computing the values comprises computing the values under a constraint that the values vary in accordance with a predefined function.

24. The method according to claim 16, further comprising, prior to computing the values, identifying the correspondence between the portions and the baseline signal-portions by minimizing a sum of respective distance measures for the portions, the distance measure for each of the portions being a distance between (i) spectral coefficients of the portion and (ii) baseline spectral coefficients of the baseline signal-portion to which the portion corresponds.

25. The method according to claim 24, wherein computing the values comprises computing the values based on, for each of the portions, a statistic of respective ratios between (i) one or more formant frequencies of the portion, and (ii) corresponding formant frequencies in the baseline spectrum for the portion.

26. The method according to claim 24, wherein the values include respective expansion factors that quantify the expansions or contractions, and wherein the expansion factor for each portion minimizes a distance between (i) spectral coefficients of the portion and (ii) baseline spectral coefficients of the baseline spectrum for the portion, following an application of the expansion factor to the spectral coefficients or to the baseline spectral coefficients.

27. The method according to any one of claims 4-13, wherein computing the values comprises computing the values based on respective measured speeds of the sound.

28. A computer software product comprising a tangible non-transitory computer-readable medium in which program instructions are stored, which instructions, when read by one or more processors, cause the processors to cooperatively carry out the method of any one of claims 4 to 27.