NZ794700B2

NZ794700B2 - Backward-compatible integration of harmonic transposer for high frequency reconstruction of audio signals

Info

Publication number: NZ794700B2
Application number: NZ794700A
Authority: NZ
Inventors: Per Ekstrand; Heiko Purnhagen; Lars Villemoes
Original assignee: Dolby International Ab
Filing date: 2018-03-19
Publication date: 2025-01-28

Abstract

method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either linear translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.

Claims

1. A method for decoding an encoded audio bitstream, the method comprising: 5 receiving the encoded audio bitstream, the encoded audio bitstream including audio data representing a lowband portion of an audio signal; decoding the audio data to generate a decoded lowband audio signal; extracting from the encoded audio bitstream high frequency reconstruction metadata, the high frequency reconstruction metadata including linear translation 10 operating parameters tuned for a high frequency reconstruction process that linearly translates a consecutive number of subbands from a lowband portion of the audio signal to a highband portion of the audio signal, the linear translation operating parameters including sinusoid addition information; filtering the decoded lowband audio signal with an analysis filterbank to 15 generate a filtered lowband audio signal; extracting from the encoded audio bitstream a flag indicating whether either linear translation or harmonic transposition is to be performed on the audio data; if the flag indicates that harmonic transposition is to be performed on the audio data; 20 regenerating a highband portion of the audio signal by performing harmonic transposition using the filtered lowband audio signal and the high frequency reconstruction metadata , including the sinusoid addition information, wherein the sinusoid addition information is reused for harmonic transposition even though it was encoded for linear translation processing; and 25 combining the filtered lowband audio signal and the regenerated highband portion to form a wideband audio signal. wherein the analysis filterbank includes analysis filters, h (n), that are modulated versions of a prototype filter, p0(n), according to: ? 1 ? h (? ) = ? (? ) exp {? (? + ) (? - )}, 0 = ? = ? ; 0 = ? < ? ? 2 2 30 where p (n) is a real-valued symmetric or asymmetric prototype filter, M is a number of channels in the analysis filterbank and N is the prototype filter order.

2. The method of claim 1 wherein the encoded audio bitstream further includes a fill element with an identifier indicating a start of the fill element and fill data after the identifier, wherein the fill data includes the flag.

3. The method of claim 2 wherein the identifier is a three bit unsigned integer 5 transmitted most significant bit first and having a value of 0x6.

4. The method of claim 2, wherein the fill data includes an extension payload, the extension payload includes spectral band replication extension data, and the extension payload is identified with a four bit unsigned integer transmitted most 10 significant bit first and having a value of ‘1101’ or ‘1110’.

5. The method of claim 4, wherein the spectral band replication extension data includes: a spectral band replication header, 15 spectral band replication data after the header, an spectral band replication extension element after the spectral band replication data, and wherein the flag is included in the spectral band replication extension element. 20

6. The method of any one of claims 1-4 wherein the prototype filter, p (n), is defined by the coefficients of Table 4.

7. The method of any of claims 1-4 wherein the prototype filter, p (n), is derived from coefficients of Table 4 by one or more mathematical operations selected from 25 the group consisting of rounding, subsampling, interpolation, or decimation.

8. A computer program having instructions that when executed by a processor cause said processor to perform the method of any of claims 1-7. 30

9. A decoder for decoding an encoded audio bitstream, the decoder comprising: an input interface for receiving the encoded audio bitstream, the encoded audio bitstream including audio data representing a lowband portion of an audio signal; a core decoder for decoding the audio data to generate a decoded lowband audio signal; a deformatter for extracting from the encoded audio bitstream high frequency reconstruction metadata, the high frequency reconstruction metadata including linear 5 translation operating parameters tuned for a high frequency reconstruction process that linearly translates a consecutive number of subbands from a lowband portion of the audio signal to a highband portion of the audio signal, the linear translation operating parameters including sinusoid addition information; an analysis filterbank for filtering the decoded lowband audio signal to 10 generate a filtered lowband audio signal; a deformatter for extracting from the encoded audio bitstream a flag indicating whether either linear translation or harmonic transposition is to be performed on the audio data; a high frequency regenerator for regenerating if the flag indicates that 15 harmonic transposition is to be performed on the audio data, a highband portion of the audio signal by performing harmonic transposition using the filtered lowband audio signal and the high frequency reconstruction metadata, including the sinusoid addition information, wherein the sinusoid addition information is reused for harmonic transposition even though it was encoded for linear translation processing; and 20 a synthesis filterbank for combining the filtered lowband audio signal and the regenerated highband portion to form a wideband audio signal, wherein the analysis filterbank includes analysis filters, hk(n), that are modulated versions of a prototype filter, p (n), according to: ? 1 ? ( ) ( ) h ? = ? ? exp {? (? + ) (? - )}, 0 = ? = ? ; 0 = ? < ? ? 2 2 25 where p0(n) is a real-valued symmetric or asymmetric prototype filter, M is a number of channels in the analysis filterbank and N is the prototype filter order.

10. The decoder of claim 9 wherein the encoded audio bitstream further includes a fill element with an identifier indicating a start of the fill element and fill data after 30 the identifier, wherein the fill data includes the flag.

11. The decoder of claim 10 wherein the identifier is a three bit unsigned integer transmitted most significant bit first and having a value of 0x6.

12. The decoder of claim 10, wherein the fill data includes an extension payload, the extension payload includes spectral band replication extension data, and the extension payload is identified with a four bit unsigned integer transmitted most 5 significant bit first and having a value of ‘1101’ or ‘1110’.

13. The decoder of claim 12, wherein the spectral band replication extension data includes: a spectral band replication header, 10 spectral band replication data after the header, an spectral band replication extension element after the spectral band replication data, and wherein the flag is included in the spectral band replication extension element. 15

14. The decoder of any one of claims 9-12 wherein the prototype filter, p0(n), is defined by the coefficients of Table 4.

15. The decoder of any one of claims 9-12 wherein the prototype filter, p0(n), is derived from coefficients of Table 4 by one or more mathematical operations 20 selected from the group consisting of rounding, subsampling, interpolation, or decimation.