Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Kanagasundaram et al., 2019 - Google Patents
[go: Go Back, main page]

Kanagasundaram et al., 2019 - Google Patents

A study of x-vector based speaker recognition on short utterances

Kanagasundaram et al., 2019

View PDF
Document ID
1366304604841193667
Author
Kanagasundaram A
Sridharan S
Ganapathy S
Singh P
Fookes C
Publication year
Publication venue
Proceedings of the 20th Annual Conference of the International Speech Communication Association, INTERSPEECH 2019. Vol. 2019-September.

External Links

Snippet

The aim of this work is to gain insights into how the deep neural network (DNN) models should be trained for short utterance evaluation conditions in an x-vector based speaker verification system. The study suggests that the speaker embedding can be extracted with …
Continue reading at eprints.qut.edu.au (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication Publication Date Title
Kanagasundaram et al. A study of x-vector based speaker recognition on short utterances
Kanagasundaram et al. Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques
Matejka et al. Neural Network Bottleneck Features for Language Identification.
Zhou et al. Dynamic margin softmax loss for speaker verification
Sethu et al. Speaker variability in speech based emotion models-Analysis and normalisation
Silva et al. Spoken digit recognition in portuguese using line spectral frequencies
Madikeri et al. Implementation of the standard i-vector system for the kaldi speech recognition toolkit
Soboleva et al. Replacing human audio with synthetic audio for on-device unspoken punctuation prediction
Ghaemmaghami et al. Speaker attribution of australian broadcast news data
Walsh et al. Hilbert-huang-transform based features for accent classification of non-native english speakers
Kajarekar Four weightings and a fusion: A cepstral-SVM system for speaker recognition
Gupta et al. Segment-level pyramid match kernels for the classification of varying length patterns of speech using SVMs
Xiao-chun et al. A text-independent speaker recognition system based on probabilistic principle component analysis
Saleem et al. Voice conversion and spoofed voice detection from parallel English and Urdu corpus using cyclic GANs
Kekre et al. Speaker identification using spectrograms of varying frame sizes
Adam et al. Wavelet based Cepstral Coefficients for neural network speech recognition
Lee et al. Using discrete probabilities with Bhattacharyya measure for SVM-based speaker verification
Kanagasundaram et al. Study on pairwise LDA for x‐vector‐based speaker recognition
Portêlo et al. Privacy-preserving query-by-example speech search
Rahman et al. Domain mismatch modeling of out-domain i-vectors for PLDA speaker verification
Aronowitz Unsupervised compensation of intra-session intra-speaker variability for speaker diarization
Verma et al. Performance analysis of speaker identification using gaussian mixture model and support vector machine
Kekre et al. Speaker identification using row mean vector of spectrogram
Wang et al. DNN-based discriminative scoring for speaker recognition based on i-vector
Nandwana et al. Analysis of Complementary Information Sources in the Speaker Embeddings Framework.