Language Aided Speaker Diarization Using Speaker Role Information. (arXiv:1911.07994v1 [eess.AS])
Speaker diarization relies on the assumption that acoustic embeddings from speech segments corresponding to a particular speaker share common characteristics. Thus, they are concentrated in a specific region of the speaker space; a region which represents that speaker’s identity. Those identities however are not known a priori, so a clustering algorithm is employed, which is…