Open Source Contributions
- Project 1 - RawNet2: End-to-end anti-spoofing.
- Project 2 - RawGAT-ST: End-to-end Spectro-Temporal Graph Attention Networks.
- Project 3 - SSL-AASIST: Audio spoofing and deepfake detection using self-supervised learning and data augmentation.
- Project 4 - t-EER: Parameter-Free Tandem Evaluation Metric of Countermeasures and Biometric Comparators.
- Project 5 - RawBoost: Data boosting and augmentation technique for automatic speaker verification and audio deepfake detection.
- Project 6: Jointly-optimised and integrated solutions to spoofing attack detection and automatic speaker verification.
During PhD, I actively contribute to various projects. Below are some of my notable open source projects:
- Project 1 - RawNet2: This repository contains our implementation of the paper accepted to ICASSP 2021, "End-to-end anti-spoofing with RawNet2". This work demonstrates the effectivness of end-to-end approaches that utilise automatic feature learning to improve performance, both for seen spoofing attack types as well as for worst-case (A17) unseen attack.
- Project 2 - RawGAT-ST: This repository contains our implementation of the paper published in the ASVspoof 2021 satellite workshop, "End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection". This work demonstrates the effectivness of end-to-end spectro-temporal graph attention network (GAT) which learns the relationship between cues spanning different sub-bands and temporal intervals for anti-spoofing and speech deepfake detection.
- Project 3 - SSL-AASIST: This repository contains our implementation of the paper published in the Speaker Odyssey 2022 conference, "Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and Rawboost data augmentation". In this work, we replaced the sinc-layer front-end from RawNet2 with a self-supervised learning (SSL) front-end wav2vec2.0. Our experiments demonstrate that incorporating efficient "RawBoost" data-augmentation technique with the SSL wav2vec 2.0 front-end and a graph neural networks (GNNs) based AASIST back-end leads to significant improvements in performance. These results highlight the importance of a well-trained and fine-tuned front-end, even when initially trained using only bona fide utterances in large quantities, as it enhances generalisation. This work achieved state-of-the-art performance on the more challenging ASVspoof 2021 LA and DF tasks.
- Project 4 - t-EER: In this repository we propose an extension to the widely-used 'equal error rate' (EER) performance metric which is still used across the field to assess the reliability of biometric systems. The new 'tandem' EER ("t-EER") metric can be used to assess the performance of presentation attack detectors (PADs) and biometric comparators jointly -- as opposed to evaluating the two systems in isolation. Informally, it's an "EER" for *two* detectors with their respective detection thresholds. As indicated in the title of our article, the t-EER metric is parameter free.
- Project 5 - RawBoost: In this repository we propose RawBoost, a data-boosting and augmentation method to design more reliable spoofing detection solutions which operate directly upon raw waveform. RawBoost data-augmentation technique is based upon the combination of linear and non-linear filtering in addition to signal-dependent and signal-independent additive noises. The aim is to improve spoofing detection reliability in the face of nuisance variation stemming from unknown encoding and transmission conditions which typify telephony scenario.
- Project 6 - Jointly-optimised and integrated solutions for SASV: This project aims to establish the potential and determine whether spoofing countermeasures (CMs) and automatic speaker verification (ASV) systems can be jointly optimised. Jointly-optimised or integrated solutions have the potential to better exploit the synergy between spoofing CMs and ASV subsystems so that they function cooperatively as a more reliable solution to spoofing aware speaker verification (SASV).