An Effective Technique for Deepfake Video Detection
Baneen Musa Mahdi
Technical College of Management - Baghdad, Middle Technical University, Baghdad, Iraq
Prof. Dr. Ali Mohammad Sahan
Technical College of Management - Baghdad, Middle Technical University, Baghdad, Iraq
Download PDF http://doi.org/10.37648/ijps.v20i01.014
Abstract
As fake videos cause numerous problems affecting people's lives in various fields, they have received increasing attention. In this paper, we present a successful method for detecting fake videos based on the Scattered Wavelet Transform (SWT) and a pre-trained deep learning model, EfficientNet-B0. Several experiments were conducted on the fake video detection dataset to evaluate the effectiveness of the proposed method. The Deepfake Detection database (DFD) database was used. The model was tested by adding noise to images to evaluate its robustness and accuracy under various noise conditions. It was tested on salt-and-pepper noise and white Gaussian noise, and horizontal misalignment noise, a type of noise commonly used in fake video detection, was applied. 98% accuracy was achieved using the noise.
Keywords:
EfficientNet B0; Scatter Wavelet transform; deepfake video detection; Deepfake Detection database.
References
- Agarwal, S., & Farid, H. (2021). Detecting deep-fake videos from aural and oral dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 981– 989). https://doi.org/10.1109/TCSVT.2023.3281474
- l-Tamimi, M. S. H. (2019). Combining convolutional neural networks and slantlet transform for an effective image retrieval scheme. International Journal of Electrical and Computer Engineering, *9*(5), 4382– 4395. https://doi.org/10.11591/ijece.v9i5.pp4382-4395
- Aneja, S., & Nießner, M. (2020). Generalized zero and few-shot transfer for facial forgery detection. arXiv preprint. https://arxiv.org/abs/2006.11863
- Bar, L., Sochen, N., & Kiryati, N. (2005). Image deblurring in the presence of salt-and-pepper noise. In Proceedings of the International Conference on Scale-Space Theories in Computer Vision (pp. 107–118). Springer. https://doi.org/10.1007/11408031_10
- Brodarič, M., Štruc, V., & Peer, P. (2024). Cross-dataset deepfake detection: Evaluating the generalization capabilities of modern deepfake detectors. In Proceedings of the 27th Computer Vision Winter Workshop (CVWW) (pp. 47–56).
- Cao, J., Ma, C., Yao, T., Chen, S., Ding, S., & Yang, X. (2022). End-to-end reconstruction-classification learning for face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4113–4122).
- Chen, S., Yao, T., Chen, Y., Ding, J., Li, J., & Ji, R. (2021). Local relation learning for face forgery detection. In Proceedings of the AAAI Conference on Artificial Intelligence, *35*(2), 1081–1088.
- Dey, S., Singh, P., & Saha, G. (2023). Wavelet scattering transform for improving generalization in low-resourced spoken language identification. arXiv preprint. https://arxiv.org/abs/2310.00602
- Eickenberg, M., Exarchakis, G., Hirn, M., & Mallat, S. (n.d.). Solid harmonic wavelet scattering for molecular energy regression. Unpublished manuscript.
- Gerstner, C. R., & Farid, H. (2022). Detecting real-time deep-fake videos using active illumination. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 53– 60). https://doi.org/10.1007/978-3-031-73661-2_22
- Ghosh, A., Sufian, A., Sultana, F., Chakrabarti, A., & De, D. (2019). Fundamental concepts of convolutional neural network. In Intelligent Systems Reference Library (Vol. 172, pp. 519–567). Springer. https://doi.org/10.1007/978-3- 030-32644-9_36
- Hadi, T. H. (2024). Deep learning-based DDoS detection in network traffic data. International Journal of Electrical and Computer Engineering Systems, *15*(5), 407–414. https://doi.org/10.32985/ijeces.15.5
- Haliassos, A., Vougioukas, K., Petridis, S., & Pantic, M. (2021). Lips don't lie: A generalisable and robust approach to face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5039–5049).
- Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z., Tang, Y., Xiao, A., Xu, C., Xu, Y., Yang, Z., Zhang, Y., & Tao, D. (2023). A survey on vision transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, *45*(1), 87–110. https://doi.org/10.1109/TPAMI.2022.3152247
- Hu, J., Liao, X., Gao, D., Tsutsui, S., Wang, Q., Qin, Z., & Shou, M. Z. (2023). Mover: Mask and recovery based facial part consistency aware method for deepfake video detection. arXiv preprint. https://arxiv.org/abs/2303.01740
- Hwang, H., & Haddad, R. A. (1995). Adaptive median filters: New algorithms and results. IEEE Transactions on Image Processing, *4*(4), 499–502. https://doi.org/10.1109/83.370679
- Ji, L., Wang, Y., Chen, K., Wu, Y., & Huang, D. (2024). Distinguish any fake videos: Unleashing the power of largescale data and motion features. arXiv preprint. https://arxiv.org/abs/2405.15343
- Kaur, A., Hoshyar, A. N., Saikrishna, V., Firmin, S., & Xia, F. (2024). Deepfake video detection: Challenges and opportunities. Artificial Intelligence Review, *57*(6), Article 159. https://doi.org/10.1007/s10462-024-10810-6
- Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020). Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3207–3216).
- Qi, P., Cao, J., Li, Y., Liu, X., Meng, Y., & Shen, W. (2023). Fakesv: A multimodal benchmark with rich social context for fake news detection on short video platforms. In Proceedings of the AAAI Conference on Artificial Intelligence, *37*(12), 14444–14452. https://doi.org/10.1609/aaai.v37i12.26689
- Rehman, M., Ahmed, F., Khan, M., Tariq, U., Alfouzan, F., Alzahrani, N. M., & Ahmad, J. (2022). Dynamic hand gesture recognition using 3D-CNN and LSTM networks. Computers, Materials & Continua, *70*(3), 4675– 4690. https://doi.org/10.32604/cmc.2022.019586
- Santhoshkumar, R., & Geetha, M. K. (2019). Deep learning approach for emotion recognition from human body movements with feedforward deep convolution neural networks. Procedia Computer Science, *152*, 158– 165. https://doi.org/10.1016/j.procs.2019.05.038
- Sarma, D., Kavyasree, V., & Bhuyan, M. K. (2022). Two-stream fusion model using 3D-CNN and 2D-CNN via videoframes and optical flow motion templates for hand gesture recognition. Innovations in Systems and Software Engineering. Advance online publication. https://doi.org/10.1007/s11334-022-00477-z
- Sekar, V., & Jawaharlalnehru, A. (2022). Semantic-based visual emotion recognition in videos: A transfer learning approach. International Journal of Electrical and Computer Engineering, *12*(4), 3674– 3683. https://doi.org/10.11591/ijece.v12i4.pp3674-3683Sekar, V., & Jawaharlalnehru, A. (2022). Semantic-based visual emotion recognition in videos: A transfer learning approach. International Journal of Electrical and Computer Engineering, *12*(4), 3674– 3683. https://doi.org/10.11591/ijece.v12i4.pp3674-3683
- Song, H., Huang, S., Dong, Y., & Tu, W.-W. (2023). Robustness and generalizability of deepfake detection: A study with diffusion models. arXiv preprint. https://arxiv.org/abs/2309.02218
- Telea, A. (2004). An image inpainting technique based on the fast marching method. Journal of Graphics Tools, *9*(1), 23–34.
- Xu, G., & Aminu, M. J. (2022). An efficient procedure for removing salt and pepper noise in images. Informatica, *46*(2). https://doi.org/10.31449/inf.v46i2.3530
- Zhang, D., Xiao, Z., Li, S., Lin, F., Li, J., & Ge, S. (2024). Learning natural consistency representation for face forgery video detection. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 407–424). Springer. https://doi.org/10.1109/TIFS.2025.3567110
- Zhang, M., & Gunturk, B. K. (2008). Multiresolution bilateral filtering for image denoising. IEEE Transactions on Image Processing, *17*(12), 2324–2333. https://doi.org/10.1109/TIP.2008.2006658
- Zhang, S., & Karim, M. A. (2002). A new impulse detector for switching median filters. IEEE Signal Processing Letters, *9*(11), 360–363. https://doi.org/10.1109/LSP.2002.805310