Provas Públicas de Mestrado em Engenharia Eletrotécnica-Telecomunicações

Provas Públicas de Mestrado em Engenharia Eletrotécnica-Telecomunicações

por Alberto de Jesus Nascimento -
Número de respostas: 0

Autor: Daniel Moisés de Olival Parada
Local: Sala 2.109 e sessão Zoom https://videoconf-colibri.zoom.us/j/92410874604?pwd=M1RGNC9IaHpZUm10RC9GaThqNzV5Zz09
Dia/Hora: 2/2/2024 às 16:00

Abstract: This thesis investigates the usage of deep learning algorithms to perform sentiment analysis over restaurant reviews from the Zomato application, making use of natural language processing techniques to handle text data and taking advantage of the rating given by consumers to perform supervised training. This work presents two models developed from scratch to address the case the study problem using recurrent neural networks and self-attention: Recurrent Encoder Classifier and Attentive Recurrent Encoder Classifier. These models were subject to two heuristic-based optimization procedures: a discrete genetic algorithm to select an optimal set of hyperparameters and optimal architecture and a grid search algorithm to optimize the text preprocessing steps. The usage of deep learning models with Portuguese data is limited; hence, the gain in performance was evaluated against classical machine learning models trained on Zomato’s dataset, verifying an improvement of 3% in F1-score. The genetic algorithm yielded a relative obtainable improvement score of 4.4% and 8.3% on the recurrent and attentive recurrent encoders architectures, respectively, against their baseline configuration, with the possibility of further optimization by increasing the number of generations. The grid search algorithm slightly improved the performance of each architecture. Both had comparable results, where the Attentive Recurrent Encoder Classifier presented the best performance with 76% of F1-score, 92.5% of ROC-AUC, and 82.7% of accuracy.

Finally, a trained Recurrent Encoder Classifier model was deployed onto a single-board computer, using the Whisper base model for audio transcription. This implementation demonstrates the feasibility of the proposed approach for sentiment analysis in real-world, resource-constrained environments. The results of the study demonstrate that deep learning algorithms can effectively analyze sentiment and provide accurate predictions without relying on pre-trained models.

Keywords: natural language processing, sentiment analysis, Portuguese language, deep learning, genetic algorithm, edge computing.