A Rag-Based Sentiment Analysis Proposal: Classifying A Dataset About Citizen Security
The paper presents an approach to sentiment analysis with a focus on citizen security. The analyzer is based on the Retrieval-Augmented Generation (RAG) to enhance the performance of an unsupervised approach and leverage the ability of LLMs to understand domain-specific language. Via prompt engineering, the analyzer can estimate two levels of analysis: aspect-level and sentence-level, providing a response that includes the sentiment and the reason for that classification. The proposal is supported by the development of a web application that facilitates the creation of a knowledge base for other contexts and the monitoring of related areas, enabling informed decision-making in security policies. During the proposal evaluation, a dataset of Ecuadorian citizen security demonstrated that both the knowledge base and the information retrieval component assist the language models in better interpreting the text under analysis. The results reveal that LLMs exhibit outstanding performance in detecting negative sentiments, e.g., the analyzer based on Llama3-8B-Instruct achieves 95% accuracy in this category.
