Challenges In Automatic Analysis of Privacy Policy Documents With Llms

Large language models (LLMs) have shown potential in privacy policy analysis and making these long and complex documents more digestible. LLMs can summarize policy documents for the user and extract information such as third-party actors from the text. However, their practical usability in this analysis is limited by challenges such as vague expressions in privacy policies. While LLMs show lots of promise in privacy policy analysis, it is clear that LLMs themselves also have several issues - hallucinations, nondeterminism, and context length limitations - that undermine the usability and precision of their outputs when analyzing privacy policies. This emphasizes the need for solutions such as careful prompt design, using hybrid approaches, retrieval-augmented generation, and thorough validation are important. Using these approaches correctly can significantly improve the reliability and accuracy of LLM outputs.

Riia Laakso
University of Turku
Finland

Shashika Harshani
University of Turku
Finland

Sini Salmi
University of Turku
Finland

Sammani Rajapaksha
University of Turku
Finland

Sampsa Rauti
University of Turku
Finland