Πτυχιακές εργασίες

Μόνιμο URI για αυτήν τη συλλογήhttps://pyxida.aueb.gr/handle/123456789/11719

Περιήγηση

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω

Τώρα δείχνει 1 - 3 από 3

Retrieval augmented generation on regulatory documents
(2025-06-20) Chasandras, Ioannis; Χασάνδρας, Ιωάννης; Androutsopoulos, Ion; Chlapanis, Odysseas
This thesis investigates the application of Retrieval Augmented Generation (RAG) in regulatory procedures through the emerging field of Regulatory NLP. Based on real-world regulatory documents, the study evaluates the performance of commercial retrieval models and introduces advanced, hybrid retrieval techniques tailored for legal compliance tasks. Given the critical need for precision and completeness in the legal domain, new algorithms that utilize Large Language Models (LLMs) are developed to enhance regulatory question-answering. The work also includes an adversarial evaluation of RePASs, a metric focused on legal obligations. Through participation in the RIRAG-2025 shared task, the thesis demonstrates both the promise and current limitations of AI systems in regulatory settings, emphasizing the need for further exploration in this field.
Leveraging retrieval-augmented generation for student support: a document-centric QA system for the AUEB informatics studies guide
(2025-07) Mitsakis, Nikos; Μητσάκης, Νικόλαος; Androutsopoulos, Ion; Stafylakis, Themos
This thesis examines the design, development, and evaluation of a Retrieval-Augmented Generation (RAG) system specifically designed to support undergraduate students in the Department of Informatics at the Athens University of Economics and Business (AUEB). The central objective is to create a cost-effective yet high-quality AI assistant capable of answering studies guide-related questions, ensuring that all responses are explicitly grounded in the latest edition of the department's official Studies Guide. To achieve this, the system ingests the newest version of the Studies Guide. It represents its contents at three levels of granularity: chunks (bodies of text corresponding to paragraphs or groups of paragraphs on a specific topic, based on the document’s structure), sentences (extracted by sentence tokenizing each chunk), and propositions (decontextualized factual statements synthetically generated from the chunks). The retrieval architecture explores traditional lexical search (BM25), dense vector search, and a hybrid ensemble retriever to maximize retrieval coverage and relevance. Question-answering capabilities are assessed using both real-world and synthetic QA pairs, with the generation module leveraging self-hosted state-of-the-art large language models (LLMs). The thesis conducts a comprehensive evaluation across all document granularities and retrieval configurations, employing both classical information retrieval metrics and more modern LLM-based evaluation. Results demonstrate the feasibility of delivering a factual, responsive, and modular assistant using modest computational resources. The thesis further discusses the limitations and potential extensions of the approach, aiming to provide a blueprint for deploying similar RAG-based assistants in other academic settings.
Tuples-DMM: a retrieval-enhanced concept-driven guided decoding algorithm
(2024-10-25) Plavos, Dimosthenis; Πλαβός, Δημοσθένης; Pavlopoulos, Ioannis
Η αυτόµατη περιγραφή ιατριϰών ειϰόνων αποτελεί µια εξελισσόµενη διαδιϰασία στον τοµέα της Τεχνητής Νοηµοσύνης που περιλαµβάνει την αυτόµατη παραγωγή περιγραφιϰών λεζαντών για τέτοιες ειϰόνες. Ενισχύεται από τις προόδους στις τεχνολογίες απειϰόνισης ϰαι τον αυξανόµενο αριϑµό ασϑενών, τα οποία έχουν οδηγήσει στη δηµιουργία ενός µεγάλου αριϑµού αϰτινολογιϰών ειϰόνων στις µονάδες υγειονοµιϰής περίϑαλψης παγϰοσµίως. Η ανάλυση αυτών των ειϰόνων απαιτεί σηµαντιϰή ποσότητα χρόνου από τους ϰλινιϰούς ιατρούς, γεγονός που ϰαϑιστά την αυτοµατοποίηση αυτής της διαδιϰασίας ένα µέσο εξοιϰονόµησης χρόνου. Οι αυτόµατα δηµιουργούµενες λεζάντες µπορούν επίσης να χρησιµεύσουν ως εργαλεία για την ϰαϑοδήγηση της διαγνωστιϰής διαδιϰασίας ή την επιβεβαίωση των ευρηµάτων των ϰλινιϰών ιατρών. Η πτυχιαϰή αυτή εργασία επιϰεντρώνεται στην Παραγωγή ∆ιαγνωστιϰής Περιγραφής (Diagnostic Captioning), η οποία αναφέρεται στη δηµιουργία ϰειµενιϰών περιγραφών µε στόχο την αναγνώριση ϰαι µετάδοση διαγνωστιϰών πληροφοριών από ιατριϰές ειϰόνες. Για την υλοποίησή της, χρησιµοποιεί το σύνολο δεδοµένων ImageCLEFmedical 2023. Η προτεινόµενη µέϑοδος TuplesDMM βασίζεται στη µέϑοδο DMM (Distance from Median Maximum), που αποτελεί µια µεϑοδολογία Καϑοδηγούµενης Αποϰωδιϰοποίησης βασισµένη σε"ϰεντριϰές έννοιες" ϰαι παρουσιάστηϰε από τον Kaliosis ϰαι άλλους [Kal+24]. Η µέϑοδος DMM δηµιουργεί περιγραφές ενσωµατώνοντας ρητά ή άρρητα τις έννοιες που σχετίζονται µε µια ιατριϰή ειϰόνα, σύµφωνα µε τον τρόπο που αυτές οι έννοιες εϰπροσωπούνται στα παραδέιγµατα εϰπαίδευσης. Η µέϑοδος Tuples-DMM ϰαι οι τροποποιήσεις της στοχεύουν στην ανάϰτηση των πιο σχετιϰών δεδοµένων εϰπαίδευσης ϰαι την τροποποίηση του αλγορίϑµου DMM. Ο στόχος είναι η βελτίωση της ϰαϑοδηγούµενης δηµιουργίας µέσω της αποφυγής της επιρροής από δεδοµένα εϰπαίδευσης που αντιπροσωπεύουν άσχετα νοηµατιϰά ϑέµατα ϰαι της εστίασης σε σχετιϰά νοηµατιϰά δεδοµένα εϰπαίδευσης, προϰειµένου να επιτευχϑούν πιο αϰριβείς ϰαι νοηµατιϰά ουσιαστιϰές περιγραφές.

Περιήγηση

Πλοήγηση Πτυχιακές εργασίες ανά Θέμα "Natural Language Processing (NLP)"