Κοινότητες στην ΠΥΞΙΔΑ
Επιλέξτε μια κοινότητα για να περιηγηθείτε στις συλλογές της.
Πρόσφατες Υποβολές
Leveraging retrieval-augmented generation for student support: a document-centric QA system for the AUEB informatics studies guide
(2025-07) Mitsakis, Nikos; Μητσάκης, Νικόλαος; Androutsopoulos, Ion; Stafylakis, Themos
This thesis examines the design, development, and evaluation of a Retrieval-Augmented Generation (RAG) system specifically designed to support undergraduate students in the Department of Informatics at the Athens University of Economics and Business (AUEB). The central objective is to create a cost-effective yet high-quality AI assistant capable of answering studies guide-related questions, ensuring that all responses are explicitly grounded in the latest edition of the department's official Studies Guide. To achieve this, the system ingests the newest version of the Studies Guide. It represents its contents at three levels of granularity: chunks (bodies of text corresponding to paragraphs or groups of paragraphs on a specific topic, based on the document’s structure), sentences (extracted by sentence tokenizing each chunk), and propositions (decontextualized factual statements synthetically generated from the chunks). The retrieval architecture explores traditional lexical search (BM25), dense vector search, and a hybrid ensemble retriever to maximize retrieval coverage and relevance. Question-answering capabilities are assessed using both real-world and synthetic QA pairs, with the generation module leveraging self-hosted state-of-the-art large language models (LLMs). The thesis conducts a comprehensive evaluation across all document granularities and retrieval configurations, employing both classical information retrieval metrics and more modern LLM-based evaluation. Results demonstrate the feasibility of delivering a factual, responsive, and modular assistant using modest computational resources. The thesis further discusses the limitations and potential extensions of the approach, aiming to provide a blueprint for deploying similar RAG-based assistants in other academic settings.
Retrieval augmented generation on regulatory documents
(2025-06-20) Chasandras, Ioannis; Χασάνδρας, Ιωάννης; Androutsopoulos, Ion; Chlapanis, Odysseas
This thesis investigates the application of Retrieval Augmented Generation (RAG) in regulatory procedures through the emerging field of Regulatory NLP. Based on real-world regulatory documents, the study evaluates the performance of commercial retrieval models and introduces advanced, hybrid retrieval techniques tailored for legal compliance tasks. Given the critical need for precision and completeness in the legal domain, new algorithms that utilize Large Language Models (LLMs) are developed to enhance regulatory question-answering. The work also includes an adversarial evaluation of RePASs, a metric focused on legal obligations. Through participation in the RIRAG-2025 shared task, the thesis demonstrates both the promise and current limitations of AI systems in regulatory settings, emphasizing the need for further exploration in this field.
Nαυτιλία και βιωσιμότητα: μέτρα πρόληψης και νομοθεσία για την πρόληψη και μείωση της ρύπανσης και οι επιπτώσεις στην οικονομία
(2025-10-01) Σπυροπούλου, Αικατερίνη; Χατζησταμούλου, Νικόλαος; Landis, Conrad Felix Michel; Χάλκος, Γεώργιος
Ανέκαθεν οι θαλάσσιες μεταφορές αποτελούσαν έναν από τους κυριότερους τρόπους μεταφοράς αγαθών και ανθρώπων. Με την πάροδο των χρόνων παρουσιάστηκε η ανάγκη για την ανάπτυξη νομοθεσιών που να περιορίζουν την απρόσκοπτη και αλόγιστη χρήση των δυσμενών για το περιβάλλον συνθηκών μεταφοράς. Οι δράσεις για το σκοπό αυτό πραγματοποιήθηκαν από κοινού τόσο από διεθνείς οργανισμούς όσο και από εθνικούς. Η νομοθεσία για την πρόληψη και αντιμετώπιση της ρύπανσης λόγω της ανάπτυξης του φαινομένου του θερμοκηπίου - και όχι μόνο - συνεχώς εξελίσσεται ώστε να προσαρμόζεται στο μεταβαλλόμενο περιβάλλον με σκοπό την βιωσιμότητα. Είναι εμφανές ότι όλοι οι διεθνείς φορείς προσπαθούν να δρουν με τον βέλτιστο τρόπο ώστε να μειώνονται οι επιπτώσεις που ένα πλοίο μπορεί να επιφέρει στο περιβάλλον, είτε αυτό αφορά τον άνθρωπο είτε το περιβάλλον καθ' αυτό. Η εξέλιξη των πλοίων είναι και από τις βασικές μεθόδους των εταιρειών υιοθετώντας ολοένα και περισσότερο τις ανανεώσιμες πηγές ενέργειας. Λόγος γίνεται ακόμα και για πλοία μερικώς ή ολικώς μη επανδρωμένα. Διάφορα ερωτήματα, όμως, προκύπτουν ανά διαστήματα: Μέχρι που μπορεί να φτάσει η εξέλιξη των πλοίων και πόσο αυτή η εξέλιξη βελτιώνει πραγματικά το περιβάλλον; Είναι εύκολο να καταφέρουν οι εταιρείες να προσαρμοστούν σε όλες τις αλλαγές που προβλέπουν οι νομοθεσίες σε τόσο σύντομο χρονικό διάστημα; Πως επηρεάζεται η οικονομία και πώς αντιδράει σε αυτές τις αλλαγές;
Forecasting in panel data
(2025-11-13) Pavelis, Konstantinos; Παβέλης, Κωνσταντίνος; Topaloglou, Nikolaos; Dendramis, Yiannis; Alexopoulos, Angelos
This study examines the use of econometric forecasting methods within a panel data framework, combining traditional approaches within modern dimensionality reduction techniques such as Factor Models, Fixed Effect Models and Principal Component Analysis (PCA). The research aims to demonstrate how the integration of information from panel data improves both the accuracy and explanatory power of forecasts for economic and financial variables, compared to classical time series methods. The empirical analysis is based on daily data from fifteen major global stock indices, covering the United States, Europe and Asia, for the period of 2014-2025. The dataset includes the variables Open, High, Low and Change, which are analyzed through econometric models such as Fixed Effect, Random Effects, and PCA Factor models. This analysis was conducted using the R programming language in the RStudio environment, taking advantage of its efficiency in processing and analyzing large and complex datasets. The results show that combining the panel data structure with dimensionality reduction techniques leads to significant improvements in predictive accuracy, offering deeper insights into the mechanisms driving market behavior. Indices such as S&P 500 and CAC40 displayed particularly high adjusted R2 values, indicating that a small set of latent factors extracted through PCA can effectively capture most of the variation in returns. Conversely, indices such as Nikkei 225 and Hang Seng showed lower explanatory power, suggesting that the presence of regional or idiosyncratic factors not fully captured by the models. The PCA Factor Models demonstrated that the first eight components were sufficient to explain approximately 80% of the variance in major indices, with PC3 and PC6 emerging as statistically significant predictors across most markets. Despite issues of heteroskedasticity and non normal residuals, the use of robust standard errors provided reliable inference, with no signs of multicollinearity or autocorrelation. Overall, the findings indicate that the combination of Fixed/Random Effects and Factor Models provides a comprehensive and effective framework for financial forecasting. This integrated approach yields statistically and theoretically sound results, enhances the understanding of global financial markets, and contributes to the development of evidence-based economic policymaking.
Introduction to hidden Markov models and their application to financial theory
(2025-11-04) Barkolias, Evangelos-Panagiotis; Μπαρκολιάς, Ευάγγελος-Παναγιώτης; Vrontos, Ioannis; Giannakopoulos, Thanasis; Besbeas, Panagiotis
Hidden Markov Models (HMMs) emerged in the late ’60s as a statistical framework designed to extract latent information from data characterized by uncertainty. Their ability to capture hidden structure beyond observable variables soon made them highly relevant for financial applications, where volatility clustering, regime shifts, and non-normality are pervasive. Before turning to empirical application, it is important to first review the theoretical background that underpins HMMs,ensuring a clear understanding of the statistical concepts on which they are built. Building on this foundation, the thesis investigates the modeling of stock returns, beginning with models without temporal dependence and gradually extending to fully Markovian structures, highlighting the crucial role of state dependence in improving both interpretability and predictive power. Methodologically, the research employs Direct Numerical Maximization for parameter estimation and evaluates state sequences. The results show that incorporating state dependence not only improves the statistical characterization of stock return distributions but also yields interpretable latent states corresponding to calm and turbulent regimes. Furthermore, the analysis emphasizes the importance of approaching financial time series from a purely statistical perspective while also ensuring robust optimization and reliable inference.
