Enhancing Multilingual Information Extraction Towards Global Linguistic Inclusivity

dc.contributor.advisorNguyen, Thien Huu
dc.contributor.authorNguyen, Van Minh
dc.date.accessioned2024-08-07T21:18:04Z
dc.date.available2024-08-07T21:18:04Z
dc.date.issued2024-08-07
dc.description.abstractIn our interconnected world, the diversity of around 7,000 languages presents challenges and opportunities for bridging language barriers. Multilingual information extraction (Multilingual IE) is crucial in natural language processing (NLP) for extracting information from texts across languages, facilitating global understanding and information equity. Despite advancements, the focus on high-resource languages has marginalized speakers of less-represented languages. Multilingual IE seeks to correct this by embracing linguistic diversity and inclusivity. This dissertation enhances Multilingual IE to address challenges of linguistic diversity, data scarcity, and model generalization, aiming to make IE technologies more accessible. It focuses on developing sophisticated algorithms for tasks like event trigger detection, event argument extraction, entity mention recognition, and relation extraction. The goal is to create a system capable of accurate information extraction across diverse languages, supporting global communication and cultural preservation. Furthermore, the importance of IE in the era of large language models (LLMs) remains significant. While LLMs have broadened NLP's capabilities, the precise, context-specific information provided by IE is essential, especially in retrieval-augmented generation (RAG) settings. This underscores IE's ongoing relevance, ensuring LLMs retrieve accurate, relevant information and highlighting IE's critical role in advancing NLP.en_US
dc.identifier.urihttps://hdl.handle.net/1794/29718
dc.language.isoen_US
dc.publisherUniversity of Oregon
dc.rightsAll Rights Reserved.
dc.subjectinformation extractionen_US
dc.subjectinformation retrievalen_US
dc.subjectlarge language modelsen_US
dc.subjectmultilingualen_US
dc.subjectnatural language processingen_US
dc.subjectquestion answeringen_US
dc.titleEnhancing Multilingual Information Extraction Towards Global Linguistic Inclusivity
dc.typeElectronic Thesis or Dissertation
thesis.degree.disciplineDepartment of Computer Science
thesis.degree.grantorUniversity of Oregon
thesis.degree.leveldoctoral
thesis.degree.namePh.D.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nguyen_oregon_0171A_13809.pdf
Size:
3.02 MB
Format:
Adobe Portable Document Format