“This half-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. “Uncertainty Quantification for Text Classification.” In Advances in Information Retrieval - 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2-6, 2023, Proceedings, Part III, edited by Jaap Kamps, Lorraine Goeuriot, Fabio Crestani, Maria Maistro, Hideo Joho, Brian Davis, Cathal Gurrin, Udo Kruschwitz, and Annalina Caputo, 13982:362–69. ![]() Zhang, Dell, Murat Sensoy, Masoud Makrehchi, and Bilyana Taneva-Popova. Our experiments conducted on two legal datasets in English and Portuguese respectively show that the proposed approach can significantly improve the performance of document page classification compared to the non-recurrent setup as well as the other context-aware baselines.” Uncertainty Quantification for Text Classification Specifically, we enhance the input with extra tokens carrying sequential information about previous pages - introducing recurrence - which enables the usage of pre-trained Transformer models like BERT for context-aware page classification. In this paper, we present a simple but effective approach that overcomes the above limitation. Although in recent years a few techniques have been proposed to exploit the context information from neighbouring pages to enhance document page classification, they typically cannot be utilized with large pre-trained language models due to the constraint on input length. Most existing studies in the field of document image classification either focus on single-page documents or treat multiple pages in a document independently. “For many business applications that require the processing, indexing, and retrieval of professional documents such as legal briefs (in PDF format etc.), it is often essential to classify the pages of any given document into their corresponding types beforehand. “Context-Aware Classification of Legal Document Pages.” In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23), July 23–27, 2023, Taipei, Taiwan. Our experimental results show that the model trained solely on pseudo labels outperforms the supervised baseline when gold-standard data is limited, highlighting the effectiveness of our proposed approach in reducing the dependency on annotated data.” Context-Aware Classification of Legal Document Pagesįragkogiannis, Pavlos, Martina Forster, Grace E. Notably, to the best of our knowledge, this is the first work to apply weak supervision to DLA. In this paper, we propose a novel system that combines object detection for Document Layout Analysis (DLA) with weakly supervised learning to address the challenge of extracting discontinuous complex named entities in legal documents. However, despite significant progress in traditional NER methods, the extraction of Complex Named Entities remains a relatively unexplored area. “Accurate Named Entity Recognition (NER) is crucial for various information retrieval tasks in industry. ![]() 2023 “Extracting Complex Named Entities in Legal Documents via Weakly Supervised Object Detection.” In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23), July 23–27, 2023, Taipei, Taiwan. ![]() Extracting Complex Named Entities in Legal Documents via Weakly Supervised Object Detection
0 Comments
Leave a Reply. |