ICDAR 2025 Workshop on

Documents Analysis of Low-resource Languages

September 16-21, 2025 @ Wuhan, Hubei, China


Motivation

The importance of low-resource document analysis is multifaceted, particularly in the fields of cultural preservation, data scarcity, linguistic research, and technological applications. Firstly, low-resource languages often embody unique cultural and historical contexts. Document analysis facilitates the digitization and preservation of these linguistic materials, providing crucial resources for understanding human history and cultural evolution. For instance, many endangered languages possess vast amounts of scanned documents, which can be analyzed to create valuable linguistic and cultural repositories. Secondly, low-resource languages typically suffer from a lack of large-scale annotated datasets, posing challenges for training machine learning models. Document analysis techniques, such as Optical Character Recognition (OCR) and document layout analysis, enable the extraction and structuring of data from existing documents, thereby mitigating data scarcity issues. Moreover, document analysis plays a pivotal role in enhancing machine translation capabilities. Monolingual data extracted through OCR can be utilized to improve machine translation for low-resource languages, which is particularly critical for languages with limited parallel corpora. Additionally, document analysis supports linguistic research by enabling the study of language variations and historical documentation, shedding light on the evolution and unique features of these languages. Finally, document analysis enhances the accessibility and usability of low-resource language documents. For example, advancements in OCR systems for non-Latin scripts allow researchers to extract text more efficiently from scanned documents, enabling applications such as content summarization and information retrieval. In summary, low-resource document analysis is not only a vital tool for cultural preservation but also a key driver of language technology development and academic research.


Tentative Schedule

All times in Beijing Time (UTC+08:00)

Time Events
09:00 - 9:10 Opening Remarks
09:10 - 10:00 Invited Talk:Prof. Nyima,Trashi
10:00 - 10:20 Coffee break
10:20 - 12:00 Presentation
12:00 - 13:00 Discussion & Conclusion
12:00 - 14:00 Poster


Call for Papers

Acceptable submission topics may include but are not limited to:

  • Document Image Processing
  • Document Image Processing Optical Character Recognition, OCR
  • Logical Layout Analysis
  • Handwriting recognition
  • Natural Language Processing for Document Understanding
  • Medical Document Analysis
  • Gold-Standard Benchmarks And Datasets For Low-Resource Languages
  • Document analysis systems for Low-Resource Languages


Submission

This workshop invites original contributions in both theoretical and applied research domains. All submissions must adhere to the formatting guidelines specified on the ICDAR 2025 official website. Paper length is limited to 15 pages (excluding references) and must comply with our double-blind review requirements:

  • Remove all author identifiers (names, affiliations, etc.) from the manuscript
  • Cite previous work in third-person format to avoid identity disclosure
  • Omit acknowledgments section in initial submissions

Submissions will be accepted through the workshop's EasyChair submission portal. At least one author of each accepted paper must complete workshop registration to present the work. Detailed submission procedures are available on the ICDAR 2025 guidelines portal.


Important Dates

  • Submission Deadline: May 30, 2025
  • Decisions Announced: June 30, 2025
  • Camera Ready Deadline: July 05, 2025
  • September 21, 2025


Publication

Accepted papers will be published in the ICDAR 2025 workshop proceedings.


Workshop Chairs

  • Yong,Tso, Tibet University,China
  • Brian Kenji Iwana,Kyushu University,Japan
  • Yu,Yongbin, University of Electronic Science and Technology, China


Program Committee Members

  • Nyima,Trashi, Tibet University,China
  • Brian Kenji Iwana,Kyushu University,Japan
  • Harold MOUCHERE,Nantes Univerisity, France
  • Cheng,Jian,University of Electronic Science and Technology, China
  • Anna Zhu, Wuhan University of Technology,China
  • Yu,Yongbin, University of Electronic Science and Technology, China
  • Yong,Tso, Tibet University,China
  • Rinchen,Dongrub, Tibet University,China


Short CV of the Workshop Chairs


Prof. Yong Tso. Tso Yong is a professor of Tibet University from China, a senior member of the Chinese Association of Artificial Intelligence, a senior member of the Chinese Computer Society. Her main research area is intelligent analysis of Tibetan ancient literature, including digitization of ancient books, knowledge extraction, and language modeling etc. She has served as Principal Investigator (PI) for research projects including the National Natural Science Foundation of China (NSFC) grants and a sub-project under the National Key R&D Program of China (NKPs), and won two first prizes and two second prizes in science and technology of the Xizang Autonomous Region,and also served as the chief engineer for the National Key Research and Development project "Integration and Application Demonstration of Digital Technology for Tibetan Ancient Books and Documents". She had as a national visiting scholar at the University of Bergen in Norway, the University of Virginia in the United States, and the University of British Columbia in Canada.


Prof. Brian Kenji Iwana. Brian Kenji Iwana is an Associate Professor at the Department of Advanced Information Science in the Graduate School of Information Science and Electrical Engineering, Kyushu University. He received a B.S. in Computer Engineering of the University of California, Irvine, USA. After getting his Bachelor's, Brian Kenji Iwana worked as a software developer at the National Aeronautics and Space Administration (NASA) in Mountain View, California. He returned to academia and received a Ph.D. from the Graduate School of Information Science and Electrical Engineering, Kyushu University. He was also a graduate from the Graduate Education and Research Training Program in Decision Science for a Sustainable Society, Kyushu University. Since then, he worked as a Post Doc, Assistant Professor, and then an Associate Professor at the Graduate School of Information Science and Electrical Engineering, Kyushu University. Furthermore, he is affiliated with the International Undergraduate Program In English (IUPE), Kyushu University and the Graduate Program of Interdisciplinary Policy Analysis and Design (GIPAD), Kyushu University. He is an Associate Editor for the journal, Springer Nature Computer Science, and has served on many international conference program committees, such as ICDAR, ICFHR, AAAI, ICPR, and DAS. His research interests include time series recognition, dynamic programming, artificial neural networks, document recognition, and natural language processing (NLP).


Prof. Yu,Yongbin. Yongbin Yu is an Associate Professor at the School of Information and Software Engineering, University of Electronic Science and Technology of China (UESTC). His research focuses on Artificial Intelligence (neural network, swarm intelligence, large language model), Memristor, VLSI physical design, impulsive control and stability analysis. He has organized international academic events as Publicity Co-Chair and Technical Program Committee (TPC) member for conferences including ICCCAS.


Invited Speakers


Prof. Nyima,Trashi. Updated soon.

Title:

Abstract: