ICDAR 2026 Workshop on
Documents Analysis of Low-resource Languages
Motivation
The importance of low-resource document analysis is multifaceted, particularly in the fields of cultural preservation, data scarcity, linguistic research, and technological applications. Firstly, low-resource languages often embody unique cultural and historical contexts. Document analysis facilitates the digitization and preservation of these linguistic materials, providing crucial resources for understanding human history and cultural evolution. For instance, many endangered languages possess vast amounts of scanned documents, which can be analyzed to create valuable linguistic and cultural repositories. Secondly, low-resource languages typically suffer from a lack of large-scale annotated datasets, posing challenges for training machine learning models. Document analysis techniques, such as Optical Character Recognition (OCR) and document layout analysis, enable the extraction and structuring of data from existing documents, thereby mitigating data scarcity issues. Moreover, document analysis plays a pivotal role in enhancing machine translation capabilities. Monolingual data extracted through OCR can be utilized to improve machine translation for low-resource languages, which is particularly critical for languages with limited parallel corpora. Additionally, document analysis supports linguistic research by enabling the study of language variations and historical documentation, shedding light on the evolution and unique features of these languages. Finally, document analysis enhances the accessibility and usability of low-resource language documents. For example, advancements in OCR systems for non-Latin scripts allow researchers to extract text more efficiently from scanned documents, enabling applications such as content summarization and information retrieval. In summary, low-resource document analysis is not only a vital tool for cultural preservation but also a key driver of language technology development and academic research.
Tentative Schedule
All times in Beijing Time (UTC+08:00)
| Time | Events | |
| 09:00 - 9:20 | Opening Remarks | |
| 09:20 - 09:40 | Coffee break | |
| 09:40 - 12:30 | Presentation | |
| 12:30 - 12:40 | Conclusion | |
| 12:40 - 13:10 | Poster |
Call for Papers
Acceptable submission topics may include but are not limited to:
- Document Image Processing
- Optical Character Recognition (OCR) for Printed Text in low-resource languages
- Logical layout analysis
- Handwriting Text Recognition (HTR) for manuscripts and historical documents
- Natural Language Processing for Understanding of Documents written in low resource languages
- Scene Text Detection and Recognition for Low-Resource Languages
- Gold-Standard Benchmarks and Datasets for Low-Resource Languages
- Document analysis systems for Low-Resource Languages
- Multimodal Large Language Models (MLLMs) for Low-Resource Document Understanding
- Large Language Model (LLM)-Driven Document Analysis Technologies
Submission
This workshop invites original contributions in both theoretical and applied research domains. All submissions must adhere to the formatting guidelines specified on the ICDAR 2026 official website.
- Manuscripts not adhering to the page limit (maximum 17 pages all included), the formatting guidelines (Springer Lecture Notes format) or the anonymization requirements will be rejected without review.
- The authorship of submitted manuscripts is final. No changes to the list of authors will be allowed once a paper has been submitted!
- The review of conference papers will be double blind. Authors should not include their names, affiliations, or acknowledgements in submitted manuscripts, and should ensure that their identity is not revealed indirectly by citing their earlier work in the third person. Authors will be given the opportunity to respond to reviews in a rebuttal phase.
- Submitted papers must adhere to the Springer Lecture Notes in Computer Science (LNCS) format (Springer Guidelines/Templates).
Submissions will be accepted through the workshop's EasyChair submission portal. At least one author of each accepted paper must complete workshop registration to present the work. Detailed submission procedures are available on the ICDAR 2026 official website.
Contact
Important Dates
- Submission Deadline: May 1,2026
- Decisions Announced: June 1,2026
- Camera Ready Deadline: June 25, 2026
Publication
Accepted papers will be published in the ICDAR 2026 workshop proceedings.
Workshop Chairs
- Yong,Tso, Xizang University, China
- Brian Kenji Iwana,Kyushu University,Japan
- Yu,Yongbin, University of Electronic Science and Technology, China
Program Committee Members
- To Be Announced
Short CV of the Workshop Chairs
Prof. Yong Tso. Tso Yong is a professor of Xizang University from China, a senior member of the Chinese Association of Artificial Intelligence. Her main research area is Artificial Intelligence (Few Shot Learning), mainly focusing on intelligent analysis of Tibetan ancient literature, including digitization of ancient books, knowledge extraction, and language modeling etc. She has served as Principal Investigator (PI) for research projects including the National Natural Science Foundation of China (NSFC) grants and a sub-project under the National Key R&D Program of China (NKPs), and won two first prizes and two second prizes in science and technology of the Xizang Autonomous Region,and also served as the chief engineer for the National Key Research and Development project "Integration and Application Demonstration of Digital Technology for Tibetan Ancient Books and Documents". She had as a national visiting scholar at the University of Bergen in Norway, the University of Virginia in the United States, and the University of British Columbia in Canada.
Prof. Brian Kenji Iwana. Brian Kenji Iwana is an Associate Professor at the Department of Advanced Information Science in the Graduate School of Information Science and Electrical Engineering, Kyushu University. He received a B.S. in Computer Engineering of the University of California, Irvine, USA. After getting his Bachelor's, Brian Kenji Iwana worked as a software developer at the National Aeronautics and Space Administration (NASA) in Mountain View, California. He returned to academia and received a Ph.D. from the Graduate School of Information Science and Electrical Engineering, Kyushu University. He was also a graduate from the Graduate Education and Research Training Program in Decision Science for a Sustainable Society, Kyushu University. Since then, he worked as a Post Doc, Assistant Professor, and then an Associate Professor at the Graduate School of Information Science and Electrical Engineering, Kyushu University. Furthermore, he is affiliated with the International Undergraduate Program In English (IUPE), Kyushu University and the Graduate Program of Interdisciplinary Policy Analysis and Design (GIPAD), Kyushu University. He is an Associate Editor for the journal, Springer Nature Computer Science, and has served on many international conference program committees, such as ICDAR, ICFHR, AAAI, ICPR, and DAS. His research interests include time series recognition, dynamic programming, artificial neural networks, document recognition, and natural language processing (NLP).
Prof. Yu,Yongbin. Yongbin Yu is an Associate Professor at the School of Information and Software Engineering, University of Electronic Science and Technology of China (UESTC). He has visited the University of Michigan at Ann Arbor, Ann Arbor, MI, USA, in 2013-2014, and the University of California at Santa Barbara, Santa Barbara, CA, USA, in 2016-2017. He has worked as the Guest Deputy Director with the Department of Big Data Industry, Sichuan Provincial Economic and Information Commission, in 2018-2020. His research focuses on natural language processing, memristor-based neural network, swarm intelligence, and big data. He has made Research and Application of Key Technologies in Tibetan Natural Language Processing, and won the First Prize of Science and Technology Award of Tibet Autonomous Region in 2018.He has organized international academic events as Publicity Co-Chair and Technical Program Commitee (TPC) member for conferences including ICCCAS.