Question 1 of 30
Dr. Anya Sharma, a computational linguist, is developing a multilingual document processing system for a global news organization. The system needs to accurately handle news articles containing text in Latin, Cyrillic, Arabic, and Chinese scripts. During testing, Anya discovers that while the system correctly identifies the scripts using ISO 15924 codes, it struggles to properly render mixed-script text, leading to display errors and incorrect text processing. For instance, sentences containing both English (Latin script) and Russian (Cyrillic script) are sometimes displayed with incorrect character ordering or missing glyphs. Similarly, articles containing Arabic script mixed with Latin script exhibit issues with right-to-left text directionality. Chinese characters, when interspersed with Latin text, occasionally cause font rendering problems.\n\nGiven this scenario, which of the following approaches is the MOST appropriate for Anya to ensure accurate and consistent representation and processing of multilingual text in her system, considering the interplay between ISO 15924, Unicode, and software application capabilities?
Leverage Unicode's built-in support for script mixing, including combining characters, bidirectional text handling, and language tags, ensuring the system uses appropriate fonts for each script and correctly interprets character properties based on language context.
Primarily rely on ISO 15924 codes to dictate the rendering of each script segment, implementing custom rendering engines for each script identified, and forcing a uniform left-to-right rendering direction for all text.
Convert all non-Latin scripts to their closest Latin script equivalents using transliteration algorithms before processing, thereby simplifying the rendering process and ensuring compatibility across different software applications.
Implement a custom encoding scheme that assigns unique numerical identifiers to all characters across all scripts, bypassing Unicode entirely, and ensuring all software components use this custom encoding for text representation.

Preparing for ISO 23950:1998 Information and documentation -- Information retrieval (Z39.50)? Now land the interview.

73% of qualified candidates get rejected because of weak resumes. Build an ATS-optimized, recruiter-ready resume in under 5 minutes - free to start.

Build My Resume Free