|
The
Arabic and Hebrew are bidirectional because their text runs right-to-left and their numbers run left-to-right. Also, any string with embedded text that runs in the opposite direction from the main text (English with embedded Arabic text, for example), is bidirectional. Bidirectional text presents interesting problems for correctly positioning carets, accurately locating selections, and correctly displaying multiple lines. Also bidirectional and right-to-left text present similar problems for moving the caret in the correct direction in response to right and left arrow key presses.
This lesson describes these issues and demonstrates how a
About the Examples
The following examples support foreign language text with
the To see the foreign language text, you need to install a unicode font like Bitstream Cyberbit on your system. For example, with a unicode font set up on your system, you can start the DrawSample application as follows to see a text string in Hebrew:
java DrawSample -text hebrew
The
Inserting Text In editable text, a caret displays where the end user clicks to indicate the insertion point where the end user will enter text. When the insertion point falls between right-to-left text like Arabic and left-to-right text like English, the same character location in the source text (shown on top) maps to two insertion points on the display (shown on the bottom). One location is the insertion point for English text and the other location is the insertion point for Arabic text. In the figure, character location 8 in the source text maps to the space after the word is or the first character in the right-to-left Arabic text in the displayed text.
The The next two figures display the same text, which consists of mainly right-to-left text with two left-to-right words (Hello and Arabic) embedded. When the end user clicks on the o in Hello or on the space after the o, dual carets display.
Dual carets consist of
a strong and weak caret, and in the figures, the strong caret is red
and the weak caret is black. The carets represent
boundaries between glyphs for selection highlighting, hit testing, and
moving the caret with arrow keys. The
Character Hit, Side, and Language A click on the o on the side of the o towards the Hebrew records that the end user clicked after the o, which is part of the English. This positions the weak (black) caret next to the o and the strong caret (red) in front of the H. If the end user enters English, it appears after the o, and if the end user enters Hebrew, it appears before the H.
A click on the space to the right of the o records that the end user clicked the space, which is part of the Hebrew. This positions the strong (red) caret next to the o and the weak caret (black) in front of the H. If the end user types English, it appears before the H, and if the end user types Hebrew, it appears after the o.
Caret Positioning You might be wondering why the caret positions do not include the spaces on either side. Spaces are either left-to-right or right-to-left characters depending on what is next to them. If the characters on both sides of a space are the same kind of character, the space is that kind of character too. Spaces between Arabic words are treated like Arabic characters, and spaces between English words are treated like English characters. When the characters on both sides are different, spaces are treated like the overall direction of the paragraph: If the paragraph as a whole is left-to-right, the space is left-to-right, and if the paragraph as a whole is right-to-left, the space is right-to-left. In the Hit Test sample, the overall text is right-to-left. The spaces on each side of Hello each have one neighbor that is left-to-right (the English) and one that is right-to-left (the Hebrew). Because the text is right-to-left, the spaces are right-to-left too, and the split carets appear next to the o and H because the spaces being right-to-left belong to the right-to-left text on either side. Hit Testing
In code, a point returned by a mouse click is passed to the
However, in the source text, position 5 is before the H, so
the The source text is initialized the way the words are spoken, and not the way they are printed. The source text looks like this: "\u05D0\u05E0\u05D9 Hello \u05DC\u05D0 \u05DE\u05D1\u05D9\ u05DF " + "\u05E2\u05D1\u05E8\u05D9\u05EA Arabic \u0644\u0645\u062C\ u0645\u0648\u0639\u0629", map);
The first three unicode characters and the space
Determining the location of the insertion point is taken care of
by the Here is the code to initialize the colors for the strong and weak carets: private static final Color STRONG_CARET_COLOR = Color.red; private static final Color WEAK_CARET_COLOR = Color.black;
Here is the code that draws the
Here is the
And here is the complete HitTestSample.java source code. Selection Highlighting The next figure shows how a contiguous range of characters in the source text (on the top) might not map to a contiguous highlight region (on the bottom) on screen if the selection range includes left-to-right and right-to-left characters. When the Arabic text is turned around to run right-to-left on the display, the selected portion of the Arabic text is not contiguous with the is and space before it, althout these characters are contiguous in the source text.
The next figure shows how a contiguous highlight region on the display (on the bottom) might not map to a single contiguous range of characters in the source text (on the top). This point is illustrated in the next figure.
A With logical highlighting, the selected characters are always contiguous in the source text, and the highlight region is allowed to be discontiguous on the display. With visual highlighting, there might be more than one range of selected characters, but the highlight region is always contiguous.
Logical highlighting is simpler for programmers to use because the selected
characters are always contiguous in the source. The
Here is the code to get and draw the selection range:
Here is the complete SelectionSample.java source code. Moving the Caret In bidirectional text, the cursor should move smoothly through the text on the display in the direction that corresponds to the direction of the Arrow key being pressed. The problem, is that right-to-left text is positioned in the source text in the direction it is spoken and not in the direction it is displayed. For a caret to have a smooth journey across the display, the character offset does not move smoothly through the source text. This point is illustrated by the figure.
Progressing through the three screen positions shown on the bottom of the figure from left-to-right corresponds to progressing through the character offsets in the source text in the order of 7, 19, and 18.
The
Here is the complete ArrowKeySample.java source code. Multiline Text
As you learned in the Draw Multiple
Lines of Text section of Lesson 2,
a Here is the complete LineBreakSample.java source code. | ||||||||||||||
|
| ||||||||||||