Binary SignWriting
From PuddleNet
Binary SignWriting is an encoding model for sign language data. It handles the specific requirements of a spatial script combined with the unique features of SignWriting.
Contents |
[edit] Status of the work
This document is the pre-release version 1.0 of Binary SignWriting. Undergoing active development.
[edit] Requirements
Binary Requirements
- encodes sign language data
- 16 bit character codes
- ABNF notation
- Regular Expression parsing
- Unicode compatible (theoretical)
Language Requirements
- Two way conversion between character code and symbol id.
- Access to glyphs - symbol image
- Creation of glyphograms - visual unit of spatially written glyphs
- Center estimate for glyphs and glyphograms
- Horizontal stacking of glyphs and glyphograms based on center
- Proper spacing between writing and punctuation
- Horizontal off center alignment for lanes
- Sorting based on sequence data, including special sorting symbols
- Searching for symbol, BaseSymbol, symbol combination, spatial arrangement, or exact sign match
- Drag and drop user interface
- Keyboarding user interface
- Special Commands for text entry
[edit] Use Cases
Writing, processing, and Unicode compatibility.
[edit] Repertoire
ISWA 2008
Valerie Sutton hand crafted over 35 thousand symbols.
Each symbol has been assigned a unique symbol ID.
7-bit ASCII
Being 7-bit ASCII compatible offers many advantages and no disadvantages.
Control Characters
Special control characters are required for sign language data.
Number Characters
The spatial aspect of sign language data requires coordinated information. Each spatial symbol is associated with 2 number character. The number characters have a range from -1919 through 1919. These number characters are used for the X,Y position of the top left of the symbol when placed on a 2 dimensional grid. Number characters are used to avoid character collision when parsing.
[edit] Character
The characters are encoded using a fixed width 16 bits, which correspond to the natural representation of integers of the chosen datatype of the computing platform. ABNF notation and Regular Expressions fully defines the character usage. The character definition has no further limitation.
[edit] Encoding
There are three character encoding schemes available for Binary SignWriting: compact, XML, and binary. Two main types of data are represented. Sign data with spatial and sequential information. Sign text for sentences and lanes.
Sign data
| Example | Compact | XML | Hexadecimal | UTF-8 | Display | ||
|---|---|---|---|---|---|---|---|
| Symbol data | 256 | <sym>256</sym> | 0100 | f1 80 84 80 | |||
| Single symbol glyphogram data | 256,50,50 | <sign><sym x="50" y="50">256</sym></sign> | 0080 0100 F880 F880 | f1 80 82 80 f1 80 84 80 f1 8f a2 80 f1 8f a2 80 | |||
| Multi-symbol glyphogram data | 256,50,50,356,70,70 | <sign><sym x="50" y="50">256</sym><sym x="70" y="70">356</sym></sign> | 0080 0100 F880 F880 0164 F894 F894 | f1 80 82 80 f1 80 84 80 f1 8f a2 80 f1 8f a2 80 f1 80 85 a4 f1 8f a2 94 f1 8f a2 94 | |||
| Sequence data | 256,356 | <seq><sym>256</sym><sym>356</sym></seq> | 0081 0100 0164 | f1 80 82 81 f1 80 84 80 f1 80 85 a4 |
- 7 bit ASCII
- ASCII = %x00-7F;
- ISWA
- ISWA = %x100-F0FF;
- WRITING_SYM = %x100-EBBF;
- CENTERING_SYM = %xB980-E61F;
- PUNCTUATION_SYM = %xEBC0-ED9F;
- SORTING_SYM = %xEDA0-F09F;
- Control
- CONTROL_CHARS = %x80-FF;
- Marker
- MARKER_CHARS = %x80-8F;
- SIGN_MARKER = %x80;
- SEQUENCE_MARKER = %x81;
- LEFT_LANE = %x82;
- RIGHT_LANE = %x83;
- Number
- NUM_CHAR = %xF100-FFFE;
- Constructed
- COORD = NUM_CHAR NUM_CHAR;
- SPATIAL_CHAR = WRITING_SYM COORD;
- CLUSTER = *(SPATIAL_CHAR)
- SIGN = (LEFT_LANE / SIGN_MARKER / RIGHT_LANE) CLUSTER;
- SEQUENCE = SEQUENCE_MARKER *(WRITING_SYM / SORTING_SYM);
- SIGN_TEXT = *(SIGN [SEQUENCE] / PUNCTUATION_SYM);
later implementation
- Control
- CONTROL_CHARS = %x80-FF;
- Marker
- MARKER_CHARS = %x80-8F;
- OFFSET_MARKER = %x84;
- GROUP_MARKER = %x85;
- SECTION_MARKER = %x86;
- CLASS_MARKER = %x87;
- Breaks
- BREAK_CHARS = %x90-9F;
- GROUP_BREAK = %x90;
- LINE_BREAK = %x91;
- PARAGRAPH_BREAK = %x92;
- PAGE_BREAK = %x93;
- SECTION_BREAK = %x94;
- CLASS_BREAK = %x95;
- Constructed
- CLASS = CLASS_MARKER NUM_CHAR;
- OFFSET = OFFSET_MARKER COORD;
- SIGN = (LEFT_LANE / SIGN_MARKER / RIGHT_LANE) CLUSTER [OFFSET];

