Binary SignWriting

From PuddleNet

Jump to: navigation, search

Binary SignWriting is an encoding model for sign language data. It handles the specific requirements of a spatial script combined with the unique features of SignWriting.


Contents

[edit] Status of the work

This document is the pre-release version 1.0 of Binary SignWriting. Undergoing active development.


[edit] Requirements

Binary Requirements

  • encodes sign language data
  • 16 bit character codes
  • ABNF notation
  • Regular Expression parsing
  • Unicode compatible (theoretical)

Language Requirements

  • Two way conversion between character code and symbol id.
  • Access to glyphs - symbol image
  • Creation of glyphograms - visual unit of spatially written glyphs
  • Center estimate for glyphs and glyphograms
  • Horizontal stacking of glyphs and glyphograms based on center
  • Proper spacing between writing and punctuation
  • Horizontal off center alignment for lanes
  • Sorting based on sequence data, including special sorting symbols
  • Searching for symbol, BaseSymbol, symbol combination, spatial arrangement, or exact sign match
  • Drag and drop user interface
  • Keyboarding user interface
  • Special Commands for text entry


[edit] Use Cases

Writing, processing, and Unicode compatibility.


[edit] Repertoire

ISWA 2008
Valerie Sutton hand crafted over 35 thousand symbols. Each symbol has been assigned a unique symbol ID.

7-bit ASCII
Being 7-bit ASCII compatible offers many advantages and no disadvantages.

Control Characters
Special control characters are required for sign language data.

Number Characters
The spatial aspect of sign language data requires coordinated information. Each spatial symbol is associated with 2 number character. The number characters have a range from -1919 through 1919. These number characters are used for the X,Y position of the top left of the symbol when placed on a 2 dimensional grid. Number characters are used to avoid character collision when parsing.


[edit] Character

The characters are encoded using a fixed width 16 bits, which correspond to the natural representation of integers of the chosen datatype of the computing platform. ABNF notation and Regular Expressions fully defines the character usage. The character definition has no further limitation.


[edit] Encoding

There are three character encoding schemes available for Binary SignWriting: compact, XML, and binary. Two main types of data are represented. Sign data with spatial and sequential information. Sign text for sentences and lanes.

Sign data

Example Compact XML Hexadecimal UTF-8 Display
Symbol data 256 <sym>256</sym> 0100 f1 80 84 80
Single symbol glyphogram data 256,50,50 <sign><sym x="50" y="50">256</sym></sign> 0080 0100 F880 F880 f1 80 82 80 f1 80 84 80 f1 8f a2 80 f1 8f a2 80
Multi-symbol glyphogram data 256,50,50,356,70,70 <sign><sym x="50" y="50">256</sym><sym x="70" y="70">356</sym></sign> 0080 0100 F880 F880 0164 F894 F894 f1 80 82 80 f1 80 84 80 f1 8f a2 80 f1 8f a2 80 f1 80 85 a4 f1 8f a2 94 f1 8f a2 94
Sequence data 256,356 <seq><sym>256</sym><sym>356</sym></seq> 0081 0100 0164 f1 80 82 81 f1 80 84 80 f1 80 85 a4

7 bit ASCII
ASCII = %x00-7F;
ISWA
ISWA = %x100-F0FF;
WRITING_SYM = %x100-EBBF;
CENTERING_SYM = %xB980-E61F;
PUNCTUATION_SYM = %xEBC0-ED9F;
SORTING_SYM = %xEDA0-F09F;
Control
CONTROL_CHARS = %x80-FF;
Marker
MARKER_CHARS = %x80-8F;
SIGN_MARKER = %x80;
SEQUENCE_MARKER = %x81;
LEFT_LANE = %x82;
RIGHT_LANE = %x83;
Number
NUM_CHAR = %xF100-FFFE;
Constructed
COORD = NUM_CHAR NUM_CHAR;
SPATIAL_CHAR = WRITING_SYM COORD;
CLUSTER = *(SPATIAL_CHAR)
SIGN = (LEFT_LANE / SIGN_MARKER / RIGHT_LANE) CLUSTER;
SEQUENCE = SEQUENCE_MARKER *(WRITING_SYM / SORTING_SYM);
SIGN_TEXT = *(SIGN [SEQUENCE] / PUNCTUATION_SYM);


later implementation

Control
CONTROL_CHARS = %x80-FF;
Marker
MARKER_CHARS = %x80-8F;
OFFSET_MARKER = %x84;
GROUP_MARKER = %x85;
SECTION_MARKER = %x86;
CLASS_MARKER = %x87;
Breaks
BREAK_CHARS = %x90-9F;
GROUP_BREAK = %x90;
LINE_BREAK = %x91;
PARAGRAPH_BREAK = %x92;
PAGE_BREAK = %x93;
SECTION_BREAK = %x94;
CLASS_BREAK = %x95;
Constructed
CLASS = CLASS_MARKER NUM_CHAR;
OFFSET = OFFSET_MARKER COORD;
SIGN = (LEFT_LANE / SIGN_MARKER / RIGHT_LANE) CLUSTER [OFFSET];




Sign Language Data
Language Script Alphabet User interface Rendering Model
spatial
written
varied
symbol
cluster
center
column
lane
sort
search
name
order
hierarchy
group
base
palette
extension
style
palette
canvas
cursor
transformation
sequence
text
glyph
glyphogram
strip
page
requirements
use cases
repertoire
character
encoding
Personal tools