SignWriting Text

SignWriting Text
An image is not text. The SignPuddle Standard for SignWriting Text establish sign language as text. The specifications and infrastructure are openly available. We are approaching a stable version 1.0 release across the board.

Sign language is vastly different than spoken language. Instead of the sequential sounds of the voice, there is a 3 dimensional space with simultaneous action. The SignWriting Script creates 2-dimensional writing that is visually icon and full of featural information. This is true on the symbol level and on the sign level. A symbol represents phonemic information and is full of featural information to better understand the phonemes of the symbols. A sign is a 2-dimensional arrangement of symbols and is full of featural information to better understand the morphemes of the signs.

Bringing the SignWriting Script to the computer has a unique set of requirements for text and a unique set of possibilities. The plain text model defines several compatible character sets. A sequential list of characters in a plain text string. Strings represent signs using various character patterns to represent mathematically sized logograms. A robust plain text encoding model separates visual display, layout issues, and regular expression searching. This enables a distributed client-server model of SignWriting images, where the client knows the size and center of the logogram, but the server knows the visual image.

The rich text model defines styling using basic CSS rules for HTML or MediaWiki markup. The rich text model is quickly approaching a usable beta.

Mathematical Name
The mathematical name of a logographic sign is a plain text string of characters. This encoding model makes explicit those features which can be effectively and efficiently processed. Formal languages and regular expressions are used to solve fundamental problems.

Visual Image
A usable infrastructure on the internet for SignWriting Text becomes easier when signs are treated with a logographic nature. A client-server model is used where the server generates the logographic images from the mathematical names, and the client can quickly calculate the size of a logogram from a string and predetermine these values for an image it has not yet received.

The SignWriting Image Server is an open server that creates SignWriting images. It frees the SignWriting Image Client to focus on text processing and presentation.

In this client side model, the structure of a page can be quickly created. If the image server lags, it is of no concern to the client. The images appear where and how they are should without any jumping or bouncing. Simple math and basic CSS rules determine layout.

The client side is most often a browser. The client side uses the mathematical names of SignWriting Plain Text for general text processing. Searching, sorting, and layout are easy client side tasks.

By splitting the world of visual image from text processing, we create a robust and powerful model that is structured, productive, and scalable.

Individuals can run their own SignWriting Image Server, private networks can have a SignWriting Image Server, and we can have SignWriting Image Servers freely available on the internet.

Client side programming is simplified by using a SignWriting Image Server. SignWriting Text can be implemented on a website or in applications with less programming. Plugins and extensions can focus on the math and CSS style rules for rich text layout without having to reinvent the wheel of logographic images.

Infrastructure
Visit SignPuddle.com for the status of the infrastructure.


 * SignWriting Text Reference
 * International SignWriting Alphabet Fonts
 * SignWriting Icon Server
 * SignWriting Icon Client

Encoding Schemes
Encoding schemes define how a character is written as a sequence of bytes. SignWriting Text can use encoding schemes based on either: ASCII or Unicode.

Given a sequence of bytes representing text and a stated character encoding scheme, a string of characters is unambiguously and it is easy to recreate a sequence of characters as required for plain text.

ASCII
Every logographic sign has a mathematical name in ASCII. ASCII is universally supported. The ASCII names are authoritative and easy to identify. Searching with regular expressions is 4 times faster in ASCII that the equivalent Unicode.

Hexadecimal
Hexadecimal is base 16, but in this context is a subset of ASCII. In this reference, hexadecimal characters are single digits between 0-9 along with a-f. Hexadecimal strings are 2 or more characters. Many of the character encoding forms incorporate fixed width hexadecimal strings.

Unicode
Every logographic sign has a temporary name of Unicode PUA characters for client side font handling. The use of the Unicode PUA demonstrates the necessity and the capability of the proposed character set.

Coded Character Sets
A character is a fundamental building block of digital data. A character's smallest representation is a binary representation of a real number found in a character set. A string is an ordered sequence of characters, which is nothing more that a list of integers.

x-ISWA-2010
The x-ISWA-2010 is a 16-bit character set that covers each symbol of the ISWA 2010. A 16-bit code is an integer between 0 and 65,535. This type of value is perfect for a primary key for database lookup or other integer index. Through a simple formula, any symbol identification can be transformed into a unique 16-bit codepoint. Font software using the SQLite fonts rely on the x-ISWA-2010 coded character set.

Read about the symbol set and the symbol encoding design in Modern SignWriting.

x-Binary-SignWriting
The x-Binary-SignWriting is a 12-bit character set that covers the characters of SignWriting Plain Text. It is possible to write the name of a logographic sign with binary data. This is more of a theoretical advantage because we don't write with 12-bit characters. This form is most useful for the translation to Private Use Area Unicode.

Read about the coded characters set and the string patterns in Modern SignWriting.

x-Character-SignWriting
The x-Character-SignWriting is my proposal for SignWriting in Unicode. Take the characters of the x-Binary-SignWriting coded character set and add hexadecimal value 1D700. The same principal is used to create the temporary font characters with PUA Unicode.

Character Encoding Forms
The specifics of the character encoding forms are contained in Modern SignWriting, section 8: Text Encoding, section 9: Regular Storage Form, and section 10: Variant Display Form.

BSW - Binary SignWriting
Binary SignWriting uses fixed-width hexadecimal characters from the the 12-bit coded character set x-Binary-SignWriting. Each character is written with 3 hexadecimal digits. Structures are identified with one characters (3 digits), symbols are identified with 3 characters (9 digits), numbers are identified with 1 character (3 digits), and coordinates are identified with 2 characters (6 digits). The name of a sign is a patterned string. The character definitions are available in Modern SignWriting, section 8: Repertoire and Coded Character Set.

CSW - Character SignWriting
Character SignWriting uses Private Use Area Unicode characters to create a logographic sign with a mathematical name. The character definitions are available in Modern SignWriting, section 8: Repertoire and Coded Character Set.

FSW - Formal SignWriting
Formal SignWriting uses a lite markup to create a string that represents a sized logogram with a regular structure. ASCII characters are used to identify structure, symbols, and coordinates. The lite markup of FSW is covered in Modern SignWriting, section 9: Lite Markup. A structured query language for FSW is covered in section 9: Query String.

KSW - Kartesian SignWriting
Kartesian SignWriting uses a lite markup to create a string that represents a variant display area. ASCII characters are used to identify structure, symbols, and coordinates. The lite markup in general is covered in Modern SignWriting, section 8: Lite Makrup. The specific forms of KSW are covered in section 10: variant display forms.