MSW:Formal SignWriting

9. Formal SignWriting
The regular storage form consists of two parts: the lite markup (FSW) for the text and a query string for searching the text. The query string is a concise representation for a much larger and detailed set of regular expressions.

9.A. Lite Markup
The text of the regular storage form uses the lite markup with the token values for structural markers (A, B, L, M, R), symbol keys, and regular coordinates. Spaces separate words for signs and punctuation.

9.A.1. Pattern Examples
Sign A sign is a combination of a lane maker (BLMR), followed by the maximum coordinate, followed by zero or more symbol keys with placement coordinates.

Example: M518x529S14c20481x471S27106503x489

Punctuation A punctuation is a combination of a symbol key followed by a placement coordinate. The center is assumed to be the coordinate (0,0). The maximum coordinate is the additive inverse of the placement coordinate.

Example: S38800464x496

9.B. Query String
The query string is a combination of several types of searches of symbols, ranges, and positions. The query string is a concise representation for a much larger and detailed set of regular expressions. Each query strings is equivalent to 1 or more regular expressions. The regular expressions can be used to instantaneously search large files and databases.

A filter and repeat pattern of searching is used as a series of match criteria. A file, database, or text input is searched using a sequence of steps. Each step applies a single match criteria. Matching results are collated and the next search criteria is applied. The pattern of searching the previous results continues until all regular expressions have been used.

As an example, consider searching a deck of cards. Searching for the 7H is broken into two steps. The first searches for all cards that have a 7. The second step searches the matching cards for all cards that have a Heart. We can add several decks to expand the search criteria. Imagine a 100 decks of cards, mixed in a huge pile. If we searched for a red backed 7H, we would have 3 search steps. The first search finds all red backed cards and create a new pile. Second search finds all 7's to create a smaller pile. The last search finds all of the Hearts. The number of matching cards would depend on the decks that were initially mixed.

This is the basis of searching.

The query string can contain many concepts:
 * Find an exact symbol
 * Find a general symbol ( different fill and/or rotation)
 * Find a range of symbols
 * Find a symbol (exact, general, or range) in a position
 * Set the symbol placement variance with a custom value of any positive value or zero. The default is normally set at 20.

Each search concept is a segment. Like the above card matching, this searching can have many matching criteria. Each concept is extracted and transformed into a Regular Expression. Each regular expression is fairly complicated, but could be written by hand by a knowledgeable user of regular expressions.

The regular expression searching uses the filter and repeat pattern.

9.B.1. Pattern Examples
SignBox Search The query string to search for all signs is simply the letter “Q”. This will return the regular expression for a sign box with an optional prefix. This example has a ratio of 1 letter of query string to 112 letters of regular expression.

Query String: Q Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Term Search The query string to search for all terms is simply the letters “QT”. This will return the regular expression for a term. This example has a ratio of 1 letter of query string to 55.5 letters of regular expression.

Query String: QT Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Exact Symbol Search The query string to search for a specific symbol uses the exact symbol key. This will only match signboxes that have this exact symbol key. This example has a ratio of 1 letter of query string to 26 letters of regular expression.

Query String: QS10000 Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S10000[0-9]{3}x[0-9]{3}(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Generic Symbol Search The query string to search for a generic symbol replaces either one or both of the fill or rotation aspect of the symbol key with 'u' for unspecified. This will match all symbol keys for the specific symbol base without regard to fill or rotation. This example has a ratio of 1 letter of query string to 28 letters of regular expression.

Query String: QS100uu Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S100[0-5][0-9a-f][0-9]{3}x[0-9]{3}(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Symbol search with approximate location It is possible to search for a symbol around an approximate location. The position is specified with a regular coordinate string. The default variance for coordinates is plus or minus 20. This example has a ratio of 1 letter of query string to 17 letters of regular expression.

Query String: QS14c20481x471 Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S14c20((46[1-9])|(4[7-9][0-9])|(50[01]))x((45[1-9])|(4[6-8][0-9])|(49[01]))(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Symbol search with approximate location and custom variance It is possible to search for a symbol around an approximate location with a custom variance. The custom variance in this example is 10, so the coordinate numbers can vary by plus or minus 10. This example has a ratio of 1 letter of query string to 14 letters of regular expression.

Query String: QS14cuu481x471V10 Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S14c[0-5][0-9a-f]((47[1-9])|(48[0-9])|(49[01]))x((46[1-9])|(47[0-9])|(48[01]))(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Range Search It is possible to search for a range of symbol bases. The query string does not specify a fill or rotation value. It has a start based and an end base. This will match all symbol keys that are inside the range. This example has a ratio of 1 letter of query string to 26 letters of regular expression.

Query String: QR2fft36c Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S((2ff)|(3[0-5][0-9a-f])|(36[0-9a-c]))[0-5][0-9a-f][0-9]{3}x[0-9]{3}(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Range Search with approximate location It is possible to search for a range of symbol bases. The query string does not specify a fill or rotation value. It has a start based and an end base. This will match all symbol keys that are inside the range. This example has a ratio of 1 letter of query string to 16 letters of regular expression.

Query String: QR2fft36c480x480 Regular Expression: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)?[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S((2ff)|(3[0-5][0-9a-f])|(36[0-9a-c]))[0-5][0-9a-f]((4[6-9][0-9])|(500))x((4[6-9][0-9])|(500))(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

Term search with range and exact symbol It is possible to use multiple criteria for the search query. This example searches for terms that include a symbol from a range and an exact symbol. This will require 2 regular expressions that are performed is sequence. This example has a ratio of 1 letter of query string to 23 letters of regular expression.

Query String: QTR2fft36cS10000 Regular Expression 1: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S10000[0-9]{3}x[0-9]{3}(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})* Regular Expression 2: (A(S[123][0-9a-f]{2}[0-5][0-9a-f])+)[BLMR]([0-9]{3}x[0-9]{3})(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*S((2ff)|(3[0-5][0-9a-f])|(36[0-9a-c]))[0-5][0-9a-f][0-9]{3}x[0-9]{3}(S[123][0-9a-f]{2}[0-5][0-9a-f][0-9]{3}x[0-9]{3})*

9.C. Displacement
When searching with approximate location, the symbol's position is based on the center of the signbox rather than the relation between the symbols. If we want to find all examples of a sign, we will want to account for signs that are not in the same location with regards to the center of the signbox. This requires accounting for displacement.

Displacement can be cause by the use of head symbols (which changes the signbox center) or caused by multiple signs in the same signbox (which changes the signs relation to the signbox center).

9.C.1. Displacement Grid
To search for displacement, we'll need to use 8 additional query strings. We can either add or subtract the double of the variance to the X and/or Y values for each of the coordinates. The default variance is 20, so the displacement is adjusted by +/- 40

9.C.2. Displacement Example
Consider this query string: QS14c20481x471S27106503x489