.. http://www.lassosoft.com/Language-Guide-String-Operations .. _strings: ******* Strings ******* Text in Lasso is stored and manipulated using the :type:`string` type or the ``string_…`` methods. This chapter details the operators and methods that can be used to manipulate string values. .. tip:: The :type:`string` type is often used in conjunction with the :type:`bytes` type to convert binary data between different character encodings (e.g. UTF-8, ISO-8859-1). See the :ref:`byte-streams` chapter for more information about the :type:`bytes` type. String Objects ============== Text processing is a central function of Lasso. Many Lasso methods are dedicated to outputting and manipulating text. Lasso is used to format text-based HTML pages or XML data for output. Lasso is also used to process and manipulate text-based HTML form inputs and URLs. As a result of this focus on text processing, the :type:`string` type is the primary type of data in Lasso. The result of all expressions are converted to strings before they are output into the HTML page or XML data being served. The following operations that can be performed directly on strings: #. Operators can be used to perform string calculations:: 'The' + ' ' + 'String' // => The String #. String member methods can be used to manipulate the string value:: 'the string'->titlecase&; // => The String #. String member methods can be used to return new strings based on the value of the current string:: 'The String'->sub(5, 6) // => String #. String member methods can be used to test the attributes of strings:: 'The String'->contains('the') // => true Each of these methods is described in detail in the sections that follow. This chapter contains a description and examples of using operators and methods to manipulate strings. Unicode Characters ------------------ Lasso supports the processing of Unicode characters in all :type:`string` methods. The escape sequence ``\u…`` can be used with 4 hexadecimal digits (or ``\U…`` with 8 or ``\x…`` with 2) to embed a Unicode character in a string. For example ``\u002F`` represents a "/" character, ``\u0020`` represents a space, and ``\u0042`` represents a capital letter "B". The same type of escape sequence can be used to embed any Unicode character, e.g. ``\u4E26`` represents the Traditional Chinese character |4E26|. .. |4E26| unicode:: U+4E26 Lasso also supports common escape sequences including ``"\r"`` for a return character, ``"\n"`` for a newline character, ``"\r\n"`` for a Windows return/newline, ``"\f"`` for a form-feed character, ``"\t"`` for a tab, and ``"\v"`` for a vertical-tab. See the table :ref:`literals-string-escape` for the full list. Converting Values to Strings ============================ Expressions that produce a value will convert that value to the :type:`string` type automatically, or they can be explicitly converted using the `string` creator method as well as the ``asString`` member method every object has. .. method:: string(obj::any) .. method:: string(obj::bytes, enc::string= ?) Converts a value to type :type:`string`. Requires one value which is the data to be converted. An optional second parameter can be used when converting byte streams in order to specify which character set should be used to translate the byte stream to a string (defaults to "UTF-8"). Automatic String Conversion --------------------------- Integer and decimal values are converted to strings automatically if they are used as a parameter to a string operator. If either of the parameters to the operator is a string then the other parameter is converted to a string automatically. The following example shows how the integer ``123`` is automatically converted to a string because the other parameter of the ``+`` operator is the string ``'String'``:: 'String ' + 123 // => String 123 The following example shows how a variable that contains the integer ``123`` is automatically converted to a string for the expression:: local(number) = 123 'String ' + #number + '\n' + #number->type // => // String 123 // integer Array, map, and pair values are converted to strings automatically when they are output to a web page or included as part of an auto-collect block. The value they return is intended for the developer to be able to see the contents of the complex type and is not intended to be displayed to site visitors. :: array('One', 'Two', 'Three') // => array(One, Two, Three) map('Key1'="Value1", 'Key2'="Value2") // => map(Key1 = Value1, Key2 = Value2) pair('name'='value') // => (name = value) The parameters sent to the ``string_…`` methods are automatically converted to strings. The following example shows the result of calling `string_length` on an integer:: string_length(21) // => 2 Explicitly Convert a Value to a String Object --------------------------------------------- Integer and decimal values can be converted to string objects using the `string` creator method. The value of the new string is the same as the value of the integer or decimal value when it is output using the `~null->toString` method. The following example shows a math calculation and the integer result ``579``. The next line shows the same calculation with string parameters and the result of ``123456``. :: 123 + 456 // => 579 string(123) + string(456) // => 123456 Boolean values can also be converted to a string object using the `string` creator method. The value will always either be the string "true" or the string "false". The following example shows a conditional result converted to type :type:`string`:: string('dog' == 'cat') // => false String member methods can be used on any value by first converting that value to a string using either the `string` creator method or the ``asString`` member method every object has. The following example shows how to use the `string->size` member method on an integer by first converting it to a string object:: 21->asString->size // => 2 string(21)->size // => 2 Byte streams that are converted to strings can include the character set to be used to export the data in the byte stream. By default byte streams are assumed to contain UTF-8 character data. The following example code would translate a byte stream contained in a variable named "myByteStream" using the ISO-8859-1 encoding to interpret the character data. This is analogous to using the `bytes->exportString` method which is described in more detail in the :ref:`byte-streams` chapter:: string(#myByteStream, 'ISO-8859-1') String Inspection Methods ========================= The :type:`string` type has many member methods that return information about the value of the string object. Many of these methods are documented below. (Information about regular expressions and the :type:`regexp` type is found in the :ref:`regular-expressions` chapter.) .. type:: string .. member:: string->length() .. deprecated:: 9.0 Use `string->size` instead. .. member:: string->size() Returns the number of characters in the string. .. member:: string->charName(position::integer) Takes a parameter that specifies the position of the character to inspect. It returns the Unicode name for the specified character. .. member:: string->charType(position::integer) Takes a parameter that specifies the position of the character to inspect. It returns the Unicode type for the specified character. .. member:: string->digit(position::integer, base::integer) Takes a parameter that specifies the position of the character to inspect and a parameter that specifies the base or radix. If the specified character is a digit for the specified radix, then it returns the integer value for that digit. (Remember that when integers are converted to strings, they default to displaying in base 10.) The radix or base can be any from "2" to "36". .. member: string->sub(pos::integer) .. member:: string->sub(position::integer, size::integer= ?) .. member: string->substring(start::integer) .. member:: string->substring(start::integer, size::integer= ?) Returns a portion of the string. The starting point is specified by the first parameter and the number of characters to return is specified by the second. If the second parameter is not specified, then all characters from the specified starting position to the end of the string are returned. .. member: string->integer() .. member:: string->integer(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character if no position is specified. It returns the Unicode integer value of that character. .. member:: string->charDigitValue(position::integer) Takes a parameter that specifies the position of the character to inspect. If the specified character is a digit, then it will return an integer of the value of the digit. Otherwise it returns "-1". .. member:: string->getNumericValue(position::integer) Takes a parameter that specifies the position of the character to inspect. If the specified character is a digit, then it will return a decimal of the value of the digit. Otherwise it returns the decimal "-123456789.0". .. member: string->isAlnum() .. member:: string->isAlnum(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is alphanumeric the method will return "true" otherwise it will return "false". .. member: string->isAlpha() .. member:: string->isAlpha(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is alphabetic the method will return "true" otherwise it will return "false". .. member: string->isUAlphabetic() .. member:: string->isUAlphabetic(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character has the Unicode alphabetic property then the method will return "true" otherwise it will return "false". .. member: string->isBase() .. member:: string->isBase(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is a base Unicode character the method will return "true" otherwise it will return "false". .. member: string->isBlank() .. member:: string->isBlank(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is a space or tab the method will return "true" otherwise it will return "false". .. member: string->isCntrl() .. member:: string->isCntrl(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is a control character then the method will return "true" otherwise it will return "false". .. member: string->isDigit() .. member:: string->isDigit(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is a base 10 digit then the method will return "true" otherwise it will return "false". .. member: string->isXDigit() .. member:: string->isXDigit(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is a hexadecimal digit then the method will return "true" otherwise it will return "false". .. member: string->isGraph() .. member:: string->isGraph(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is printable and not whitespace then the method will return "true" otherwise it will return "false". .. member: string->isLower() .. member:: string->isLower(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is lowercase the method will return "true" otherwise it will return "false". .. member: string->isULowercase() .. member:: string->isULowercase(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character has the Unicode lowercase property then the method will return "true" otherwise it will return "false". .. member: string->isPrint() .. member:: string->isPrint(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is printable the method will return "true" otherwise it will return "false". .. member: string->isPunct() .. member:: string->isPunct(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is punctuation the method will return "true" otherwise it will return "false". .. member: string->isSpace() .. member:: string->isSpace(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is whitespace the method will return "true" otherwise it will return "false". .. member: string->isTitle() .. member:: string->isTitle(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is in the Unicode category "Letter, Titlecase" then the method will return "true" otherwise it will return "false". .. member: string->isUpper() .. member:: string->isUpper(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is uppercase the method will return "true" otherwise it will return "false". .. member: string->isUUppercase() .. member:: string->isUUppercase(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character has the Unicode uppercase property then the method will return "true" otherwise it will return "false". .. member: string->isWhitespace() .. member:: string->isWhitespace(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character is whitespace the method will return "true" otherwise it will return "false". .. member: string->isUWhitespace() .. member:: string->isUWhitespace(position::integer= ?) Takes a parameter that specifies the position of the character to inspect, defaulting to the first character. If the specified character has the Unicode whitespace property then the method will return "true" otherwise it will return "false". .. member:: string->find(find::string, offset::integer, -case::boolean= ?) .. member:: string->find(find::string, offset::integer, length::integer) .. member:: string->find(find::string, offset::integer, length::integer, \ patOffset::integer, patLength::integer, case::boolean) .. member:: string->find(find::string, \ -offset::integer= ?, \ -length::integer= ?, \ -patOffset::integer= ?, \ -patLength::integer= ?, \ -case::boolean= ?) Searches the value of the string object for the specified string pattern, returning the position of where the pattern first begins in the string object value or zero if the pattern cannot be found. An optional ``-case`` parameter can be used to specify case-sensitive pattern matching. The ``-offset`` and ``-length`` parameters can be used to specify a portion of the string within which to look for the match, with the former specifying the position to begin the search and the latter specifying the number of characters to search. (If ``-length`` is not specified, the method will search to the end of the string.) The ``-patOffset`` and ``-patLength`` parameters can be used to specify that only a portion of the pattern should be used for matching; they behave similarly for the pattern string as the ``-offset`` and ``-length`` parameters do for the base string. .. member:: string->findLast(find::string, \ offset::integer= ?, \ -length::integer= ?, \ -patOffset::integer= ?, \ -patLength::integer= ?, \ -case::boolean= ?) This method is similar to `string->find` except that it returns the starting position of the *last* match found in the string object. .. member:: string->contains(find::string, -case::boolean= ?) .. member:: string->contains(find::regexp, -ignoreCase::boolean= ?) Takes a parameter that specifies a string or regular expression to match within the string object. It returns "true" if it finds a match, otherwise it will return "false". By default, string matching is not case-sensitive unless the optional ``-case`` parameter is passed to the method, but regular expression matching is case-sensitive unless the optional ``-ignoreCase`` parameter is passed to the method. .. member:: string->get(position::integer) Takes a parameter that specifies the position of the character to return. .. member:: string->equals(find::string, case::boolean) .. member:: string->equals(find::string, -case::boolean= ?) This method is similar to the ``==`` equality operator. It returns "true" if the specified string is equivalent to the base string. This matching will not be case-sensitive unless passed the ``-case`` parameter. .. member:: string->compare(find::string, -case::boolean= ?) .. member:: string->compare(find::string, offset::integer, \ length::integer= ?, \ patOffset::integer= ?, \ patLength::integer= ?, \ -case::boolean= ?) Takes a string pattern to compare with the string object and returns "0" if they are equal, "1" if the characters in the string are bitwise greater than the parameter, and "-1" if the characters in the string are bitwise less than the parameter. Comparisons are not case-sensitive unless passed the optional ``-case`` parameter. Optionally, the comparison can be made on smaller portions of the string object by passing the ``offset`` and ``length`` parameters, and smaller portions of the pattern by passing the ``patOffset`` and ``patLength`` parameters. .. member:: string->beginsWith(find::string, case::boolean) .. member:: string->beginsWith(find::string, -case::boolean= ?) Takes a parameter that specifies a string to compare with the beginning of the string object value. It returns "true" if it matches the beginning, otherwise it will return "false". By default, string matching is not case-sensitive unless the optional ``-case`` parameter is passed to the method. .. member:: string->endsWith(find::string, case::boolean) .. member:: string->endsWith(find::string, -case::boolean= ?) Takes a parameter that specifies a string to compare with the end of the string object value. It returns "true" if it matches the end, otherwise it will return "false". By default, string matching is not case-sensitive unless the optional ``-case`` parameter is passed to the method. .. member:: string->getPropertyValue(position::integer, property::integer) Takes a parameter that specifies the position of the character to inspect and a second parameter that specifies a Unicode property. It returns the Unicode property value for the indicated character and property. Unicode properties are defined in the `Unicode Character Database`_ (UCD) and `Unicode Technical Reports`_ (UTR). Lasso defines many methods that return values for these Unicode property names. All of these values have the ``UCHAR_`` prefix. .. member:: string->hasBinaryProperty(position::integer, property::integer) Takes a parameter that specifies the position of the character to inspect and a second parameter that specifies a Unicode property. It returns "true" if the specified character has the specified property, otherwise it returns "false". Find the Size of a String ------------------------- The following example returns the number of characters of the string:: 'Ralph is a red rhinoceros'->size // => 25 Check for Lowercase Characters ------------------------------ The following example inspects each character in a string and counts the number of lowercase letters it contains:: local(num_lcase) = 0 local(my_string) = 'Ralph is a red rhinoceros' loop(#my_string->size) => { #my_string->isLower(loop_count) ? #num_lcase++ } #num_lcase // => 20 Check the Beginning of a String ------------------------------- The following example checks to see if a string begins with "https:". If so, it displays "secure", otherwise it displays "insecure":: local(url) = 'https://secure.example.com' #url->beginsWith('https:') ? 'secure' | 'insecure' // => secure Find a Substring ---------------- This example uses the `string->find` method to find and output each position in a string where there is an apostrophe:: local(my_string) = "Don't, it's not worth it!" local(position) = 0 while(#position < #my_string->size) => {^ #position = #my_string->find(`'`, #position + 1) if(0 == #position) => { loop_abort } #position + '\n' ^} // => // 4 // 10 Extract a Substring ------------------- The following example will pull the substring "red" out of the base string:: local(my_string) = 'Ralph is a red rhinoceros' #my_string->sub(12, 3) // => red Extract a Specified Character Position -------------------------------------- The following example uses `string->get` to return the last character in a string:: local(my_string) = 'Ralph is a red rhinoceros' #my_string->get(#my_string->size) // => s String Manipulation Methods =========================== The :type:`string` type includes many member methods that can be used to modify or manipulate a string object in-place. These methods do not return a value, and instead modify the value of the string object. Many of these member methods are documented below. .. member:: string->append(s::string) .. member:: string->append(obj::any) Takes a single parameter that will be converted to a string and then concatenated to the end of the string object. It modifies the string object in-place, not returning any value. .. member:: string->appendChar(i::integer) Takes an integer that is the Unicode integer value in base 10 of a character. This character is then concatenated with the end of the string object. It modifies the string object in-place, not returning any value. .. member: string->remove() .. member:: string->remove(position::integer= ?) .. member:: string->remove(position::integer, num::integer) Takes a parameter that specifies the position of the first character to remove, defaulting to the first character. A second parameter can specify the number of characters to remove and defaults to removing all the characters from the starting position. It modifies the string object in-place, not returning any value. .. member:: string->normalize() Transforms a string object into its normalized form. It modifies the string object in-place, not returning any value. For more information on normalizing Unicode strings, see the `Unicode Normalization FAQ`_ and `Unicode Standard Annex #15`_. .. member:: string->foldCase() Converts the characters in the string object to allow for case-insensitive comparisons. It modifies the string object in-place, not returning any value. .. member:: string->trim() Removes any whitespace from the beginning and end of a string. It modifies the string object in-place, not returning any value. .. member:: string->reverse() Changes the string object to the value of the base string in reverse order. It modifies the string object in-place, not returning any value. .. member:: string->toLower(position::integer) Takes a parameter that specifies the position of the character to modify. That character is converted to lowercase if possible. It modifies the string object in-place, not returning any value. .. member:: string->toUpper(position::integer) Takes a parameter that specifies the position of the character to modify. That character is converted to uppercase if possible. It modifies the string object in-place, not returning any value. .. member:: string->toTitle(position::integer) Takes a parameter that specifies the position of the character to modify. That character is converted to title case if possible. It modifies the string object in-place, not returning any value. .. member:: string->lowercase() Changes every possible character in a string to lowercase. It modifies the string object in-place, not returning any value. .. member:: string->uppercase() Changes every possible character in a string to uppercase. It modifies the string object in-place, not returning any value. .. member:: string->titlecase() .. member:: string->titlecase(language::string, country::string) Changes every possible word in a string to title case. It can optionally take a language code for the first parameter and a country code for the second to specify a locale to be used when performing this operation. It modifies the string object in-place, not returning any value. .. member:: string->padLeading(tosize::integer, with::string= ?) Takes a parameter that specifies the target size of the string. If the base string object is smaller in size, then it changes the string by prepending a character to the start of the string until the string is the specified size. The character used for prepending defaults to a space, but can be set with the optional second parameter. It modifies the string object in-place, not returning any value. .. member:: string->padTrailing(tosize::integer, with::string= ?) Takes a parameter that specifies the target size of the string. If the base string object is smaller in size, then it changes the string by appending a character to the end of the string until the string is the specified size. The character used for appending defaults to a space, but can be set with the optional second parameter. It modifies the string object in-place, not returning any value. .. member:: string->removeLeading(find::string) .. member:: string->removeLeading(find::regexp) Takes either a string or a regular expression and removes all specified matches from the beginning of the string. It keeps removing until the beginning of the string no longer matches the specified pattern. It modifies the string object in-place, not returning any value. .. member:: string->removeTrailing(find::string) Takes a string and removes all matches specified from the end of the string. It keeps removing until the end of the string no longer matches the specified parameter. It modifies the string object in-place, not returning any value. .. member:: string->merge(where::integer, what::string, offset::integer= ?, length::integer= ?) Merges a specified string into the base string. It requires the first parameter to specify the position in the base string for the merge to take place and a second parameter that specifies the string to merge into the base string. It modifies the string object in-place, not returning any value. Optionally, a third parameter can specify the starting position of the passed string to be used in the merge and a fourth can specify the number of characters to after the offset to be merged from the passed string. .. member:: string->replace(find::string, replace::string, -case::boolean= ?) .. member:: string->replace(find::regexp, replace= ?, ignoreCase= ?) Takes either a string or a regular expression and replaces all matches found in the string object value with the specified replacement. For regular expression matches, the replacement string can be specified for this method, or it will use the replacement string of the :type:`regexp` object. It modifies the string object in-place, not returning any value. When using a regular expression, the method defaults to a case-sensitive matching unless otherwise specified by the third parameter. When using a string for matching, the default is the reverse: it uses case-insensitive matching unless otherwise specified by the third parameter. Append Data to a String ----------------------- This example uses the `string->append` method to add a trailing slash to a directory path if one does not already exist:: local(dir_path) = '/var/lasso/home' if(not #dir_path->endsWith('/')) => { #dir_path->append('/') } #dir_path // => /var/lasso/home/ Remove Whitespace Around a String --------------------------------- This example uses the `string->trim` method to remove whitespace from the beginning and end of the string and then outputs the string:: local(my_string) = '\n Ralph the Ringed Rhino \n\n' #my_string->trim #my_string // => Ralph the Ringed Rhino Ensure All Characters are Lowercase ----------------------------------- This example takes a string and converts all the characters to lowercase and then outputs the changed string:: local(my_string) = 'Ralph the Ringed Rhino races red radishes in THE RINK.' #my_string->lowercase #my_string // => ralph the ringed rhino races red radishes in the rink. Remove a Pattern from the End of a String ----------------------------------------- This example removes all the trailing commas from the string:: local(my_string) = 'First, Second, Fifth,,,' #my_string->removeTrailing(',') #my_string // => First, Second, Fifth String Encoding Methods ======================= .. member:: string->hash() Returns a simple hash of the string object. .. member:: string->unescape() Returns a string with any escape sequences (a sequence beginning with a backslash) in the base string object replaced with their literal Unicode equivalents. This is the same escape process Lasso does for non-ticked string literals. .. member:: string->encodeHtml() .. member:: string->encodeHtml(linebreaks::boolean, ignorechars::boolean) Returns a string with any reserved, illegal, or extended ASCII characters in the base string object converted to their equivalent HTML entity. This replacement can be modified by passing two boolean parameters. If the first parameter is set to "true", then line breaks are encoded. If the second parameter is set to "true", then the following characters are not encoded: ``" & ' < >`` (double quotation mark, ampersand, single quotation mark, less than or left angle bracket, and greater than or right angle bracket, respectively). .. member:: string->decodeHtml() Returns a string with any HTML entities in the base string object converted to their Unicode equivalent. This is the opposite of the `string->encodeHtml` method. .. member:: string->encodeXml() Returns a new string of the base string object with any reserved or illegal XML characters encoded into their equivalent XML entity. .. member:: string->decodeXml() Returns a string from the base string object with any XML entities converted to their Unicode equivalent. This is the opposite of the `string->encodeXml` method. .. member:: string->encodeHtmlToXml() Returns a string from the base string object with any HTML encoded entities converted to XML encoding. .. member: string->asBytes() .. member:: string->asBytes(encoding::string= ?) Returns the value of the base string as a bytes object. By default, UTF-8 encoding is used for this conversion, but any encoding can be specified as a string parameter to this method. .. member:: string->encodeSql() Returns the value of the base string with any illegal characters for MySQL data sources properly escaped. .. member:: string->encodeSql92() Returns the value of the base string with any illegal characters for SQL-92--compliant databases properly escaped. Not for use with MySQL. Convert Escape Sequences ------------------------ The following example creates a string with escape sequences using a ticked string literal so that Lasso won't automatically unescape them. It then outputs the string before calling `string->unescape` and then shows the result of calling `string->unescape`:: local(my_string) = `Chinese Character: \u4E26` #my_string + '\n' #my_string->unescape // => // Chinese Character: \u4E26 // Chinese Character: 並 Encode HTML Entities -------------------- The following example uses `string->encodeHtml` to return a string with the special HTML entities encoded:: local(my_string) = '<>&' #my_string->encodeHtml // => <>& Encode for Use in MySQL ----------------------- The following example returns a string whose quotes have been encoded for use in a MySQL SQL statement:: local(my_string) = "Don't forget to encode" #my_string->encodeSql // => Don\'t forget to encode String Iteration Methods ======================== .. member:: string->forEachCharacter() Takes a capture block and executes that block once for every character in the base string. The character can be accessed in the capture block through the special local variable ``#1``. .. member:: string->forEachWordBreak() Takes a capture block and executes that block once for every word in the base string. The word can be accessed in the capture block through the special local variable ``#1``. .. member:: string->forEachLineBreak() Takes a capture block and executes that block once for every substring that would be generated by splitting the base string object on a line break. Every line break character is recognized: ``"\r"``, ``"\n"``, and ``"\r\n"``. Each of the substrings can be accessed in the capture block through the special local variable ``#1``. .. member:: string->forEachMatch(exp::string) .. member:: string->forEachMatch(exp::regexp) Takes a capture block and executes that block once for every specified match in the base string object. Matches can be specified as either :type:`string` or :type:`regexp` objects. The match can be accessed in the capture block through the special local variable ``#1``. .. member:: string->eachCharacter() Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every character in the base string object. .. member:: string->eachWordBreak() Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every word in the base string object. .. member:: string->eachLineBreak() Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every line in the base string object. .. member:: string->eachMatch(exp::string) .. member:: string->eachMatch(exp::regexp) Returns an ``eacher`` that can be used in conjunction with query expressions to inspect and perform complex operations on every specified match in the base string object. Matches can be specified as either :type:`string` or :type:`regexp` objects. Iterate Over Lines ------------------ The following example takes a string with multiple lines and runs the lines of the string together with slashes, storing the result in the variable "quoted_poem". It removes the trailing slash at the end and then displays the variable "quoted_poem" in quotes. :: local(poem) = '\ An old silent pond... A frog jumps into the pond, Splash! Silence again.' local(quoted_poem) = '' #poem->forEachLineBreak => { #quoted_poem->append(#1 + '/') } #quoted_poem->removeTrailing('/') '"' + #quoted_poem + '"' // => "An old silent pond.../A frog jumps into the pond,/Splash! Silence again." Iterate Over Words ------------------ The following example takes a string and inspects each word using a query expression. If the word starts with the letter "r" then it will transform it to uppercase. The query expression selects each word, allowing us to create a staticarray of words. :: local(my_string) = 'Ralph is a red rhinoceros.' ( with word in #my_string->eachWordBreak select (#word->beginsWith('r') ? #word->uppercase& | #word) )->asStaticArray // => staticarray(RALPH, is, a, RED, RHINOCEROS.) Iterate Over a Specified Regular Expression Match ------------------------------------------------- The following example uses `string->eachMatch` with a :type:`regexp` object to find every vowel in a string, where the local variable "vowels" is used to count the number of each vowel in the string. :: local(my_string) = 'ralph is a red rhinoceros.' local(vowels) = map('a'=0, 'e'=0, 'i'=0, 'o'=0, 'u'=0) with letter in #my_string->eachMatch(regexp(`[aeiouAEIOU]`)) do #vowels->find(#letter)++ #vowels // => map(a = 2, e = 2, i = 2, o = 2, u = 0) String Export Methods ===================== .. member:: string->split(find::string) Returns an array with elements created by breaking up the string on the specified string. If an empty string is specified, each element of the array is a single character of the string. .. member:: string->values() Returns an array, each element of which is one character of the string. .. member:: string->keys() Returns a :type:`generateSeries` from 1 to the number of characters in the string, or an empty :type:`generateSeries` if the string is empty. Split a String Into an Array ---------------------------- The following example creates an array by splitting a string on a comma:: local(my_string) = '1,3,9,f,g' #my_string->split(',') // => array(1, 3, 9, f, g) .. _Unicode Character Database: http://www.unicode.org/ucd/ .. _Unicode Technical Reports: http://www.unicode.org/reports/ .. _Unicode Normalization FAQ: http://www.unicode.org/faq/normalization.html .. _Unicode Standard Annex #15: http://www.unicode.org/reports/tr15/