Strings

"String" is a native TADS 3 datatype, but it is also an intrinsic class.  The String class provides a number of useful methods for manipulating string values.

Value Semantics

Strings have "value semantics," which means that a given string object's text is constant; once you've created a string, the text within that string never changes.  All of the methods and operators that appear to change the value of a string actually create a new string with the modified value, leaving the original value intact.  For example, consider this code:

 

   local x = 'foo';
   local y = x;
   x += 'bar';

 

Superficially, it appears that the last line changes the string in x.  In fact, the original string is not changed – if we displayed the value of y, we'd see that it still contains "foo".  When the interpreter executes the last line above, it creates a new string to hold the concatenated value, then assigns the result to x.

 

Value semantics make it very easy to work with strings, because you don't have to worry about whether a function might modify a string you pass to it: this can never happen, because a given string's text is constant.

String Methods

endsWith(str) – returns true if this string ends with str, nil if not.  This string ends with str if this string is at least as long as str, and the last str.length() characters of this string are the same as the characters of str.

 

find(str, index?) – finds the substring str within this string.  If the substring is contained within this string, the method returns the character index where the substring starts; the first character is at index 1.  If the substring isn't contained within this string, the method returns nil.

 

If index is given, it gives the starting index in self for the search; a value of 1 indicates that the search starts at the first character.  If the index value is omitted, the default value is 1.  The starting index value can be used to search for another occurrence of the same substring following a previous search, for example.

 

Examples:

 

   'abcdef'.find('cd') yields 3
   'abcdef'.find('g') yields nil
   'abcdef'.find('c', 3) yields 3
   'abcdef'.find('c', 4) yields nil
   'abcabcabc'.find('c', 4) yields 6
   'abcabcabc'.find('c', 7) yields 9

 

findReplace(origStr, newStr, flags, index?) – finds instances of the substring origStr within the target string, and replaces them with the new substring newStr.  If the flags value is ReplaceAll, then all occurrences of origStr are replaced; if the value is ReplaceOnce, then only the first occurrence is replaced.

 

If index is specified, it gives the starting index in self for the search.  If index is 1, the search starts at the first character; this is the default if index is not given.  No instances of origStr before index will be replaced.

 

htmlify(flags?) – converts HTML markup-significant characters in the string to appropriate HTML sequences, and returns the resulting string.  If the flags argument is not included, the method acts as though flags has the value 0 (zero).  By default, this method scans the string for the characters "&" (ampersand) and "<" (less than), and converts these characters to the sequences "&amp;" and "&lt;" respectively.  This conversion ensures that, when the string is rendered in HTML mode, the display shows ampersands and less-than signs where they appeared in the original string's text.  In addition, you can specify a combination (using the bitwise OR operator, "|") of the following flags to perform additional conversions:

 

 

This method is useful if you obtain a string from an external source, such as from the user (via the inputLine() function, for example) or from a text file, and you then want to display the string in HTML mode.  Without conversions, any markup-significant characters in the string might not be displayed properly, since the HTML parser would attempt to interpret the characters as HTML formatting codes.  You can use this method to ensure that a string obtained externally is displayed verbatim in HTML mode.

 

length() – returns the number of characters in the string.

 

mapToByteArray(charset) – maps the string from its internal Unicode representation to the corresponding representation in the character set specified by charset, and returns a new ByteArray containing the bytes of the result.  The charset parameter must be an object of class CharacterSet.  If charset refers to an unknown character set, an UnknownCharSetException will be thrown; you can determine if the character set is known using its isMappingKnown() method.

 

substr(start, length?) – returns a new string consisting of a substring of this string.  The substring starts at character index start (the first character in the string is at index 1).  If length is specified, the result string is at most length characters long; if length is not specified, the result runs to the end of the source string.

 

If the start parameter is negative, it indicates an offset from the end of the string: -1 indicates that the substring is to start at the last character, -2 at the second-to-last, and so on.

 

Examples:

 

   'abcdef'.substr(3) yields 'cdef'
   'abcdef'.substr(3, 2) yields 'cd'
   'abcdefghi'.substr(-3) yields 'ghi'
   'abcdefghi'.substr(-3, 2) yields 'gh'

 

toLower() – returns a new string consisting of the characters of the original string converted to lower-case.  Only alphabetic characters are affected; other characters are copied to the new string unchanged.  The conversion uses the case conversions specified in the Unicode character database, so accented and non-Roman alphabetic characters are properly converted.

 

startsWith(str) – returns true if this string starts with str, nil if not.  This string starts with str if this string is at least as long as str, and the first str.length() characters match the characters of str.

 

toUnicode(idx?) – converts one or all of the characters of this string to Unicode character codes.  If idx is given, it specifies the character index within the string of the single character to convert (the first character is at index 1), and the method returns an integer containing the Unicode code point for the character at that index.  If idx is not specified, the function returns a list; each element in the list is an integer giving the Unicode code point value for the corresponding character in the source string.  The list will have one element per character in the source string.

 

This function can be used to decompose a string into its individual characters, which is sometimes an easier or more efficient method of manipulating the string.  You can convert a list of Unicode code point values back into a string using the makeString() function in the "tads-gen" function set.

 

toUpper() – returns a new string consisting of the characters of the original string converted to upper-case.  Only alphabetic characters are affected; other characters are copied to the new string unchanged.  The conversion uses the case conversions specified in the Unicode character database, so accented and non-Roman alphabetic characters are properly converted.