Byte Streams¶
Binary data in Lasso is stored and manipulated using the bytes
type.
This chapter details the operators and methods that can manipulate binary data.
Tip
The bytes
type is often used in conjunction with the string
type to convert binary data between different character encodings, such as
UTF-8 and ISO-8859-1. See the Strings chapter for more information
about the string
type.
Creating Bytes Objects¶
While string data in Lasso is processed as one- to four-byte Unicode characters,
the bytes
type can represent raw strings of single bytes, which is often
referred to as a byte stream or binary data.
Lasso’s methods return a bytes object in the following situations:
- The
bytes
creator method allocates a new bytes object. - The
web_request->param
methods return a bytes object. - The
field
method returns a bytes object from MySQL “BLOB” fields. - Other methods that return or require binary data as outlined in their documentation.
-
type
bytes
¶
-
bytes
()
-
bytes
(initial::integer)
-
bytes
(copy::bytes)
-
bytes
(import::string, encoding::string=?)
-
bytes
(doc::pdf_doc) Allocates a bytes object. Can convert a
string
orpdf_doc
type to abytes
type, or instantiate a newbytes
object. Accepts one optional parameter that can specify the initial size in bytes for the stream; or specify thestring
,pdf_doc
, orbytes
object to convert to a newbytes
object. If converting astring
object, it can accept an optional second parameter to specify the encoding of the string.
-
bytes->
reserve
(size::integer)¶ Attempts to preallocate enough memory for the specified number of bytes. Useful for optimization by avoiding memory reallocation if the expected byte stream size is known in advance.
Bytes Inspection Methods¶
Byte streams are similar to strings and support many of the same member methods. Additionally, byte streams support a number of member methods that make it easier to deal with binary data. The most common methods are outlined below.
-
bytes->
size
()¶ Returns the number of bytes contained in the bytes object.
-
bytes->
length
()¶ Deprecated since version 9.0: Use
bytes->size
instead.
-
bytes->
get
(position::integer) → integer¶ Returns a single byte from the stream. Requires a parameter specifying which byte to fetch.
-
bytes->
getRange
(position::integer, num::integer) → bytes¶ Returns a range of bytes from the byte stream. Requires two parameters: the first specifies the byte position to start from, and the second specifies how many bytes to return.
-
bytes->
find
(find::bytes, position::integer=?, length::integer=?, patPosition::integer=?, patLength::integer=?)¶
-
bytes->
find
(find::string, position::integer=?, length::integer=?, patPosition::integer=?, patLength::integer=?) Searches the bytes object for the byte sequence or string pattern specified in the first parameter, returning the position where the sequence first begins in the bytes object or “0” if the pattern cannot be found.
The second and third parameters can specify a portion of the bytes object within which to look for the match, with the former specifying the position to begin the search and the latter specifying the number of bytes to search. Similarly, the fourth and fifth parameters can specify a portion of the sequence that should be used for matching.
-
bytes->
contains
(find::string)¶
-
bytes->
contains
(find::bytes) Returns “true” if the byte stream contains the specified sequence.
-
bytes->
beginsWith
(find::string)¶
-
bytes->
beginsWith
(find::bytes) Returns “true” if the byte stream begins with the specified sequence.
-
bytes->
endsWith
(find::string)¶
-
bytes->
endsWith
(find::bytes) Returns “true” if the byte stream ends with the specified sequence.
-
bytes->
bestCharset
(charset::string)¶ Checks if the byte stream can be encoded using the specified character set. Returns the either the specified character set name if it can, or an appropriate character set name if not.
-
bytes->
detectCharset
()¶ Checks which character sets could be used to decode the byte stream and returns a staticarray of guesses where each is a staticarray of the character set name, the language covered by the character set (if any), and a confidence value.
Find a Character Set for a Byte Stream¶
Use the bytes->bestCharset
method. The examples below show the result of
passing a byte stream containing a character that can’t be encoded with the
suggested character set:
bytes('This is a plain ASCII string')->bestCharset('ISO-8859-1')
// => ISO-8859-1
bytes('This isn’t a plain ASCII string')->bestCharset('ISO-8859-1')
// => UTF-8
Bytes Export Methods¶
Bytes objects keep track of a “marker”, indicating where in the stream export operations will begin from. Newly created bytes objects have their marker set to “0”, and are incremented by the number of exported bytes when any of the export member methods that return bytes objects are called. The marker can also be set manually.
-
bytes->
asString
(encoding::string=?)¶ Returns the entire byte stream as a string using the specified encoding, defaulting to “UTF-8”.
-
bytes->
marker
()¶ Returns the current position at which exports will occur in the byte stream.
-
bytes->
marker=
(value::integer)¶ Sets the byte stream’s marker to the passed value.
-
bytes->
position
()¶
-
bytes->
position=
(value::integer)¶
-
bytes->
setPosition
(i::integer)¶ Deprecated since version 9.0: Use
bytes->marker
andbytes->marker=
instead.
-
bytes->
exportString
(encoding::string)¶ Returns a string representing the byte stream. Requires a single parameter specifying the character encoding (e.g. “ISO-8859-1” or “UTF-8”) for the export. If the byte stream has a marker set, only the bytes following the marker will be returned. The marker is not modified.
-
bytes->
exportBytes
(num::integer=?)¶ Returns the byte stream as a bytes object. Accepts one optional parameter that can specify the number of bytes to return. If the byte stream has a marker set, only the bytes following the marker will be returned. Sets the marker to the end of the stream.
-
bytes->
export8bits
()¶
-
bytes->
export16bits
()¶
-
bytes->
export32bits
()¶
-
bytes->
export64bits
()¶ Returns 1, 2, 4, or 8 bytes of the byte stream starting from the marker as an integer and increments the marker by the same amount.
-
bytes->
exportSigned8bits
()¶
-
bytes->
exportSigned16bits
()¶
-
bytes->
exportSigned32bits
()¶
-
bytes->
exportSigned64bits
()¶ Returns 1, 2, 4, or 8 bytes of the byte stream starting from the marker as a signed (two’s-complement) integer and increments the marker by the same amount.
-
bytes->
split
(find::string)¶
-
bytes->
split
(find::bytes) Returns an array of bytes objects using the specified sequence as the delimiter to split the byte stream. If the delimiter provided is an empty byte stream or string, the byte stream is split on each byte, so the returned array will have each byte as one of its elements.
-
bytes->
sub
(position::integer, num::integer=?)¶ Returns a specified slice of the byte stream. Requires an integer parameter specifying the index into the byte stream to start taking the slice from. An optional second integer parameter can specify the number of bytes to slice out of the bytes object. If the second parameter is not specified, all of the bytes following the index are returned.
Return the Size of a Byte Stream¶
Use the bytes->size
method. The example below returns the size of a bytes
object:
local(obj) = bytes('abc…')
#obj->size
// => 6
Return a Single Byte from a Byte Stream¶
Use the bytes->get
method. An integer parameter specifies the index of the
byte to return. Note that this method returns an integer, not a fragment of the
original data (such as a string character):
local(obj) = bytes('hello world')
#obj->get(2)
// => 101
Find a Value Within a Byte Stream¶
Use the bytes->find
method. The example below returns the starting byte number
of the value 'rhino'
, which is contained within the byte stream:
bytes('running rhinos risk rampage')->find('rhino')
// => 9
Determine If a Byte Stream Contains a Value¶
Use the bytes->contains
method. The example below will return “true” if the
value 'Rhino'
is contained within the byte stream. Note that in this example
it will return “false” because the bytes of 'rhino'
are a different sequence
than the bytes of 'Rhino'
.
bytes('running rhinos risk rampage')->find('Rhino')
// => false
Export a String from a Byte Stream¶
Use the bytes->exportString
method. The following example exports a string
using UTF-8 encoding:
local(obj) = bytes('This is a string')
#obj->exportString('UTF-8')
// => This is a string
Bytes Decoding/Encoding Methods¶
-
bytes->
crc
()¶ Returns the cyclic redundancy check integer value for the byte stream.
-
bytes->
encodeBase64
()¶ Returns a base64-encoded representation of the byte stream as a bytes object.
-
bytes->
decodeBase64
()¶ Returns the binary data of a base64-encoded byte stream as a bytes object. This is the opposite of the
bytes->encodeBase64
method.
-
bytes->
encodeHex
()¶ Returns the byte stream in hexadecimal format.
-
bytes->
decodeHex
()¶ Returns the binary data of a byte stream containing hexadecimal ASCII characters by converting each pair of characters to a single byte. This is the opposite of the
bytes->encodeHex
method.
-
bytes->
encodeMd5
()¶ Returns the MD5 hash value for the byte stream as a bytes object.
-
bytes->
encodeQP
()¶ Returns the byte stream in quoted-printable format.
-
bytes->
decodeQP
()¶ Returns the binary data of a quoted-printable–encoded byte stream as a bytes object. This is the opposite of the
bytes->encodeQP
method.
-
bytes->
encodeSql
()¶ Returns the byte stream with any illegal characters for MySQL data sources properly escaped.
-
bytes->
encodeSql92
()¶ Returns the byte stream with any illegal characters for SQL-92–compliant data sources properly escaped. Not for use with MySQL.
-
bytes->
encodeUrl
()¶ Returns the byte stream with any illegal characters for URLs properly escaped.
-
bytes->
decodeUrl
()¶ Returns the binary data of a URL-encoded byte stream as a bytes object, with any escaped characters replaced with their ASCII equivalents. This is the opposite of the
bytes->encodeUrl
method.
Encode a File as Base64¶
Use the bytes->encodeBase64
method. The example below reads a file into a byte
stream and prints its Base64-encoded value:
file('red-dot.png')->readBytes->encodeBase64
// => iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==
Bytes Iteration Methods¶
-
bytes->
forEachByte
()¶ Executes a given capture block once for every bytes in the byte stream. The byte can be accessed in the capture block through the special local variable
#1
.
-
bytes->
eachByte
()¶ Returns an
eacher
that can be used in conjunction with query expressions to inspect and perform complex operations on every byte in the byte stream.
Bytes Manipulation Methods¶
Calling the following methods will modify the bytes object without returning a value.
-
bytes->
setSize
(num::integer)¶ Sets the byte stream size to the specified number of bytes.
-
bytes->
setRange
(what::bytes, where::integer=?, whatStart::integer=?, whatLen::integer=?)¶ Sets a range of characters within a byte stream. Requires one parameter for the binary data to be inserted. The optional second, third, and fourth parameters specify the integer offset into the byte stream to insert the new data, and the offset and length of the new data to be inserted, respectively.
-
bytes->
padLeading
(tosize::integer, with::bytes=?)¶
-
bytes->
padLeading
(tosize::integer, with::string=?) If the byte stream is smaller in size than the first parameter specifying the target number of bytes, it changes the byte stream by prepending a character to its beginning until it reaches the specified size. The character used for prepending defaults to a space, but can be set with an optional second parameter.
-
bytes->
padTrailing
(tosize::integer, with::bytes=?)¶
-
bytes->
padTrailing
(tosize::integer, with::string=?) If the byte stream is smaller in size than the first parameter specifying the target number of bytes, it changes the byte stream by appending a character to its end until it reaches the specified size. The character used for appending defaults to a space, but can be set with an optional second parameter.
-
bytes->
replace
(find::bytes, replace::bytes)¶ Replaces all instances of a value within a byte stream with a new value. Requires two parameters: the first parameter is the value to find, and the second parameter is the value with which to replace the first parameter.
-
bytes->
remove
()¶
-
bytes->
remove
(position::integer, num::integer) Removes bytes from a byte stream. When passed without a parameter, it removes all bytes, setting the object to an empty bytes object. In its second form, it requires an offset into the byte stream and the number of bytes to remove starting from there.
-
bytes->
removeLeading
(find::bytes)¶ Removes all occurrences of the specified sequence from the beginning of the byte stream. Requires one parameter specifying the data to be removed.
-
bytes->
removeTrailing
(find::bytes)¶ Removes all occurrences of the parameter sequence from the end of the byte stream. Requires one parameter specifying the data to be removed.
-
bytes->
append
(rhs::bytes)¶
-
bytes->
append
(rhs::string) Appends the specified data to the end of the byte stream. Requires one parameter specifying the data to append.
-
bytes->
trim
()¶ Removes all whitespace ASCII characters from the beginning and the end of the byte stream.
-
bytes->
importString
(s::string, enc::string=?)¶ Imports a string parameter into the byte stream. A second parameter can specify the character encoding (e.g. “ISO-8859-1” or “UTF-8”) to use for the import.
-
bytes->
importBytes
(b::bytes)¶ Imports a bytes object parameter into the byte stream.
-
bytes->
import8bits
(i::integer)¶
-
bytes->
import16bits
(i::integer)¶
-
bytes->
import32bits
(i::integer)¶
-
bytes->
import64bits
(i::integer)¶ Imports the first 1, 2, 4, or 8 bytes of an integer parameter.
-
bytes->
swapBytes
()¶ Swaps the position of every pair of bytes, e.g. a byte stream of
'father'
becomes'afhtre'
.
Add a String to a Byte Stream¶
Use the bytes->append
method. The following example adds the string 'I am'
to the end of a byte stream:
local(obj) = bytes
#obj->append('I am')
Find and Replace Values in a Byte Stream¶
Use the bytes->replace
method. The following example finds the string
'Blue'
and replaces it with the string 'Green'
within the byte stream:
local(colors) = bytes('Blue Red Yellow')
#colors->replace('Blue', 'Green')
Import a String Into a Byte Stream¶
Use the bytes->importString
method. The following example imports a string
using ISO-8859-1 encoding:
local(obj) = bytes('This is a string')
#obj->importString('This is another string', 'ISO-8859-1')