FreshLib:Data: Arrays Strings |
FreshLib directory: data
This directory contains data handling routines and useful data structures. Actually it contains two abstract data structures: Arrays and Strings.
Current implementation is fully portable: it does not contain any OS dependent code.
FreshLib Arrays
The FreshLib Arrays data structure handles dynamic arrays that contain elements of arbitrary size.
Structure
The following structure represents the header of the array. The actual array will have arbitrary size, depending on the element count and size.
struct TArray .count dd ? .capacity dd ? .itemsize dd ? .lparam dd ? label .array dword ends
The first element of the array begins on offset TArray.array from the begining of the memory block.
The field TArray.count contains the current element count of the array.
The field TArray.capacity contains the current capacity of the array. It is because the library usually allocates more memory than is needed for the array element count. This approach reduces memory allocations and reallocations and thus increases the speed of inserting and deleting elements in the array. How many memory will be allocated depends on the user setting of the variable ResizeIt (defined in memory.asm). This variable contains a pointer to the procedure that simply increases the value of ECX to the next suitable value.
The field TArray.itemsize contains the size in bytes of one array element. Changing of this value is not recommended.
The field TArray.lparam is for user defined parameter, associated with the array.
Procedures
CreateArray
proc CreateArray, .itemSize
This procedure creates new array with item size [.ItemSize]
The procedure returns CF=0 if the array is properly created and pointer to the array is placed in EAX.
In case the memory cannot be allocated, the procedure returns CF=1.
The array must be freed after use. There is no special procedure for array free. Use FreeMem procedure to free array memory after use.
AddArrayItems
proc AddArrayItems, .ptrArray, .count
This procedure adds new array items at the end of the array pointed by [.ptrArray]
The procedure returns two values:
EAX contains pointer to the first of the new appended elements.
EDX contains pointer to the array (in the process of appending of the new element, it is possible the whole array to be moved to the new address in memory, so the programmer should store the value of EDX for the future reference to the array.
In case, the new memory can not be allocated, the procedure returns CF=1 and EDX contains the proper pointer to the original array.
InsertArrayItems
proc InsertArrayItems, .ptrArray, .iElement, .count
This procedure inserts [.count] new elements at the [.iElement] position of the array pointed by [.ptrArray]
If [.iElement] is larger or equal to [TArray.count] the elements are appended at the end of the array. (Just like AddArrayItems) Otherwise, all elements are moved to make room for the new elements.
The procedure returns exactly the same results as AddArrayItems procedure EDX points to the array and EAX points to the first of the new inserted elements.
CF is error flag.
DeleteArrayItems
proc DeleteArrayItems, .ptrArray, .iElement, .count
This procedure deletes [.count] items with begin index [.iElement] the [.ptrArray] dynamic array. If the capacity of the array is bigger than the recommended for the new count, then the array is resized. The recommended size is calculated using ResizeIt procedure from memory library.
Returns EDX - pointer to the TArray. In the most cases this pointer will not be changed, but this also depends on the current OS memory allocation API, so it is safer to store the pointer for future use, instead of one passed to the procedure.
This procedure can not fail, because deleting element is always possible. In some rare cases it can fail to reallocate smaller memory block, but this is not a problem for the array consistency.
VacuumArray
proc VacuumArray, .ptrArray
This procedure removes the reserved memory from the array in order to make it as small as possible. This operation should be executed only if there will be no more inserts in the array. The memory economized this way depends on reallocation strategy and can be from 25 to 100% in some cases.
ListIndexOf
proc ListIndexOf, .ptrList, .Item
The list is a special case of array with item size equal to 4 bytes (dword). This procedure searches the list [.ptrList] for item equal to [.Item] and returns the index of the element in EAX. In case of error CF=1.
ListFree
proc ListFree, .ptrList, .FreeProc
Frees all elements of the list [.ptrList], calling [.FreeProc] for every element of the list.
FreeProc callback procedure have one argument of type dword that is the value of the current list element. The definition of the callback procedure is similar to following:
proc FreeMyListItem, .ptrItem begin ;do something with the item being freed return endp
FreshLib Strings
Using strings in assembler was often problematic for programmers - static strings can't be resized, so they had to reserve as many bytes as they thought user could need, and it still could be not enough. Thus we created StrLib - a library that operates on dynamic strings, that are automatically resized when needed. Also, StrLib contains many functions that perform string comparison, inserting one string into another, and more. And most of this functions can operate on static strings too. StrLib uses "memory.asm" library for memory allocations and does not contains any OS dependent code.
FreshLib Strings uses FreshLib Arrays to handle the list with pointers to allocated string memory.
StrLib string format
The strings used in StrLib are implemented using a specific structure, compatible but not equal to AsciiZ, used by Windows API. The string structure is defined in the following way:
struc string { .capacity dd ? .len dd ? label .data byte } virtual at -(sizeof.string) string string sizeof.string = $-string end virtual
The string data is placed on offset 0 to the pointer of the string. Its label is "string.data". The string data always is terminated by at least one zero byte and the length of the memory buffer is always dword aligned. It is safe to process the string data by dword instructions.
On offset [string-4] is located a dword field, that contains the length of the string data, not including the terminating zero bytes.
On offset [string-8] is located a dword with the capacity of the memory allocated for the string.
These fields are accessible by its symbolic names: string.len and string.capacity; This fields are for internal use only it is not safe for the user to change the values of these fields.
Using special field that to keep the length of the string makes some of the string operations extremely fast, because searching for the terminating zero is very slow operation.
All procedures in StrLib compute the proper value for [.len] field and never search for the terminating zeros, except for the AsciiZ strings that are external towards StrLib those returned from the OS API functions or by string constants in memory.
StrLib string handles
The string in StrLib is identified not with its pointer, but by a handle. While the pointer can be changed when the string memory is reallocated, the handle is always a constant for the whole life cycle of the string.
There is a procedure that extracts the current pointer of the string by its handle.
Because handle values never collides with memory addresses, almost all StrLib procedures can work with handles and with memory pointers at the same time.
For the strings passed to the procedures with memory pointer, StrLib assumes they are static strings from memory, or returned from the OS.
Because of this assumption, StrLib process these strings safely and slowly scans the string to determine the length, assumes it is byte aligned etc. In other words it process it as a standard AsciiZ string.
StrLib procedures
StrNew
proc StrNew
Creates new string and returns handle in EAX.
StrPtr
proc StrPtr, .hString
Returns the current pointer to the string with handle [.hString].
If [.hString] looks like a valid handle, but it is not found in the strings table, the procedure returns CF=1.
StrDel
proc StrDel, .hString
Deletes the string with handle [.hString] and frees all allocated memory. If [.hString] is a pointer, it tries to search the strings table for the given handle and deletes it.
If a string is not found, the procedure does not return an error.
StrDup
proc StrDup, .hSource
Creates a duplicate of the string .hSource. Returns a handle to the new created string in EAX.
StrLen
proc StrLen, .hString
Returns the length of the string [.hString]. If the handle is valid, it returns the value of the field [string.len]. If [.hString] is a pointer, it computes the length by scanning the string up to the zero terminator.
StrFixLen
proc StrFixLen, .hString
This procedure scans the length of zero terminated string and "fixes" [string.len] field. StrFixLen should be call when the content of the string is created by an external call, for example a Win32 API function.
StrSetCapacity
proc StrSetCapacity, .hString, .capacity
If the capacity of [.hString] is larger than the requested capacity it does nothing.
If the capacity of [.hString] is smaller than the requested capacity, it sets the capacity of the string to the requested value.
Returns a pointer to the string after reallocation in EAX.
.hString is a handle to the string that have to be resized. Pointers are not acceptable here.
.capacity contains requested capacity for the string.
If returns CF=1 the reallocation failed. EAX still contains the pointer to the string, but the string was not resized.
StrCopy
proc StrCopy, .dest, .source
Copies the content of [.source] string to [.destination] string.
StrCompCase
proc StrCompCase, .str1, .str2
Compares the content of two strings [.str1] and [.str2] case sensitive.
Returns:
CF=1 if the strings are EQUAL.
CF=0 if the strings are NOT equal.
StrCompNoCase
proc StrCompNoCase, .str1, .str2
Compares the content of two strings [.str1] and [.str2] case NOT sensitive.
Returns:
CF=1 if the strings are EQUAL
CF=0 if the strings are NOT equal.
SetString
proc SetString, .ptrHString, .ptrSource
Creates string and assigns it to variable. If the variable already contains string handle, the old string will be reused. Copies the content of [.ptrSource] to the string variable.
Arguments:
.ptrHString pointer to the variable that contains string handle.
.ptrSource pointer to the source for string.
StrCat
proc StrCat, .dest, .source
Concatenates two strings. The operation is: destination = destination + source.
The destination string [.dest] can be only handle. [.source] can be handle or pointer.
StrCharPos
proc StrCharPos, .hString, .char
StrCharPos returns a pointer to the first occurrence of a given char in specified string.
Arguments:
.hString - string to search
.char - char to look for
Returns:
A pointer to the char in source, or NULL if char doesn't occur in the given string.
StrPos
proc StrPos, .hString, .hPattern
StrPos returns a pointer to the first occurrence of a .hPattern string in .hString.
Arguments:
hPattern - 'pattern' string
hString - string to search
Returns:
Pointer to the pattern string in source, or NULL if pattern string doesn't occur in the string to search.
StrCopyPart
proc StrCopyPart, .dest, .source, .pos, .len
Copies part of the source string to the destination string.
Arguments:
.dest handle to destination string.
.source handle or pointer to the source string.
.pos Source part start position.
.len length of the copied part.
Returns nothing.
StrExtract
proc StrExtract, .string, .pos, .len
The same as StrCopyPart, but creates and returns new string with extracted part of the source [.string]
Returns the new created string in EAX.
StrSplit
proc StrSplit, .hString, .pos
Splits the string on two strings, at position [.pos]
Arguments:
.hString handle of the string to be split.
.pos - position where to split the string.
Returns:
EAX - handle to the new created string with second part of the string. The original string does not reallocate memory and it's capacity and the pointer will remains the same.
StrInsert
proc StrInsert, .dest, .source, .pos
Inserts the string [.source] into the string [.dest] at position [.pos]
StrLCase
proc StrLCase, .hString
Converts [.hString] to lower case.
StrUCase
proc StrUCase, .hString
Converts [.hString] to upper case.
NumToStr
proc NumToStr, .num, .flags
Converts the given number to a string representing it.
[.flags] controls how the number to be converted.
The procedure returns a handle to the string in EAX and direct pointer to the string in EAX.
The [.flags] is a dword with following format:
The LSB contains the number of digits, the number must have, if ntsFixedWidth flag is specified.
Second byte of [.flags] contains the radix that to be used for conversion.
The third and the fourth bytes are reserved for bit flags.
Following constants are predefined in StrLib in order to set the value for [.flags]:
ntsSigned | converts number in signed format. | $00000 |
ntsUnsigned | converts number in unsigned format. | $10000 |
ntsFixedWidth | the count of the digits is fixed. | $20000 |
ntsBin | for binary number | $0200 |
ntsQuad | for quad number | $0400 |
ntsOct | for octal number | $0800 |
ntsDec | for decimal number | $0a00 |
ntsHex | for hexadecimal number | $1000 |
Example:
stdcall NumToStr,EAX, ntsDec or ntsFixedWidth or 8
This example will convert the number in EAX to a signed decimal number with exactly 8 digits. If EAX contains $00000080, the result string will be '00000128'.
StrCharCat
proc StrCharCat, .hString, .char
This procedure appends up to 4 characters at the end of the string [.hString]. The characters are contained in the dword argument [.char]
StrCharInsert
proc StrCharInsert, .hString, .char, .pos
Inserts up to 4 characters [.char] at position [.pos] in the string [.hString]
StrHash
proc StrHash, .hString
Computes 32 bit hash value from the string [.hString]. This procedure implements the hash algorithm FNV-1b. Returns the result in EAX.