Friday, September 17, 2010

Hexadecimal string to Decimal integer conversion

17, Sep, 2010.
Question:
Given a string that contains hexadecimal number, write a function htoi() that returns the equivalent integer.


// Here is one function
int getValue(char const* hexString, int const radix)
{
    int         loopIndex;      // Loop iterator
    int         signFlag = 1;   // Keeps track of sign
    signed int  retValue = 0;   // Final value
    int         currentDigit;   // Current digit under process
    int         len = strlen(hexString);    // Lenght of string

    // Check for sign
    if(hexString[0] == '-')
    {
        // Advance the base
        hexString++;
        // Remove '-' from count
        len--;
        // Sing is -ve
        signFlag = -1;
    }

    // Iterate from left to right
    // Accumulate the weight of each digit upon occurance of next digit
    for(loopIndex = 0; loopIndex < len; loopIndex++)
    {
        currentDigit = toupper(hexString[loopIndex]) - ASCII_ZERO;

        // Is the digit between A through F?
        if(currentDigit > 9)
        {
            // Adjust numerical value
            currentDigit -= (ASCII_A - '9');
        }

        // Based on Horner's principle
        retValue = retValue * radix + currentDigit;
    }

    // Flip the sign, if required
    retValue *= signFlag;

    // This is the final value
    return retValue;
}
Examples:


getValue("1011", 2);    // Binary string to integer
getValue("12345", 10);  // Decimal string to integer
getValue("ABCDE", 16);  // Hexadecimal string to integer


Error handling is omitted.


1. Overflow and underflow (if string contains more than 10 characters on 32 bit machines, i.e. length = [32*log(2) + 1] digits in general, which is the characteristic).
2. Invalid String (containing unexpected characters).
3. String not terminated with NUL character.
4. String is NULL.
5. What if a kernel space buffer is passed as hex string?
6. Is it reentrant?
7. Who will be the consumers of the function, and possible ways of misusing it?
8. What changes are required if wide character buffer (Unicode) is passed (portability issue).
9. Any optimizations without compromising on the functionality.

No comments: