Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Working With Number – Infinity Multiplication – Optimised The Code – Python

Tags: string digits

In the previous two chapters of infinity multiplication for Python, I had demonstrated the methods and the programming code for multiplying two String of digits together at the rate of one digit at a time. If you need a reference on the method of multiplication, please refer to chapter one and two of infinity multiplication with the name “Infinity Multiplication – Beyond integer” and “Infinity Multiplication – Decimal, Precise Float Calculation”.

The methods of the previous two chapters are error proof, due to we are only processing one digit at a time. We was also storing the answer for each cycle of multiplication for the digit in the second value string on a separate temporary string before we join the temporary answer string to the final answer string. Nevertheless the methods in the previous chapter could be inefficient for processing a large amount of data.

In this chapter, I will demonstrate what in my opinion consider the most optimised way for optimising the previous chapters’ methods for the production environment. Same as the previous chapters, the source code was tested against hundred of thousand plus test cases. With this article there are also the code for testing all the previous chapter source code and including this one. I also did a benchmark test for this production version of the source code of Infinity Multiplication versus the previous version of this program.

Optimised The Code

Before discussing in regard to optimising the code, I will discuss about the benchmark score results from the benchmark test for this version of infinity multiplication. When testing for how fast the program executes, I produced two strings of number, each string contained forty digits in length. The strings are then passed into the function to execute 100,000 times, and the time it took to execute was record as the benchmark score. This test ran twice and the average result between the two tests is uses as the final benchmark score.

This version of Infinity Multiplication for Python took 22.331 seconds to compute the results for 100,000 cases. For each case the same two strings of number were use, each string of number contained forty digits and did not contained the decimal. With the decimal the program took 22.541 seconds to compute the results. To compare this to the previous version, the previous version took 322.680 seconds to compute the results for 100,000 cases without the decimal. For each case, the same two strings that tested this version of infinity multiplication were use. With the decimal the previous version took 303.114 seconds to execute.

I know that the benchmark score can look funny from the single perspective of why when infiX dealing with the decimal, the program was computing faster. This have something to do with how python find() function operate. On one hand the infiX program will always inspect the string for a decimal doesn’t matter if the string contain a decimal or not. On the other hand the find() function will search through the input string for the decimal. When the find() function found the decimal, the find() function will exit and return the position for the decimal. When the string does not contain a decimal, the find() function actually have to search through the entire string to look for the decimal. When I was testing this program, it seem the find() function operate faster when the decimal was in the middle or at the beginning or at the end of the string. This type of behaviour was found with the Python string search builtin find() function. When working with other programming languages, the string search builtin function may operate differently.

How did I made the program run more than ten times faster? Instead of processing one digit at a time, I converted the program into processing five digits at a time. For each multiplication cycle, the previous version would require to convert a string type digit to an integer type twice, convert the result to a string format, stores the result, calculate the leftover and the remainder. The previous version of infinity multiplication also produces a temporary answer string to computes the result for the final answer string.

By computing five digits at a time, we would still use the same procedures for each multiplication cycle, however the procedures are now apply to five digits at a time for each multiplication cycle. Beside that I removed the temporary answer string and directly produce the final result onto the final answer string. By removing the temporary answer string, I would also saves memory usage and requires less memory to operate.

When choosing between the while and the for loop, I chose the for loop. The for loop took less than 100 milliseconds to execute ten millions empty cycles, on the other hand the while loop took a bit more than one second to execute the same ten millions empty cycles. For string search and replacement function, I chose builtin functions that doesn’t involve regex where I can. Regex library is indeed a very powerful library, however when it’s come to simple operation that only involve minor changes to the string, regex tense to operate a bit slower. This is due to the fact that regex have to evaluate the input pattern before initialising the method to operate on the string. To proof this theory, I actually tested how how fast the string function versus the re library function. For Python, the re.sub() function took 490 milliseconds to execute a trimming operation of leading zeroes from the left-side of the string one million times. The string was 12 digits long and there were four zeroes in the beginning of the string. The lstrip() function took 59 milliseconds to execute the same operation on the same string for one million cycle. Nevertheless in my opinion, for complex string operation tasks, regex is always the best way to go.

With the above in mind, I chose the lstrip() and rstrip() function for trimming the left and right of the string. To search for the decimal, I used Python’s string prototype find() function. To be able to get the digits from a location to another location within the strings, I used Python’s string “[]” bracket method.

The reason why I only calculate five digits at a time is due to the fact that the maximum value an 32 bits block integer can hold is 2,147,483,647. Which in term is 12 digits. For a multiplication equation, five digits multiply by another five digits have the potential to produce the maximum value that is ten digits. This is well below the maximum value that a 32 bits block integer can hold. If I was to compute six digits at a time I have the potential to produce a result with the maximum value of 9,999,980,00,001. This can overflow the memory of what a 32 bits block integer can hold and produce an incorrect result.

The formula for multiplying five digits at a time is similar to multiplying one digit at a time. Nevertheless writing the result to the output string can be a very complex task without a temporary answer string. Let’s first look at the example.

Example: 92345 67890 x 12545 89085
                                                         92345  67890
                                                       x 12545  89085
                                                        --------------
Multiply 67890 to 89085                                       (80650)  | leftover = 60479 and remainder = 0
  
Multiply 92345 to 89085                                 54325          | leftover = 82265
Add remainder and leftover                                 +0
                                                       +60479          
Value of this position                                 (14804)         | remainder = 1
Add remainder and leftover                           1
                                                +82265
Value of this position                          (82266)                | 0 leftover and 0 remainder

Current answer                                   82266  14804  80650                 
                                                          
Multiply 67890 to 12545                                 80050          | leftover = 8516
Add current position value in answer string            +14804               
Value of this position                                 (94854)         | remainder = 0

Multiply 92345 to 12545                          68025                 | leftover = 15584
Add current position value in answer string     +82266
Add remainder and leftover                       +8516 
                                                     0
Value of this position                          (58807)                | remainder = 1
Add remainder and leftover                    1
                                         +15584
Value of this position                   (15585)
                                         ----------------------------
Add the previous results together         15585  58807  94854  80650                                                              

With the example above in context, it’s can be define that we are multiplying all the digits in A string to B string in a block of five digits at a time. Each five block of digits in A string is multiply by the first block of five digits in B. Each time we multiplied a block of five digits from A to the first block of five digits in B, we would only store a five digits block at a time to the answer string from right to left. Any values that are above five digits in length would be kept as the leftover for the next equation. When keeping the digits for storing into the answer string, the first five most digits from right to left would be kept. The sixth digits and beyond would be kept for leftover.

When we are adding the leftover to the result for the equation, if there is a six digits value we would also keep the five first digits from the right and keep the left most digit as the remainder for our next equation.

After multiplying all the digits in A string to the first five block of digits in B string, we would then multiply all the digits in A string to the next block of five digits in B string. This time however, our starting position for storing the value into the answer string is move over by five digits, starting from the right. This starting position would be move over by another five when we are moving to the next block of five digits in B string, and so on.

When we are storing the result into the answer string, we would also have to add the result value to any value that the current position in the answer string hold. The same rule of remainder still apply to the equation.

With the above procedures in context, the first variable we would need is how many digit we are calculating at a time. This value can extend the ability of the program, if the bit block for storing digits increase we can change this value to increase the amount of digits we are handling per equation. If we have to work in a restricted environment and the integer bits block is smaller, we can also change this value to adapt the program.

The second variable we would need is an index count of how many times we had read from B string. We know that each time we read would translate to per five digits block. The amount of how many digits we read can vary at the very end of the string. Nevertheless this program was built to considers the variation in the final block of digits for both string. This index variable was name posr.

Beside the index for reading, we would also need a variable for calculating which position we are working on in our answer string. The third new variable is the digitneg variable. When we are using Python’s “[]” method, negative value translates to the position in the string counting from right to left. Since we are working from right to left of the strings, it is easier to use negative values. Therefore the amount of digits we want to use per calculation cycle is converted to negative value for easy reference to in our equation. This below is the variables we would need to declare outside of our calculation loops.

temp = 0
alen = len(a)
blen = len(b)
output = ""
posr = 0
posw = 0
digit = 5
digitneg = digit * -1
pad0 = ""

For every time we are reading a position in B string, we would have to read all the position in A string. Thus the outer loop will read the digits in B string and jump by five position every execution, and for every outer loop execution an inner loop will read all the position in A string and also jump by five position at a time.

To get five digits from B string, I used the “[]” bracket method. The syntax of this method are, string_variable_name[argument_1] or string_variable_name[argument_1:argument_2] or string_variable_name[argument_1:argument_2:argument_3]. For this method, argument one, two and three accepts integer value as input. Argument two and three are optional values. If only one argument was input without any “:”, Python will use the value as a position in the string and return the character at that position in the string. A positive value and zero, are the positions counting from left to right, with zero being the first position. A negative value for the first argument is the position in the string counting from right to left, with negative one being the first position. For python version three, if only argument one was the only input without the “:” and the position is out of range, Python will produce an out of range error.

If the second argument for the “[]” method is inputted, Python will treat both arguments as two positions. Python will grab all the characters between these two positions. Python will always grab the character in the position of the first argument. Python will not grab the character in the position of the second argument, only up to the character that is next to the second argument position. These two positions that defines by the first and the second argument are call position “from” and “to”. The “from” position is defines by the first argument and the “to” position is defines by the second argument.

As the time of this writing, Python counts the “from” and the “to” position from left to right. For example, if the “to” position is before the “from” position from left to right, Python would return an empty value. Another behaviour to the square bracket method is that, if the “to” position is larger than the string length or is after the last position on the string counting from the left, Python will return all the characters from the “from” position to the end of the string. If the “from” position is before the zero position of the string, Python will return all the characters from to beginning of the string to the position that is just before the “to” position. For example, if the string is 25 characters in length, inputting -28 as the first argument would make the “from” position being the position before the zero position in the string.

As for third argument of the “[]” bracket method, the argument take an integer value. When omitted, this argument default value is one. This value define what can be call as a step. A step can be describe as an incremental or subtractive value to the current position where the “[]” method is grabbing the current character to calculate the next position where the “[]” method should grab the next character within the available positions that were define by the first and second argument.

For example, a string of “1234567890” with the “[]” method input as [0:10:4] will produce “159” as the result. In the example, the first position the “[]” method should get a character from is position zero in the string, the last character if to be grab is the nine position in the string. Python will grab the zero position character and use the step value to calculate the next position to get the character from, which is position four. At position four Python will use the step again to calculate the next to get the character from which would be position eight. If the step is a negative value then position “from” can be larger than position “to” from left to right of the string. Python will get the first character from the largest position and use the step for all the next positions. Negative values as steps are extremely useful for reversing a string.

When using a “:” separator, only one position need to be input. The omitted position will be replace with the ending of the string. When using “::” option and argument three was input, only one position need to be input for the second and first argument. If the positions between the first and second argument are out of range, Python simply return an empty value.

To get five digits from B string, we can use the “[]” method. The first position we inputted into the “[]” method is the position from, which is the current position we are at minus the amount of digits we want to work with. The value for the to position would be the current position where we are at in B string. Notice that we do not minus one of our string length to get the current position, this is due to Python “[]” method only get the digit that is just before the to position. Since in Python string’s position start at zero, the length of the string is actually one value higher than the last position of the string from left to right.

We also do not want the position from to turn into a negative. If the from position turned into a negative it would be before the to position and produce an empty string. Thus if the position that we are in minus the digits we wanted to get is a negative value, we would turn the value to zero as being the first position of the string on the left. This is because the last block of digit on the left of the string will most likely to contain a vary amount of digits and not exactly the amount of digits we intended to work with. The method for grabbing a digit from A string is similar to the method of grabbing a digit from B string.

When we are executing the inner loop for calculation and when the answer string does contain an output value, we simply storing the answer into the output string bases on the procedures to this method. To get the leftover value, we use the “[]” method and get the character from the zero position to the amount of digit we wanted to work with in negative from the temp variable which hold the multiplication value for the current cycle. This value is then convert to an integer type with the int() function. If the temp variable does not contain more than the amount of digits we intended to work with, we have to substitute a string that contain a zero in lieu of a value for the int() function to work with. Otherwise the int() function will produce an error, the int() function does not convert empty string or value at the time of this writing.

When appending the result to the answer string, we want to always append the digits in the format of five digits. Therefore we would have to append to the result zeroes in the event where the block of digits is less than five digits. The reason why is when 1 00125 x 1 00001, The first five block of digits from A multiply by the first five block of digits in B would equal to 1 x 125 and the result would be 125. If we do not append zero to the beginning of the result to make up for the length, the next calculation will only add just a one to the beginning of this result in the answer string. Which in term is just 1125 instead of the correct answer of 1 00125.

To be able to always consistently produce a block of five digits for all the result, we adds four zeroes to the beginning of the result value. We then use the “[]” method to grab the amount of digits that we wanted to worked with which was five for this article from the right side of the string. For this procedure, we can input the digitneg value as the second argument for the “[]” method and omit the input for the second argument. After multiplying all the digits block in A string to the first block of digits in B string, we would append any leftover value to the leftmost of the answer string. The principle of a consistent amount of digits in a block of digits would also apply to this procedure.

This code block below demonstrate the mentioned procedures in Python.

for bi in range(blen,0,digitneg):
            z = int(b[(bi - digit if bi - digit > -1 else 0):bi])
            outlen = len(output)
            leftover = 0
            if ( outlen  -1 else 0):ai]) * z + leftover)
                    leftover = int(temp[0:digitneg] or "0")
                    output = (pad0 + temp)[digitneg:] + output
                if ( leftover > 0 ):
                    output = (pad0 + str(leftover))[digitneg:] + output

The previous procedures is for when the answer string does not yet contain a value. However it’s can get very complicated for when the answer string does contain a value. Each time we read a digit block in A string, we would have to calculate the position where we are working with in our answer string. The starting position for the answer string is move over from the right to the left for each digits block we read in B string. Beside that the position that we are working with in the answer string will also move over for each digits block we are currently reading from in A string.

To calculate this position, first we get the starting position by multiply how many times we had read from B string to the amount of digits we wanted to work with. Second we multiply the amount of times we had read from A string to the amount of digits we wanted to work with and add this value to the first value. In a short procedure we can describe it as how many times we read from B string plus how many times we read from A string and multiply the value to the amount of digits we intended to work with. Since we are using negative value for slicing the string, this value is in negative.

For this time of multiplying a digits block from A string to the digits block in B string, we have to add the result value to the value of the digits block of the current position we are working with in the answer string. This value is store in a temporary variable name tempadd. To get the digits in the answer string, we use the current position that we are working with in negative value plus the amount of digits we wanted to work with, which also in negative value as our from position for our “[]” method. The to position for the “[]” method would be from the current position we are working with in the answer string. In the event where the position we are working in currently beyond the length of the answer string, we would assign a string that only contain a zero digit for the int() function to work with.

We would also produce a remainder value from our addition operation if there is any. The procedure for producing a remainder is same as producing the leftover value for the multiplication procedure.

After producing the tempadd variable we would then store this variable into our answer string by getting all the digits from the beginning of the answer string to where we grabbed the first digit from the answer string. If this position is out of range because we are working on a part where the answer string does not yet have a value, the “[]” will return an empty string because the position is out of range. Thus we do not have to worry about the position have no value in this scenario. We then add our tempadd value to the right side of this string and then add the rest the digits from after the last digit we just grabbed from the answer string for the addition procedure to the end of the answer string.

After multiplying all the digits in A string to the set of digits in B string, we would add any remainder and leftover to the leftmost side of our answer string. The below code block demonstrate the above procedures into programming statements for Python.

else:
    remainder = 0
    tempadd = 0
    loopidx = 0
    for ai in range(alen,0,digitneg):
        posw = (posr + loopidx) * digitneg
        temp = str( int(a[(ai - digit if ai - digit > -1 else 0):ai]) * z + leftover + remainder)
        leftover = int(temp[0:digitneg] or "0")
        tempadd = str(int( temp[digitneg:]) + int(output[posw + digitneg:posw] or "0"))
        remainder = int(tempadd[0:digitneg] or "0")
        output = output[0:posw + digitneg] + (pad0 + tempadd[digitneg:])[digitneg:] + output[posw:]
        loopidx = loopidx + 1

        if ( remainder + leftover > 0 ):
            output = (pad0 + str(leftover + remainder))[digitneg:] + output

As for the testing to see if this function work. This code block below is what I used to test the previous version of InfiX function against the native multiplication function of the Python’s platform. For the previous version, since we are calculating one digits at a time, the correct result for calculating values that within the boundary of what an integer can hold will persist through into the longer string of digits. I also tested the library against Ruby’s native multiplication function. As it seem, Ruby’s native multiplication function can also calculate a very large amount of digits.

When testing for a decimal value, take consideration that the native calculate function will not be able to provide the actual precise decimal calculation. This tester will tell you, how many cases was not match and will also tell you the answer for the native function on the left and infiX answer on the right. If you use a calculator or manual calculation, you will be able to distinguish the pattern of result between infiX and the native multiplication function.

To use this tester, simply look for version two of infiX which is call “Decimal, Precise Float Calculation” and pair this tester code into the program. This code below is the tester cases. I preset the amount of run to 5000 cases, this can be increase. For Python, it is safe to set the amount of cases to run at less than or equal to 100,000 cases at a time. Otherwise it might take sometimes to get back the result. This code block below is the tester for the version two of infiX.

import random
um = 0
m = 0
for idx in range(0,5000):
    e = libZ()
    a1 = random.randint(0,99999)
    a2 = random.randint(0,99999)
    ac1 = random.randint(0,99999)
    ac2 = random.randint(0,1)

    b1 = random.randint(0,99999)
    b2 = random.randint(0,99999)
    bc1 = random.randint(0,99999)
    bc2 = random.randint(0,99999)
    padA1 = ""
    padB1 = ""
    for idn1 in range(0, random.randint(0, 0), 1):
        padA1 = "0" + padA1
    for idn1 in range(0, random.randint(0, 0), 1):
        padB1 = "0" + padB1

    ta = "-" + str(a1) + str(a2) + "." + padA1 + str(ac1) + str(ac2)
    tb = "+" + str(b1) + str(b2) + "." + padB1 + str(bc1) + str(bc2)

    test1 = float(ta) * float(tb)
    test2 = e.infiX(ta, tb )
    test1 = str(test1)

    print("Case #" + str(idx) + " : " + ta + " * " + tb )
    if (test1 != test2):
        um = um + 1
        print("Question: " + ta + " * " + tb )
        print("Native Multiplication: " + test1)
        print("InfiX: " + test2)
    if (test1 == test2):
        m = m + 1	
    print("")
print("Matched " + str(m) + " cases." )
print("Unmatched " + str(um) + " cases." )

For testing this version of infiX, I tested this version of infiX versus the previous version of infiX. Since the previous version already thoroughly tested, this one will produce the correct answer if the answer matches with the previous one. The previous version does take a long time to produce mass results. However as the prototype, I built the previous version as a demonstrator to the articles I wrote and also a tester to test the future release versions of the infiX program.

To use the tester below, simply have this version of infiX and the version two of infiX in the same script file with the tester. Rename one of the infiX function if necessary. This tester will produce five block of random numbers in front of the decimal for each string. The random blocks of numbers can be anywhere between one or five digits. The tester will then pad 0 to 15 zeroes behind the decimal and then three blocks of random numbers before padding another 0 to 15 zeroes and then the last two blocks of random digits. As recommended, this tester should not be set to execute above 50000 cases at a time. It can take a long time to produce the results since the previous version of InfiX does take a long time to produce an answer.

The below code block is the tester code for Python.

This is not the full article. The the full article with the complete source code can be found @ http://kevinhng86.iblog.website/2017/02/22/working-with-number-infinity-multiplication-optimised-the-code-python/
This article was written by Kevin Ng @ http://kevinhng86.iblog.website.



This post first appeared on Programming With Kevin, please read the originial post: here

Share the post

Working With Number – Infinity Multiplication – Optimised The Code – Python

×

Subscribe to Programming With Kevin

Get updates delivered right to your inbox!

Thank you for your subscription

×