1.3 import, loops revisited, and some syntactic sugar

1.3 import, loops revisited, and some syntactic sugar mjg8

To warm up a bit, let’s briefly revisit a few Python features that you are already familiar with but for which there exist some forms or details that you may not yet know, starting with the Python “import” command. We are also going to expand on, or introduce a few Python constructs that can be used to simplify code logic and clarify complex application process flow. 

It is highly recommended that you try out these examples yourself and experiment with them to get a better understanding.

1.3.1 import

1.3.1 import mrs110

The form of the “import” command that you definitely should already know is

import <module name>

e.g.,

import arcpy

What happens here is that the module (either a module from the standard library, a module that is part of another package you installed, or simply another .py file in your project directory) is loaded, unless it has already been loaded before, and the name of the module becomes part of the namespace of the script that contains the import command. As a result, you can now access all variables, functions, or classes defined in the imported module, by writing

<module name>.<variable or function name>

e.g.,

arcpy.Describe()

You can also use the import command like this instead:

import arcpy as ap

This form introduces a new alias for the module name, typically to save some typing when the module name is rather long, and instead of writing

arcpy.Describe()

, you would now use the ap to reference arcpy.

ap.Describe() 

in your code.

Another approach of using “import” is to directly add content of a module (again either variables, functions, or classes) to the namespace of the importing Python script. This is done by using the form "from … import …" as in the following example:

from arcpy import Describe, Point, 

...

Describe()

The difference is that now you can use the imported names directly in our code without having to use the module name (or an alias) as a prefix as it is done in line 5 of the example code. However, be aware that if you are importing multiple modules, this can easily lead to name conflicts if, for instance, two modules contain functions with the same name. It can also make your code a little more difficult to read since

  arcpy.Describe(...)

helps you, or another programmer recognize that you’re using something defined in arcpy and not in another library or the main code of your script.

You can also use

from arcpy import *

to import all variable, function and class names from a module into the namespace of your script if you don’t want to list all those you actually need. However, this can increase the likelihood of a name conflict.

Lastly, you can import the script into itself so it essentially creates a preserved namespaced copy of the script. This is especially useful in multiprocessing contexts where function references need to be fully qualified and picklable. By self-importing, the script ensures that the multiprocessing subprocesses reference functions via the module namespace (myscript.function_name) rather than __main__.function_name, which can lead to issues on Windows or when using the multiprocessing 'spawn' start method.

import myscript
    
myscript.worker(...)

1.3.2 loops, continue, and break

1.3.2 loops, continue, and break mrs110

Next, let’s quickly revisit loops in Python. There are two kinds of loops in Python, the for-loop and the while-loop. You should know that the for-loop is typically used when the goal is to go through a given set, or list of items, or do something a certain number of times. In the first case, the for-loop typically looks like this

for item in list: 
    # do something with item 

while in the second case, the for-loop is often used together with the range(…), len(...), or enumerate(...) functions to determine how many times the loop body should be executed:

for i in range(50):  
	# do something 50 times

In contrast, the while-loop has a condition that is checked before each iteration and if the condition becomes False, the loop is terminated and the code execution continues to the next line after the loop body. With this knowledge, it should be pretty clear what the following code example does:

import random 

r = random.randrange(100) # produce random number between 0 and 99 
attempt_count = 1 

while r != 11: 
    attempt_count += 1 
    r = random.randrange(100) 
print(f'This took {attempt_count} attempts')

There are two additional commands, break and continue, that can be used in combination with either a for-loop or a while-loop. The break command will automatically terminate the execution of the current loop and continue with the next line of code outside of the loop. If the loop is part of a nested loop, only the inner (nested) loop will be terminated and the outer loop will progress to its next execution. This means we can rewrite the program from above using a for-loop rather than a while-loop like this:

import random 

attempt_count = 0 

for i in range(1000):  
    r = random.randrange(100) 
    attempt_count += 1 
    
    if r == 11: 
        break  # terminate loop and continue after it 

print(f'This took {attempt_count} attempts')

When the random number produced in the loop body is 11, the conditional if-statement will equate to True. The break command will be executed and the program execution immediately leaves the loop and continues with the print statement after it.

If you have experience with programming languages other than Python, you may know that some languages have a "dowhile" loop construct where the condition is only tested after each time the loop body has been executed so that the loop body is always executed at least once. Since we first need to create a random number before the condition can be tested, this example would actually be a little bit shorter and clearer using a do-while loop. Python does not have a built in do-while loop, but it can be simulated using a combination of while and break:

import random

attempt_count = 0  

while True: 
    r = random.randrange(100) 
    attempt_count += 1 

    if r == 11: 
        break 

print(f'This took {attempt_count} attempts') 

A while loop with the condition True will in principle run forever. However, since we have the if-statement with the break, the execution will be terminated as soon as the random number generator rolls an 11.

When a continue command is encountered within the body of a loop, the current execution of the loop body is also immediately stopped. In contrast to the break command, the execution then continues with the next iteration of the loop body. Of course, the next iteration is only started if the while condition is still True in the case of a while-loop. In the case of a for-loop, it will continue if there are still remaining items in the iterable that we are looping through. To demonstrate this, the following code goes through a list of numbers and prints only those numbers that are divisible by 3 (without remainder).

l = [3, 7, 99, 54, 3, 11, 123, 444] 

for n in l: 
    if n % 3 != 0:   # test whether n is not divisible by 3 without remainder 
        continue 

    print(n)

This code uses the built-in modulo operator % to get the remainder of the division of n and 3 in line 5. If this remainder is not 0, the continue command is executed and the next item in the list is tested. If the condition is False (meaning the number is divisible by 3 without a remainder), the execution continues as normal after the if-statement and prints the number.

As you saw in these examples, there are often multiple ways in which the loop constructs for, while, and control commands break, continue, and if-else can be combined to achieve the same result. While break and continue can be useful commands, they can also make code more difficult to read and understand. Therefore, they should only be used sparingly and when their usage leads to a simpler and more comprehensible code structure than a combination of for /while and if-else would do.

1.3.3 Expressions, Decision Structures, and the ternary operator

1.3.3 Expressions, Decision Structures, and the ternary operator jmk649

Expressions

You are already familiar with Python binary operators that can be used to define arbitrarily complex expressions. For instance, you can use arithmetic expressions that evaluate to a number, or boolean expressions that evaluate to either True or False. Here is an example of an arithmetic expression using the arithmetic operators and *:

x = 25 – 2 * 3

Each binary operator takes two operand values of a particular type (all numbers in this example) and replaces them by a new value calculated from the operands. All Python operators are organized into different precedence classes, determining in which order the operators are applied when the expression is evaluated unless parentheses are used to explicitly change the order of evaluation. This operator precedence table shows the classes from lowest to highest precedence. The operator * for multiplication has a higher precedence than the operator for subtraction, so the multiplication will be performed first and the result of the overall expression assigned to variable x is 19. 

Here is an example for a boolean expression: 

x = y > 12 and z == 3

The boolean expression on the right side of the assignment operator contains three binary operators: two comparison operators, > and ==, that take two numbers and return a boolean value, and the logical ‘and’ operator that takes two boolean values and returns a new boolean (True only if both input values are True, False otherwise). The precedence of ‘and’ is lower than that of the two comparison operators, so the ‘and’ will be evaluated last. So if y has the value 6 and z the value 3, the value assigned to variable x by this expression will be False because the comparison on the left side of the ‘and’ evaluates to False. 

if/else & ternary operator  

In addition to all these binary operators, Python has a ternary operator, so an operator that takes three operands as input. This operator has the format 

 x if c else y

x, y, and c here are the three operands while if and else are the keywords making up the operator and demarcating the operands. While x and y can be values or expressions of arbitrary type, the condition c needs to be a boolean value or expression. What the operator does is it looks at the condition c and if c is True it evaluates to x, else it evaluates to y. So for example, in the following line of code 

 p = 1 if x > 12 else 0

variable p will be assigned the value 1 if x is larger than 12, else p will be assigned the value 0. Obviously what the ternary if-else operator does is very similar to what we can do with an if or if-else statement. For instance, we could have written the previous code as 

p = 1
if x > 12:
    p = 0

The “x if c else y” operator is an example of a language construct that does not add anything principally new to the language, but enables writing things more compactly or more elegantly. That’s why such constructs are often called syntactic sugar. The nice thing about “x if c else y” is that in contrast to the if-else statement, it is an operator that evaluates to a value and, hence, can be embedded directly within more complex expressions as in the following example that uses the operator twice:

newValue = 25 + (10 if oldValue < 20 else 44) / 100 + (5 if useOffset else 0)

Using an if-else statement for this expression would have required at least five lines of code (which is ok!). The ternary construct works well if the result can be 1 of the two values, but if you have more than two possibilities, you will need to utilize the if/elif/else structure or you can implement what is called a ‘object literal’ or ‘switch case’. 

Match

This section provides some advanced constructs and is provided as additional information that may be useful as we get more familiar and comfortable with what we can do with Python. Other coding languages include a switch/case construct that executes or assigns values based on a condition.  Python introduced this as ‘match’ in Python version 3.10 but it can also done with a dictionary and the built in dict.get() method.  This construct replaces multiple elifs in the if/elif/else structure and provides an explicit means of setting values. 

For example, what if we wanted to set a variable that could have 3 more possible values?  The long way would be to create an if, elif, else like so: 

p = 0

for x in [1, 13, 12, 6]:
    if x == 1:
        p = One
    elif x == 13:
        p = Two
    elif x == 12:
        p = Three

    print(p)
Output
One
Two
Three

The elifs can get long depending on the number of possibilities and can become difficult to read or keep track of the conditionals. Using match, you can control the flow of the program by explicitly setting cases and the desired code that should be executed if that case matches the condition.

An example is provided below:

command = 'Hello, Geog 489!'

match command:
    case Hello, Geog 489!:
        print('Hello to you too!')
    case 'Goodbye, World!':
        print('See you later')
    case other:
        print('No match found')
Output
Hello to you too!

‘Hello, Geog 489’ is a string assigned to the variable command. The interpreter will compare the incoming variable against the cases. When there is a True result, a ‘match’ between the incoming object and one of the cases, the code within the case scope will execute. In the example, the first case equaled the command, resulting in the Hello to you too! printing. Applied to the previous example:

for x in [1, 13, 12, 6]:
    match x:
        case 1:
            p = 'One'
        case 13:
            p = 'Two'
        case 12:
            p = 'Three'
        case other:
            p = 'No match found'

    print(p)
Output
One
Two
Three
No match found

A variation of the Match construct can be created with a dictionary. With the dict.get(…) dictionary lookup method, you can also include a default value if one of the values does not match any of the keys in a much more concise way:

possible_values_dict = {1: 'One', 13: 'Two', 12: 'Three'}
for x in [1, 13, 12, 6]:
    print(possible_values_dict.get(x, 'No match found'))
Output
One
Two
Three
No match found 

In the example above, 1, 13, and 12 are keys in the dictionary and their values were returned for the print statement. Since 6 is not present in the dictionary, the result is the default value of ‘No match found’. This default value return is helpful when compared to the dict[‘key’] retrieval method since it does not raise a KeyError Exception and stopping the script or requiring that added code to written to handle the KeyError as shown below.

possible_values_dict = {1: 'One', 13: 'Two', 12: 'Three'}
for x in [1, 13, 12, 6]:
    print(possible_values_dict[x])
Output
One
Two
Three
Traceback (most recent call last):
File "C:\...\CourseCode.py", line 20, in <module>
    print(possible_values_dict[x])
    ~~~~~~~~~~~~~~~~~~~~^^^
    KeyError: 6

Dictionaries are a very powerful data structure in Python and can even be used to execute functions as values using the .get(…) construct above. For example, let’s say we have different tasks that we want to run depending on a string value. This construct will look like the code below:

task = monthly
getTask = {'daily': lambda: get_daily_tasks(),
            'monthly': lambda: get_monthly_tasks(),
           'taskSet': lambda: get_all_tasks()}

getTask.get(task)()

The .get() method will return the lambda for the matching key passed in. The empty () after the .get(task) then executes the function that was returned in the .get(task) call. .get() takes a second, default parameter that is returned if there is no key match.  You can set the second parameter to be a function, or a value.

getTask.get(task, get_all_tasks)()

If the first parameter (key) is not found, it will return the function set in the default parameter for execution. Be careful to keep the returned value the same type or else you may get an Execution error.

1.3.4 String concatenation vs. format

1.3.4 String concatenation vs. format jmk649

In GEOG 485, we used the + operator for string concatenation to produce strings from multiple components to then print them out or use them in some other way, as in the following two examples:

print('The feature class contains ' + str(n) + ' point features.')

queryString = '"'+ fieldName+ '" = ' + "'" + countryName + "'"

There are two alternatives to this approach that can be used to improve code readability. The first is to use the string method format(…). When this method is invoked for a particular string, the string content is interpreted as a template in which parts surrounded by curly brackets {…} should be replaced by the variables given as parameters to the method. Here is how the two examples from above would look in this approach:

print('The feature class contains {0} point features.'.format(n) )

queryString = '"{0}" = \'{1}\''.format(fieldName, countryName)

In both examples, we have a string literal '….' and then directly call the format(…) method for this string literal to give us a new string in which the occurrences of {…} have been replaced. In the simple form {i} used here, each occurrence of this pattern will be replaced by the i-th parameter given to format(…). In the second example, {0} will be replaced by the value of variable fieldName and {1} will be replaced by variable countryName. Please note that the second example will also use \' to produce the single quotes so that the entire template could be written as a single string. The numbers within the curly brackets can also be omitted if the parameters should be inserted into the string in the order in which they appear.

The second option is to use the f string decorator. This is a shortcut for the .format(...) method and requires that the variable is wrapped within the {...} in the string. This approach is faster, and makes the string read more naturally. Here is how the example from above would look in this approach:

print(f'The feature class contains {n} point features.')

queryString = f"{fieldName} = '{countryName}'"

In both examples, we have a string literal '….' and the f decorator calls the format(…) method for this string literal to give us a new string in which the occurrences of {…} have been replaced.

The main advantages of using the f decorator or .format() are that the string can be a bit easier to produce and read as in particular in the second example containing SQL. If the value is a numeric data type, we don’t have to explicitly convert all non-string variables to strings with str(…). In addition, format allows us to include information about how the values of the variables should be formatted. By using {i:n}, we say that the value of the i-th variable should be expanded to n characters if it’s less than that. For strings, this will by default be done by adding spaces after the actual string content, while for numbers, spaces will be added before the actual string representation of the number. In addition, for numbers, we can also specify the number d of decimal digits that should be displayed by using the pattern {i:n.df}. The following example shows how this can be used to produce some well-formatted list output:

items = [('Maple trees', 45.232 ),  ('Pine trees', 30.213 ), ('Oak trees', 24.331)]

for i in items:

    print(f'{i[0]:20} {i[1]:3.2f}%')

Output:

Maple trees                          45.23%
Pine trees                           30.21%
Oak trees                            24.33%

The pattern {i[0]:20} is used here to always fill up the names of the tree species in the list with spaces to get 20 characters. Then the pattern {i[1]:3.2f} is used to have the percentage numbers displayed as three characters before the decimal point and two digits after. As a result, the numbers line up perfectly in a columnized format.

The format method can do a few more things, but we are not going to go into further details here. Check out this page about formatted output if you would like to learn more about formatting.