The following introduction to Python serves as additional, supplementary material to the lecture of the same name. Since it is a comprehensive, stand-alone introduction, it can also be used to learn the basics of Python on your own. The concept of this introduction is inspired by the JavaScript reference in the MDN Web Docs which I really like because it provides good explanations along with code sandboxes that allow you to play directly with the code while learning. I would encourage you to do exactly that.
Please be aware that certain code examples have been adjusted due to limitations in Trinket.io . Some functionalities, such as f-strings, may not be available or fully supported. Additionally, there may be missing examples for certain standard library modules that cannot be utilized within the Trinket environment at this time.
The exercises and lecture slides are also available on Github .
What Is Python and Why You Should Learn It
Python is a high-level programming language created by Guido van Rossum. It was released in 1991. Although images of snakes are often associated with Python, the name is actually derived from Guido van Rossum's favorite TV show, "Monty Python's Flying Circus". Python is an object-oriented programming language which is available on a wide variety of systems and can be used for a lot of different applications. The current version is Python 3, which will be the focus of the following explanations. Python is designed to be a beginner-friendly yet powerful programming language. In recent years, Python has gained increasing popularity, especially due to well-known and powerful frameworks designed for scientific computing, artificial intelligence, and data science. Examples of such frameworks include Numpy , Pandas , PyTorch or TensorFlow . Python is also commonly used for automation, prototyping, and even web development.
Unlike many other programming languages, Python statements do not require termination with a special character. The Python interpreter identifies the end of a statement through the presence of a newline, generated by pressing the "Return" key on the keyboard. We will delve into multi-line statements in subsequent sections. Another crucial point to note is that Python relies on indentation instead of delimiters like curly braces. While the specific amount of indentation doesn't hold significance, it must remain consistent within a given depth of a loop or conditional statement. Additionally, statements that are not intended to be indented must start from the first column. However, adhering to the convention outlined in PEP 8 , it is recommended to use 4 spaces for indentation, a practice we'll explore further in upcoming discussions.
Now, you might be wondering, what are PEPs? PEP stands for Python Enhancement Proposal. These are documents that outline design decisions, standards, and guidelines for the Python programming language. While compliance with most PEPs is voluntary, they are considered best practice in the Python community. For example, PEP 257 provides guidelines for docstring conventions, which are essential for documenting Python code effectively. However, there are some PEPs that are mandatory and must be followed. One such example is PEP 8, which is the Python Enhancement Proposal for the style guide of Python code. It provides recommendations on how to format your code for clarity and consistency.
How To Invoke Python Code
To get started with running Python code on your computer, you'll need to install a Python interpreter which is available for Windows, Linux and Mac OS. Alternatively, you can also use Conda (or one of its variants like MiniConda or Mamba). We'll talk more about these options when we discuss virtual environments in a moment.
Once Python is installed on your system, there are two main ways to start using it. The first method is by opening your shell or command prompt and typing "python". This will launch the Python interpreter, and you'll see something like this:Python 3.11.3 (main, Jun 5 2023, 09:32:32) [GCC 13.1.1 20230429] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
The three greater-than signs (>>>) are prompts that indicate you can type your commands after them. When you press "Enter", Python will execute your command. If you type a command like Python's print()
, the result will show up on the screen, like this:
For longer programs, you can use your preferred text editor or Integrated Development Environment (IDE) to write your Python code. The common file extension for Python files is ".py". To run your program, type "python" followed by the name or path of your file. Most modern IDEs also offer a "run" button that you can click.
One of the great advantages of Python is its easy-to-use libraries and external packages. Since many people use Python, there's a wide variety of pre-built packages available for many different tasks. The Python Package Index (PyPI ) is a collection of software for Python that offers numerous packages. You can install Python packages from PyPI using a package management tool called "pip".
However, Python is not great at dependency management. This is why nearly every Python user recommends to use virtual environments. Virtual environments are a Python tool for dependency management and project isolation. They provide a simple solution for a lot of potential problems by helping you to:
- Resolve dependency issues
- Create self-contained and reproducible projects
- Avoid system pollution
- Install packages without admin rights
There are various approaches to creating a virtual environment. One option is to use venv , a built-in tool in Python. External tools like virtualenv or Conda are also commonly used. Conda provides an alternative package and environment management approach. It not only enables easy creation of virtual environments with different Python versions but also introduces an additional feature set. It's worth noting that Conda is a separate project and is unrelated to pip. It uses an alternative package index maintained by the Anaconda project instead of PyPI. Conda packages can be installed using the command "conda install". As an alternative, Docker can also be used to create a containerized Python development environment. Docker allows you to package your Python application along with its dependencies and system configurations, ensuring consistency across different environments.
Alright, now that we've laid the groundwork with our introduction to Python, it's time to dive headfirst into the fundamental building blocks of this powerful programming language. Our next destination: the world of variables and the core concepts that form the foundation of Python's functionality.
Variables
In Python, variables are used for storing and managing data. Think of variables as containers that can hold values of various data types, such as numbers, strings, or complex objects. You can assign values to variables using the assignment operator =. In Python, there are specific rules and conventions for naming variables. Variable names can consist of letters, numbers, and underscores, but they must begin with a letter or an underscore. It's crucial to note that variable names are case-sensitive. Additionally, while Python doesn't strictly enforce it, it's highly advisable to avoid using built-in Python keywords as variable names to prevent potential conflicts. To maintain consistency and readability in your code, it's recommended to adhere to the naming conventions outlined in PEP 8 .
Python uses dynamic typing, which means that you can change the type of data stored in a variable during the program's execution. You can reassign a variable to hold a different data type without explicitly declaring its type. Dynamic typing provides flexibility but also requires careful handling of variable types to avoid unexpected behavior.
Variables have different scopes, which define where in the code they can be accessed. In Python, there are primarily three levels of variable scope: global scope, local scope and enclosing scope. We will discuss the details when talking about functions.
Data Types
In Python, understanding data types is fundamental to writing effective code. Let's explore the most common data types and get familiar with them. They will be discussed in more detail in the following sections.
Numeric Data Types
Python supports four different types of numeric data, namely integers, long integers, floating point numbers and complex numbers. Floating point numbers can be specified either by using a decimal point or using exponential notation, e.g. 1e-3 or 1E-3. In Python, long integers are actually what are sometimes called "arbitrary precision" integers which means that they can have as many digits as you have typed into the computer. In Python 3, the "L" suffix is not required anymore. They also have the further advantage that all arithmetic performed with long integers does not have limited precision like floating point numbers. However, while regular integer arithmetic is supported by most operating systems, Python has to perform all its own long integer arithmetic. Thus, using (many) long integers, will slow your programs down. Complex numbers can be entered into Python using either the complex
function or by denoting the complex number as the real portion followed by a plus sign and the imaginary part with a trailing uppercase or lowercase "J". There must be no spaces between the imaginary part and the "J". Both components are stored as floating point numbers.
Strings
Strings are collections of characters that represent arbitrary text in Python code. They can be created by enclosing text in single quotes, double quotes, or triple quotes (which allow spanning multiple lines). Inside strings, special character sequences starting with a backslash are interpreted uniquely. Single backslashes can also act as continuation characters, similar to using triple quotes. To include an actual backslash in a string, you can either use two backslashes or employ raw strings by prefixing the opening quotation mark with "r".
Unicode strings, which use 16 bits to store characters instead of 8 bits used by normal strings, can be created by prefixing the opening quote character with "u". Arbitrary Unicode characters can be specified using "\u" inside a string. Combining a Unicode string with a regular string results in a Unicode string.
String Operations
The following sections describe some of the most important operations available for working with strings.
Concatenation
Strings can be concatenated using the + operator, creating a new string that combines the originals. Strings can be also concatenated with variables.
String Formatting
In addition to creating and manipulating strings, it's essential to know how to format them effectively. String formatting allows you to combine variables, constants, and text in a way that makes your code more readable and user-friendly. There are various string formatting techniques, including concatenation, the "%" formatting method which is considered outdated and less versatile, and the modern approach using f-strings (for Python version 3.6 and above). You can also use the `str.format()` method, which provides another flexible way to format strings. In all these formatting methods, you can use format specifiers to control how individual values are presented. For example, you can format floating-point numbers to a certain number of decimal places.
Repetition
When using an asterisk (*) between a string and an integer a new string is created which contains the old string repeated by the integer value. The order of the arguments does not matter.
Indexing and Slicing
Strings in Python support indexing and slicing. A single character can be extracted from a string by appending the index of the desired character surrounded by square brackets. Keep in mind that the index starts at zero in Python. If the value inside the brackets is less than zero, Python counts from the end of the string. The last character in a string can be for example accessed using a subscript of -1.
A contiguous part of a string (called a slice) can be extracted by using a subscript consisting of a starting index followed by a colon and a ending index after it. Notice that the slicing stop position is exclusive, i.e. the slicing stops one position before the second value. If a slice starts at the beginning of a string or continues until the end, the first or second index can be omitted, respectively. It is also possible to use variables and integer constants for indexing and slicing.
Functions and Methods
Python provides some useful functions and methods for working with strings. The len()
function returns the number of characters which a string contains. Strings in Python are immutable objects, i.e. the value of a string can't be changed in place. To change the value of a string, a method needs to be invoked on the variable containing the string and the value of this operation needs to be reassigned to the desired variable. Python offers a wide variety of string methods. The split()
and join()
methods are among the most useful. The split()
method returns a list whose elements are a character or string in the original string splitted at the optionally specified separator. If no argument is specified, one or more whitespace characters are used as the default separator character. The following table shows an overview over some useful string methods:
Method | Description |
---|---|
str.upper() | Converts all characters in the string to uppercase. |
str.lower() | Converts all characters in the string to lowercase. |
str.strip() | Removes leading and trailing whitespace from the string. |
str.split() | Splits the string into a list of substrings based on a delimiter. |
str.join() | Joins a list of strings into a single string using the provided delimiter. |
str.replace() | Replaces occurrences of a substring with another substring. |
str.find() | Returns the index of the first occurrence of a substring (or -1 if not found). |
str.startswith() | Checks if the string starts with a specified substring. |
str.endswith() | Checks if the string ends with a specified substring. |
str.isalpha() | Checks if all characters in the string are alphabetic. |
Booleans
Booleans represent either True
or False
values and are often used in conditional statements and logic.
None
In Python, None is a special value used to indicate the absence of a value or to initialize variables when you don't have a specific value to assign. It's often employed as a placeholder in situations where a variable or result is expected but hasn't been determined yet.
Data Structures
Python provides various data structures to efficiently organize and manipulate data. These structures help manage collections of data items, making it easier to work with different types of information. A Python list is an ordered collection which allows to store objects of different data types. A Python dictionary is an unordered collection of data consisting of key/value pairs. A tuple is an ordered collection of different data types like a list, but tuples are immutable, i.e. they can not be modified once they are created. A set is a collection of data types that is similar to list and tuple, but a set is not an ordered collection of items and can only store unique items. These data structures serve various purposes, and their choice depends on the specific requirements of your program. We will dive deeper into Python data structures later on.
Comments
Comments are a crucial aspect of Python programming. They serve the purpose of documenting your code and providing context to both others who may read your code and your future self. Comments are lines of text that are not executed by the Python interpreter and are preceded by the #
symbol.
Single-line comments are used for brief explanations within your code. While Python doesn't have a specific syntax for multi-line comments, you can use triple-quotes ('''
or """
) to create multi-line comment blocks, although they are typically used for so-called docstrings, which we will discuss in the section about functions and modules.
There are different opinions on how and when to comment code among programmers, but using meaningful variable and function names is considered best practice. Clear, concise single-line comments can be used if needed, but are often a sign of code smell. However, when documenting functions or modules for wider use or sharing, adhering to the docstring conventions and practices is recommended to ensure that others can understand and use your code effectively.
Operators
In Python, operators are fundamental elements used to perform various operations on values and variables. In this section, we will explore different types of operators in Python.
Assigment Operators
One of the fundamental operations in any programming language is the assignment statement. It allows us to associate a variable name with a value, enabling us to manipulate our data effectively. In Python, like many other programming languages, the equal sign (=) is used for assignment. Assignment statements can be chained together to set multiple variables to the same value. Multiple assignments are also possible by using a comma-separated list of variables and expressions.
Without diving too deep into the internal details of Python, there is one crucial aspect of the assignment statement which needs to be understood to program effectively in Python. When assigning values to variables, whether they are numbers, strings, or expressions involving such types, Python stores these values in memory and associates them with their assigned variable names. However, when making assignments of one variable to another, Python actually stores a reference to the variable. Instead of creating a new copy of the contents of the original variable, it stores information about where the original variable is stored in memory. When the new variable is later referenced, it refers back to this location to find the value of the variable. For scalar variables (e.g., numbers and strings), this behavior generally doesn't produce unexpected results since Python automatically updates the values of any variables referring to another value once it's changed. However, with mutable objects such as lists, changes within the list do not trigger this updating mechanism, which can lead to surprising results. If you genuinely want to make a copy of a list instead of merely storing a reference, you can use Python's copy
module. We will discuss this in more detail when talking about data structures.
In addition to the basic assignment operator (=), Python also provides several other assignment operators. Let's take a look at some examples:
The following table shows the assignment operators available in Python:
Operator | Description |
---|---|
= | Assigns the value on the right to the variable on the left. |
+= | Adds the value on the right to the variable on the left and assigns the result to the variable. |
-= | Subtracts the value on the right from the variable on the left and assigns the result to the variable. |
*= | Multiplies the variable on the left by the value on the right and assigns the result to the variable. |
/= | Divides the variable on the left by the value on the right and assigns the result to the variable. |
%= | Calculates the modulus of the variable on the left and the value on the right, then assigns the result to the variable. |
//= | Performs floor division on the variable on the left and the value on the right, then assigns the result to the variable. |
**= | Raises the variable on the left to the power of the value on the right and assigns the result to the variable. |
&= | Performs a bitwise AND operation between the variable on the left and the value on the right, then assigns the result to the variable. |
|= | Performs a bitwise OR operation between the variable on the left and the value on the right, then assigns the result to the variable. |
^= | Performs a bitwise XOR operation between the variable on the left and the value on the right, then assigns the result to the variable. |
<<= | Shifts the bits of the variable on the left to the left by the number of positions specified by the value on the right and assigns the result to the variable. |
>>= | Shifts the bits of the variable on the left to the right by the number of positions specified by the value on the right and assigns the result to the variable. |
Arithmetic Operators
Arithmetic operators are used to perform basic mathematical operations in Python, including addition, subtraction, multiplication, and division. Python supports all the binary arithmetic operators shown in the following table.
Operator | Description |
---|---|
+ | Adds two operands. |
- | Subtracts the right operand from the left operand. |
* | Multiplies two operands. |
/ | Divides the left operand by the right operand (float division). |
// | Divides the left operand by the right operand and returns the floor value (integer division). |
% | Returns the remainder when the left operand is divided by the right operand. |
** | Raises the left operand to the power of the right operand. |
A binary operator operates on exactly two elements, one on each side of the operator's symbol. When performing operations on integers, Python performs integer arithmetic unless one of the operands is a floating-point number. Python also provides unary operators for plus and minus. Any expression returning a single numeric value can be preceded either by a minus or plus sign.
The precedence of Arithmetic Operators in Python is as follows:
- P - Parentheses
- E - Exponentiation
- M - Multiplication (Multiplication and division have the same precedence)
- D - Division
- A - Addition (Addition and subtraction have the same precedence)
- S - Subtraction
Comparison Operators
Comparison operators, also known as relational operators, are used in Python to compare values. They evaluate to either True
or False
based on the specified condition. The comparison operators in Python are shown in the following table.
Operator | Description |
---|---|
== | Equal to: Checks if the operands are equal. |
!= | Not equal to: Checks if the operands are not equal. |
< | Less than: Checks if the left operand is less than the right operand. |
> | Greater than: Checks if the left operand is greater than the right operand. |
<= | Less than or equal to: Checks if the left operand is less than or equal to the right operand. |
>= | Greater than or equal to: Checks if the left operand is greater than or equal to the right operand. |
In Python, comparison operators have lower precedence than arithmetic operators. All comparison operators have the same precedence order.
Logical Operators
Logical operators are used to perform logical operations such as "AND", "OR", and "NOT" in Python. They are typically used to combine conditional statements and create more complex conditions. The precedence of Logical Operators in Python is as follows:
- Logical NOT
- Logical AND
- Logical OR
The following example illustrates how to use logical operators in Python.
Bitwise Operators
Bitwise operators are used to operate on binary numbers and perform bit-by-bit operations. An overview over the bitwise operators in Python is shown in the following table.
Operator | Description |
---|---|
& | Bitwise AND: Performs a bitwise AND operation on the corresponding bits of the operands. |
| | Bitwise OR: Performs a bitwise OR operation on the corresponding bits of the operands. |
^ | Bitwise XOR: Performs a bitwise XOR (exclusive OR) operation on the corresponding bits of the operands. |
~ | Bitwise NOT: Inverts the bits of the operand, changing 1s to 0s and vice versa. |
<< | Left Shift: Shifts the bits of the left operand to the left by the number of positions specified by the right operand. |
>> | Right Shift: Shifts the bits of the left operand to the right by the number of positions specified by the right operand. |
The precedence of Bitwise Operators in Python is as follows:
- Bitwise NOT
- Bitwise Shift
- Bitwise AND
- Bitwise XOR
- Bitwise OR
Identity and Membership Operators
In Python, identity and membership operators are essential for comparing and checking the relationships between values, variables, and sequences. The is
operator is used to check if two values or variables refer to the same object in memory. It returns True
if they do and False
otherwise. It is not used to compare the values themselves, but rather their identity. The is not
operator is the negation of is
. It returns True
if two values or variables do not refer to the same object and False
if they do.
The in
operator is used to check if a value or variable is present in a sequence (e.g., a list, tuple, string, or set). It returns True
if the value is found in the sequence and False
otherwise. The not in
operator is the negation of in
.
Python provides a concise way to evaluate an expression based on a condition using ternary operators. The basic syntax is [on_true] if [expression] else [on_false]
. This allows you to write compact code instead of needing a multi-line if-else statement. Ternary operators are especially useful when you need to assign a value to a variable based on a condition in a single line.
Control Structures
Conditions
In Python, control structures like the if
statement, along with optional elif
and else
statements, are the foundation for executing conditional logic in your programs. The if
statement evaluates the expression following it. If the expression is true, Python executes the statement(s) that follow. If the expression is false, Python continues with the elif
statement (if there is one), and tests that expression. It proceeds to test the expressions associated with any elif
statements in order, executing the first set of statements for which the expression is true. If none of the elif
expressions are true, and there is an else
statement, Python executes the statement(s) associated with the else
. If there's no else
statement, Python simply moves on to the statements following the if
block. It's important to note that once Python encounters a true expression associated with an if
or elif
statement, it executes the corresponding statement(s) and doesn't evaluate any other expressions that follow.
Loops
Python provides two primary types of loops: the for
loop and the while
loop.
Python, like other programming languages, provides a for
loop which allows you to iterate. Although, the for
loop is very useful, there is a more "pythonic" and powerful way for iteration in Python. The for in
loop allows you to iterate over all the values in a sequence (e.g. string, list or tuple), performing the same task in each element. If the elements of the sequence being iterated over contain tuples or lists, a comma separated list can be used to unpack each individual element of the sequence. If you need an index, you can use enumerate
.
The for
loop provides a great way to process the elements of a sequence. However, sometimes it is necessary to do some repetitive computation which is not based on an array. In those cases, the while
loop can be used. When Python encounters a while
loop, it firsts tests the expression provided. If the expression is false and there's an else
clause, Python executes the statements following the else
. Without an else
clause, if the expression is false, control moves to the first statement after the while
loop. If the expression is true, Python executes the statements following the while
statement. Once those statements are completed, the expression is tested again, and the process repeats. As long as the expression remains true, Python continues executing the statements after the while
. When the expression becomes false, the statements after the else
(if present) are executed.
It's worth mentioning that Python programmers often use while
loops in a somewhat unconventional way. They may create "infinite" loops and control when to exit the loop with statements inside the loop. To exit a loop (for
or while
) prematurely, you can use the break
statement. The continue
statement, on the other hand, is used to skip the remaining part of the current iteration and move on to the next one. In a for
loop, the loop variable's value is automatically incremented after a continue
statement, so the next iteration proceeds as usual.
Data Structures
Data structures are a way of organizing data so that it can be accessed more efficiently depending upon the situation. They are the fundamentals of any programming language around which a program is built. Python provides several built-in data structures, each with its unique characteristics and use cases. Understanding these data structures is essential for effective programming, as they play a fundamental role in organizing and processing data in Python. Let's dive into these data structures and their capabilities.
Lists
Lists serve as a fundamental tool in Python for holding collections of objects that are indexed using numerical positions. These objects within a list can span various types, including numbers, strings, functions, user-defined objects, and even other lists. This versatility enables the creation of complex data structures with ease. To define a list in Python, enclose a sequence of elements you want within square brackets, separated by commas. To initialize an empty list, simply use square brackets without any elements inside. Notably, the items in a list can have different data types. Accessing individual elements within a list is achieved by employing square brackets after the list's name, specifying the index of the desired element. It's important to remember that Python employs zero-based indexing, meaning the first element corresponds to index 0. In the case of nested lists (lists within lists), you can utilize additional sets of square brackets to access individual elements. To get the number of elements within a list, Python offers the built-in len()
function.
List Indexing and Slicing
The slicing operations introduced in the section about strings are equally applicable to lists, with a particularly useful extension. In addition to utilizing slicing to extract a section of a list, you can employ slicing to assign values to elements within a list, indicated by a slice positioned on the left side of an equal sign. This distinction arises from the fact that lists are mutable objects, while strings are immutable. When assigning values using this approach, if the count of elements in the list on the right side of the assignment doesn't match the number of elements suggested by the slice's subscript, the list will automatically adjust its size to accommodate the assignment. Conversely, assignments performed via a single subscript will consistently maintain the list's length. Slices further offer the capability to remove elements from a list. Alternatively, the del
statement can be employed to remove items from a list. To use the del
statement, specify the element or slice of the list that requires deletion. Another important use of slices is to make a separate modifiable copy of a list. In this case, a slice is created without a starting and ending index. Recall what we learned earlier when discussing variable assignment.
List Operators
Concatenation
To merge the contents of two lists, the plus sign is used to concatenate them. The outcome is a unified list whose length equals the sum of both original lists' lengths. This new list contains all the elements from the first list followed by all the elements from the second list. List concatenation only works when combining two lists. To add a scalar to the end of a list, you either need to surround the scalar with square brackets or use the append()
method.
Repetition
As for strings, the asterisk is overloaded for lists to serve as a repetition operator, Applying repetition to a list results in a single list with the elements of the original list repeated as many times as specified. A list consisting other lists can be created by surrounding a list to be repeated with an extra set of square brackets.
The in
Operator
The in
operator provides an extremely convenient technique for determining whether a specific value resides within a list. These evaluations are structured by positioning a value on the left side of the operator and a list on the right side. The outcome is either True
or False
, rendering it highly suitable for conditional statements. It's important to exercise caution when constructing expressions employing the in
operator, as an exact match in both type and value is necessary for a True
result. Moreover, the in
operator works seamlessly with expressions that evaluate to elements within a list.
List Comprehension
List comprehensions are a concise and elegant way to create lists in Python. They allow you to generate a new list by applying an expression to each item in an existing iterable (like a list, tuple, or range) and optionally applying a condition to filter the items. List comprehensions not only make your code more compact, but they also enhance its readability by encapsulating complex operations in a comprehensible format.
Functions and Methods for Lists
We already learnt that the len()
function will return the number of elements in a list. But there are more built-in functions and methods for lists in Python. For a single list input, the min()
function extracts the smallest element from the list, while the max()
function retrieves the largest. To append a single element at the end of a list, the append()
method can be used. If you need to add several elements to the end of a list, either the concatenation operator or the extend()
method can be employed. If your intent is to insert an element at a position other than the end of the list, the insert()
method proves useful. This method demands two arguments: the index for the insertion location and the item itself. Notably, a single element can be inserted using insert()
, while for multiple insertions, slicing can be utilized. When the objective involves eliminating an item from a list based on its value, the remove()
method proves invaluable. It removes only the initial occurrence of the value within the list. Both the reverse()
and sort()
methods operate in place, meaning they change the original order of elements within the list. If preservation of the initial order is essential, it's advisable to generate a copy of the list. By default, the sort()
method sorts its numeric arguments in numerical order and string arguments in alphabetical order. Recall that lists can contain arbitrary objects, so sort()
needs to be very flexible. In general, it sorts scalar numeric values before scalar string values and lists by first comparing their initial elements and continuing through the available list elements until one list differs from the other. The count()
method accepts a single argument representing the value to be located and calculates the frequency of that value within the list. On the other hand, the index()
method offers the index (subscript) of the first occurrence of a specified value.
Tuples
Tuples, like lists, are a fundamental data structure in Python, used to store collections of items. They can contain elements of various data types, including numbers, strings, and even other tuples. Tuples are similar to lists but there is one important difference, namely that tuples are not mutable. This means that once a tuple is created, its elements can't be modified in place. The immutability of tuples makes them valuable in scenarios where you want to ensure that the data remains constant throughout the program's execution. Tuples are constructed by enclosing their values within parentheses. When a tuple is not embedded within an expression, the parentheses may be omitted. To create an empty tuple, simply utilize a pair of empty parentheses. However, due to the use of parentheses for grouping in arithmetic expressions, it is essential to include a comma after the sole element to explicitly denote a tuple with only one element in an assignment statement.
Operators and Indexing for Tuples
The same operators as mentioned in the previous section for lists apply to tuples. Keep in mind that tuples are immutable. Thus, slicing operations for tuples are more similar to strings than lists. Slicing can be used to extract parts of a tuple, but not to change it.
Functions and Methods for Tuples
There are no methods for tuples. However, tuples and list can be easily converted to each other using the built-in functions list
and tuple
.
Dictionaries
Dictionaries in Python share similarities with lists, as they can store arbitrary objects and be nested to any desired depth. However, unlike lists, dictionaries are indexed by keys, which can be any immutable object, such as strings or tuples. Let's delve into this concept with a straightforward example: imagine a scenario where we want to store student matriculation numbers as tuples inside a list, where the first tuple element represents the student's name, and the second element is their matriculation number. However, if we wish to retrieve the matriculation number of a specific student, we'd need to search through each element in the list to find the tuple with the student's name as the first element before we can access the desired number. With a dictionary, we can use the student's name as the index, often referred to as a key. This significantly simplifies the process of retrieving the information we need:
As the example above illustrates, a dictionary can be initialized using a comma-separated list of key/value pairs enclosed in curly braces. An empty dictionary can be created with a pair of empty curly braces. It's important to note that dictionary keys are not limited to strings, and they don't need to be of the same data type. However, it's crucial to remember that mutable objects, such as lists, cannot be used as dictionary keys.
Operators and Indexing for Dictionaries
You can add key/value pairs to a dictionary using assignment statements, and if you need to remove a specific key/value pair from a dictionary, you can use the del
statement for this purpose. You can use the in
operator to check if a specific key exists in a dictionary. It returns True
if the key is present and False
otherwise.
Functions and Methods for Dictionaries
Dictionaries in Python offer several built-in methods specifically designed for working with dictionaries. These methods provide convenient ways to perform various operations on dictionaries. Let's take a look at some examples:
Sets
Python's built-in set type is similar to dictionaries. Sets are unordered and its elements are unique. A set itself can be modified, but the elements in the set must be immutable. There are two ways to create a set in Python: first, you can create a set using the built-in set()
function or alternately, a set can be defined using curly braces. To define an empty set, the set()
function needs to be used. Recall that Python interprets empty curly braces as an empty dictionary. The elements in a set can be objects of different types. However, keep in mind that set elements must be immutable. Thus, a tuple may be included in a set, but lists and dictionaries are mutable, so they can't be set elements.
Functions and Methods for Sets
As for lists, the len()
function can be used to get the number of elements in a set and the in
and not in
operators can be used to test for membership. However, many of the operations that can be used for Python's other data types, don't make sense for sets. For example, sets can't be indexed or sliced. Nevertheless, Python provides a variety of operations on set objects which are similar to the operations defined for mathematical sets. Set union of two or more sets can be performed with the "|" operator or with the union()
method. The resulting set contains all elements that are present in any of the specified sets. To compute the intersection of two or more sets, i.e. return a set containing only elements that are present in all of the specified sets, the "&" operator or intersection()
method can be used. The difference between two or more sets can be computed using the "-" operator or the difference()
method. For two sets, this will return a set containing all elements that are in the first set, but not in the second set. When multiple sets are specified, the operation is performed from left to right. There are many more operations available such as update()
for updating a set, add()
to add an element to a set, remove()
to remove an element from a set and clear()
to remove all elements from a set. For more details, refer to the Python documentation .There's also another built-in type in Python called frozenset
which is exactly the same as set, but is immutable.
Functions and Modules
Functions
Functions are a fundamental component of any programming language for two primary reasons. First, they enable code reusability, eliminating the need to duplicate and modify code each time it is used. Second, functions allow you to logically isolate various sub-tasks that inevitably arise when working on a program. This programming approach is known as modular programming and is generally considered a best practice for writing readable and maintainable code.
In Python, functions are treated as objects. As a result, they can be assigned to variables, stored in lists or tuples, passed as arguments to other functions, and more. However, functions possess a unique property that distinguishes them from other Python objects: they can accept a list of arguments enclosed in parentheses and optionally return a value. You may already be familiar with some of Python's built-in functions, such as len()
, and you've encountered methods, which are similar to functions and defined in a similar manner. In this section, we will delve into various aspects of functions, including how to create and import them into your programs.
Variable Scoping
In Python, you can typically reference variables anywhere in your program. However, when you define a function, Python creates a distinct namespace within that function. A namespace serves as a mapping between object names in your program and their corresponding memory locations where Python stores their values. Consequently, when a variable is created inside a function, Python recognizes it as a local variable within that function's namespace, distinct from variables with the same name defined elsewhere in the program. You can also reference a variable inside a function that already exists in your program at the time the function is called. Additionally, you can declare global variables using the global
statement, though this should be only used for situations where no other reasonable solution is available. Consequently, we won't delve into global variables in detail here.
When Python attempts to resolve the reference of a name in your program, names within a function's local namespace take precedence, followed by names of global objects or objects imported into the global namespace from a module. Finally, built-in object names are searched as a last resort. Due to this search order, Python's scoping is sometimes referred to as following the LGB rule.
Function Basics
In Python, functions are defined using the def
statement. The function's name follows the def
keyword, followed by a parenthesized list of arguments to be passed to the function. If a function does not require any arguments, an empty set of parentheses is used. A colon (:) comes after the parenthesized list. On the next line, an indented triple-quoted string, known as a docstring, provides documentation about the function. While not strictly necessary, docstrings are good practice. Python docstrings can be written in several formats. The Google and Numpy formats are probably the most common ones. The function body follows the def
line and the optional docstring. The function returns control to the calling environment when it encounters a return
statement or when it reaches the end of the function body. If the function lacks a return
statement, calling it will return the value None
. To call a function, you refer to its name followed by a parenthesized list of arguments. For functions with no arguments, you simply use empty parentheses.
Named Arguments and Default Values
When passing arguments to a function without specifying their names, Python assumes that you've arranged the arguments in the order they were defined in the function. To mitigate the need to remember argument order, Python allows you to use named arguments. When calling a function, you can precede some or all of the arguments with a name and an equal sign, making it clear which argument corresponds to which parameter in the function. For added flexibility, when defining a function, Python allows you to specify default values for some or all of the arguments, making it optional for users to provide these arguments. To set a default value when defining a function, use a syntax similar to naming arguments when calling a function, appending an equal sign followed by the desired default value. When combining named and unnamed arguments in a function call, unnamed arguments must precede named ones in the argument list. Thus, place required arguments before optional ones in the function's argument list.
Variable Number of Arguments
In some cases, it's impossible to predict in advance how many arguments a function will receive. By designating a function argument with a name beginning with an asterisk, Python will collect all unnamed arguments passed to the function into a tuple, which can be accessed using that argument's name. A similar technique can be employed to create functions that can handle an unlimited number of keyword-argument pairs. If an argument in a function is prefixed with two asterisks, Python will collect all keyword-argument pairs that were not explicitly declared as arguments into a dictionary. This argument must be the last one in the function definition. This approach enables you to write a function that accepts any named parameter, even if you don't know the parameter name when writing the function.
Functional Programming and Anonymous Functions
When you need to perform the same operation on a list of objects, an alternative to using a for
loop is the map
function. This function accepts another function as its first argument and one or more lists as additional arguments. The number of provided lists must match the number of arguments expected by the mapped function. Python offers the lambda
operator for creating anonymous functions when you only need a function once. Anonymous functions are limited to a single statement, which becomes the value returned by the lambda
operator. Instead of enclosing the arguments in parentheses, you list them after the lambda
keyword, separated by commas.
Another functional programming tool in Python is the filter
function. Like map
, it takes a function as its first argument and a list as its second argument. However, filter
returns only those elements for which the function returns True
. You might also recall list comprehensions as an alternative way to apply an expression to all elements of a list. In fact, a list comprehension with a single for
clause is similar to a call to map
. By adding an if
clause to the comprehension, it becomes similar to embedding a call to filter
within a map
.
map
and filter
both return so-called Iterators. An iterator is an object that can be iterated upon, allowing you to traverse through all its values. In Python, an iterator is an object that implements the iterator protocol, which consists of the methods __iter__
and __next__
. We will discuss these so-called magic methods in more detail in the next section. The key takeaway here is that map
and filter
return Iterators, so the results need to be casted to lists for practical use.
Modules
Python's core design philosophy emphasizes simplicity and efficiency, keeping the language small and easy to learn. However, many programming tasks require additional capabilities, which are provided through the use of modules. In Python, a module is a collection of code that groups together functions, classes, and variables according to functionality. Modules serve as a way to organize your code into reusable and logically grouped components. They are essential for managing complexity in larger programs and promoting code reuse across projects. Python comes with an extensive standard library of modules that provide pre-built functionality for various tasks. We will explore some of the most commonly used standard modules distributed with Python later on. In addition to the standard modules, there are many additional packages available for installation, such as those from PyPi or Conda. However, exercise caution when installing third-party packages to ensure their trustworthiness and compatibility with your project.
To create a module, you simply save your Python code in a separate .py
file with a descriptive name, such as my_module.py
. This file should contain functions, classes, or variables that you want to reuse in other parts of your program or in other programs. As discussed earlier in the context of variable scoping, Python identifies which variable or function you are referring to by resolving its name in a namespace. To use functions, classes, or variables from a module in your Python program, you need to import the module. Python provides three ways to do this. The simplest way to access a module is to provide the module's name to the import
statement. For example, to import a module named my_module
, you'd use import my_module
. It makes all the functions, classes, and variables in the module available under the my_module
namespace. If you only need specific items from a module, you can import them directly. This way, you can use function_name
without needing to prefix it with module_name
. For instance, from my_module import my_function
. While this approach is slightly more efficient than importing an entire module, it eliminates the explicit connection between the imported function and the module it was imported from. You also need to list every function you want to use in the import statement. You can also import all of the objects from a module into the local namespace by using an asterisk after the import
statement, e.g. from my_module import *
. Since this may override existing objects and even built-in objects, imports of whole modules should be used carefully. To give a module a shorter name, you can use the as
keyword to create an alias. This is useful when working with modules with long names or to avoid naming conflicts. For example, import my_long_module_name as mlm
.
When you import a module, Python searches for it in a specific order. First, it checks the current directory where your script is located. Then, it looks in the built-in modules. Finally, it searches in directories listed in the sys.path
variable. Understanding this order can help you manage your module imports effectively.
When the Python interpreter reads a source file, it executes all the code found in that file. Consequently, when you import
a module into a Python program, the contents of the module are executed. This behavior can lead to unintended consequences when users accidentally invoke scripts they didn't intend to. To address this issue, Python provides a built-in variable called __name__
. This variable has a special role in module execution. When a module is imported, the __name__
variable is set to the name of that module. However, when a program is executed directly (as opposed to being imported as a module), the __name__
variable is set to the value"__main__"
. This concept forms the basis for the commonly seen if __name__ == "__main__":
statement in Python code. It allows you to differentiate between code that should run only when the script is executed directly and code that should be available for import as a module. By using the if __name__ == "__main__":
construct, you can conditionally execute specific code blocks only when the script is run directly. This is particularly useful for creating reusable modules that can be both imported and executed independently.
Object-Oriented Programming
Object-Oriented Programming (OOP) is a programming paradigm that uses objects and classes to structure code. In Python, everything is an object, and classes are the blueprints for creating objects. A class defines the attributes (data) and methods (functions) that objects of that class will have. OOP promotes the organization of code into reusable, self-contained units, making it easier to manage and maintain complex systems.
In Python, you define a class using the class
keyword followed by the class name. The class definition typically contains attributes and methods. Attributes are variables that store data, while methods are functions that define the behaviors of the objects created from the class. To use a class, you create objects or instances of that class. This process is known as instantiation. Each object is a unique instance of the class, with its own set of attributes and the ability to call methods defined in the class.
In Python, the self
parameter is a reference to the instance of the class. It is the first parameter in all instance methods and allows you to access and modify attributes and call other methods within the class. While you can name this parameter differently, it's a convention to use self
.
Operator Overloading
Besides creating methods on our own, we can change the way many familiar operators work by a technique known as operator overloading. Special methods, whose names begin and end with double underscores, can be defined to “intercept” many common operators, allowing you to redefine what such operators as print
, +
, and *
, or functions like len
, will do when they're applied to the objects you create. These methods, denoted by double underscores as both a prefix and suffix, are commonly referred to as dunder (short for “Double Underscore”) or magic methods in Python. One of the most important operator overloading methods is the __init__
method.
The __init__
method is a special method in Python classes, also known as the constructor. It gets called when you create a new instance of the class and allows you to initialize attributes.
The __str__
method is called through the print statement. The __repr__
method is called when an object's name is typed in the interpreter. The following table lists some of the more commonly used methods for overloading.
Method | Description |
---|---|
__str__ | Converts an object to a string for printing. |
__repr__ | Returns a string representation of the object for debugging. |
__len__ | Returns the length of an object. |
__add__ | Defines behavior for the addition operation. |
__sub__ | Defines behavior for the subtraction operation. |
__mul__ | Defines behavior for the multiplication operation. |
__truediv__ | Defines behavior for the true division operation. |
__floordiv__ | Defines behavior for the floor division operation. |
__mod__ | Defines behavior for the modulus operation. |
__eq__ | Defines behavior for the equality comparison. |
__ne__ | Defines behavior for the inequality comparison. |
__lt__ | Defines behavior for the less-than comparison. |
__le__ | Defines behavior for the less-than-or-equal comparison. |
__gt__ | Defines behavior for the greater-than comparison. |
__ge__ | Defines behavior for the greater-than-or-equal comparison. |
In addition, you can define what happens when your object is iterated over by means of the for statement by defining an__iter__
method that simply returns the object itself and providing a next
method which will be called for each iteration. Inside the next
method, you need to raise a StopIteration
exception when no more items are available.
Private Attributes
In many object-oriented languages, certain attributes can be declared as private, making it impossible for users of a class to directly view or modify their values. The designer of the class then provides methods to control the ways in which these attributes can be manipulated. While Python classes don't have true private attributes, if an attribute name begins with two underscores, the Python interpreter internally modifies the attribute's name, so that references to the attribute will not be resolved. The attribute name is still accessible, but this convention indicates that an attribute should not directly be manipulated and gives you more control over the way users of your class will manipulate those attributes.
In Python, there is also the concept of "protected" attributes, which is not enforced by the language itself but is a strong convention and best practice within the Python community. While an attribute with a single leading underscore, like _protected_var
, doesn't have any special meaning to the Python interpreter, it carries a significant message to developers. It signals that this attribute is intended for internal use within a class and should not be accessed or modified directly from outside the class. Think of "protected" attributes as a way to communicate within the Python code itself. It indicates that, while technically accessible, this attribute is part of the class's internal workings and not meant to be part of the class's public interface. This convention helps improve code organization, maintainability, and readability. By encouraging developers to use getter and setter methods to interact with "protected" attributes, it promotes the principles of encapsulation and information hiding. It separates the internal details of a class from the outside world, making it easier to understand and maintain the code.
Class and Instance Variables
Class variables are shared by all instances of a class and are defined within the class but outside any method. Instance variables are specific to each instance of a class and are typically defined within the __init__
method.
Class Methods and Static Methods
Class methods are methods that are bound to the class and not the instance. They can access and modify class-level attributes. Static methods are similar but don't have access to instance-specific or class-specific attributes.
Inheritance
Inheritance is a fundamental concept in OOP that allows you to create new classes based on existing ones. The new class (subclass or derived class) inherits attributes and methods from the parent class (base class or superclass). You can also override or extend inherited methods and attributes in the subclass.
Encapsulation
Encapsulation is the practice of bundling data (attributes) and the methods (functions) that operate on that data into a single unit (a class). It helps hide the internal details of how an object works, providing an interface for interacting with the object while protecting its integrity.
In Python, we have an elegant and convenient mechanism for accessing object attributes known as properties. Properties serve as a means to control how we access these attributes, offering a more refined alternative to traditional getter methods. They allow us to define getter methods that look and feel like regular attribute access. When using properties, you can enforce rules, execute calculations, or add custom logic when getting attribute values. We can create properties using the @property
decorator. The "@" symbol indicates that a method has been decorated. For now, let's not concern ourselves too much with what exactly a decorator is. We will learn more about decorators in the last part. However, if you're curious, feel free to read ahead in the section on decorators. By decorating a method with @property
, we mark it as a getter method for a specific attribute. This means that when we access the attribute, the decorated method is automatically invoked behind the scenes. Properties provide a powerful way for controlling attribute access in Python. They enhance code clarity, encapsulation, and maintainability.
Polymorphism
Polymorphism is the ability of different objects to respond to the same method or attribute in a way that is appropriate for their individual types. Python supports polymorphism through method overriding. The example provided for inheritance also demonstrates polymorphism since the subclasses Cat
and Dog
override the speak
method of the Animal
base class.
Debugging, Error Handling and Unit Testing
Debugging
Debugging is the process of identifying and fixing errors or bugs in your code. Python provides several tools and techniques to help you diagnose and resolve issues in your programs. Effective debugging is an essential skill for every programmer, as it can save time and frustration during the development process.
Errors and Exceptions
A Python program terminates as soon as it encounters an error. In Python, an error can be a syntax error or an exception. While for simple programs it might be sufficient to just fail, we often need proper error handling to prevent that an error crashes whole programs or applications.
A syntax error occurs when the parser detects an incorrect statement. Let's take a look at the following example:
In this example, there was one bracket too many. Remove it and run the code again. This time, there is an exception. This type of error occurs when syntactically correct Python code results in an error. Python provides details about the type of exception encountered. In this case, it was a ZeroDivisionError
. Python provides various built-in exceptions and also offers the possibility to create self-defined exceptions. To throw an exception, you can use the raise
statement.
While exceptions are great, we still need to handle them because just throwing exceptions still results in programs terminating. To catch and handle exceptions, try
and except
are used in Python. The statement inside the try
block is executed and if an exception occurs, the code inside the except
statement is executed. Catching Exception
hides all errors, even those which are completely unexpected. This is why you should avoid bare except
clauses in your Python programs. Instead, you'll want to refer to specific exception classes you want to catch and handle. In Python, using the else
statement, you can instruct a program to execute a certain block of code only in the absence of exceptions. Everything in the finally
clause will be executed regardless if you encounter an exception somewhere in the try
or else
clauses.
Instead of waiting until a program crashes, you can also use assertions to prevent errors early on. Assertions are a technique of defensive programming. By asserting that a certain condition is met, the program is only continued if the condition turns out to be True
. Otherwise, we can have the program throw an AssertionError
exception.
There is a third type of errors, namely logical errors, also known as bugs. Logical errors occur when your code does not produce the expected output or behaves incorrectly. These errors can be challenging to identify because Python does not raise an error. Instead, the program executes, but the results are incorrect.
Python offers several debugging tools and techniques to help you identify and fix errors. The simplest debugging technique is to add print
statements in your code to display the values of variables and the flow of execution. This can help you understand what your code is doing and identify where issues may arise. Alternatively, you can utilize logging as an effective method to gather program information, not only in the event of an error. Logging will be discussed when talking about the Python standard library. Python also includes an interactive debugger called pdb
(Python Debugger). You can insert breakpoints in your code using pdb.set_trace()
and then run your script with the -m pdb
option. This launches an interactive debugging session where you can inspect variables, step through code, and evaluate expressions. Many Python IDEs, such as PyCharm or VS Code , provide built-in debugging features. These tools offer a user-friendly interface for setting breakpoints, inspecting variables, and navigating through your code during debugging.
Common Debugging Practices
When debugging in Python, consider the following best practices:
- Pay attention to error messages and traceback information. Python's error messages often provide clues about the cause of the problem and the location in your code where it occurred.
- Begin by isolating the portion of code where the error occurs. Comment out unrelated code to narrow down the issue.
- Ensure you can reproduce the error consistently. Understanding the conditions that trigger the error is crucial for debugging.
- Insert
print
or logging statements strategically to print the values of variables and intermediate results. This helps you track the flow of execution and identify unexpected values. - Verify that the data types of variables match your expectations. Python is dynamically typed, so data type errors can occur if you assume the wrong type.
- Check the official Python documentation and the documentations of any libraries you are using to ensure you are using functions and methods correctly.
Fixing Bugs
Once you identify a bug, you may follow these steps to fix it:
- Understand the root cause of the problem by reviewing the code and considering the inputs and logic.
- Modify the code to correct the issue. Be cautious not to introduce new errors while fixing the current one.
- Test your code to ensure that the bug is resolved and that the fix does not cause side effects in other parts of the program. It is considered a best practice when you are debugging your code to first write a new test pinpointing the bug.
- If you're working in a team or on a larger project, document your changes to help others understand the modifications and the reasons behind them.
Unit Testing
Testing your code is very important. It does not only allow you to ensure that your program still works correctly after you made changes to your code but also helps to isolate errors. Unit testing is a method for testing software that looks at the smallest testable pieces of code, called units, which are tested for correct operation. By doing unit testing, we can verify that each part of the code, including helper functions that may not be exposed to the user, works correctly and as intended.
In Python, there are two popular unit testing frameworks: the built-in unittest
module (often referred to as the PyUnit framework) and the pytest framework.
The unittest
module is included in the Python standard library. Creating test cases is accomplished by subclassing unittest.TestCase
. Here's a simple example:
In larger projects, you may have multiple test classes, each containing several test methods. To organize and run these tests efficiently, you can create test cases and test suites. A test case is a collection of related test methods within a test class. It represents a specific aspect of functionality or behavior to be tested. A test suite is a collection of test cases. It allows you to group related tests together. Python's unittest
framework provides the TestLoader
and TestSuite
classes to help create and organize test suites.
Test fixtures are preconditions and postconditions that are established before and after running a test. In Python, you can use setUp
and tearDown
methods within your test class to set up and clean up resources required for testing. These methods run before and after each test method, ensuring a consistent test environment. In some cases, you may want to isolate the code under test by replacing external dependencies, such as databases or APIs, with mock objects or test doubles. Python's unittest.mock
module provides tools for creating and using mock objects in your tests. The unittest framework also supports test discovery, which allows you to automatically discover and run tests within your project.
Pytest is an alternative to Python's standard unittest module, which can be installed using pip
. Despite being a fully-featured and extensible test tool, it boasts a simple syntax. Creating a test suite is as easy as writing a module with a couple of functions. Pytest offers a wide range of features for test discovery, fixtures, parameterized testing, and plugins.
In real-world software development, unit tests are often integrated into a Continuous Integration (CI) pipeline. CI systems automatically run unit tests whenever code changes are pushed to version control repositories. This practice helps identify and address issues early in the development process. Tools like coverage.py
can help you determine which parts of your code are covered by your unit tests. Achieving high test coverage is a common goal in software testing to ensure that all critical paths are tested.
In summary, debugging, error handling, and unit testing are crucial aspects of writing robust and reliable Python code. By mastering these practices, you can create software that is more stable, maintainable, and resistant to bugs.
Standard Library
Python's standard library is exceptionally comprehensive, offering a wide array of tools and functionalities. It consists of built-in modules, some of which are implemented in C, granting Python programmers access to system-level operations like file input/output, which would otherwise be beyond their reach. Additionally, Python-written modules in the standard library provide standardized solutions to a multitude of common programming challenges. Many of these modules are intentionally designed to foster and improve the portability of Python programs, abstracting away platform-specific details and offering platform-neutral APIs.
In addition to the standard library, there exists an active collection of hundreds of thousands of software components. These range from individual programs and modules to fully-fledged packages and comprehensive application development frameworks. This extensive repository of third-party packages can be found for example on the Python Package Index (PyPi) .
In the following sections, we will take a look at some of the most commonly used modules from Python's standard library, equipping you with essential tools and knowledge for your further journey in Python programming.
Argparse
argparse
is a Python module for parsing command-line arguments and options in a structured and user-friendly manner. It simplifies the process of taking input from the command line and is particularly useful when creating Python scripts or programs that require configuration or customization via the command line.When running this script from the command line, you can specify the input_file
and optionally the --output
file as follows:python main.py input.txt --output output.txt
Copy
The copy
module in Python's standard library provides tools for creating copies of objects, especially when dealing with mutable objects like lists and dictionaries. Copying allows us to maintain the integrity of data and prevent unintended side effects. It creates independent copies of data or objects, ensuring that changes made to one copy do not affect the original data or other copies which plays a vital role for use cases such as parallel processing, undo and redo operations or state restoration. The copy
module offers two main methods for copying objects: copy()
and deepcopy()
.
The copy()
method creates a shallow copy of an object. Shallow copying means that it duplicates the top-level structure of the object but does not create copies of the nested objects within it. As a result, if the original object contains other objects (e.g., lists within a list), the copied object will still share references to those nested objects. It is used when you want to duplicate an object's structure without recursively copying all nested objects.
The deepcopy()
method creates a deep copy of an object. Deep copying means that it recursively duplicates the entire structure of the object, including all nested objects. As a result, the copied object is completely independent of the original object. It is used when you want to create a new object that is a true, independent copy of the original object.
File I/O
Python's standard library includes modules for performing file input and output operations. The primary module for this purpose is open()
, which allows you to open files for reading, writing, or both. You can also use the with
statement to ensure that files are properly closed after usage.
Opening a File
The open()
function takes two arguments: the file name (including the path) and the mode in which you want to open the file. Common modes include "r" for read mode (default), "w" for write mode and "a" for append mode.
Closing a File
It's important to close a file after you're done with it to release system resources and ensure that changes are saved. You can use the close()
method or work with files using the with
statement, which automatically closes the file when you exit the block.
Reading from a File
Once you've opened a file for reading, you can use various methods to read its content.
Writing to a File
When a file is opened in write or append mode, you can use methods like write()
to add content to the file.
Exception Handling
When working with files, it's a good practice to handle exceptions, especially when opening, reading, or writing files, as various errors can occur (e.g., the file might not exist, you may not have permission to read or write it).
Logging
Logging is an essential part of software development that allows you to record messages, warnings, errors, and other information about the execution of your Python programs. The Python standard library provides a built-in module called logging
for handling logging operations.
To use the logging
module, you need to import it at the beginning of your script. Before you start logging messages, you can configure the logging system according to your needs. This includes specifying the logging level, setting a format for log messages, and defining where the log messages should be directed. Once you've configured logging, you can start logging messages using various log levels. By default, log messages of "WARNING" level and above are displayed on the console. You can control the behavior of log messages, such as where they are displayed and at what level, by configuring loggers, handlers, and formatters. For example, you can send log messages to multiple destinations, filter them, or format them differently.
Math
Python's math
library is a built-in module that provides a wide range of mathematical functions and constants for performing various mathematical operations. You can use the math
module to work with mathematical calculations and functions in your Python programs. Let's take a look at some examples.
OS
The os
module in Python is a built-in library that provides functions for interacting with the operating system, allowing you to perform various tasks related to file and directory manipulation, working with file paths in a platform-independent way, environment variables, process management, and more.
When working with the os
module, it's essential to handle exceptions, especially when performing file and directory operations. Various errors can occur, such as files not existing, insufficient permissions, or incorrect paths.
Pathlib
The pathlib
module in Python is a powerful and object-oriented library for working with file system paths and files. Introduced in Python 3.4, pathlib
offers a more intuitive and platform-independent way to manipulate paths and perform file and directory operations. The central concept of pathlib
is the Path
object. You create a Path
object by passing a path string to its constructor.
Regex
The re
module in Python, also known as the "regex" module, allows you to work with regular expressions. Regular expressions are powerful patterns that help you search for and manipulate text data based on specific patterns.
Let's look at an example to understand why this is useful. Assume, we want to validate an email address.
Regular expressions consist of patterns that describe specific sequences of characters. Some common elements used in regular expressions are:
- Literal Characters: Characters like letters and digits match themselves. For example, the pattern "abc" matches the string "abc" exactly.
- Dot: The dot "." matches any single character except a newline. For example, the pattern "a.c" matches "abc", "adc", and so on.
- Character Classes: Square brackets
[...]
define a character class, and the pattern matches any single character that is in the class. For example,[aeiou]
matches any vowel, and[0-9]
matches any digit. - Caret and Dollar: The caret
^
matches the start of a line or string, and the dollar$
matches the end of a line or string. For example,^abc
matches if "abc" is at the start of a line. - Quantifiers: Quantifiers modify the number of times a pattern is matched.
- *: Matches zero or more occurrences.
- +: Matches one or more occurrences.
- ?: Matches zero or one occurrence.
- {n}: Matches exactly n occurrences.
- {n, m}: Matches between "n" and "m" occurrences.
The re.search(pattern, string)
function is used to search for the first occurrence of a pattern in a string. It returns a match object if a match is found or None
if no match is found. You can use the .group()
method of the match object to extract the matched text.
The re.findall(pattern, string)
function returns a list of all non-overlapping matches of a pattern in a string. It returns an empty list if no matches are found.
The re.sub(pattern, replacement, string)
function replaces all occurrences of a pattern in a string with the specified replacement.
You can also compile regular expressions using re.compile(pattern)
to improve performance if you plan to use the same pattern multiple times.
Let's go back to our previous email example. Thankfully, common patterns have been built into regular expressions by hard-working programmers. Notice that \w
is the same as [a-zA-Z0-9_]
. You may want to test different emails for validity and figure out how and why it works. There are also great (visual) regex testers like Debuggex which can help you.
Sys
The sys
module in Python is a built-in module that provides access to system-specific parameters and functions. It is often used to interact with the Python runtime environment and system-related functionality.
The sys.exit()
function is used to exit a Python script with an optional exit status code. It allows you to terminate the script programmatically, and you can specify an exit status to indicate the success or failure of the script (0 for success, non-zero for failure):
The sys
module also provides access to the standard input, standard output, and standard error streams:
sys.stdin
: Represents the standard input stream. You can use it to read input from the user or a file.sys.stdout
: Represents the standard output stream. You can use it to print output to the console or redirect it to a file.sys.stderr
: Represents the standard error stream. You can use it to print error messages to the console or redirect them to a file.
The sys
module provides information about the Python runtime environment and the underlying operating system through various attributes:
sys.version
: Returns the Python version string.sys.version_info
: Returns a tuple containing the major, minor, micro, release level, and serial version components.sys.platform
: Returns a string indicating the platform where Python is running (e.g., "win32" for Windows, "linux" for Linux).
The sys.path
contains a list of directories that Python searches when importing modules. You can modify this list to add custom directories to the module search path: sys.path.append("/path/to/my_module_directory")
.
Beyond The Basics
Congratulations! You've journeyed through the Python basics, and now it's time to take your skills to the next level. In this section, we'll dive deeper into Python's best practices and explore advanced topics that will empower you to write cleaner, more efficient code. While we've already touched on best practices throughout this course, it's valuable to address some common pitfalls often encountered by Python beginners. These practices not only make your code cleaner but also set the stage for becoming a proficient Python programmer. Additionally, we'll take a look at some fundamental programming concepts, which are applicable across various programming languages, enriching your overall programming knowledge.
Type Annotations
Type annotations are a feature introduced in Python 3.5 that allows you to specify the data types of variables, function parameters, and return values. These annotations provide clarity about the expected types of values in your code and help catch type-related errors early. Recall that Python is a dynamically typed language, which means that variables' types are determined at runtime. In Python, unlike statically typed lanuages, type annotations are not enforced at runtime by the Python interpreter. However, you can use static type checkers like mypy to analyze your code and identify type-related errors before your code is executed. This practice not only helps catch potential issues early but also enhances code readability and documentation by making it explicit and self-documenting. Let's take a look at some examples.
Python's typing
module provides a rich set of type annotations for common data types and allows you to define your own custom types. It serves as a powerful tool for enhancing code clarity and robustness. For more details, examples, and in-depth information on Python's typing system, including relevant PEPs, you can refer to the typing
module's official documenatation .
Decorators
Decorators are a powerful and advanced feature in Python that allow you to modify or enhance the behavior of functions or methods without changing their source code. They are applied to functions or methods using the "@" symbol followed by the decorator function's name. Decorators are a form of metaprogramming and are widely used in Python for various tasks like logging, authentication, timing, validation, and caching. We've already seen some of Pythons built-in decorators before , such as @staticmethod
, @classmethod
, and @property
, which are commonly used when defining classes and methods. Decorators are implemented as regular Python functions. They take a function as an argument and return a new function that usually extends or modifies the behavior of the original function. When you apply a decorator to a function or method, it effectively wraps the original function inside the decorator function. This means that when you call the decorated function, the decorator's code is executed before and/or after the original function's code. Feeling overwhelmed? Don't worry! Let's simplify the concept with a straightforward example to grasp how decorators work.
Dataclasses
Dataclasses are a decorator-based feature introduced in Python 3.7 to simplify the creation of classes that primarily exist to store and manage data. They provide a concise way to define classes with attributes, automatically generate special methods like__init__
, __repr__
, and __eq__
, and improve code readability. This is particularly useful for classes where the primary purpose is to hold attributes, such as configurations, data objects, or simple structs. To create a dataclass, you need to import the dataclass
decorator from the dataclasses
module and apply it to your class. Let's take a look at some basic examples.
Enums
Enums, short for "enumerations", are a valuable feature in Python that allow you to define a set of named, constant values representing discrete choices or options. Enums make your code more readable and maintainable by providing meaningful names to these values. Since they are type-safe, they help to prevent errors caused by incorrect or invalid values. The enum
module was introduced in Python 3.4. Let's take a look at an example to understand how enums work.
How To Name Things In Code
Choosing appropriate names for elements in your code is a fundamental aspect of writing clean and maintainable code independent of the programming language. Phil Karlton's famous quote, "There are only two hard things in Computer Science: cache invalidation and naming things," highlights the challenges associated with naming in programming. In this section, we will explore some best practices and conventions for naming variables, functions, classes, and modules to make your code more readable and understandable.
A common thing you'll see are variables with a single letter. Variable names should be descriptive and reflect the purpose of the variable. While it may be tempting to use single-letter variable names like x
, y
, or i
, it is generally best to avoid them. These names lack context and can make your code less intuitive and harder to understand. Instead, opt for names that clearly convey the information the variable holds. Taking this even further, you should never abbreviate names. Although there might be a common understanding of some single letter variables and abbreviations in the field you are working, variables and abbreviations rely on the context that one may or may not have. Long variable and function names are in general not a problem, especially not in times of IDEs with autocompletion and large screens, as long as they are concise. Using concise naming improves readability and understandability of code. There may be one exception, namely loop variables. For variables that only live for a short time in a particular scope, using single letter variables may be sufficient.
In the past, programmers often used Hungarian notation, where variable names were prefixed with their data types (e.g., str_name
for a string or int_count
for an integer). You should avoid to put types in your names. Nowadays, this is no longer necessary and just makes things unnecessary complex, especially for dynamically typed languages like Python.
When working with measurements or quantities, it's a good practice to include units in variable names. This adds clarity to your code and ensures that others can understand the units your variables represent. Consider for example a function which takes something called delay
as an argument. Is this supposed to be a value in milliseconds, seconds or even hours? Using a proper variable name like delay_seconds
can prevent confusion and potential errors. But even better than that is to use a type that removes the ambiguity completely. The type abstracts the user from understanding the exact underlying unit. For dynamically typed languages like Python, we can unfortunately not rely on type declarations.
Sometimes, difficulty in naming elements in your code may indicate underlying structural issues. A common anti-pattern is collecting unrelated functions into a single "utils" module or class. If you find yourself tempted to name something as generic as "utils" or "helper," it might be a sign that your code needs refactoring. Consider moving your code to already existing classes and modules or breaking it down into smaller, logically organized modules or classes with descriptive names. This not only makes your code easier to navigate but also improves its maintainability.
To write code that is clean, understandable, and maintainable, it's crucial to adhere to conventions, best practices, and coding standards specific to the programming language you are using. These guidelines provide a common set of rules and recommendations that help programmers write code that is not only functional but also follows a standardized structure for readability and collaboration. I recommend that you invest time in familiarizing yourself with the conventions and best practices established for the programming language you are working with. Each language has its own set of guidelines that have evolved over time to improve code quality and consistency. In the case of Python, we have PEP 8, which serves as the official style guide for Python code. If you're working in a team, consistency becomes even more critical. Teams often establish their own set of coding conventions and style guidelines to ensure that everyone on the team is on the same page. These guidelines help maintain a unified coding style across the project and make it easier for team members to understand and collaborate on each other's code. Even if you have personal preferences that differ from your team's chosen conventions, it's essential to prioritize team consistency. Large organizations, such as Google, often have extensive style guides for various programming languages. The reason behind this is not just for the sake of bureaucracy but rather for the sake of efficiency, maintainability, and collaboration. When you're working on a large codebase with multiple developers, adhering to a common set of coding standards becomes essential. Style guides help prevent common pitfalls, reduce debugging time, and facilitate the integration of code from different team members. They also make it easier for new team members to onboard quickly.
Python Code Best Practices
Exploring Pythonic code and best practices is crucial for writing clean, readable, maintainable, and efficient Python programs. Pythonic code is code that adheres to the style, idioms, and principles of the Python programming language. Let's take a look at some best practices:
- Follow conventions, such as PEP 8: consistent formatting, naming conventions, and code structure make your code more readable and maintainable. Tools like linters can help you enforce PEP 8 standards.
- Avoid magic numbers: define them as named constants to improve code readability and make changes easier. For example, use
MAX_SIZE = 100
instead ofif len(data) > 100
. - Comment and document effectively: write meaningful docstrings for modules, classes and functions to explain their purpose. Use comments sparingly.
- Use f-strings instead of string concatentation to increase readability.
- Avoid global variables: they can lead to unintended side effects, make code harder to reason about and are most of the time not needed at all.
- Use comprehensions and lambda functions to a reasonable extent: they can make your code more concise. However, there's a trade-off between conciseness and readability. Complex comprehensions can become less readable.
- Use context managers (
with
statement) instead oftry
andfinally
. They ensure proper resource cleanup. - Avoid bare except clauses and use specific exceptions instead.
- Use
isinstance
to check for a type: this allows more flexibility and robustness in handling object types. - Use
if
instead ofif [Boolean]
orif len()
. - Instead of iterating using the range length idiom, use
for in
or, if you need the index, useenumerate
(you may also want to take a loop atzip
). - Use packing and unpacking to work with multiple values, such as tuples and lists.
- Avoid using wildcard imports (
import *
): they can clutter your namespace and make it unclear in which module a variable or function is defined. Use explicit imports instead. - Use logging instead of
print
: logging provides more flexibility, making it more suitable for debugging and production environments.
Working Code Is Not Enough
To conclude this introduction, let's explore some high-level concepts of software design. While our primary focus in this lecture is on Python programming, it's essential to understand that writing code in Python, or any language for that matter, is just one part of software development. Understanding the fundamentals of software design will help you as you continue to learn. The following ideas are inspired by John Ousterhout's book, "A Philosophy of Software Design", which I really recommend to read for those who want to dive deeper. You can also gain a preliminary insight into these ideas by watching the related talk at Google .
In the field of software development, we have the unique ability to create virtual worlds, automate complex tasks, and bring digital ideas to life. However, this creative freedom comes with its own set of challenges. As we craft software, it evolves and grows in complexity over time. New features are added, requirements change, and the codebase accumulates intricacies. Software starts simple but tends to grow in complexity over time. As new features are added and requirements evolve, the complexity of the codebase inevitably accumulates. This natural progression presents a significant challenge — how do we continue to work on, modify, and extend a system as it becomes increasingly intricate? This is where software design comes to our rescue. By embracing the principles of good software design, we can fight complexity and build larger and more powerful systems before complexity becomes overwhelming.
One of the key elements of good software design is our mindset when approaching programming tasks. Many programmers adopt a mindset called tactical programming, where the primary goal is to get a specific feature working or fix a particular bug. This approach is entirely reasonable, especially when dealing with immediate needs. However, the problem is that this approach will not lead to good software design and is thus short-sighted. Since complexity is incremental, these small things that make a system complicated, will accumulate. These complexities will start causing problems, so you will either need to refactor existing code or work around the problems which will create even more complexity. At some point refactoring will become practically almost infeasible, so the code will become worse and worse.
Thus, it is important to understand that working code isn't enough. Strategic programming offers an alternative perspective. Instead of merely focusing on functionality, it encourages us to craft a design that facilitates long-term code maintenance, extension, and modification. This strategic approach demands an investment mindset. Yes, these investments may temporarily slow down development in the short term, but they pave the way for greater efficiency and effectiveness in the long term.
Now, the question arises: How much should you invest in software design? Attempting to design an entire system upfront, following the traditional waterfall method, often proves impractical and inflexible. Instead, the best approach is to make continuous, incremental improvements. As a rule of thumb, dedicating 10-20% of your total development time to these strategic investments strikes a good balance between immediate functionality and long-term maintainability.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.