Python Variables
Overview
Python variables are very similar to how variables are used in R. The primary difference is that instead of using ←
or →
to assign variables, Python uses a single =
.
Python has a few key differences from R in regards to variable behavior. Information on variable assignment in Python can be found in the Variable Assigment section below.
Similar to other programming languages Python has several core variable types. Overview of each variable type are included below:
Variable Assigment
my_var = 4
This declares a variable with a value of 4.
Actually this is technically not true. Numbers between -5 and 256 (inclusive) are already pre-declared and exist within Python’s memory before you assigned the value to my_var. The = operator simply forces my_var to point to that value that already exists! That is right, my_var is technically a pointer. |
One of the most important differences between variables in R and Python is what is happening in the background. Take the code example below:
my_var = 4
new_variable = my_var
my_var = my_var + 1
print(f"my_var: {my_var}\nnew_variable: {new_variable}")
my_var: 5 my_var: 4
my_var = [4,]
new_variable = my_var
my_var[0] = my_var[0] + 1
print(f"my_var: {my_var}\nnew_variable: {new_variable}")
my_var: [5] new_variable: [5]
The first chunk of code behaves as you’d expect because int
values are immutable, meaning the values cannot be changed. As a result, when we assign my_var = my_var + 1
, my_vars
value isn’t changing. Instead my_var
is just being pointed to a different value (in this case 5). In comparison new_variable
still points to the value of 4.
The second chunk of code is different because it is dealing with a mutable list
. We first assign the first value (0 index) of the list to a value of 4. We then assign my_var
to new_variable
. Unlike the first example, this does not copy the values. Instead both my_var
and new_variable
point to the same mutable list object. When we then change the value of the list by 1 the change is reflected in each variable since they are pointing to the same object.
An excellent article goes into more detail and can be found here.
None
None
is a keyword used to define a null value. This would be the Python equivalent to R’s NULL
. If used in an if statement, None
represents False
. This does not mean None
== False
, in fact:
print(None == False)
False
Even though None
can represent False
in an if statement Python does not evaluate the two as equivalent.
NaN
The difference between NaN
and None
in Python can be somewhat confusing. The NaN
value stand for not a number
and is commonly used to reference missing data. Python will often convert None
values to NaN
dynamically, especially when working with numbers. There are lots of methods to identify and remove or fill NaN
values, but it is worth noting that Python will evaluate them with no issues for many operations.
For example, if I wanted to sum the rows below and then remove any NaN
values we could try the initial code snippet.
col_1 = [np.nan, 50, 100]
ccol_2 = [np.nan, 100, 50]
example_dataframe = pd.DataFrame(list(zip(col_1, col_2)), columns=['Value 1', 'Value 2'])
example_dataframe['example_sum'] = np.sum(example_dataframe, axis=1)
However, Python would evaluate this as the table below:
Value 1 |
Value 2 |
example_sum |
NaN |
NaN |
0 |
50 |
100 |
150 |
100 |
50 |
150 |
If we were to try to remove NaN
values based on the example_sum
column no rows would be removed. In this case we’d want to remove or fill the Nan
values prior to the aggregation (sum).
bool
A bool
has two possible values: True
and False
. It is important to understand that technically they also correspond to integers:
print(True == 1)
True
print(False == 0)
True
The True
and False
values only correspond to 1 and 0 respectively. They will not evaulate in the same way for other numbers:
print(True == 2)
False
However, if used in an if statement numbers that do not equal 1 or 0 can evaulate to True
. Think of the if statement below as asking the question Does this value equal 3?
and returning True
or False
as a result.
if 3:
print("3 evaluates to True")
3 evaluates to True
str
str
indicate string in Python. String are "immutable sequences of Unicode code points". Strings can be surrounded in single quotes, double quotes, or triple quoted (with either single or double quotes):
print(f"Single quoted text is type: {type('test')}")
Single quoted text is type: <class 'str'>
print(f"Double quoted text is type: {type("test")}")
Double quoted text is type: <class 'str'>
print(f"Triple quoted with single quotes is type: {type('''test''')}")
Triple quoted with single quotes is type: <class 'str'>
print(f"Triple quoted with double quotes is type: {type("""test""")}")
Triple quoted with double quotes is type: <class 'str'>
The benefit of triple quoting a string is that it can span multiple lines in the code. These lines will include the whitespace between the text:
my_string = """This text
spans multiple
lines."""
print(my_string)
This text spans multiple lines.
However, if we tried the same thing without triple quotes:
my_string = "This text,
will throw an error"
print(my_string)
In Python you do have the ability for other code to span multiple lines using \
, but newlines won’t be maintained:
my_string = "This text, \
will throw an error"
print(my_string)
This text, will throw an error
int
int
values are whole numbers. For instance:
my_var = 5
print(type(my_var))
<class 'int'>
int
values can be added, subtracted, or multiplied without changing the variable type. However, divison of int
values will change the variable type to float whether or not the result of the division is a whole number:
print(type(6+2-2*2))
<class 'int'>
print(type(6/2))
<class 'float'>
Similarly, any calculation between an int
and a float
results in a float
:
print(type(6+2.0)) ## 2.0 is a float
<class 'float'>
float
float
values are floating point numbers. Also known as numbers with decimals.
my_var = 5.0
print(type(my_var))
<class 'float'>
float
values can be converted back to int
using the int
function. This coercion causes the float
value to be truncated, regardless of how close to the "next" number the float is. Note: This will not round a number in the way that you would expect. There are round
functions in Python that have the more expected functionality.
print(int(5.5))
5
print(int(5.9999))
5
complex
complex
values represent complex numbers. For example, j
can be used to represent an imaginary number. In order for Python to understand this j
must be preceded by a number. For example 1j
.
my_var = 1j
print(my_var)
1j
print(type(my_var))
<class 'complex'>
Arithmetic with a complex
value always results in a complex
:
print(type(1j * 2))
<class 'complex'>
Unlike the other types mentioned above, you cannot convert a complex
value to an int
or float
:
print(int(1j*1j))
print(float(1j*1j))
Python error :(