Programming and Data Analysis¶
Getting started with Python
Yao-Jen Kuo yaojenkuo@ntu.edu.tw from DATAINPOINT
Getting Started¶
Tools we use to write/run Python programs¶
- Terminal(Anaconda Prompt/Terminal of macOS.)
- Text editor(Visual Studio Code is recommended.)
- Python interpreter(Miniconda is recommended.)
Learn a bit more about what terminal/bash/command line is¶
Using Stack Overflow and Pythontutor.com to help us learn programming¶
How to use Stack Overflow efficiently?¶
- The first post is question itself.
- The second post, if checked "Green", is the answer chose by the initiator.
- The third post, is the answer up-voted most by others.
Functions¶
What is print()
in our previous example?¶
print("Hello world!")
print()
is one of the so-called built-in functions in Python.
What is a function¶
A function is a named sequence of statements that performs a computation, either mathematical, symbolic, or graphical. When we define a function, we specify the name and the sequence of statements. Later, we can call the function by name.
How do we analyze a function?¶
- function name.
- inputs and parameters, if any.
- sequence of statements in a code block belongs to the function itself.
- outputs, if any.
Take bubble tea shop for instance¶
Source: Google Search
What is a built-in function?¶
A pre-defined function, we can call the function by name without defining it.
How many built-in functions are available?¶
print()
help()
type()
- ...etc.
Get HELP with help()
¶
help(print)
Help on built-in function print in module builtins: print(*args, sep=' ', end='\n', file=None, flush=False) Prints the values to a stream, or to sys.stdout by default. sep string inserted between values, default a space. end string appended after the last value, default a newline. file a file-like object (stream); defaults to the current sys.stdout. flush whether to forcibly flush the stream.
help(type)
Help on class type in module builtins: class type(object) | type(object) -> the object's type | type(name, bases, dict, **kwds) -> a new type | | Methods defined here: | | __call__(self, /, *args, **kwargs) | Call self as a function. | | __delattr__(self, name, /) | Implement delattr(self, name). | | __dir__(self, /) | Specialized __dir__ implementation for types. | | __getattribute__(self, name, /) | Return getattr(self, name). | | __init__(self, /, *args, **kwargs) | Initialize self. See help(type(self)) for accurate signature. | | __instancecheck__(self, instance, /) | Check if an object is an instance. | | __or__(self, value, /) | Return self|value. | | __repr__(self, /) | Return repr(self). | | __ror__(self, value, /) | Return value|self. | | __setattr__(self, name, value, /) | Implement setattr(self, name, value). | | __sizeof__(self, /) | Return memory consumption of the type object. | | __subclasscheck__(self, subclass, /) | Check if a class is a subclass. | | __subclasses__(self, /) | Return a list of immediate subclasses. | | mro(self, /) | Return a type's method resolution order. | | ---------------------------------------------------------------------- | Class methods defined here: | | __prepare__(...) | __prepare__() -> dict | used to create the namespace for the class statement | | ---------------------------------------------------------------------- | Static methods defined here: | | __new__(*args, **kwargs) | Create and return a new object. See help(type) for accurate signature. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __abstractmethods__ | | __annotations__ | | __dict__ | | __text_signature__ | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __base__ = <class 'object'> | The base class of the class hierarchy. | | When called, it accepts no arguments and returns a new featureless | instance that has no instance attributes and cannot be given any. | | | __bases__ = (<class 'object'>,) | | __basicsize__ = 920 | | __dictoffset__ = 264 | | __flags__ = 2156420354 | | __itemsize__ = 40 | | __mro__ = (<class 'type'>, <class 'object'>) | | __type_params__ = () | | __weakrefoffset__ = 368
We can also help()
on help()
¶
help(help)
Help on _Helper in module _sitebuiltins object: class _Helper(builtins.object) | Define the builtin 'help'. | | This is a wrapper around pydoc.help that provides a helpful message | when 'help' is typed at the Python interactive prompt. | | Calling help() at the Python prompt starts an interactive help session. | Calling help(thing) prints help for the python object 'thing'. | | Methods defined here: | | __call__(self, *args, **kwds) | Call self as a function. | | __repr__(self) | Return repr(self). | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables | | __weakref__ | list of weak references to the object
Besides built-in functions or library-powered functions, we sometimes need to self-define our own functions¶
def
the name of our functionreturn
the output of our function
def function_name(INPUTS: type, PARAMETERS: type, ...) -> type:
"""
docstring: print documentation when help() is called
"""
# sequence of statements
return OUTPUTS
# Definition
def add(x: int, y: int) -> int:
"""
Equivalent to x + y
>>> add(5, 6)
11
>>> add(55, 66)
121
>>> add(8, 7)
15
"""
return x + y
help(add)
Help on function add in module __main__: add(x: int, y: int) -> int Equivalent to x + y >>> add(5, 6) 11 >>> add(55, 66) 121 >>> add(8, 7) 15
Call the function by name after defining it¶
print(add(5, 6))
11
Programming based on testing is called TDD, Test-Driven Development¶
- Test-driven development (TDD) is a software development process relying on software requirements being converted to test cases before software is fully developed.
- Our assignments and exams are the minimal version of TDD.
Arithmetic Operators in Python¶
Symbols that represent computations¶
+
,-
,*
,/
are quite straight-forward.**
for exponentiation.%
for remainder.//
for floor-divide.
When an expression contains more than one operator, the order of evaluation depends on the operator precedence¶
- Parentheses have the highest precedence.
- Exponentiation has the next highest precedence.
- Multiplication and division have higher precedence than addition and subtraction.
- Operators with the same precedence are evaluated from left to right.
Converting Fahrenheit to Celsius¶
\begin{equation} \text{Celsius}(^{\circ}C) = \left( \text{Fahrenheit}(^{\circ}F) - 32 \right) \times \frac{5}{9} \end{equation}
def convert_fahrenheit_to_celsius(x: int) -> float:
"""
Converting from fahrenheit scale to celsius scale.
>>> convert_fahrenheit_to_celsius(32)
0.0
>>> convert_fahrenheit_to_celsius(212)
100.0
"""
out = (x - 32) * 5/9
return out
print(convert_fahrenheit_to_celsius(32))
print(convert_fahrenheit_to_celsius(212))
0.0 100.0
How to properly use functions?¶
- Using arguments to adjust the output of a defined function.
- Differentiate functions versus methods.
- Be aware of the update mechanism.
sorted()
function takes a bool
argument for reverse
parameter¶
list_to_be_sorted = [11, 5, 7, 2, 3]
print(sorted(list_to_be_sorted, reverse=True))
print(sorted(list_to_be_sorted))
[11, 7, 5, 3, 2] [2, 3, 5, 7, 11]
Different syntax¶
function_name(OBJECT, ARGUMENTS) # function
OBJECT.method_name(ARGUMENTS) # method
list
has a method sort()
works like sorted()
function¶
list_to_be_sorted = [11, 5, 7, 2, 3]
print(sorted(list_to_be_sorted))
list_to_be_sorted.sort()
print(list_to_be_sorted)
[2, 3, 5, 7, 11] [2, 3, 5, 7, 11]
How is the list_to_be_sorted
being updated?¶
# update through return
list_to_be_sorted = [11, 5, 7, 2, 3]
sorted_list = sorted(list_to_be_sorted)
print(sorted_list)
[2, 3, 5, 7, 11]
# update through change of state
list_to_be_sorted = [11, 5, 7, 2, 3]
list_to_be_sorted.sort()
print(list_to_be_sorted)
[2, 3, 5, 7, 11]
Variables¶
We usually don't just print out literal values¶
print("Hello world!")
Hello world!
It is more useful to refer a literal value by an object name¶
hello_world = "Hello world!"
print(hello_world.lower())
print(hello_world.upper())
print(hello_world.swapcase())
print(hello_world.title())
hello world! HELLO WORLD! hELLO WORLD! Hello World!
A variable is a name that refers to a value¶
variable_name = literal_value
Choose names for our variables: dos¶
- Use a lowercase single letter, word, or words.
- Separate words with underscores to improve readability(so-called snake case).
- Be meaningful.
Using #
to write comments in our program¶
Comments can appear on a line by itself, or at the end of a line.
# Turn fahrenheit to celsius
def from_fahrenheit_to_celsius(x: int) -> float:
out = (x - 32) * 5/9
return out
print(from_fahrenheit_to_celsius(32)) # turn 32 fahrenheit to celsius
print(from_fahrenheit_to_celsius(212)) # turn 212 fahrenheit to celsius
0.0 100.0
Everything from #
to the end of the line is ignored during execution¶
Data Types¶
Values belong to different types, we commonly use¶
int
andfloat
for numeric computing.str
for symbolic.bool
for conditionals.NoneType
for undefined values.
Use type()
function to check the type of a certain value/variable¶
print(type(5566))
print(type(42.195))
print(type("Hello world!"))
print(type(False))
print(type(True))
print(type(None))
<class 'int'> <class 'float'> <class 'str'> <class 'bool'> <class 'bool'> <class 'NoneType'>
str
¶
How to form a str
?¶
Use paired '
, "
, or """
to embrace letters strung together.
str_with_single_quotes = 'Hello world!'
str_with_double_quotes = "Hello world!"
str_with_triple_double_quotes = """Hello world!"""
print(type(str_with_single_quotes))
print(type(str_with_double_quotes))
print(type(str_with_triple_double_quotes))
<class 'str'> <class 'str'> <class 'str'>
If we have single/double quotes in str
values we might have SyntaxError
¶
mcd = 'I'm lovin' it!'
Use \
to escape or paired "
or paired """
¶
mcd = 'I\'m lovin\' it!'
mcd = "I'm lovin' it!"
mcd = """I'm lovin' it!"""
Great features of strings formed with paired """
¶
- A paragraph.
- Docstring.
Use paired """
for a paragraph¶
storyline = """
Chronicles the experiences of a formerly successful banker\
as a prisoner in the gloomy jailhouse of Shawshank after\
being found guilty of a crime he did not commit. The film\
portrays the man's unique way of dealing with his new, torturous\
life; along the way he befriends a number of fellow prisoners,\
most notably a wise long-term inmate named Red.
"""
sql_query = """
SELECT *
FROM world
WHERE country = 'Taiwan';
"""
Use paired """
for docstring¶
def from_fahrenheit_to_celsius(x: int) -> float:
"""
Turns fahrenheit to celsius.
"""
return (x - 32) * 5/9
help(from_fahrenheit_to_celsius)
Help on function from_fahrenheit_to_celsius in module __main__: from_fahrenheit_to_celsius(x: int) -> float Turns fahrenheit to celsius.
We've seen arithmetic operators for numeric values¶
How about those for str
?
str
type takes +
and *
¶
+
for concatenation.*
for repetition.
mcd = "I'm lovin' it!"
print(mcd)
print(mcd + mcd)
print(mcd * 3)
I'm lovin' it! I'm lovin' it!I'm lovin' it! I'm lovin' it!I'm lovin' it!I'm lovin' it!
Format our str
¶
- The
.format()
way. - The
f-string
way.
The f-string
way: uses {}
for string print with format¶
def hello_anyone(anyone: str) -> str:
out = f"Hello {anyone}!"
return out
print(hello_anyone("Anakin Skywalker"))
print(hello_anyone("Luke Skywalker"))
Hello Anakin Skywalker! Hello Luke Skywalker!
Commonly used format¶
{:.nf}
for float format.{:,}
for comma format.
def format_pi(pi: float) -> str:
return f"{pi:.2f}"
print(format_pi(3.1415))
print(format_pi(3.141592))
3.14 3.14
def format_thousands(ntd: int) -> str:
return f"${ntd:,}"
print(format_thousands(1000))
print(format_thousands(1000000))
print(format_thousands(1000000000))
$1,000 $1,000,000 $1,000,000,000
More formats with f-string
¶
bool
¶
How to form a bool
?¶
- Use keywords
True
andFalse
directly. - Use relational operators.
- Use logical operators.
Use keywords True
and False
directly¶
print(True)
print(type(True))
print(False)
print(type(False))
True <class 'bool'> False <class 'bool'>
Use relational operators¶
We have ==
, !=
, >
, <
, >=
, <=
, in
, not in
as common relational operators to compare values.
print(5566 == 5566.0)
print(5566 != 5566.0)
print('56' in '5566')
True False True
Use logical operators¶
- We have
and
,or
,not
as common logical operators to manipulatebool
type values. - Getting a
True
only if both sides ofand
areTrue
. - Getting a
False
only if both sides ofor
areFalse
.
print(True and True) # get True only when both sides are True
print(True and False)
print(False and False)
print(True or True)
print(True or False)
print(False or False) # get a False only when both sides are False
# use of not is quite straight-forward
print(not True)
print(not False)
True False False True True False False True
An example of using logical operators¶
Good marathon weather is often described as dry and cold. Say, the probabilities of dry and cold on race day are both 50%, there is a 25% of chance for good marathon weather.
def is_good_marathon_weather(is_dry: bool, is_cold: bool) -> bool:
return is_dry and is_cold
print(is_good_marathon_weather(True, True))
print(is_good_marathon_weather(True, False))
print(is_good_marathon_weather(False, True))
print(is_good_marathon_weather(False, False))
True False False False
An example of using logical operators(cont'd)¶
Good marathon weather is often described as dry or cold. Say, the probabilities of dry and cold on race day are both 50%, there is a 75% of chance for good marathon weather.
def is_good_marathon_weather(is_dry: bool, is_cold: bool) -> bool:
return is_dry or is_cold
print(is_good_marathon_weather(True, True))
print(is_good_marathon_weather(True, False))
print(is_good_marathon_weather(False, True))
print(is_good_marathon_weather(False, False))
True True True False
bool
is quite useful in control flow and filtering data.¶
NoneType
¶
Python has a special type, the NoneType
, with a single value, None¶
- This is used to represent undefined values.
- It is not the same as
False
, or an empty string''
or 0.
a_none_type = None
print(type(a_none_type))
print(a_none_type == False)
print(a_none_type == '')
print(a_none_type == 0)
print(a_none_type == None)
<class 'NoneType'> False False False True
A function without return
statement actually returns a NoneType
.¶
def hello_anyone(anyone: str) -> str:
print(f"Hello {anyone}!")
hello_anyone("Anakin Skywalker")
hello_anyone("Luke Skywalker")
Hello Anakin Skywalker! Hello Luke Skywalker!
func_out = hello_anyone("Anakin Skywalker")
type(func_out)
Hello Anakin Skywalker!
NoneType
Besides type()
function, data types can also be validated via isinstance()
function¶
an_integer = 5566
a_float = 42.195
a_str = "5566"
a_bool = False
a_none_type = None
print(isinstance(an_integer, int))
print(isinstance(a_float, float))
print(isinstance(a_str, str))
print(isinstance(a_bool, bool))
print(isinstance(a_none_type, type(None))) # print(a_none_type == None)
True True True True True
Data types can be dynamically converted using functions¶
int()
for converting toint
.float()
for converting tofloat
.str()
for converting tostr
.bool()
for converting tobool
.
Upcasting(to a supertype) is always allowed¶
NoneType
-> bool
-> int
-> float
-> str
.
print(bool(None))
print(int(True))
print(float(1))
print(str(1.0))
False 1 1.0 1.0
While downcasting(to a subtype) needs a second look¶
print(float('1.0'))
print(int('1'))
print(bool('False'))
print(bool('NoneType'))
1.0 1 True True