Subroutines - Kasetsart University

Subroutines - Kasetsart University

More Lists, File Input, and Text Processing 01204111 Computers and Programm Chaiporn Jaikaeo Department of Computer Engineering Kasetsart University Cliparts are taken from http://openclipart.org Revised 2017-10-18 Outline Reading text files Creating lists from other sequences using list comprehensions Tabular data and nested lists 2 Task: Text File Reader Read lines from a specified text file and display them along with their line numbers

Suppose there is a file named data.txt that contains two lines: data.txt Hello Good morning Then an example output of the program will be Enter file name: data.txt Line 1: Hello Line 2: Good morning 3 Creating a Text File A text file can be created using any text editor such as Notepad IDLE is also a text editor Choose menu File Open and start writing contents Save a file with .txt extension, not .py 4 Reading File with Python

Reading file's contents as a single string by combining open() function with file.read() method Note that open() returns a file object, and file.read() returns a string open(filename).read( ) >>> s = open("data.txt").read() >>> s 'Hello\nGood morning\n' Reading file's contents as a list of strings, one per line Method str.splitlines() returns a list open(filename).read().splitline s() >>> lines = open("data.txt").read().splitlines() >>> lines ['Hello', 'Good morning'] 5 Trivia: Functions vs.

Methods A method is a function bound to an object Functions are called by just their names (e.g., len(), sum()) >>> len >>> len("abc") 3 Methods are called with their names and objects they are bound to (e.g., str.split(), where str is replaced by a string) >>> s = "Hello, World" >>> s.split >>> s.split(",") ['Hello', ' World'] 6 Text File Reader Program Our program reads a file as a list of strings, then traverse the list to print out each line filename = input("Enter file name: ") lines = open(filename).read().splitlines()

for i in range(len(lines)): print(f"Line {i+1}: {lines[i]}") Enter file name: data.txt Line 1: Hello Line 2: Good morning data.txt Hello Good morning 7 Trivia File Location Matters If the text file is located in the same folder as the program Just type the file name, i.e., data.txt If not, the entire path name of the file must be used, e.g., C:\Users\user\Desktop\data.txt Windows: Click a file icon in Explorer Press Ctrl-C Back to IDLE and press Ctrl-V

macOS: Enter file name: data.txt Line 1: Hello Line 2: Good morning Click a file icon in Finder Press Alt-Command-C Back to IDLE and press Command-V 8 Trivia Files should be closed Opened files should be properly closed Files in the examples are closed automatically in most Python environments In real applications, you should explicitly close a file Two common methods: using the with statement or the close() method file is closed automatically with open("file.txt") as f:

for line in f.readlines(): # process the lines when exiting the with block f = open("file.txt") for line in f.readlines(): # process the lines f.close() We won't use them in this course file is closed manually 9 Task: Score Ranking Read a file containing a list of scores Then sort the scores from highest to lowest and print out the ranking Enter score file: scores.txt Rank #1: 97.5

Rank #2: 87.3 Rank #3: 75.6 Rank #4: 63.0 Rank #5: 37.6 scores.txt 87.3 75.6 63.0 97.5 37.6 10 Score Ranking Ideas Scores must be read as a list of numbers, not strings Each string member must get converted into a number lines "87.3 " "75.6

" scores 87.3 75.6 "63.0 "97.5 " " float() 63.0 97.5 "37.6 " 37.6 Straightforward code with a for loop: : lines = open(filename).read().splitlines()

scores = [] for x in lines: scores.append(float(x)) : 11 List Comprehensions List comprehensions are a powerful and concise way to create new lists from other sequences list2 = [ expression for item in list1 ] It behaves exactly like Similar to a set notation in mathematics, e.g., list2 = [] for item in list1: list2.append(expression) list1 x0

x1 x2 expressio n list2 y0 y1 y2 xi yi

12 Examples: List Comprehensions Create a new list with all values doubled from another list >>> L1 = [5,1,2,8,9,12,16] >>> L2 = [2*x for x in L1] >>> L2 [10, 2, 4, 16, 18, 24, 32] Create a list of squares of n, where n = 1,2,,10 A range object can be used directly inside a list comprehension >>> [i**2 for i in range(1,11)] [1, 4, 9, 16, 25, 36, 49, 64, 81, 100] Compute the>>>sum sum([i**2 for i in of squares of range(1,11)]) n, where n = 1,2,,10 385

13 Score Ranking Ideas With a list comprehension, the code scores = [] for x in lines: scores.append(float(x)) can be replaced by a much more concise statement: scores = [float(x) for x in lines] 14 Score Ranking Program filename = input("Enter score file: ") lines = open(filename).read().splitlines() scores = [float(x) for x in lines] Sort the scores from scores.sort(reverse=True) highest to lowest for i in range(len(scores)): print(f"Rank #{i+1}: {scores[i]}")

Enter score file: scores.txt Rank #1: 97.5 Rank #2: 87.3 Rank #3: 75.6 Rank #4: 63.0 Rank #5: 37.6 scores.txt 87.3 75.6 63.0 97.5 37.6 15 Caveats Empty Lines in File Empty lines in the input file will break the program Enter score file: scores.txt Traceback (most recent call last): File "score-rank.py", line 3, in scores = [float(x) for x in lines]

File "score-rank.py", line 3, in scores = [float(x) for x in lines] ValueError: could not convert string to float: scores.txt 87.3 75.6 empty line 63.0 97.5 37.6 empty line We must filter out those empty lines before converting them to floats 16

Conditional List Comprehensions Only certain members in the original list are selected to be included in the new list using if keyword list2 = [ expression for item in list1 if condition] The above is similar to list2 = [] for item in list1: if condition: list2.append(expression) 17 Examples: Conditional List Comprehensions Split numbers into odd and even sets of numbers >>> >>> >>> >>>

[5, >>> [2, L = [5,1,2,8,9,12,16] odd = [x for x in L if x%2 == 1] even = [x for x in L if x%2 == 0] odd 1, 9] even 8, 12, 16] Create a list of positive integers less than 100 that are divisible by 8 but not divisible by 6 >>> [x for x in range(1,100) if x%8 == 0 and x%6 ! = 0] [8, 16, 32, 40, 56, 64, 80, 88] 18 Score Ranking Revised Program This version skips empty lines in the input file filename = input("Enter score file: ")

lines = open(filename).read().splitlines() scores = [float(x) for x in lines if x != ""] scores.sort(reverse=True) for i in range(len(scores)): This condition helps print(f"Rank #{i+1}: {scores[i]}") skip empty lines Enter score file: scores.txt Rank #1: 97.5 Rank #2: 87.3 Rank #3: 75.6 Rank #4: 63.0 Rank #5: 37.6 scores.txt 87.3 75.6 63.0 97.5 37.6

19 Challenge Top-Three Ranking Modify the program so that it always outputs only the top three ranks Enter score file: scores.txt Rank #1: 97.5 Rank #2: 87.3 Rank #3: 75.6 scores.txt 87.3 75.6 63.0 97.5 37.6 20 Tabular Data Most real-world data are often available in tabular form For example, this is a snapshot of household income statistics by

year available at http://data.go.th 21 CSV Files Comma-Separated Values Commonly used to store tabular data as a text file Each line is a row Columns in each line (row) are separated by commas rows Subject Credits Grade 01175112 1 B+

01204111 3 A 01417167 3 B grades.txt 01175112,1,B+ 01204111,3,A 01417167,3,B columns CSV files can be opened directly in Microsoft Excel 22

Task: GPA Calculator Read a CSV file containing a list of subject codes, their credits, and the grades received Then display grade summary, total credits, and GPA Enter grade data file: grades.txt ---------------------------------Subject Credits Grade Point ---------------------------------01175112 1 B+ 3.5 01204111 3 A 4.0 01355112 3 C+ 2.5 01417167 3 B 3.0

---------------------------------Total credits = 10 GPA = 3.20 grades.txt 01175112,1,B+ 01204111,3,A 01355112,3,C+ 01417167,3,B 23 GPA Calculator Ideas How to store tabular data in Python? A table is a list of rows; each row is a list of columns We need a list of lists also known as a nested list >>> >>> 2 >>> [4, >>>

6 table table = [[1,2,3],[4,5,6]] len(table) table[1] 5, 6] table[1][2] Access row#1 (2nd row) 1 2 3 4 5 6

Access column#2 (3rd column) in row#1 (2nd row) 24 GPA Calculator Steps Divide the whole task into three major steps Step 1: read grade table data from file as a nested list Step 2: display the grade table Step 3: calculate total credits and GPA 25 Breaking Lines into Columns Python provides str.split() method >>> line = "01204111,3,A" >>> line.split(",") ['01204111', '3', 'A'] Let us try using it inside a list comprehension >>> lines = open("grades.txt").read().splitlines() >>> lines

['01175112,1,B+', '01204111,3,A', '01355112,3,C+', '01417167,3,B'] >>> table = [x.split(",") for x in lines] >>> table [['01175112', '1', 'B+'], ['01204111', '3', 'A'], ['01355112', '3', 'C+'], ['01417167', '3', 'B']] We now got a nested list! 26 GPA Calculator Steps Step 1 - read grade table from file as a nested list We will define read_table() function as follows def read_table(filename): lines = open(filename).read().splitlines() table = [x.split(",") for x in lines if x != ""] return table Let's test it >>> read_table("grades.txt")

[['01175112', '1', 'B+'], ['01204111', '3', 'A'], ['01355112', '3', 'C+'], ['01417167', '3', 'B']] grades.txt 01175112,1,B+ 01204111,3,A 01355112,3,C+ 01417167,3,B 27 GPA Calculator Steps The resulting table is not complete >>> read_table("grades.txt") [['01175112', '1', 'B+'], ['01204111', '3', 'A'], ['01355112', '3', 'C+'], ['01417167', '3', 'B']] Output on the right is what we expect to get in the end The credits column should store integers, not strings, for later calculation The point column is still missing

grades.txt 01175112,1,B+ 01204111,3,A 01355112,3,C+ 01417167,3,B Enter grade data file: grades.txt ---------------------------------Subject Credits Grade Point ---------------------------------01175112 1 B+ 3.5 01204111 3 A 4.0 01355112 3 C+ 2.5 01417167

3 B 3.0 ---------------------------------Total credits = 10 GPA = 3.20 28 GPA Calculator Steps We will traverse the table list to perform adjustment on each row We also define grade_point() function to map a grade to a point def read_table(filename): lines = open(filename).read().splitlines() table = [x.split(",") for x in lines if x != ""] for row in table: # convert credits to integers row[1] = int(row[1]) # add a new column for grade point row.append(grade_point(row[2])) return table >>> table = read_table("grades.txt")

>>> table [['01175112', 1, 'B+', 3.5], ['01204111', 3, 'A', 4.0], ['01355112', 3, 'C+', 2.5], ['01417167', 3, 'B', 3.0]] def grade_point(grade): if grade == "A": return 4.0 elif grade == "B+": return 3.5 elif grade == "B": return 3.0 elif grade == "C+": return 2.5 elif grade == "C": return 2.0 elif grade == "D+": return 1.5 elif grade == "D": return 1.0 elif grade == "F": return 0.0

29 GPA Calculator Steps Step 2 - display the grade table Traverse the table list and print out each row def print_table(table): print("-----------------------------------") print(" Subject Credits Grade Point") print("-----------------------------------") for row in table: print(f" {row[0]:8} {row[1]:5} {row[2]:<5} {row[3]:.1f}") print("-----------------------------------") >>> print_table(table) # table from previous step ----------------------------------Subject Credits Grade Point ----------------------------------01175112 1 B+ 3.5

01204111 3 A 4.0 01355112 3 C+ 2.5 01417167 3 B 3.0 ----------------------------------- Not so difficult, but a lot of tweaking to get a nice-looking table 30 GPA Calculator Steps Step 3 - calculate total credits and GPA

Total of credits is computed from the summation of column#1 in all rows total_credits = sum([row[1] for row in table]) GPA is computed from the summation of credits*point of all subjects credits column#1, point column#3 sum_credits_point = sum([row[1]*row[3] for row in table]) gpa = sum_credits_point/total_credits 31 GPA Calculator Main Program read_table() and print_table() are not shown filename = input("Enter grade data file: ") table = read_table(filename) print_table(table) total_credits = sum([row[1] for row in table]) sum_credits_point = sum([row[1]*row[3] for row in table]) gpa = sum_credits_point/total_credits print(f"Total credits = {total_credits}") print(f"GPA = {gpa:.2f}")

grades.txt 01175112,1,B+ 01204111,3,A 01355112,3,C+ 01417167,3,B Enter grade data file: grades.txt ----------------------------------Subject Credits Grade Point ----------------------------------01175112 1 B+ 3.5 01204111 3 A 4.0 01355112 3 C+ 2.5 01417167

3 B 3.0 ----------------------------------Total credits = 10 GPA = 3.20 32 Notes: Why Subroutines? Most examples in this course could be written without using subroutines at all That would also result in a bit shorter programs However, breaking a task into subroutines helps focus on smaller, more manageable problems (i.e., separation of concerns), makes programs easier to read, test, and find bugs, and makes it easier to divide tasks among team members 33

Conclusion Data can be read into a program from a text file instead of being entered by hand Saves time and reduces user error List comprehensions help create new lists in an expressive and concise way Tabular data can be represented in Python as a nested list 34 References Python Language for Grades 10-12 (in Thai). The Institute for the Promotion of Teaching Science and Technology (ISPT). List comprehensions https://docs.python.org/3/tutorial/datastructures.html#list-compr ehensions How to read a file with Python https://www.webucator.com/how-to/how-read-file-with-python.cf m

35 Syntax Summary (1) Open a file and read its contents as a single string open(filename).read() Open a file and read its contents as a list of strings, one string per line open(filename).read().splitlines() Split a string s into a list of strings using the specified delimiter s.split(delimiter) 36 Syntax Summary (2) Create a list using a list comprehension [expression for item in list] Create a list using a conditional list comprehension [expression for item in list if condition]

37 Revision History September 2016 Intiraporn Mulasatra ([email protected]) Prepared slides for files and sorting in C# October 2017 Chaiporn Jaikaeo ([email protected]) Revised for Python 38

Recently Viewed Presentations

  • Human Resource Management, 15e - Anvari.Net

    Human Resource Management, 15e - Anvari.Net

    The Strategic Management Process (1 of 2) The Strategic Management Process - is defined as the process of identifying and executing the organization's strategic plan by matching the company's capabilities with the demands of the environment.
  • NATURE OF SCIENCE The International System of BENCHMARKS:

    NATURE OF SCIENCE The International System of BENCHMARKS:

    They are both one kilogram so they weight the same, but it takes more feathers than lead to equal one kilogram! We say the lead is more dense than the feathers. Density is how much matter is in something (mass),...
  • Legal English and the Common Law

    Legal English and the Common Law

    Sometimes agreement is used as a synonym for contract, but: if every contract is an agreement, not every agreement is a contract. A contract in itself constitutes a type of agreement, it is a legally binding agreement, that is an...
  • Sustainable development of PV: T he creation of step-by-step ...

    Sustainable development of PV: T he creation of step-by-step ...

    Assistant Professor, Institute of Law for Science and Technology (ILST), National Tsing Hua University, Taiwan PhD in Energy Law, KU Leuven Belgium 7th International Scientific Conference on Energy and Climate Change, 8-10 October 2014*
  • Introductory Astronomy 2 - Mayfield City Schools

    Introductory Astronomy 2 - Mayfield City Schools

    The Moon looks red during a total lunar eclipse for the same reason that the Sun appears reddish at sunrise and sunset, and the sky appear blue. Sunlight is composed of all the colors of the rainbow (red, orange, yellow,...
  • Chapter 7 Court Organization and Operation

    Chapter 7 Court Organization and Operation

    In the U.S. we have what is called an adversarial court system. The adversarial system the desire to win can become overpowering, for both the prosecution and the defense. Still one of the best systems that you can find worldwide....
  • La flogosi allergica e le infezioni nel bambino

    La flogosi allergica e le infezioni nel bambino

    Mild RI involved the rhinopharynx (rhinitis, rhinopharyngitis with possible involvement of tonsils), and/or the larynx. Severe RI involved the middle ear, the lower airways, or the paranasal sinuses. allergia e infezioni, responsabile di un inizio più precoce della sintomatologia e...
  • PPBES Welcome to the Matrix - University of North Dakota

    PPBES Welcome to the Matrix - University of North Dakota

    It is used in GSI for global and regional data assimilation as well as applied in GOES-R studies. The aerosol module contains the mass extinction, scattering coefficients and detailed phase function for dust, sea salt, organic carbon, black carbon, and...