The purpose of this assignment is to help you learn or review the Python programming language and database programming in Python. If done properly, the assignment also will give you practice at composing modular code.
Make sure you study the COS 333 Policies web page before doing this assignment or any of the course's assignments.
Any university registrar's office must maintain data about courses and classes offered during the upcoming semester. Those data typically are kept in a database. The registrar's office must provide an interface that allows students and other interested parties to query the database.
Pretend that you're working for Princeton's Registrar's Office. You're given a database containing data about courses and classes offered during an upcoming Princeton semester. (Actually the data are for the Spring 2010 semester.) Your task is to compose Python programs that allow Princeton students and other interested parties to query the database.
Subsequent assignments will be similar to this one. So it will be to your advantage to modularize your code so you easily can reuse code from this assignment in subsequent ones.
Browse to the TigerFile page for this assignment. Download the reg1.zip
file. Then unzip that file to create files named reg.sqlite
, ref_regoverviews.pyc
, ref_regdetails.pyc
, testregoverviewsgiven.py
, testregdetailsgiven.py
, and replace.py
. Subsequent sections of this document describe those files.
The database is a SQLite database that is stored in a given file named reg.sqlite
. The database consists of these five tables:
classes
Fields:classid
,courseid
,days
,starttime
,endtime
,bldg
,roomnum
courses
Fields:courseid
,area
,title
,descrip
,prereqs
crosslistings
Fields:courseid
,dept
,coursenum
coursesprofs
Fields:courseid
,profid
profs
Fields:profid
,profname
These are the relationships among the tables:
courses
and classes
:For each row of thecourses
table withcourseid
x, there are 0 or more rows of theclasses
table withcourseid
x. (Some courses don't have a corresponding class; that is, a particular course might not be offered this semester. Some courses have 1 corresponding class. Some courses have more than 1 corresponding class.)
For each row of theclasses
table withcourseid
x, there is exactly 1 row of thecourses
table withcourseid
x. (Each class has exactly one corresponding course.)
courses
and crosslistings
:
For each row of thecourses
table withcourseid
x, there are 1 or more rows of thecrosslistings
table withcourseid
x. (Some courses have 1 department and course number. Some courses have more than 1 department and course number.)
For each row of thecrosslistings
table withcourseid
x, there is exactly one row of thecourses
table withcourseid
x.
courses
and coursesprofs
:
For each row of thecourses
table withcourseid
x, there are 0 or more rows of thecoursesprofs
table withcourseid
x. (Some courses don't have an assigned professor. Some courses have 1 professor. Some courses have more than one professor. Incidentally, I find it strange that Princeton maps professors to courses instead of mapping professors to classes.)
For each row of thecoursesprofs
table withcourseid
x, there is exactly one row of thecourses
table withcourseid
x.
coursesprofs
and profs
:
For each row of thecoursesprofs
table withprofid
x, there is exactly 1 row of theprofs
table withprofid
x. (Each professor has exactly one name!)
For each row of theprofs
table withprofid
x, there are 0 or more rows in thecoursesprofs
table withprofid
x. (Some professors are teaching 0 courses. Some are teaching 1 course. Some are teaching more than one course.)
Your first step must be to familiarize yourself with the database.
regoverviews.py
ProgramYou must compose two programs. The first program must be named regoverviews.py
.
The given file named ref_regoverviews.pyc
contains a reference program in bytecode form. Your regoverviews.py
must have the same behavior as ref_regoverviews.pyc
. That is, when given the same command-line arguments and database, your regoverviews.py
and ref_regoverviews.pyc
must write exactly the same (character-for-character) output to stdout
and stderr
, and must have the same exit status.
Run ref_regoverviews.pyc
to determine the desired behavior of your regoverviews.py
. Here are some highlights of that behavior...
When executed via a command that contains -h as a command-line argument, your regoverviews.py
must display a help message:
$ python regoverviews.py -h usage: regoverviews.py [-h] [-d dept] [-n num] [-a area] [-t title] Registrar application: show overviews of classes options: -h, --help show this help message and exit -d dept show only those classes whose department contains dept -n num show only those classes whose course number contains num -a area show only those classes whose distrib area contains area -t title show only those classes whose course title contains title
argparse
module.
These example commands describe how your regoverviews.py
must behave when given proper command-line arguments. Give them a try using ref_regoverviews.pyc
:
$ python regoverviews.py
Display data for all classes.
$ python regoverviews.py -d COS
Display data for all classes whose department, when converted to lowercase letters, contains "cos"
.
$ python regoverviews.py -n 333
Display data for all classes whose course number, when converted to lowercase letters, contains "333"
.
$ python regoverviews.py -n b
Display data for all classes whose course number, when converted to lowercase letters, contains "b"
.
$ python regoverviews.py -a Qr
Display data for all classes whose distribution area, when converted to lowercase letters, contains "qr"
.
$ python regoverviews.py -t intro
Display data for all classes whose title, when converted to lowercase letters, contains "intro"
.
$ python regoverviews.py -t science
Display data for all classes whose title, when converted to lowercase letters, contains "science"
.
$ python regoverviews.py -t C_S
Display data for all classes whose title, when converted to lowercase letters, contains "c_s"
. The point is that your program, unlike SQL, must not interpret the underscore character as a wildcard character.
$ python regoverviews.py -t c%S
Display data for all classes whose title, when converted to lowercase letters, contains "c%s"
. The point is that your program, unlike SQL, must not interpret the percent character as a wildcard character.
$ python regoverviews.py -d cos -n 3
Display data for all classes whose department when converted to lowercase letters contains"cos"
and whose course number when converted to lowercase letters contains"3"
.
$ python regoverviews.py -d cos -a qr -n 2 -t intro
Display data for all classes whose department when converted to lowercase letters contains"cos"
and whose area when converted to lowercase letters contains"qr"
and whose course number when converted to lowercase letters contains"2"
and whose title when converted to lowercase letters contains"intro"
.
$ python regoverviews.py -t "Independent Study"
Display data for all classes whose title, when converted to lowercase letters, contains "independent study"
.
$ python regoverviews.py -t "Independent Study "
Display data for all classes whose title, when converted to lowercase letters, contains "independent study "
.
$ python regoverviews.py -t "Independent Study "
Display data for all classes whose title, when converted to lowercase letters, contains "independent study "
.
$ python regoverviews.py -t " Independent Study"
Display data for all classes whose title, when converted to lowercase letters, contains " independent study"
.
$ python regoverviews.py -t " Independent Study"
Display data for all classes whose title, when converted to lowercase letters, contains " independent study"
.
$ python regoverviews.py -t=-c
Display data for all classes whose title, when converted to lowercase letters, contains "-c"
.
In short, your regoverviews.py
must display the classid
, dept
, coursenum
, area
, and title
of each class that matches the criteria specified by the user via command-line arguments.
Your regoverviews.py
must use SQL prepared statements to protect the database against SQL injection attacks.
Your regoverviews.py
must write its output in a human-readable format. These are some salient characteristics of the output:
dept
in ascending order, the secondary sort must be by coursenum
in ascending order, and tertiary sort must be by classid
in ascending order.Normally your regoverviews.py
must terminate with exit status 0. If it detects a database-related error, then it must terminate with exit status 1. If it detects erroneous command-line arguments, then it must terminate with exit status 2 — as is the default behavior of argparse
.
%
) operator and C-style conversion specifications to format each row of output. An expression of this form works:
(There are other reasonable approaches; but given your knowledge from COS 217 of the C language and itsrow = '%5s %4s %6s %4s %s' % (classid, dept, coursenum, area, title)
printf()
conversion specifications, that approach is likely to be the most straightforward for you.)textwrap
module to wrap it to lines. Within each row, lines after the first must be indented using 5+1+4+1+6+1+4+1=23 spaces.
SELECT somefields FROM sometable WHERE somefield LIKE '%'
has the same semantics as a SQL statement of this form:
SELECT somefields FROM sometable
That fact could help you to express the logic in your regoverviews.py
more succinctly.
regdetails.py
ProgramYour second program must be named regdetails.py
.
The given file named ref_regdetails.pyc
contains a reference program in bytecode form. Your regdetails.py
must have the same behavior as ref_regdetails.pyc
. That is, when given the same command-line arguments and database, your regdetails.py
and ref_regdetails.pyc
must write exactly the same (character-for-character) output to stdout
and stderr
, and must have the same exit status.
Run ref_regdetails.pyc
to determine the desired behavior of your regdetails.py
. Here are some highlights of that behavior...
When executed via a command that contains -h as a command-line argument, your regdetails.py
must display a help message:
$ python regdetails.py -h usage: regdetails.py [-h] classid Registrar application: show details about a class positional arguments: classid the id of the class whose details should be shown options: -h, --help show this help message and exit
argparse
module.
When executed via a command that contains a valid classid
, your regdetails.py
must write to stdout
the classid
, courseid
, days
, starttime
, endtime
, bldg
, roomnum
, dept(s)
, coursenum(s)
, area
, title
, descrip
, prereqs
, and profname(s)
for the class with the given classid
. These commands illustrate. Give them a try using ref_regdetails.pyc
:
$ python regdetails.py 8321 $ python regdetails.py 9032 $ python regdetails.py 8293 $ python regdetails.py 9977 $ python regdetails.py 9012 $ python regdetails.py 10188
Your regdetails.py
must use SQL prepared statements to protect the database against SQL injection attacks.
Your regdetails.py
must write its data in a human-readable format. These are some salient characteristics of that format:
dept
in ascending order and secondarily by coursenum
in ascending order.profname
in ascending order.Normally your regdetails.py
must terminate with exit status 0. If it detects a database-related error, or if no class has the specified classid
, then it must terminate with exit status 1. If it detects erroneous command-line arguments, then it must terminate with exit status 2 — as is the default behavior of argparse
.
textwrap
module to wrap each row to lines.
Your regoverviews.py
and regdetails.py
must be robust.
Your regoverviews.py
must handle erroneous command-line arguments gracefully. These commands illustrate. Give them a try using ref_regoverviews.pyc
:
$ python regoverviews.py a qr $ python regoverviews.py -A qr $ python regoverviews.py "-a " qr $ python regoverviews.py -a qr st $ python regoverviews.py -a $ python regoverviews.py -a qr -d $ python regoverviews.py -a -d cos $ python regoverviews.py -x
Your regdetails.py
must handle erroneous command-line arguments gracefully. These commands illustrate. Give them a try using ref_regdetails.pyc
:
$ python regdetails.py $ python regdetails.py 8321 9032 $ python regdetails.py abc123 $ python regdetails.py 9034
Your regoverviews.py
and regdetails.py
must handle "database cannot be opened" errors. If the database cannot be opened, then your programs must write a descriptive error message — the one contained within the thrown Exception object — to their stderr
.
Your regoverviews.py
and regdetails.py
also must handle "corrupted database" errors. In the context of this course, a corrupted database is one that causes the database driver, upon interaction with the database, to throw an exception. In the case of SQLite, a corrupted database might consist of an invalid SQLite file — for example, a file that's empty or contains simple text. More generally, a corrupted database might be a database that is missing a table that the interaction requires, or is missing a column/field that the interaction requires. If the database is corrupted such that the SQLite driver's execution of a SELECT
statement throws an exception, then your programs must write a descriptive error message — the one contained within the thrown Exception object — to their stderr
.
Suppose the user runs regdetails.py
for a particular classid
, the program queries the classes
table to find the corresponding courseid
, and the program then queries the database to fetch the row in the courses
table with that courseid
. Furthermore, suppose the database is corrupted such that no row with that courseid
exists in the courses
table. Must our regdetails.py
handle that particular database corruption error?
No. Your grader won't test your regdetails.py
program on that error. Generalizing...
Database management systems can enforce foreign key integrity constraints on the databases that they manage. So any DBMS would not allow its databases to contain that kind of violation. And so client programs would not need to check for such violations.
That's a very good thing. With a richer database, it would be very difficult for a client program systematically to check for foreign key integrity constraint violations. And it would be absurdly redundant for every client program to check for such violations.
So it's more realistic to compose your regoverviews.py
and regdetails.py
such that they don't check for foreign key integrity constraint violations. And so it's fine to compose your programs such that they check for only the kinds of database-related errors that are described in this specification: file-level errors and schema-level errors. The first easily could happen if the reg.sqlite
file is missing from the working directory. The second could happen if, for example, the reg.sqlite
file is present but is missing some table.
Each error message written by your regoverviews.py
must be prefixed with sys.argv[0]
(which is normally but not necessarily "regoverviews.py") followed by a colon and a space. Similarly, each error message written by your regdetails.py
must be prefixed with sys.argv[0]
(which is normally but not necessarily "regdetails.py") followed by a colon and a space.
We'll cover software testing techniques in lectures later in the semester. In the meantime, to test your programs it will be sufficient to rely upon (1) your knowledge of testing from the COS 217 course, and (2) this A Software Testing Taxonomy document.
Test your regoverviews.py
and regdetails.py
by (1) reviewing this assignment specification thoroughly, making sure that your programs conform to every aspect of it, and (2) comparing the behavior of your programs with the behavior of the reference programs.
Perform boundary (alias corner case) testing. Of course, make sure that your programs handle normal data. But also make sure that your programs handle unusual data: courses that have multiple cross-referenced departments/numbers, long titles, long descriptions, multiple professors, no professors, and so forth. Also make sure that your programs handle errors: erroneous command-line arguments, database cannot be opened, corrupted database, and so forth.
Perform statement (alias coverage) testing. Your tests should cause every statement of your regoverviews.py
and regdetails.py
to be executed.
Make a copy of the testregoverviewsgiven.py
file; name the copy testregoverviews.py
. Also make a copy of the testregdetailsgiven.py
file; name the copy testregdetails.py
. Automate your testing using the testregoverviews.py
, testregdetails.py
, and replace.py
programs. First study those programs. Then issue these commands to automate your testing of your regoverviews.py
:
python testregoverviews.py regoverviews.py > out1 2>&1 python testregoverviews.py ref_regoverviews.pyc > out2 2>&1 python replace.py out2 ref_regoverviews.pyc regoverviews.py
The first command runs testregoverviews.py
on your regoverviews.py
, and stores the output in a file named out1
. The second command runs testregoverviews.py
on the given ref_regoverviews.pyc
, and stores the output in a file named out2
. The third command replaces, within out2
, each occurrence of "ref_regoverviews.pyc" (such as might appear at the beginning of error messages) with "regoverviews.py".
Then compare the contents of the out1
and out2
files: on a Mac or Linux computer issue the command diff out1 out2
; on a MS Windows computer issue the command fc out1 out2
. The command should indicate that the contents of out1
and out2
are identical.
Thereafter, edit your testregoverviews.py
to add additional tests, and repeat that sequence
Similarly, issue these commands to automate testing of your regdetails.py
:
python testregdetails.py regdetails.py > out3 2>&1 python testregdetails.py ref_regdetails.pyc > out4 2>&1 python replace.py out4 ref_regdetails.pyc regdetails.py
Then compare the contents of the out3
and out4
files: on a Mac or Linux computer issue the command diff out3 out4
; on a MS Windows computer issue the command fc out3 out4
. The command must indicate that the contents of out3
and out4
are identical.
Thereafter, edit your testregdetails.py
to add additional tests, and repeat that sequence.
diff out1 out2
command, then widen your terminal window and issue the command diff -y out1 out2
. The output will be a side-by-side display of the two files, with markers denoting lines that differ. Same for diff out3 out4
.
diff
or fc
to compare files, you could use a file comparing website such as Diffchecker.
Your programs must be well styled. Generally, by consensus of the Python community, good Python style is defined by the PEP 8 -- Style Guide for Python Code website.
The Python community has developed a static code analysis tool -- a linter tool -- named pylint
. The pylint
tool is similar to the splint
tool that you used in COS 217: just as splint
(configured with a .splintrc
file) generates a report critiquing the style of given C code, so pylint
(configured with a .pylintrc
file) generates a report critiquing the style of given Python code. The pylint
tool enforces many of the PEP 8 guidelines, and some additional guidelines too. In COS 333 "good style" is defined by the pylint
tool, as configured with the given .pylintrc
file. The lecture example programs (with a few intentional or unavoidable exceptions) cause pylint
to generate perfect reports, and your assignment programs should do the same.
Using pylint
is easy. You can critique one file comprising your programs at a time:
python -m pylint regoverviews.py python -m pylint regdetails.py ...
However, a better approach is to critique all files comprising your programs at the same time:
python -m pylint regoverviews.py regdetails.py ...
When given multiple files, pylint
performs some cross-file critiquing. So the "all files at the same time" approach may generate warnings in addition to those generated by the "one file at a time" approach. Make sure you use the "all files at the same time" approach, as your grader will.
The command-line format of regoverviews.py
is incomplete in the logical sense. Most notably the format doesn't allow the user to express OR relationships in queries. For example, the format doesn't allow the user to express queries of the form "Display data for all classes whose area is 'qr' OR 'st'." or "Display data for all classes whose dept is 'cos' OR whose area is 'qr'." Don't be concerned about that. Remember that the purpose of the assignment is not to develop a great registrar's office application per se. Instead its purpose is to give you experience with database programming — experience that, we hope, will help you when developing your project, and beyond.
Modularize your code, at both the file level and the function level, as much as you can. Good modularity will help you to reuse your code from this assignment in the next assignment.
Make sure that the output of your regoverviews.py
has the same format as does the output of ref_regoverviews.pyc
. If it doesn't, then (no doubt) your testing programs will detect many differences between the output of the two programs, and it will be difficult for you to determine how well your regoverviews.py
works. Moreover, if it doesn't, then your regoverviews.py
will fail many of our automated checks, it will be difficult for your grader to determine if how well your regoverviews.py
works, and your grade will suffer. Same for your regdetails.py
.
Compose a readme
file. Your readme
file must contain:
Your readme
file must be a plain text file. Don't create your readme
file using Microsoft Word or any other word processor.
Submit your assignment files using the TigerFile page for this assignment. Make sure you submit regoverviews.py
, regdetails.py
, testregoverviews.py
, testregdetails.py
, any .py
files used by those programs, and readme
.
Assume that your grader already has activated the cos333 virtual environment before he/she runs your programs. The document from the first lecture entitled A COS 333 Computing Environment describes the cos333 virtual environment.
Your grade will be based upon:
pylint
tool, when using the given .pylintrc
file, and when executed via the command python -m pylint *.py
. Ten points (that is, ten percent) of your grade will be based upon the quality of your program style as reported by pylint
. Your grader will start with the 10-point score reported by pylint
. Your grader then will "round down" that score to the 0.5 level to compute your program style grade. For example, if your pylint
score is 9.8, then your program style grade will be 9.5; if your pylint
score is 7.4, then your program style grade will be 7.0. Your grader will not run pylint
on your testregoverviews.py
or testregdetails.py
programs.testregoverviews.py
and testregdetails.py
programs. Two points (that is, two percent) of your grade will be based upon your testregoverviews.py
program. You'll receive two points if your testregoverviews.py
program works and is reasonably thorough, one point if your testregoverviews.py
program works and is minimal, and zero points if your testregoverviews.py
program doesn't work or you didn't submit it. Similarly, two points of your grade will be based upon your testregdetails.py
program.To support your statement testing, you're encouraged (but not required) to use the Python coverage
tool to generate a coverage report showing which lines of your application have and have not been executed by your tests. These are the steps:
python -m coverage run -p regoverviews.py arguments
. Each of those commands runs your regoverviews.py
with the specified arguments, and generates a coverage report in a file named .coverageX
(for some X
). The report indicates which lines of your regoverviews.py
were executed and which were not.python -m coverage run -p regdetails.py argument
. Each of those commands runs your regdetails.py
with the specified argument, and generates a coverage report in a file named .coverageX
(for some X
). The report indicates which lines of your regdetails.py
were executed and which were not.python -m coverage combine
to combine the coverage reports generated by steps 1 and 2 into one large coverage report in a file named .coverage
.python -m coverage html
to use the .coverage
file to generate a human-readable report as a set of HTML documents in a directory named htmlcov
.htmlcov/index.html
to check the report.htmlcov
directory should show that 100% of your programs' lines were executed. If the report shows less than 100% coverage, then revise your testing plan accordingly, delete the .coverage*
files and the htmlcov
directory, and repeat steps 1 through 5.Copyright © 2024 by Robert M. Dondero, Jr.