Due midnight Sunday November 17
This is the last lab. Have fun.
Privacy is much in the news of late, with concerns ranging from identity theft through government surveillance to commercial exploitation of information about our purchases, our interests, our activities, our friends, and everything else. This lab will explore some issues of privacy and access to information.
This is a comparatively open-ended lab, so you may well find ambiguities and fuzzy bits. Don't worry about them, since this is meant to be for exploration rather than precise answers. But please make suggestions to help us improve the lab for next time.
This lab is intended to be more than a Google and Wikipedia exercise; you must cast your net more widely, by using other search engines and other information sources. You will be graded partly on how well you do this, so tell us for each thing what tools you used and how well they worked. Among the alternative search engines you might try are (in alphabetical order): Bing, Brave, Dogpile, DuckDuckGo, Yahoo, and Yandex. You can also try Baidu; it's in Chinese but there are sites that let you use it in in English. Anything else is fine too. In particular, you can use chat systems like ChatGPT, some of which have recently added search capabilities. The same technology is being integrated into regular search engines, of course.
There are also sites that do telephone number and address lookup or that maintain public records; financial sites like Yahoo finance and Google finance provide access to holdings and trading by insiders (which is legal under some circumstances); and of course social networks like Facebook, Instagram and LinkedIn reveal a lot about their users. Explore; that's part of the exercise. You might find it more productive to spread the lab over a couple of days so you have time to think about possibilities.
As you go along, we want you to collect your observations and comments in a Google Doc. When you're done, save it as a PDF file and upload it to Gradescope.
Use this Google Doc template so we have some uniformity among the submissions.
Make a coppy of this doc now and begin to edit it. In the following, when we ask you to "report," we're looking for a reasonably organized but not too long description. The questions in the text are meant to start you thinking, but need not be answered literally.
We're not going to grade your writing, but you'll leave a better impression if there aren't too many spelling mistakes, flagrant grammar errors, random formatting, and so on. It's ok to summarize with lists rather than complete sentences, but try to distill the essence of what you've seen rather than just copying and pasting.
For this section, you should use at least three sources,
not just Google.
How much can you learn about someone just by searching online
information? For yourself or a member of your family or someone else
close to you, see how much you can discover about that person online.
Examples of the kind of information you might look for include
home address,
telephone number, age, birthday, education, employment,
political contributions,
voter registration,
sports and hobbies, organizations and memberships, price of their home,
names of other family members (like mother's maiden name, for example),
activities and interests. Can you find a picture? Was it one that you
knew about?
It is sometimes possible to get information by searching for a phone
number or street address or social security number. (It's a bad
idea to search for your own SSN!) Do phone numbers or addresses
reveal family names? Is information always consistent?
Can you find a good picture of your home (or a friend's) with maps
from Google, Microsoft or Apple? Which one of these gives the best
image? Can you make out your car or some other possession? How much
might the house be worth? See, for example,
Zillow.
If you visit Zillow, what kinds of addresses does it show you without
being asked? How does Zillow compare to
Trulia? Which one appears
to reveal more information, or are they about the same?
What does your Facebook orInstagram page reveal about you that you find
surprising or worth thinking about?
There's no need to go overboard on this; the goal is definitely not
to invade anyone's privacy, but to get a sense of the accessibility of
information that would have been comparatively private when your parents
were your age.
As we saw in class, the mere act of visiting a web site reveals
information about you. There are a variety of sites that report back to
you about what information your visit reveals, or about what
vulnerabilities your system appears to have. Visit some of these and
see what they tell you.
Search for some service or store with several search engines and see
how accurately they geo-locate you. Look for
significant differences in apparent accuracy among Google, Bing and
other search engines.
The specific combination of which browser you use, what fonts you
have available, and a dozen other bits of information can identify you
uniquely, or almost so, to a surprising degree. Visit
Cover Your Tracks and
Am I Unique?
How unique are you? Try it with two different browsers.
We've talked about how cookies can be used to track what web sites
you visit, especially "third-party" cookies (that is, cookies that come
from someone other than the web site you accessed directly) that
aggregate and correlate information about your visits to apparently
unrelated sites.
First, how many cookies do you currently have? Record the
rough count, and whether this is before or after you removed cookies
after the lecture about them. The easiest way to find cookies is to use
the browser. In Firefox on a Mac, use Preferences and the Privacy &
Security tab.
In Safari,
Preferences / Privacy.
In Chrome, Preferences / Privacy and Security / Cookies and other site data
/ See all cookies and site data.
Now remove all cookies. Set your browser preferences to allow all
cookies, then visit half a dozen major sites (media, sports and
e-commerce sites are good for this, as are search engines and even
universities). Check how many cookies a typical visit deposits.
For sites that you visit regularly, see whether they deposit
third-party cookies. (In the unlikely
event that your regular sites don't have third-part cookies, you
can try foxnews.com, cnn.com, espn.com, priceline.com and so on.)
Experiment to see whether the third-party blocking mechanism
of your browser works the way you expect it to, by first allowing
such cookies, then removing all, setting up blocking, and revisiting
sites.
The site Blacklight
is a vivid demonstration of how much tracking goes on at any given
website. For example, it reports that on epicurious.com, a popular
cooking site, it found 73 trackers and 170 third-party cookies from 53
different companies [sic], including some that attempt to monitor your
keystrokes and mouse clicks.
Fou Analytics is analogous, but gives a rather
different view of the trackers on a given web page, and sometimes includes
actual dollar values for how much an advertiser will pay when you click
on one of their links or images. Unfortunately, it seems to work only
erratically for me; maybe you will have better luck, but don't waste
much time on it.
Do some exploring with Blacklight, Fou Analytics, and any other tools that you
like. and see what kinds of tracking you are potentially vulnerable
to. Explore some plausible sites that you do or might visit.
(If you turn on defenses like ad blockers, these horror shows won't affect you
nearly as much.)
Private Browsing or Incognito Mode in browsers is a partial solution
to some tracking problems. An incognito window will delete cookies,
history, and most other data that was created while you were browsing
with that window, but only from your own computer. If you did
anything that could identify you at the various servers you visited,
that is still recorded somewhere. And your ISP knows what sites you
visited as well. Basically all that incognito mode does is to remove
the local record of what you did, so it doesn't make you invisible and
unidentifiable, just that there's not much trace of your activities on
your own computer (which explains its informal name, "porn mode").
In an incognito window, visit some sites that will deposit cookies;
verify that there are cookies. (News, sports and shopping sites are
good.) Delete the window, then open a new incognito window and check to
see whether there are any cookies preserved from the last time.
The Tor browser is one of the best tools for maintaining some
anonymity and privacy on the web. Tor is a version of Firefox that uses
encryption and a network of relay computers to ensure that the sites
that you browse to can not determine your IP address and thus (if you
use it properly) are unable to identify you.
Download and install the Tor browser if you have not already done
so; you can find it
here.
As we discussed in class, there are ways to limit your risks and the
amount of information that you reveal. Virus checkers are
important, but for ordinary browsing there are plenty of others as well.
Many web sites insist that you provide a working email address
before they will let you register or access some service.
10MinuteMail provides a useful
service: it gives you an email address that's valid for 10 minutes
and shows you whatever mail arrives during that time; that lets you
retrieve the registration key or whatever, without giving away a real
address. Two alternatives are
Mailinator and
Yopmail, which lets you invent your own
email address, and retains mail for that address for a week.
Try a couple of these services. Determine how long it takes for mail to
arrive and how long it persists. (I've had the best luck with mailinator
but your mileage may vary.)
Check your own environment. For your regular browser record your
default settings for cookies, filename extensions, JavaScript,
popups, automatic updates, downloading, software, installation, programs
that start automatically, etc. If your mail reader provides a previewer
that interprets HTML and thus is subject to web bugs, try sending
yourself mail with a reference to an image in your public_html
directory, i.e., http://your_netid.mycpanel.princeton.edu,
to see whether the image is retrieved and displayed.
Check what plug-ins and add-ons are already installed in your
browser. Among those you might consider adding are AdBlock Plus, uMatrix
Origin, NoScript, Privacy Badger, and Ghostery; each reduces your exposure to
various kinds of tracking and potentially harmful content. As a bare minimum,
you should run Ghostery and Adblock Plus or uMatrix Origin.
Install
Ghostery,
which works in most browsers. This extension detects and disables
JavaScript trackers, which would otherwise report your page visits and
activities to advertising aggregators. Determine how many trackers
Ghostery reports that it is blocking. Visit some sites to see how many
trackers are in use. Try to find the highest number possible; there
might even be a small and worthless prize for the person who finds the
worst offender.
Reconsider your privacy settings on sites like Facebook, Instagram,
TikTok, and so on. Bear in mind that most your information is readily
available on social networks like Instagram and WhatsApp (owned by
Facebook), Snapchat, Twitter, and LinkedIn (owned by Microsoft).
Finally, if you saw anything interesting or suspicious that we didn't
ask about specifically, or if you have any thoughts on how to improve this
lab, we'd like to hear them. There are a couple of wrapup questions in
the template that address this:
Thanks.
When you're all done, convert your Google Doc to lab8.pdf
and upload it to Gradescope. No need to put anything on cPanel.
Part 1: Personal Information
Part 2: What Else Do They Know About You?
Part 3: Cookie Crumbs
Part 4: Tracking the Trackers
Part 5: Defenses and Countermeasures (1)
Part 6: Defenses and Countermeasures (2)
Submitting your Work