You must complete the first assignment to do a project in this track.
You may complete assignment 1 and then later change your mind about
which project you will in fact provide as your term project, provided
your final decision occurs prior to the second project assignment
and is approved by the instructor.
See the course schedule.
If you perform a
Google search using robots.txt get and you will see lots of web pages,
many about how to write robots.txt files, but among them will be actual
web logs as well. An example of a weblog appears at
https://www.ursaoutdoors.com/logs/access.log.
Below are the first few lines of the log.
216.39.48.161 - - [18/Oct/2003:11:54:26 -0700] "GET /robots.txt HTTP/1.1" 404 347 "-" "Scooter/3.2"
216.39.48.161 - - [18/Oct/2003:11:54:29 -0700] "GET /logs/ HTTP/1.1" 200 1367 "-" "Scooter/3.2"
152.163.253.69 - - [18/Oct/2003:18:49:33 -0700] "GET /rainforestgentle2.html HTTP/1.0" 404 343 "https://www.google.com/search?q=free+brids+or+quaker+hl=en&lr=&ie=UTF-8&start=10&sa=N" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 9.0; Windows NT 5.1; Hotbar 4.3.5.0)"
The first line informs us that the machine whose IP address was 216.39.48.161 visited
this site on October 18, 2003.
In this assignment, you will locate an on-line weblogs and report information about IP addresses.
Below are the steps you must perform.
- Perform a manual Google search on robots.txt get and examine the
retrieved pages. Pay particular attention to the webpages that are actual weblogs.
Select 3 of the weblogs for your use. Each weblog should contains at least 30
distinct IP addresses.
These weblogs will be your sample. Make a local copy of the information
contained in your sample and
record the URL of each of the 3 weblogs.
- Research out available network tools, such as reverse nslookup and
network connection and allocation information. See what tools are available
to tell you information about an IP address based on routing and assignment information.
Summarize your findings by constructing a chart of what each tool accepts
what kind of information as input and provides which fields of output.
- For 30 distinct IP addresses found in your sample, report the information found in step 2. Note the order in which you may have to perform look-ups and searches.
Record the completeness of the information.
Write a 2-3 page report of your findings. Provide a summary discussion about
the usefulness of the results you have found and identify any privacy concerns you may have. Include as an appendix the information for each of the 30 IP addresses
you tested.
Below is an overview of steps for this project.
- Sometimes an email address can be associated with an IP address if the user
has posted an email message. To uncover such information, write a program
to search the web on an IP address, and identify whether the IP address (or its name)
appears in an email header of an email message that is on-line. If so,
return the email message content.
- Write a program that given an IP address automatically
queries the tools you found in Assignment 1 to provide information on the
given IP address. In this step, you are automating the task you performed
manually in Assignment 1.
- Enhance your program by incorporating your program in step 2 (immediately above)
with the one in step 1.
Given an IP address,
your enhanced program should report various network information about the IP address
as well as any associated email postings.
- Run you enhanced program on the 30 IP addresses
you used in Assignment 1. Record your results and compare them to your manual findings. Did you learn any new information?
Run your enhanced program on other IP addresses in your sample.
Report your results. Note how much and
what kind of information was found on each IP address.
- Graduate credit. If you are enrolled for graduate credit, enhance your
program further by allowing it to operate on a weblog rather than a single IP address.
This will allow you to check many IP addresses quickly because you can fetch
the URLs of numerous weblogs and see how many IP addresses give you interesting information. Be sure to use distinct IP addresses. People tend to visit the same
sites regularly, so you don't want to count the same IP address multiple times.
RFecord how much information is typically gained. See if you can make more
generalized statements as supported by your findings.
Assignment 2.
Report on your results from steps 1,immediately above.
Describe your algorithmic design and report what kind of information
is available by email and what can be learned about people because
their IP address is included.
Final report. Complete the steps above and gather findings.
Be selective in what information you present. You may elect to report on other
aspects not necessarily those listed above.
Write a final report for the project and prepare and conduct
an in-person poster presentation of your work.
Graduate credit: If you are taking this course for graduate credit, you must
write a conference-style paper on your work, rather than a report.