CS2911
Network Protocols

Resources

You can work the entire lab (except the demo) as a prelab. Since your demo is due during the lab period, I recommend completing the code and working out nearly all the bugs before the period starts. Please work in teams of two unless approved by the instructor. Please submit only one report per team.

Another advantage of getting the lab done before the lab period is that you can get an early start on Lab 7 as well.

Introduction

The goal of this lab is to write a Python program to request and save a web resource, acting as an HTTP client. You will write code from scratch, sending and receiving bytes over a TCP connection rather than using a prebuilt HTTP library.

You will start from the template httpclient.py.

The program has at least the following functions; you will add others.

main()

This method is provided in its entirety in the template.

The provided main function will perform basic tests. You may add others. No user input is needed.

  • This method has no arguments
  • This method is invoked with a main() function call at the end of the program.

get_http_resource(url,file_name)

This method is provided in its entirety in the template.

Using HTTP, request a web resource and store the returned data in the specified file.

  • Arguments:
    • url: string containing URL (including the "http://" protocol declaration and the domain name) for desired resource
    • file_name: string containing name of file in which to store response data
  • Return value:
    • None

make_http_request(host, port, resource, file_name)

Using HTTP, request a web resource and store the returned payload data in the specified file.

  • Arguments:
    • host: bytes object with the ASCII the domain name or IP address of the server machine (i.e., host) to connect to
    • port: port number to connect to on host
    • resource: bytes object with the ASCII path/name of resource to get. This is everything in the URL after the domain name, including the first /.
    • file_name: string (str) name of file in which to store the retrieved resource
  • Return value:
    • Integer status code given by server
  • Operation:
    • Send an HTTP request to retrieve the resource at the specified url.
      • The client should recognize and interpret both Content-Length and chunked responses
      • While you need to implement chunking, you do not need to implement the chunk extensions described in the RFC (and the videos)
    • If successful, store the response data in a file with the specified file_name.
    • Return the status code given by the server

It is not necessary to handle or report errors that the server hypothetically could make by not following the protocol. (Dr. Sebern's website follows the protocol correctly.)

Procedure

  1. Work through the design steps from Lab 4
  2. Download the skeleton Python template: httpclient.py
  3. Edit the header of the file to include your team members' names.
  4. Complete the make_http_request method to request, receive, and store the designated resource. You should add other helper methods, but do not change the code provided in the template. As a team, design the methods and the data passed between them. Then divide your efforts, with each team-member writing at least one method. Document who writes each method with a Sphinx tag starting with :author:

If this base functionality turns out to be too easy, you may experiment with adding additional functions, but be sure the basic requirements are still met.

Divide up the primary responsibility for parts of the program in an equitable way.

Document each method following the coding standard

You may use the next_byte() method from Lab 5. Whether or not you use next_byte(), your program should use recv correctly. It should handle the situtation when recv() returns fewer bytes than expected during normal transmission. It is usually best to just use next_byte()

You do not need to implement a persistent connection. The server will send only one response unless you explicitly request a persistent connection. Nevertheless, your code should be extensible to the persistent case. In other words, when you are reading the message, the program should not attempt to read any bytes past the end of the message, to avoid reading bytes from the second response.

Because a student has asked me to state this explicitly: You should not use HTTP libraries when implementig this lab. The purpose of this lab is for your team to write the library from scratch! (International students: This means, like in cooking, to make something from raw ingredients without any pre-made components.)

To test your files, compare the contents of index.html with what you get by right-clicking on the page seprof.sebern.com and seleting "view page source." Your image is probably correct if it displays correctly in IntelliJ

Have fun! Ask your instructor if you have any questions!