CSE 265: System and Network Administration

Lab #7

Today we continue with networking. In this lab we will configure Apache to handle multiple domains, and install a simple Perl script to support Web chat.

  1. WWW basics.

    • A URL consists of a number of (sometimes optional) parts. For example, http://www.apache.org:80/foundation/faq.html contains the protocol (http), the host (www.apache.org), the port (80), and a path (/foundation/faq.html). If the port is not specified, 80 is assumed. If a path is not specified, the root path (/) is assumed (actually the browser will generate the root path if needed).

      Other protocols are permitted in a URL, such as https (secure HTTP access), ftp (access a file on an FTP server), and others. Not all are recognized by all browsers, but they are still considered valid to the URL specification.

    • A web server is a program that waits for requests that instruct it to serve or generate files. It typically listens on port 80, but can run on other ports as well (which is why a URL can include the port number). A web server accepts requests and sends reponses using HTTP (HyperText Transfer Protocol). Here is a sample request:
        GET /somedir/page.html HTTP/1.1
        Host: www.someschool.edu
        User-agent: Mozilla/4.0
        Connection: close
      And a sample response:
        HTTP/1.1 200 OK
        Date: Thu, 06 Aug 1998 12:00:15 GMT
        Server: Apache/1.3.0 (Unix)
        Last-Modified: Mon, 22 Jun 1998 ...
        Content-Length: 6821
        Content-Type: text/html
      Actually, the above are only considered the headers of the request and response, as they may also carry a body which is the payload. For the response, it would contain the 6821 bytes of the file being served. In a request, the body would contain the contents of a form submission, or a file upload, etc.

      A request of this kind is made for every separate object on a page (often, including the body of the page, every image and advertisement, sometimes for JavaScript, sometimes for CSS, and for AJAX activites).

      You can see all the activity by running wireshark while visiting a web page like Google Maps.

    • Typically, a web server is only able to serve existing files; dynamically generated content usually requires additional programs or services. The Common Gateway Interface (CGI) provides one way to generate content dynamically. Other methods are also possible. It executes a user program to generate output. Note that such programs are often security risks as they usually process some input from a web interface and are not designed to securely handle that input (e.g., student code :-).

    • Apache is the most popular web server technology (Netcraft, February 2012). Apache 2 (included in modern OS releases):
      • Supports multi-process and mult-threaded operation
      • Supports SSL/TLS encryption
      • Supports proxy operation
      • Supports virtual hosting
      • Supports syslog logging, but typically doesn't use it
      See http://httpd.apache.org/ for documentation. Primary configuration files are in /etc/httpd/conf/. Skim through the extensive httpd.conf file as we will be editing it later. Note the use of modules to turn on or off various behaviors. The administrator can also configure the number of processes to use, number of requests to be served by each child process, where the DocumentRoot is, what format to use for generating logs, and much more.

  2. Configuring Apache.

    In our installation, apache has been installed as /usr/sbin/httpd with configuration and logging in /etc/httpd. First, start apache by modifying the services on your machine as we did in last lab (e.g., in System->Administration->Services) to start httpd and make it start at boot. Open a web browser, and visit http://localhost/. You should see a Test page for Apache. Now you know your httpd server is running.

    Now let's create some content. The default home directory for Apache's content is in /var/www/html/. Note that it is empty! The test page you saw was generated on-demand by a module (see the info on the test page for details). If you create an index.html file (which is the default file that is served, even when only the root of a directory is specified as the URL), it will show up instead of the test page.

    Create two subdirectories (choose your own names, such as alpha and beta). Within each subdirectory, create new index.html files (we want to have two different files here). You might visit some website you like and use the browser's Save As menu to make a copy of the HTML.

    Create two hostnames that map to your localhost. You can do this by modifying (if necessary) the zone files you created in last lab, or by just adding two new hosts into your /etc/hosts file and have them both map to 127.0.0.1. To test this, you should be able to use these hostnames in your web browswer, and still get the Apache/Fedora test page.

    Now we can create virtual Web sites for each of these hostnames. Near the end of Apache's long configuration file is a section called VirtualHost. You'll want to enable virtual hosting by uncommenting the NameVirtualHost directive (just above) and specifying the IP address to which it applies (use whatever one you used in your zone files or /etc/hosts file above). Copy the sample twice and modify the copies to match the two hostnames and two directories that you created above. Restart apache, and you should now see the host-specific index.html home pages rather than the test page with your Web browser. However, if you want to have a default host other than your first virtual host, you'll need to create a third entry (to go first). (If no Host: HTTP header is included in a request, or doesn't match an existing VirtualHost entry, the Web server will serve using the first VirtualHost that matches.)

  3. Add a Perl CGI application.

    Fortunately for this exercise, Apache is already configured to support CGI scripts. The default directory for them is /var/www/cgi-bin/. As a demonstration of using a Perl script for CGI, we will use a simple chat application called EveryChat. Check out the local installation at http://wume.cse.lehigh.edu/~brian/chat/chatframes.html so that you know what to expect.

    Download this free (but obsolete) application from http://www.everysoft.com/ and use unzip to uncompress this zip archive and extract its contents. Take a look at the readme.txt file --- it has instructions on how to install EveryChat, and those instructions include needing to edit multiple files. This package was written on a Windows machine, and so all of the files are written with DOS-style end-of-line markers (carriage return -- ASCII 13) rather than UNIX end-of-line markers (newlines -- ASCII 10). For text and HTML this might not matter, but it sometimes matters for scripts. If you have problems running the Perl script, you may need to translate the format. I used the dos2unix utility; there are likely other methods. The package documentation does not mention this (potential) problem.

  4. Wrapping Up

    In order to sign the lab completion sheet, you will need to:
    1. show me the two web pages you created using the virtual hosting.
    2. show me your perl scripting chat site in action.


This page can be reached from http://www.cse.lehigh.edu/~brian/course/2014/sysadmin/labs/
Last revised: 30 September 2014.