Decommissioning AFS-based websites

Tags Web afs

Summary

There are some units — centers, departments, institutes, and programs — that have old AFS-based web sites. While technically not the responsibility of LSA Technology Services (either the Infrastructure or Web teams), we've put together some instructions to help them decommission their old sites (for example, once they've transitioned to one of the LSA-supported web services). (Instructions on decommissioning webapps-based websites are also available.)

 

Environment

AFS-hosted websites.

 

Directions

Identifying who can make changes

We usually don't have administrative access rights to the unit's AFS space. To identify who does, perform the following steps:

  1. Log into the login.itd.umich.edu login servers, or another AFS-capable Unix/Linux box.
     
  2. Change to the unit's HTML directory: cd ~unit/Public/html

    where unit is their URI component. For example, for IPCAA, the unit is ipcaa and the command would be cd ~ipcaa/Public/html.
     
  3. Determine what users or groups have permissions: fs la .
     
  4. In the resulting output, identify any users or groups with full permissions, rlidwka. See the OpenAFS documentation for details on what each permission letter stands for if you're so inclined.
     
  5. For each group identified in Step 4, determine its membership: pts mem groupname

One of the users — either with direct permissions from Step 4 or via a group from Step 5 — needs to be the one to make the changes. Alternatively, someone with administrative privileges (the "a" bit) can add you to a group with the pts adduser -user uniqname -group groupname command and then you can do the work.

 

Backing up the existing site

Before you make any changes, take a backup of the existing site by performing the following steps:

  1. Log into the login.itd.umich.edu login servers, or another AFS-capable Unix/Linux box.
     
  2. Change to the unit's Public directory: cd ~unit/Public

    where unit is their URI component.
     
  3. Back up the html directory: tar cf - html | gzip > ~unit/public-html.yyyymmdd.tgz
     
  4. Change to the unit's Private directory: cd ../Private
     
  5. Back up the html directory: tar cf - html | gzip > ~unit/private-html.yyyymmdd.tgz

You have backed up the html directories to gzipped tar archives named {public,private}-html.yyyymmdd.tgz.

 

Determining what changes to make

The specifics of what to do becomes a political question, as it's a business decision: how much time and effort to take should be weighed against the benefits, costs, and risks of taking (or not taking) action. Advantages and disadvantages are in Table 1.

Table 1
Advantages and disadvantages

Action Advantages Disadvantages
Move the html directory aside. Trivial to do.
Takes no time.
Breaks all old search engine results and users' bookmarks.
Redirect only the home page. Easy to do.
Takes little time.
Home page continues to work.
Breaks all old search engine results and users' bookmarks that aren't to the home page.
Redirect every page to the home page. Not too hard to do.
Takes some time.
Home page continues to work.
Other pages get redirected to home page.
Old search engine results and users' bookmarks get the home page instead of the expected page.
Redirect every page. Hardest to do.
Takes a lot of time.
All pages continue to work.
Old search engine results and users' bookmarks still work; users need to take no action.

Each option is discussed in the following sections.

 

Move the html directory aside

The most-trivial path for administrators, this provides the most inconvenience to the end users: All old search engine results and users' bookmarks will cease to function; users will receive 404/Not found errors. To move the html directory aside, perform the following steps:

  1. Log into the login.itd.umich.edu login servers, or another AFS-capable Unix/Linux box.
     
  2. Change to the unit's Public directory: cd ~unit/Public

    where unit is their URI component.
     
  3. Move the html directory aside: mv html html.old
     
  4. Change to the unit's Private directory: cd ../Private
     
  5. Move the html directory aside: mv html html.old

You have moved the html directories aside.

 

Redirect only the home page

The next most-trivial path for administrators, redirecting only the home page, is quick and easy and provides most users access without having to mess with search engine results or users' bookmarks. The drawback is that the old content for any and all other pages on the site remain available. To redirect only the home page, perform the following steps:

  1. Log into the login.itd.umich.edu login servers, or another AFS-capable Unix/Linux box.
     
  2. Change to the unit's public html directory: cd ~unit/Public/html

    where unit is their URI component.
     
  3. Back up the old index.html file: mv index.html index.html.old

    It may be named index.htm or default.htm; if so, adjust the command in Step 4 similarly. If the index file is named something else (such as index.php), contact your programming staff.
     
  4. Replace the contents of index.html:
    cat > index.html <<EOF
    <HTML>
    <HEAD>
      <TITLE>unit Has Moved!</TITLE>
      <META HTTP-EQUIV="Refresh" CONTENT="time, URL=url">
    </HEAD>
    <BODY>
    <H1>unit Has Moved!</H1>
    <P>unit has moved. We are now at <a href="url">url</a>; please update your bookmarks.
    You will be automatically redirected there in the next time seconds.</P>
    </BODY>
    EOF
    

    where:

    time
    The number of seconds to wait before implementing the redirect. Use 0 for immediate. It appears twice: Once in the META tag and once in the P tag.
    unit
    The unit name. It appears three times: Once each in the TITLE tag, the H1 tag, and the P tag. Note that the version in the P tag might be spelled out, such as "The Interdepartmental Program in Classical Art and Archaeology" instead of "IPCAA."
    url
    The new URL. It appears three times: Once in the TITLE tag and twice in the P tag (one inside the anchor href and one as the anchor link text).

    In the first and last lines, the EOF are the literal three characters E, O, and F.

  5. Change to the unit's private html directory: cd ../../Private/html
     
  6. Repeat Step 3 and Step 4.

You have redirected the home page and only the home page. Unless you take other actions, all other pages remain accessible via the old URLs.

 

Redirect every page to the home page

Probably the most-balanced case, where every old page is redirected to the new home page, provides the benefit of all search engine results and users' bookmarks still getting to the new site and effectively blocking access to the old site, without a lot of administrative effort. Absent other information about the unit-specific IT or web staff, this is probably the best course of action. To redirect every page to the home page, perform the following steps:

  1. Log into the login.itd.umich.edu login servers, or another AFS-capable Unix/Linux box.
     
  2. Change to the unit's html directory: cd ~unit/Public/html

    where unit is their URI component.
     
  3. Create a new access control file with the following contents:
    cat > .htaccess << EOF
    RewriteEngine on 
    RewriteRule ^(.*)$ https://new_url/ [R=301,L]
    EOF
    
    where new_url is the new site's home page URL.
     
  4. Copy that file to every directory inside or under the directory you're in: for dir in `find . -type d -print` ; do cp .htaccess ${dir}/ ; done 
     
  5. Change to the unit's private html directory: cd ../../Private/html
     
  6. Repeat Step 3 and Step 4.

You have redirected every file to the new site's home page.

 

Redirect every page to its unique page

The best case from the user's standpoint, where every old page is redirected to the corresponding new page, provides the benefit of all search engine results and users' bookmarks still getting to the new site, at the expense of a lot of administrative overhead. To redirect every old page to its corresponding new page, perform the following steps:

  1. Log into the login.itd.umich.edu login servers, or another AFS-capable Unix/Linux box.
     
  2. Change to the unit's html directory: cd ~unit/Public/html

    where unit is their URI component.
     
  3. Back up the old index.html file: mv index.html index.html.old

    It may be named index.htm or default.htm; if so, adjust the command in Step 4 similarly. If the index file is named something else (such as index.php), contact your programming staff.
     
  4. Create a new temporary file with the desired HTML contents:
    cat > tempfile <<EOF
    <HTML>
    <HEAD>
      <TITLE>unit Has Moved!</TITLE>
      <META HTTP-EQUIV="Refresh" CONTENT="time, URL=url">
    </HEAD>
    <BODY>
    <H1>unit Has Moved!</H1>
    <P>unit has moved. We are now at <a href="url">url</a>; please update your bookmarks.
    You will be automatically redirected there in the next time seconds.</P>
    </BODY>
    EOF
    

    where:

    time
    The number of seconds to wait before implementing the redirect. Use 0 for immediate. It appears twice: Once in the META tag and once in the P tag.
    unit
    The unit name. It appears three times: Once each in the TITLE tag, the H1 tag, and the P tag. Note that the version in the P tag might be spelled out, such as "The Interdepartmental Program in Classical Art and Archaeology" instead of "IPCAA."
    url
    The new URL. It appears three times: Once in the TITLE tag and twice in the P tag (one inside the anchor href and one as the anchor link text).

    In the first and last lines, the EOF are the literal three characters E, O, and F.

    This is the same HTML content as in the "Redirect the Home Page" section.

  5. Find all files with an .html or .htm suffix and replace them: find . -name \*.html -o -name \*.htm -exec cp -p tempfile {} \;
     
  6. Change to the unit's private html directory: cd ../../Private/html
     
  7. Repeat Step 4 through Step 6.
     
  8. Remove the temporary file: rm ../../Public/html/tempfile
     
  9. Edit all of the HTML files: find . -name \*.html -o -name \*.htm -exec vi {} \;

    You can use your editor of choice instead of vi.
     
  10. In the editor, replace the URL, url, with the corresponding new page's URL, in all three places.
     
  11. Save your changes and exit the editor for the current file.
     
  12. Repeat Step 10 and Step 11 for all remaining HTML files from Step 9.
     
  13. Change back to the public html directory: cd ../../Public/html
     
  14. Repeat Step 9 through Step 12.

Alternatively, you can put a .htaccess file in each directory (as discussed in the previous section) and edit its contents to use multiple RewriteRule directives to send each individual file in that directory to its corresponding new URL. The details of doing so are sufficiently site-specific that we cannot document them here.

You have updated all of the old web pages to redirect them to their corresponding new pages. Note that files with other suffixes, such as .cgi or .php, are not affected by this process.