SurfShopCart Documentation

Setting Up Shop Has Never Been Easier!

User Tools

Site Tools


web_essentials

Web Essentials

What is CGI?

CGI, or Common Gateway Interface, is a portal between your website visitors and the Server on which your website resides. It allows your website to become interactive. By storing certain tags in your web pages, you can call on the host server to do a great many things behind the scenes, allowing your otherwise static web site to come alive.

  • Dynamically generated pages - you can have the host server return html pages built from a database, so depending on what they request, each page is different.
  • Calculations and storage - you can have the host server collect information from a website visitor and store it for later use, whether to place an order, participate in a survey, or play a game
  • Behind the scenes operations - you can create CGI programs that are loaded whenever a webpage is requested to do a number of things, including display banner ads, count visitors or restrict access.

What is SSL?

Secure Socket Layer, or “SSL” is the standard method for encrypting and decrypting information passed between a web browser and a remote web server.

Typically, a web server will be configured so that a website can be accessed using either SSL encryption (https://securedomain) or “clear text” (http://regulardomain). Everything about the site is the same, except that when the secure protocol is used, all information is encrypted before being sent over the Internet, thus making it virtually unreadable by anyone except the intended recipient.

It is important to note that using SSL does not mean your site is “secure”. It only means that the information cannot be intercepted by a third party between the browser and the remote host.

What happens to the information after it has been received by the server is another story. A common misconception is that you can take an order using SSL and then email the information to the merchant, when, in fact, email completely negates any security benefits gained by encrypting the order in the first place. E-mail is much less secure than a regular website transaction because the information is recorded in its entirety on every mail server between the sender and the recipient, making it accessible to countless eyes along the way.

In addition, the data that is recorded on the server can be viewed by other users on the server via Telnet or FTP by navigating around the web server’s user directories.

A truly “secure” web server only allows information to be viewed by priveleged users who only have access to specific files after passing authentication through a firewall. In some cases, remote access is prohibited entirely.

Using FTP

FTP stands for File Transfer Protocol. It is a method for moving files between one computer and another. FTP is the method of choice for installing SurfShop because it does not require any special authorization besides your hosting account log in and password. You can do everything you need to completely install the system using a simple FTP client program.

There are a number of FTP programs out there, each with its own set of unique features. we recommend Transmit for the Mac and Cute FTP for Windows.

Logging In

To use FTP with a shared hosting account, you need the domain name or Internet Protocol address of the account, the log in (sometimes called username) for the account, and the account’s password. Connect to the Internet as you usually do and launch the FTP program.

A dialog box will appear asking for the above information. Enter your domain, log in and password and you’re in! At that point you should see a set of file folders much like you do on your hard drive.

An FTP Program works similarly to the “Finder” (on Mac) or “Windows Explorer” (on Windows). When you log in to your remote host, a window will appear with a list of files and directories. You can navigate through these files by double clicking them, or using “Open” under the File Menu on your Menu Bar. The different FTP programs have different ways to transfer files, but usually it means highlighting the file you want to transfer, or “upload” and pressing some kind of button to initiate the transfer. Often this button is labeled “Put”.

Alternately, you can transfer, or “download” files FROM the remote server to your computer by highlighting the file on the server and pressing a different transfer button. This button is often labeled “Get”. With some FTP Programs, you can copy files to the current directory by simply dragging the files from your desktop onto the FTP window.

Data Types

File transfers can be done in two formats: text (or ASCII) and binary. Text files include HTML documents, CGI scripts and flat file databases. Binary files include images, audio/video files, and compressed files like ZIP, PDF or TAR files.

NOTE: ALL of the files (with exception of the graphic images) used in SurfShop should be transferred to the server in the text (ASCII) format!

In any case, using FTP is fairly simple and you can be up and running in just a few minutes.

The Shared Hosting Account

Web sites are stored on computers that reside in air-conditioned rooms in some remote location and typically operate in the Unix environment. Although Unix can seem somewhat intimidating, these accounts usually have several things in common.

For people who are familiar with Windows or Macintosh computers, you can compare your hosting account to a file folder on your main hard drive. The folder is named after your user name. Unlike your home computer, however, you only have access to this one folder. Everything on your web site is located in this folder, otherwise known as your “root directory”.

In this folder there are several additional folders. Often there is a public folder, an FTP folder, a Log folder and a cgi-bin folder. There are sometimes additional folders which you will not ever need to use, or worry yourself about.

The Document Root

This folder is often called “www” or “public_html” or “public-html”. In this folder you will find the files that anyone who accesses your web site from the web will be able to see. The files in this folder are, for the most part, either text files (HTML, TXT) or image files (JPG, GIF). You might also have some PDF (Adobe Portable Document Format) or other files. These files can be viewed using a standard web browser (Safari, Firefox, Internet Explorer, Chrome, etc.) which makes them appear as we know and love them. Without the browser, they might otherwise look like random gibberish or cryptic computer code.

The FTP Folder

This folder (sometimes called “Anonymous FTP”) is used to upload and download files from your web site that may be too cumbersome to transmit via e-mail. This folder might reside inside your public folder, or it may reside outside your public folder in your root directory.

The Log Folder

Some hosting accounts give you a Log folder which stores all of the access logs that are automatically generated by the host computer. Some hosting accounts allow you to view this folders and others do not. You can often find out what went wrong at any time by viewing the error log in this folder. This folder might reside inside your public folder, but more often it will reside outside your public folder in your root directory.

The cgi-bin Folder

This folder (sometimes called “cgi-local”, or just “cgi”) is reserved for scripts and programs that are meant to be executed — rather than viewed — when they are called. By restricting executable files to a single directory, Unix administrators can contain the effects of malicious activity. The CGI scripts used by SurfShop™ reside in this folder. This folder might reside inside your public folder, or it may reside outside your public folder in your root directory.

Text Editor

A text editor is a program which allows you to open, modify and save ASCII text files without adding any extra characters to the file.

A text editor is not the same thing as a Word Processor. Programs such as Word, Word Perfect, Excel, iWork, etc., should NOT be used to edit cgi scripts. They can insert code into the file which will cause the script to malfunction or prevent it from compiling at all.

Acceptable text editors include TextWrangler, NotePad, BBEdit, TextWrangler, SimpleText, Komodo, Pico, VI, among others.

What is meant by “line endings”?

When you press the [return] key on your keyboard, the program usually inserts a “line ending” on the text you are typing. Different operating systems use different codes for this “line ending”. This may be a “carriage return”, a “linefeed” or a combination of both.

When moving files from one operating system to another, such as from Windows to Unix, or from Unix to Macintosh, these “line endings” are translated. As a result, the file may appear “double-spaced”, or in some cases, as one, single line of text.

Sometimes, the line ending is translated into a character that is not recognized by the operating system. This causes an “NPC” or “Non-Printable Character” to appear in the text, often an empty square box character. If this occurs, you may need to manually convert the incorrect line endings into ones that are native to your operating system.

Files, Directories and File Paths

Unix Files are the individual units that live on the computer’s hard drive. Most Unix files are either text (HTML, scripts or text) or binary (programs, images or data).

Directories can be thought of the name of a folder containing files. All of the files and directories in your shared hosting account reside somewhere on the host server’s hard drive and are located by the computer using a “file path”. When the term “file path” is used, it is referring to the path on the server’s hard drive to your files. A “file path” always begins with a forward slash and is followed by the names of the directories in which the file resides, separated by forward slashes.

Examples:

/usr/web/username/www/index.html

/home/username/public_html/default.html

/u/web/username/cgi-bin/surfshop/autoconfig.cgi

These are all examples of file paths on typical Unix shared hosting systems. Each Unix system uses a different layout, so the “file paths” to the shared hosting files are different on different servers. Remember that the “file path” is used whenever the server (or the program) itself is accessing your files.

If someone wishes to access a file from the Internet, they will use a “Universal Resource Locator”, or “URL”. Instead of beginning with a forward slash and a couple of directories, it begins with a protocol identifier, a colon, two forward slashes and a domain, followed by the path to the file in question.

The URL’s to the same files described with “file paths”, above would be:

http://www.surfshopcart.com/index.html

http://www.another-domain.com/default.html

http://www.third-domain.com/cgi-bin/surfshop/autoconfig.cgi

The Unix system’s web server automatically translates the URL into a “file path” to locate the appropriate file, determine if the visitor is allowed to access the file, and if so, send it to the visitor’s browser.

File Permissions

The permission of a file or directory is a three-digit number which tells the server who can read, write or execute the file or directory.

Each digit represents the privilege for one of the three types of users: the owner of the file, members of the group to which the file belongs, and everyone else (world).

Read has a value of 4, write has a value of 2 and execute has a value of 1.

A permission of 755 looks like this:

User Type r w e
Owner 4 2 1 7
Group 4 0 1 5
World 4 0 1 5

A permission of 644 looks like this:

User Type r w e
Owner 4 2 0 6
Group 4 0 0 4
World 4 0 0 4

This means that the owner of the file can read and write the file, while everyone else can only read it. New files are set to 644 by default.

Requests and Responses

Computers speak in a language of “requests” and “responses”, called a protocol. When a computer asks for a web page or CGI program, it sends a “request”. This is computer-ese for “hello, how are you. I am about to give you a bunch of information. It is going to look like this…”. The requesting computer then sends a stream of information which the receiving computer then must process.

Based on the information in the “request”, the receiving computer finds the appropriate file or runs the appropriate program, processes the information and returns a “response”. In computer-ese this is, “Thank you. This is the information you requested. It is going to look like this…”. Then it spits out a bunch of information which the requesting computer must process.

All you need to know about requests and responses is that when the term “Request Header”, “Response Header”, or “HTTP header” is used, it refers to requests and responses. If you are interested, watch the status window at the bottom of your browser when you are surfing the web. You can see all of the request and response activity between your computer and the remote host.

When a CGI program is called, instead of simply spitting out the file, the server starts a “gateway” which acts as a messenger between the file and the server. The CGI program must be written in such a way it can understand the information in the “request” and also format the output so that the requesting computer can understand it in the “response”. If the CGI program does not format the output using the correct “response”, the server generates an error.

Perl

Perl is the scripting language of choice for CGI programs. All of the scripts in the SurfShop suite are Perl scripts. The only thing you need to know about Perl is that every Perl script MUST have a header line (called the “shebang” line) on the first line of the script. This line contains the file path to the Perl program on the server. It tells the server “this is a script so take me to the script interpreter!” If you are having problems, try modifying this header line. Your ISP will be able to tell you the correct file path to Perl on your server. A typical header line looks like this:

#!/usr/local/bin/perl

OR

#!/usr/bin/perl

Forms, Form Data, and Query-Strings

An HTML form is a means for a web page to capture information from the user and send it to the host server for processing. The user fills in all the appropriate blanks and presses “submit”. The user’s browser converts all the fields and their matching values into a long string, called “form data”. It then calls the CGI program specified in the form tag and sends the form data off to the server. The CGI program on the other end then decodes the data string and uses it to process the order, mailing list, or whatever else it is designed to do.

A form uses “input” tags to capture information. If you are familiar with HTML, you know the various types of input: text, radio buttons, check boxes, pull-down menus, lists, hidden fields, etc. and how to use them. Consult an HTML manual for more information about constructing forms. There are several thousand available on the Internet.

HTML Forms

Here is a typical form tag:

<form action=“/cgi-bin/surfshop/shop.cgi” method=“post”>

… input tags …

</form>

The first parameter is the “action” parameter. This contains the URL to the CGI program being called. The second parameter is the “method” parameter.

There are two methods that the browser uses to send the form data to the CGI program. The first is the “POST” method. In this method, the form data is encoded and stored in a variable called “STDIN”. It is sent with the HTTP header, making it invisible to the user.

The second method is the “GET” method. In this method, the form data is appended to the URL in the action parameter. A question mark (?) character is used to separate the URL from the form data. Anything after the question mark is called the “Query-String”.

Query-Strings

Alternately, you can call and send form data to a CGI without setting up a form at all. To do this, you “URL encode” the Query-String yourself and embed it into an HTML link tag (<a></a>). There are a few rules to follow when encoding your own Query-String.

A question mark (?) is used to mark the end of the URL and the beginning of the Query-String. The field name comes first, followed by an equals (=) sign, followed by the value of the field. Each field name/value pair is separated by an ampersand (&) character.

Any non-alphanumeric characters (including spaces) must be converted to their hexadecimal ASCII values and preceded by a percent (%) sign. For example, a space character (ASCII #32) is converted to %20 for URL encoding.

All together, a GET method CGI call might look like this:

<a href"="/cgi-bin/login.cgi?name=John%20Smith&email=jsmith@domain.com">
Submit </a>

This would send the following field names and values to a CGI program named “login.cgi”:

name: John Smith

email: jsmith@domain.com

An example of a GET method CGI call that would tell SurfShopTM to add an item to the user’s cart might be:

<a href"="/cgi-bin/surfshop/shop.cgi? c=viewcart.htm&i_123dogbis=1"> Buy Me!
</a>

This would tell shop.cgi to add the item “123_dogbis” to the current user’s basket and then return the template file, “viewcart.htm”.

.htaccess

Web servers use a system that controls how files are delivered to a web browser. This system uses special files, called “.htaccess” files which reside in the web site’s directories to tell the web server what to do when a request comes in for a file.

A common function of .htaccess is to require a username and password before serving the file. SurfShop automatically installs .htaccess files to require a username and password in all of the data directories used by the program. The username and password is the same one that you enter when you configure the program with autoconfig.cgi.

web_essentials.txt · Last modified: 2018/07/03 04:55 (external edit)