46.5. Apache Modules

The Apache software is built in a modular fashion: all functionality except some core tasks is handled by modules. This has progressed so far that even HTTP is processed by a module (http_core).

Apache modules can be compiled into the Apache binary at buildtime or dynamically loaded at runtime. For the runtime loading, refer to Section 46.3.2.2.1, “LoadModule module_identifier /path/to/module for loading modules manually and to Modules for using YaST.

Apache in SUSE Linux comes with the following modules readily available in the apache2 RPM (prefix "mod_" omitted here): access, actions, alias, asis, auth, auth_anon, auth_dbm, auth_digest, auth_ldap, autoindex, cache, case_filter, case_filter_in, cern_meta, cgi, charset_lite, dav, dav_fs, deflate, dir, disk_cache, dumpio, echo, env, expires, ext_filter, file_cache, headers, imap, include, info, ldap, log_config, log_forensic, logio, mem_cache, mime, mime_magic, negotiation, proxy, proxy_connect, proxy_ftp, proxy_http, rewrite, setenvif, speling, ssl, status, suexec, unique_id, userdir, usertrack, and vhost_alias. Additionally, SUSE Linux provides the following Apache modules as RPM packages that need to be installed separately: apache2-mod_auth_mysql, apache2-mod_fastcgi, apache2-mod_macro, apache2-mod_murka, apache2-mod_perl, apache2-mod_php4, apache2-mod_php5, apache2-mod_python, and apache2-mod_ruby.

Some of these modules are documented in more detail in this section. For a description of other modules in the base distribution, see the Apache Modules Web site at http://httpd.apache.org/docs-2.0/mod/. For third-party modules, refer to http://modules.apache.org/.

Apache modules can be divided into three different categories: base modules, extension modules, and external modules.

46.5.1. Base Modules

Base modules are compiled into Apache by default. They are available unless explicitly left out at buildtime. Apache in SUSE Linux has only the minimum base modules compiled in, but all of them are available as shared objects: rather than being included in the /usr/sbin/httpd2 binary itself, they can be included at runtime by configuring APACHE_MODULES in /etc/sysconfig/apache2.

46.5.1.1. Server-Side Includes with mod_include

mod_include provides a means of file processing before data is sent to the client. Typically, mod_include is used to include files in a document that are in turn parsed as HTML before they reach the client. This is why it is called server-side includes (SSI).

With SSIs, special commands are executed on the server side, triggered by formatted SGML comments. These SGML commands have the syntax:

<!--#element attribute=value -->
        

For a list of element and attribute values, see the mod_include documentation at http://httpd.apache.org/docs-2.0/mod/mod_include.html.

To use mod_include in SUSE Linux, add include to APACHE_MODULES in /etc/sysconfig/apache2 or use YaST as described in Modules.

[Tip]Tip

Use the XBitHack directive (http://httpd.apache.org/docs-2.0/mod/mod_include.html#xbithack) to instruct Apache to parse files with the execute bit set for SSI directives.

This means that, rather than having to change the extension of a file to mark it as holding SSI elements (.shtml in the example above), you can use a regular .html file and run chmod +x myfile.html.

46.5.1.2. Common Gateway Interface: mod_cgi

mod_cgi enables Apache to deliver content created by external CGI ("Common Gateway Interface") programs or scripts. It acts as an instance between a programming language available on the physical machine and the Apache Web server. Theoretically, CGI scripts can be written in any programming language. Usually, languages such as Perl or C are used. mod_cgi is the most common way to include dynamic content on a Web site.

CGI programming differs from "regular" programming in that the CGI programs and scripts must be able to generate a Content-type: text/html MIME type to produce HTML output.

Example 46.11. A Simple CGI Script in Perl

#!/path/to/perl
print "Content-type: text/html\n\n";
print "Hello, World.";
            

The difference between modules specifically bound to a programming language (such as mod_php5) and mod_cgi lies in the possibility of combining mod_cgi with mod_suexec (see Section 46.5.2.1, “Running CGIs as a Different User with mod_suexec). This combination allows CGI scripts to be executed with a specified user ID. Usually, scripts using mod_cgi alone or mod_php5 are executed with the user ID of the Apache user (default in SUSE Linux: wwwrun). Modules designed for a programming language (such as mod_php5 or mod_ruby) embed a persistent interpreter in Apache to execute scripts under the Apache user ID.

As a consequence, CGIs with mod_suexec aid in administrative clarity as the CGI processes can be assigned to individual users instead of the Web server itself. Also, better file system security is accounted for with this combination: the script inherits only the user's file system rights. In the contrary case of modules, the script is granted the Web server user's file permissions, which can lead to unintended visibility of data in the file system.

CGIs are terminated when the request of a client to the Web server ends. This means that CGIs are not persistent and release all occupied resources after termination. This is an advantage, especially in case of erroneous programming. With modules, the effects of programming errors can accumulate, because the interpreter is persistent. This may result in a failure to release resources, such as database connections, and can require an Apache restart.

To use mod_cgi in SUSE Linux, either add cgi to APACHE_MODULES in /etc/sysconfig/apache2 or use YaST as described in Modules. The default directory for CGIs in SUSE Linux is /srv/www/cgi-bin/.

If manually editing the Apache configuration file, use this example as a guideline for configuring mod_cgi.

Example 46.12. Manual Activaton of mod_cgi

# Global Environment
LoadModule cgi_module /path/to/mod_cgi.so

# Main Server and/or Virtual Host and/or 
# Directory and/or .htaccess context
AddHandler cgi-script .cgi .pl

# Main Server and/or Virtual Host context
ScriptAlias /cgi-bin/ /srv/www/cgi-bin/

# Alternatively, explicitly allow CGI scripts in a directory
# Main Server and/or Virtual Host context
<Directory /srv/www/some/dir>
    Options +ExecCGI
<Directory>
            

46.5.2. Extension Modules

In general, modules labeled as extensions are included in the Apache software package, but are usually not compiled into the server statically. In SUSE Linux they are available as shared objects that can be loaded into Apache at runtime.

46.5.2.1. Running CGIs as a Different User with mod_suexec

In combination with mod_cgi (Section 46.5.1.2, “Common Gateway Interface: mod_cgi), mod_suexec allows CGI scripts to run as a specified user and group. The suEXEC program at /usr/sbin/suexec2 is used for that purpose. It is a wrapper called by Apache every time a CGI script or program is executed. Both wrapper and program then get the configured user and group ID assigned. This results in it being run as the configured user or group.

While this approach considerably reduces the security risk involved with user-generated CGI scripts, it also has some important considerations:

Considerations for suEXEC Usage

  • suEXEC docroot—All execution of scripts is limited to this base directory. This means that running scripts with suexec outside of the docroot is not possible and results in an error. docroot is set at suEXEC compile time and cannot be changed at runtime. The default in SUSE Linux is /srv/www.

  • uidmin—This represents the minimum ID a user must have to be used to execute scripts with suEXEC. This prevents scripts from being excuted as system users such as root. Do not create users with an ID lower than uidmin if they should be used with mod_suexec. The default uidmin in SUSE Linux is 96.

  • gidmin—This is the same concept as uidmin, but for the group ID. The default gidmin in SUSE Linux is 96.

  • Directory and File Permissions—The script in question must be owned by the same user and belong to the same group as specified as the suEXEC user and group. Additionally, the file must not be writable by any except the owner. The directory the script resides in must also be only writable by the owner.

  • suEXEC safepath—All programs used in a script (such as Perl) must reside in the paths labeled as safe for suexec. safepath is set at suEXEC compile time and cannot be changed at runtime. The default safepath in SUSE Linux is /usr/local/bin:/usr/bin:/bin.

In case of errors caused by mod_suexec, consult the suexec log file at /var/log/apache2/suexec.log.

To use mod_suexec in SUSE Linux, either add suexec to APACHE_MODULES in /etc/sysconfig/apache2 or use YaST as described in Modules. Keep in mind that mod_cgi is needed to run suexec.

mod_suexec is most useful when applied in a virtual host environment, described in Section 46.4, “Virtual Hosts”. To specify a certain user and group as which to run CGI scripts, use the following syntax in the file holding the virtual host declarations (default in SUSE Linux is /etc/apache2/vhosts.d/*):

Example 46.13. mod_suexec Configuration

<VirtualHost 192.168.0>
# ...
ScriptAlias /cgi-bin/ /srv/www/vhosts/www.example.com/cgi-bin/
SuexecUserGroup tux users
# ...
</VirtualHost>
            

The SuexecUserGroup username group syntax in this example assigns all scripts residing in /srv/www/vhosts/www.example.com/cgi-bin/ the user ID of tux and group ID of users.

46.5.2.2. Secure Sockets Layer and Apache: mod_ssl

mod_ssl provides strong encryption using the secure sockets layer (SSL) and transport layer security (TLS) protocols for HTTP communication between a client and the Web server. For this purpose, the server sends an SSL certificate that holds information proving the server's valid identity before any request to a URL is answered. In turn, this guarantees that the server is the uniquley correct end point for the communication. Additionally, the certificate generates an encrypted connection between client and server that can transport information without the risk of exposing sensitive, plain-text content. The most visible effect of using mod_ssl with Apache is that URLs are prefixed with https:// instead of http://.

The default port for SSL and TLS requests on the Web server side is 443. There is no conflict between a “regular” Apache listening on port 80 and an SSL/TLS-enabled Apache listening on port 443. In fact, HTTP and HTTPS can be run with the same Apache instance. Usually one virtual host (see Section 46.4, “Virtual Hosts”) is used to dispatch requests to port 80 and port 443 to separate virtual servers.

[Important]Name-Based Virtual Hosts and SSL

It is not possible to run multiple SSL-enabled virtual hosts on a server with only one IP address. Users connecting to such a setup will receive a warning message stating that the certificate does not match the server name every time they visit the URL. A separate IP address or port is necessary for every SSL-enabled domain to achieve communication based on a valid SSL certificate.

Despite the warning message, you still get the same level of encryption that you would have on any valid SSL site. This means that as long as the warning message is acceptable, communication between Web server and client is still secure. The concept of uniquely knowing the server's identity, which is guaranteed by a valid SSL certificate, is forfeited.

To activate mod_ssl in SUSE Linux, either add ssl to APACHE_MODULES in /etc/sysconfig/apache2 or use YaST as described in Modules. Additionally, the Web server must be configured to listen on the standard HTTPS port 443. This can be done manually in /etc/apache2/listen.conf or in YaST via the Listen menu entry (see Network Device Selection).

A test SSL certificate can be created by entering cd /usr/share/doc/packages/apache2; ./certificate.sh as root. Follow the on-screen instructions to build the SSL certificate. The resulting certificate files reside in the directories /etc/apache2/ssl*.

A “real” certificate with global validity can be obtained from vendors such as Thawte (http://www.thawte.com/ or Verisign (www.verisign.com).

If manually editing the Apache configuration file, use this example as a guideline for configuring mod_ssl.

Example 46.14. Manual Configuration of mod_ssl

# Global Environment
# listen on the standard SSL port
Listen 443
# load module only if rcapache2 start-ssl was issued
<IfDefine SSL>
LoadModule ssl_module /path/to/mod_ssl.so
</IfDefine>

# Main Server context
# include global (server-wide) SSL configuration 
# that is not specific to any virtual host
# only if ssl_module was loaded
<IfModule mod_ssl.c>
Include /etc/apache2/ssl-global.conf
</IfModule>
            
[Tip]Tip

Do not forget to open the firewall for SSL-enabled Apache on port 443. This can be done via YaST by going to Security and Users+Firewall+Allowed Services. Then add HTTPS Server to the list of Allowed Services.

46.5.3. External Modules

Officially, modules labeled external are not included in the Apache distribution. However, SUSE Linux provides several of them readily available for usage. This chapter briefly explains some external modules and their functionality.

46.5.3.1. Using Perl to Manage Apache: mod_perl

mod_perl embeds a persistent Perl interpreter in Apache. This avoids the overhead caused by a mod_cgi that calls an external executable on every request to a CGI. mod_perl additionally allows controlling many aspects of Apache functionality with the help of the Perl programming language.

To use mod_perl in SUSE Linux, install the apache2-mod_perl RPM and activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2. After installation and activation, a separate configuration file, mod_perl.conf, is placed in /etc/apache2/conf.d/. Additionally, the mod_perl start-up script is installed as mod_perl-startup.pl. For more information about how to use the module, consult the documentation available on the mod_perl Web site (http://perl.apache.org/).

46.5.3.2. Serving PHP: mod_php4, mod_php5

PHP is a popular programming language originally geared towards usage on the Web. It exists in two versions, PHP4 and PHP5. While PHP4 represents the classic concept of and approach to PHP, PHP5 has introduced new object-oriented programming possibilities along with many other advanced features. Both mod_php4 and mod_php5 are available in SUSE Linux. They embed the PHP interpreter into Apache as a persistent module.

To use mod_php4 or mod_php5 in SUSE Linux, install the respective RPM (apache2-mod_php4, apache2-mod_php5) and activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2.

After installation and activation, a separate configuration file for the respective module (either php4.conf or php5.conf) is placed in /etc/apache2/conf.d/. The PHP Web site (http://www.php.net) is an excellent resource for using Apache together with PHP.

46.5.3.3. Python and Apache: mod_python

mod_python embeds the Python interpreter into Apache. Python is an object-oriented programming language with a very clear and legible syntax. An unusual but convenient feature is that the program structure depends on the source code indentation rather than regular demarcation elements such as begin and end.

To use mod_python in SUSE Linux, install the apache2-mod_python RPM and activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2. For more information about how to use the module, consult the documentation available on the mod_python Web site (http://www.modpython.org/).

46.5.3.4. Ruby Interpreter in Apache: mod_ruby

mod_ruby embeds the Ruby interpreter in the Apache Web server, allowing Ruby CGI scripts to be executed natively. Ruby is a relatively new, object-oriented high-level programming language that resembles certain aspects of Perl and Python. Like Python, it has a clean, transparent syntax. On the other hand, Ruby has adopted abbreviations (such as $.r for the number of the last line read in the input file) that are appreciated by some programmers and disliked by others. The basic concept of Ruby closely resembles that of Smalltalk.

To use mod_ruby in SUSE Linux, install the apache2-mod_ruby RPM and activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2. For more information about how to use the module, consult the documentation available on the mod_ruby Web site (http://www.modruby.net/en/index.rbx).

46.5.3.5. Native File System Access: mod_dav

mod_dav provides WebDAV (Web-Based Distributed Authoring and Versioning) functionality for Apache. WebDAV is an extension of the HTTP protocol that allows users to collaboratively edit and manage files on remote servers. WebDAV's capabilities are similar to those of FTP with the major difference that HTTP is used as the underlying protocol for server access. In effect, mod_dav makes an Apache Web server an advanced remote file system.

It is good practice, if not required, to limit access to the directories available via WebDAV. The minimum precautions to take are to set up HTTP basic authentication for the WebDAV resource, along with Limit clauses inside a Location directive.

To access a WebDAV resource, WebDAV-capable software needs to be present on the client side. SUSE Linux already comes with WebDAV capabilities: Konqueror with the prefix webdav:// or webdavs:// (for WebDAV over SSL connections) can be used to connect to an Apache WebDAV file system.

mod_dav requires the module mod_dav_fs, which provides the actual file system access for WebDAV. To use mod_dav in SUSE Linux, activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2. Do the same for mod_dav_fs. For more information about how to use the module, consult the documentation available on the mod_dav Web site (http://httpd.apache.org/docs-2.0/mod/mod_dav.html).

46.5.3.6. Offering User Home Pages: mod_userdir

mod_userdir in SUSE Linux defaults to offering the contents of each user's ~/public_html folder as public Web pages. The URL to access those pages then is http://www.example.com/~username/.

[Tip]Tip

mod_userdir in SUSE Linux forbids access to any directories of the root user's home directory for security reasons. Additionally you can specifically allow only certain users to have public home pages by using:

# Main server context
UserDir disabled
UserDir enabled tux wilber
            

To use mod_userdir in SUSE Linux, activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2. For more information about how to use the module, consult the documentation available on the mod_userdir Web site (http://httpd.apache.org/docs-2.0/mod/mod_userdir.html).

46.5.3.7. Changing URL Layout: mod_rewrite

mod_rewrite is often referred to as “the Swiss army knife of URL manipulation.” It rewrites requested URLs on the fly based on a specified rule set. The result typically looks similar to http://www.example.com/2/1/de for http://www.example.com/display.php?cat=2&article=1&lang=de.

The URL Rewriting Guide explains the advantages and disadvantages of the powerful but complex module:

With mod_rewrite you either shoot yourself in the foot the first time and never use it again or love it for the rest of your life because of its power.

RewriteRule sets can be set in all configuration contexts: for the main server, for virtual hosts, for directories, and for .htaccess files. A good starting point for URL rewriting with mod_rewrite is URL Rewriting Guide at http://httpd.apache.org/docs-2.0/misc/rewriteguide.html.

To use mod_rewrite in SUSE Linux, activate the module either via YaST (Modules) or manually in /etc/sysconfig/apache2.