Defindit Docs and Howto Home
This page last modified: Aug 15 2008
title:Apache httpd suExec, Rewrite and VirtualHost
keywords:apache,suexec,su,exec,httpd,rewrite,permissions,privs,privileges,forbidden,uid,suid,script,document,root
description:How to get suExec working for Virtual Hosts and RewriteEngine examples for scripts.
Jan 03 2007
Table of contents
-----------------
Synopsis
What is DocumentRoot?
The workaround with embedded comments
Old workaround
Explanation of old workaround
Test scripts
Debugging mod_rewrite and pattern matching
Synopsis
--------
When running CGI scripts on an web server, you will often need to read
and write data to files. On a multiuser server, you do not want other
users reading or writing your data. However, on most servers, CGI
scripts for all users run as the user "apache" or "www". This means
that every user can read and write your files. Even if the files are
created as RW only for apache, and other users are not in the apache
group, a trivial CGI script will enable access to other files because
the script runs at apache. The solution is that your scripts run as
you, and not as some other user. You store data in directories outside
the web accessible area. Apache will su to you via suexec. However,
suexec will not work unless the URL has ~userid, and thus we need to
rewrite requests. Suexec only works in virtual hosting or with user
directories. (For files in document root, you can only have one
SuexecUserGroup directive, and thus only one user.)
http://httpd.apache.org/docs/2.2/suexec.html#usage
Suexec works great if:
1) you have a virtual host and your files are in document root
2) no virtual host and your files are in the userdir (assuming you have userdir enabled)
However, to get suexec to work with virtual hosted domains that
aren't in the standard document root or for URLs which don't contain
~userid, you need the workaround the below. As far as I know, using
the Apache RewriteEngine as outlined below is secure.
I think this example should be part of the Apache docs, but those docs
are some of the best I've ever seen, so it's hard to be critical.
I'm running Apache 2, and as far as I know this (mostly) also works
with Apache 1.3.
The Apach 2 suexec docs are:
http://httpd.apache.org/docs-2.0/suexec.html
For the following examples, we'll assume that our machine has a
virtual host "example.com", and you are the user "mst3k". You either
have root access, or the sysadmin is willing to help you and is
willing to let you use some powerful Apache features.
What is DocumentRoot?
---------------------
"document root" in this context is what is returned by suexec -V (you must be root to run
this command).
[root@example ~]# suexec -V
-D AP_DOC_ROOT="/var/www"
-D AP_GID_MIN=100
-D AP_HTTPD_USER="apache"
-D AP_LOG_EXEC="/var/log/httpd/suexec.log"
-D AP_SAFE_PATH="/usr/local/bin:/usr/bin:/bin"
-D AP_UID_MIN=500
-D AP_USERDIR_SUFFIX="public_html"
[root@example ~]#
There are several things we can conclude from this output above, and
some of these are none too obvious
- The VirtualHost directive SuexecUserGroup will only work for scripts
somewhere in AP_DOC_ROOT which is /var/www for this example. Changing
document root in a VirtualHost with the DocumentRoot directive will
*not* effect this setting. Directories symlinked as subdirectories of
/var/www are supported, and will suexec.
- The userdir is public_html (as usual).
- Users with a uid under 500 can't suexec. Groups with gid under 100
can't suexec.
There are some good practical reasons to locate every user's document
root in /home/user/public_html even when virtually hosting. It keeps
all the files owned by non-admin users in /home. The user's
public_html is a real directory in the user space, and not a symlink
to a subdirectory in /var/www. The /var file system doesn't need to be
large enough to accomodate web space (not a problem on most modern
systems, but a headache in the old days). If everything a user needs
is in /home/user, there is no need for symlinks to other parts of the
disk. Backups are easier when all the user data is in /home (I also
keep user's mail boxes in /home as well, i.e. /home/mst3k/Maildir and
the user's httpd logs are in /home/mst3k/logs).
The problem stems from how paranoid suexec is. If the URL contains
~userid, then suexec will happily su from apache to userid. However,
if the URL is a virtually hosted URL (in the identical directory) and
does not contain ~userid, then suexec will *not* su (switch user). For
example /home/mst3k/public_html is document root for the virtually
hosted example.com. Normally suexec will su for
http://example.com/~mst3k/test_id.pl, but will not su for
http://example.com/test_id.pl even though it is the same script in the
same directory.
The workaround with embedded comments
-------------------------------------
In order to have your scripts suexec to you instead of running as
apache or www, use a .htaccess file with the following RewriteEngine
rules between -- and --
--
DirectoryIndex index.html index.pl index.cgi
Options +ExecCGI +SymLinksIfOwnerMatch
AddType application/xml .xml
AddHandler cgi-script .pl .cgi
XbitHack on
RewriteEngine on
RewriteBase /
# If the request URI is /~mstk3k/index.html
# then the RewriteRule matches on index.html
# If the request URI is /~mstk3k/foo/index.html
# then the RewriteRule matches on foo/index.html
# Rewrite non ~ requests as ~userid requests so that suexec works. If
# you rewrite all requests (including those with a ~ then you'll have
# a redirect loop. These rules do not redirect, they only *rewrite*
# the request. In other words, these rules change how Apache treats
# the request, but the browser still sees the original URL. The effect
# of these rules is almost impossible to detect from the browser.
# The main trick is that the RewriteCond immediately before the
# RewriteRule must have a capturing regular expression that captures
# the userid in %1 (the first captured expression).
# When using virtual hosting to a user's public_html directory, the
# document root will be the user's public_html, for example
# /home/mst3k/public_html.
# However, when using an Alias directive document root will be the
# normal document root which is usually /var/www/html. When using
# Alias we have to get the userid from the SCRIPT_FILENAME instead of
# document root. The rules below work for when the alias is to
# ~/public_html. For example:
# Alias /foo/ "/home/mst3k/public_html/"
# The Alias rules below only support .pl and .cgi file extensions.
# The rules below are for Alias.
RewriteCond %{REQUEST_URI} !^/~.*$
RewriteCond %{SCRIPT_FILENAME} \/home\/(.*)\/public_html\/(.*\.(pl|cgi))
RewriteRule ^.*$ /~%1/%2 [L]
# The rules below are for virtual hosting.
RewriteCond %{REQUEST_URI} !^/~.*$
RewriteCond %{DOCUMENT_ROOT} \/home\/(.*)\/public_html
RewriteRule ^(.*)$ /~%1/$1 [L]
# Notes about mod_rewrite
# Use something like the rewrite below to debug pattern matching.
# Clear your browser cache, of force a non-cached reload (shift-reload
# in Firefox).
# The whole string that RewireRule matches against becomes the query
# string. This is useful because what RewriteRule matches against is
# not the URI. Every request is redirect to foo.html. Enable this only
# for debugging.
# RewriteCond %{REQUEST_URI} !foo
# RewriteRule (.*) /~twl8n/foo.html?$1 [R,L]
--
As explained in the comments, there are two variants:
1) a new version for use with the "Alias" directive where document
root is still the Apache default /var/www/html.
2) the original for use with virtual hosting where your document root
is also your home directory
Most of the notes from below still apply.
In order to debug this process, you'll want my envquery.pl script (see
notes below about downloading) and you may want to uncomment the two
final lines in the .htaccess example above.
These are the lines:
RewriteCond %{REQUEST_URI} !foo
RewriteRule (.*) /~twl8n/foo.html?$1 [R,L]
These lines help you answer the question "What string is RewriteRule
matching the regex against?"
It is often useful to make small changes in working rules when
debugging the regular expressions. Also, when changes to .htaccess do
not seem to have any effect, be sure you are doing a non-cached forced
page reload. In Firefox this is "shift-reload". I think it is
"control-refresh" in IE.
Old workaround
--------------
The following workaround applies to httpd.conf or .htaccess. However,
for it to work in .htaccess you'll need privileges. I've used
AllowOverride all, but some lesser privileges may work.
Copy the following lines into the highest level .htaccess file,
e.g. into /home/mst3k/public_html (as opposed to a sub directory of
public_html).
This version will only redirect Perl scripts (.pl). As far as I know,
it will work for scripts in subdirectories without the need for an
additional copy in each subdirectory's .htaccess file.
# Workaround to get non-tilde URLs to become
# tilde URLs so that Apache will correctly su to mst3k
# instead of running the scripts as user apache.
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !~
RewriteCond %{REQUEST_URI} ^\/.*\.pl$
RewriteCond %{DOCUMENT_ROOT} \/home\/mst3k\/public_html
RewriteRule ^(.*)$ /~mst3k/$1 [L]
Below is the original version that I used. This had some "features":
1) It doesn't [L] and therefore the rule evaluation continues and
evaluation of following rules can lead to unexpected results.
2) Although this looks safe from infinite loops (redirection loops),
when working on an instance of this code, I got an infinite loop (it
may have involved a subdirectory, or could have been an unrelated
bug). The version above does not loop.
3) The use of the capturing regex to get the userid is clever and
flexible. It is also slightly less efficient than the hard coded
version.
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !^/~.*$
RewriteCond %{DOCUMENT_ROOT} \/home\/(.*)\/public_html
RewriteRule ^(.*)$ /~%1/$1
Explanation of old workaround
-----------------------------
This is the explanation of the original version.
# Enable the rewrite engine
RewriteEngine on
# Set the path. I think / is equivalent to the current setting of DocumentRoot.
RewriteBase /
# RewriteCond statements must all be true for any following
#RewriteRule statements to run. All rules run, until a [L] (last) or a
#rule is false.
# REQUEST_URI must not contain a ~ i.e. must not be like http://example.com/~mst3k/
RewriteCond %{REQUEST_URI} !^/~.*$
# DOCUMENT_ROOT is matched against the regular expression
# /home/(.*)/public_html, and (.*) is captured in variable %1.
# This captures the userid, in our examples mst3k. This is generalized
# to work without changes for the different users. You should not need
# to edit this code for different users. You could leave out this line
# and hard code the user id in the next line.
RewriteCond %{DOCUMENT_ROOT} \/home\/(.*)\/public_html
# The URL minus the domain name is matched by ^(.*)$, and the
# expression is captured in $1. Apache will internally rewrite the file
# found using %1 from above and $1. For example, if example.com is
# virtually hosted from directory /home/mst3k/ then
# http://example.com/foo/test.pl
# becomes /~mst3k/foo/test.pl
# This is the crux of the workaround: the rewritten URI contains a
# ~userid and therefore Apache will suexec.
RewriteRule ^(.*)$ /~%1/$1
Test script
-----------
You can test this with the following 4 line script. Save this as a
file such as /home/mst3k/public_html/test_id.pl.
#!/usr/bin/perl
use strict;
my $id_info = `/usr/bin/id`;
print "Content-type: text/html\n\n<html><body>$id_info</body></html>\n";
Run a command to get the file permissions right:
chown +x,go-rw test_id.pl
I assume you are running CGI scripts, or there's little reason to need
suexec, but nonetheless httpd.conf or .htaccess needs:
AddHandler cgi-script .cgi .pl
Scripts won't run if any of the directories containing the script are
group writable.
When the rewrite works, these two URLs give identical results:
http://example.com/~mst3k/test_id.pl
http://example.com/test_id.pl
Simply, a web page like this:
uid=501(mst3k) gid=501(mst3k) groups=48(apache),501(mst3k)
Without the rewrite (or if it isn't working), only the ~mst3k script
will be userid mst3k. Without suexec, all the userids/group ids will
be apache.
A more extensive diagnostic is my envquery.pl script. You can download
here:
http://defindit.com/readme_files/envquery.tar
(packed in a tar file so virus scanners don't get upset).
Debugging mod_rewrite and pattern matching
------------------------------------------
# Use something like this to debug pattern matching
#RewriteRule ^(.*)$ /~mst3k/index.html?$1 [R]