Ip-based virual host and name-based virtual host in apache

The authority documents on ip-based virtual host and name-based virtual host are here and here, respectively. But like other official apache documents, the information there is also obscure. You must think hard, guess hard, and do tons of experiments to understand what they are talking about. I looked at those documents because I encountered a server problem configuring apache and thought I must know the meaning of the NameVirualHost directive to resolve my problem. Searching google I found the NameVirtualHost directive has been deprecated in the latest apache version.  However, since my apache version is 2.2, which still uses the NameVirtualHost directive, I must understand how it works.

The directive relates to two modes apache serves websites: ip-based hosting and name-based hosting. After reading the before mentioned documents, I think the best way to understand the concepts is composing an httpd.conf that contains both ip-based virtual hosts and name-based virtual hosts. But let’s begin with the simplest, then consider more complicated cases.

Suppose there is only one ip on your server(192.168.1.1), and you just want to set up ip-based virtual hosts:

 

 

Now which virtual host will serve for the request from 192.168.1.1? According to the authority document:

Specific IP addresses or ports have precedence over their wildcard equivalents, and any virtual host that matches has precedence over the servers base configuration.

The first virtual host has a specific port number while the second virtual host has a specific ip address. Does a more specific port takes precedence over more specific ip or more specific ip takes precedence over more specific port? It turns out the specific ip takes the precedence and the second virtual host servers the request. Of course if we add the specific port 80 to the second virtual host, things would be more clear.

If we modify the httpd.conf as follows,

Note that the first virtual has no port specified and the second virtual host explicitly lists both ip address and port. It seems the second virtual host wins again this time. However, the fact is the first virtual host wins. And you will see the following warning when httpd starts:

[warn] VirtualHost 192.168.1.1:0 overlaps with VirtualHost 192.168.1.1:80, the first has precedence, perhaps you need a NameVirtualHost directive

It seems if  both matching virtual hosts have specified ip, the position  order (not the specificity of port) determines who wins.

And the same thing happens if none of the virtual hosts specifies an ip.

The second virtual host in the above configuration file seems more specific but actually when both virtual hosts specify * as the ip, the port is not taken into account in the comparison, and the position order dominates. So the first virtual host will serve the request.

If there is no matching virtual host in httpd.conf(as in the following configuration file), the default server(also called the mail server) (with the DocumentRoot /var/www/html in our case) will serve the request.

Let’s make things a little more complex. Suppose there are 3 ip addresses configured on your server: 192.168.1.1, 192.168.1.2,192.168.1.3, and you use the following configuration:

 

Now, the request coming from 192.168.1.1 will be served by the first virtual host, the request from 192.168.1.2 will be served by the second virtual host, and the requests from 192.168.1.3 will be served by the third virtual host.

Name-based virtual host is based on ip-based virtual host, which means the ip-based virtual host matching and selection algorithm(except the position order factor) applies for name-based virtual host first, before the matching for name applies. In other words, the ServerName matching is done only for ip matched virtual hosts. If there is no matched ServerName, the first ip-matched virtual host is chosen. Consider the following configuration:

now which virtual host is chosen for request http://namedomain1.com coming from 192.168.1.1? You may think it is the first virtual host because both its ip/port and the ServerName match the request, while the ServerName of the second virtual host does not match the request. Unfortunately, the correct answer is the second virtual host. This is because the ip/port matching occurs before the ServerName matching. After the ip/port matching, the only virtual host chosen is the second virtual host because it has the most specific ip/port(as we talked before). This virtual host forms the candidate set for the next ServerName matching. The ServerName matching continues after the ip/port matching and fails because the ServerName in the second virtual host does not match the request. According to the official Apache document, if no virtual host in the candidate set matches its ServerName to the HOST header of the http request, the first virtual host in the candidate set is selected. In typical situations, there are multiple virtual hosts in the candidate set after ip/port matching, each corresponds to one domain name, like the following:

The request http://namedomain1.com matches the first virtual host while the request http://namedomain2.com matches the second virtual host.

If there is no virtual host set up for the ip/port range specified by the NameVirtualHost parameter like in the following configuration, you will get the following error when starting httpd:

[warn] NameVirtualHost *:8080 has no VirtualHosts

But the request still goes through ip/port matching to select a virtual host(ip-based virtual host). The NameVirtualHost parameter has nothing to do with the request, but affects the virtual host selection behavior,i.e.,virtual hosts in the candidate set after ip/port matching whose ip/port defined in NameVirtualHost will enter ServerName matching algorithm. Apache determines which virtual hosts are subject to NameVirtualHost management(i.e.,name-based virtual hosts) at start time and in a silly way. For NameVirtualHost *:* and <VirtualHost *:*>(or NameVirtualHost 192.168.1.1:* and <VirtualHost 192.168.1.1:*>, or NameVirtualHost 192.168.1.1:80 and <VirtualHost 192.168.1.1:80>, or NameVirtualHost *:80 and <VirtualHost *:80>), the virtual host is subject to NameVirtualHost. If you write NameVirtualHost *:* and <VirtualHost *:80>, apache will report the following error at start time, although human can understand the intention without problem.

[error] VirtualHost *:80 — mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results

This undefined result for our case is: if the VirtualHost uses ports other than 443, the virtual host does not go into the candidate set. If the VirtualHost uses port 443, it goes into the candidate set for matching ServerName but does not goes into the candidate set for choosing SSL key and certificate file.  Other successful combinations(I mean no error occurs at start time of httpd)  such as  NameVirtualHost *:* and <VirtualHost *:*> ,  NameVirtualHost 192.168.1.1:* and <VirtualHost 192.168.1.1:*>,  NameVirtualHost 192.168.1.1:80 and <VirtualHost 192.168.1.1:80> have a similar problem: although the virtual host goes into the candidate set for matching ServerName, it does not go into the candidate set for selecting SSL stuff.  But If you write NameVirtualHost *:80 and <VirtualHost *:*>, it will complain:

[warn] NameVirtualHost *:80 has no VirtualHosts

and the virtual host will never take part in the ServerName matching. Similarly, NameVirtualHost *:* and <VirtualHost 196.168.1.1:*>,NameVirtualHost 196.168.1.1:* and <VirtualHost *:*> produce the same error.

Now the only correct combination left is: NameVirtualHost *:80 and <VirtualHost *:80>.

From the lengthy words so far, you can know the configuration for apache is not an easy work. Although I have been learning how to set up websites using apache for years, I still have problems of this kind or that kind. The difficulties are mainly due to the lack of clean and easy to understand documents for apache. I have criticized this in  another post when I learned how to use RewriteBase. In fact many configuration parameter have the same problem. Nobody tells you the subtle and exact meaning of those parameters. A mis-configuration may cost you a lot. All you can do is testing, thinking, and learning from failure. This is a big problem of apache project.

Posted in tips of hosting