Server Hardware
There are obviously a multitude of factors affecting server hardware, including:
- Number of active users
- Number of client hosts
- Size of the project's source data
- Size of the project's derived data
- Amount of build work being performed
- Rate at which new source and derived data get generated
It's difficult to make completely general statements about what you will need, but here are a few pieces of information to inform your decision making process:
- A modest server can scale up to 100 or more users and client machines.
- A moderately powerful server should be able to scale up to several hundred users and client machines.
- The repository server uses more network, CPU time, and disk I/O than the cache server.
- The cache server can use a lot of memory. Its memory usage depends on the amount and complexity of your build work.
The repository server's memory usage is proprtional to the amount of version history it stores. (It keeps the entire directory structure for /vesta and /vesta-work in memory at all times.)
- Source and derived files are stored uncompressed in the same pool of storage. (This might change in the future.)
- The cache server can be run on a separate machine from the repository server if desired. (Usually this only serves to reduce memory usage on the repository server machine, as the CPU and network resource needs of the repository tend to dwarf those of the cache.)
Estimating Disk Space Needs
To come up with an idea of how much disk you need, you'll want to know:
- How big your source set is.
- The rate of change of your source set (as each version of each file is stored).
- The rate you expect to produce dervied data (the results of all build steps).
How often you want to run the weeder to purge build results. (Obviously less disk will require that you run the weeder more frequently.)
It may be difficult to estimate these values without some empirical data from experiments, as Vesta's caching affects how much new derived data you will store on each build.
Let's go through a quick example of how you might estimate your disk space requirements. We'll start with the sources:
- Suppose your source set is 40M in size.
- Suppose you expect it to change at a rate of 1% per week. (Remember that the Vesta definition of "source" includes compiler binaries and system libraries, which you wouldn't expect to change much.)
- Suppose you expect your project to last 2 years.
From this we can extrapolate the amount of disk space needed to store the sources over the life of the project:
- 40M * (1 + (1% * 104 weeks) ) = 40M * (1 + 1.04) = 81.6M
Now let's consider derived data:
- Suppose a complete build produces 200M of result files. (This includes both intermediate files and final results.)
- Suppose an average incremental build produces 20M of new result files.
- Suppose you have 10 developers each doing 5 builds per day.
- Suppose you would like to run the weeder no more than once a week.
From this we can extrapolate the amount of disk space needed to store derived files during the project:
- 200M + 20M per build * 5 builds per developer per day * 10 developers * 7 days = 200M + 7000M = 7.2G
Combining the two figures (since source files and derived files are stored on the same disk), you could conclude that you would need at least 7.3G for your Vesta server storage.
There are a few kinds of meta-data not accounted for in the above, so it might be best to round up to 8G just to have a little margin.
Experience at Intel
At the site at Intel which makes the most extensive use of Vesta (Massachusetts) we have about 400 active users with about the same number of client hosts. (Each user usually has their own workstation which acts as a Vesta client.)
From 1998-2001, this was our Vesta server:
CPU |
4 x 667MHz Alpha |
RAM |
16G |
Disk for Vesta |
407G |
OS |
Tru64 |
From 2002-2006 we've been using this Vesta server:
CPU |
4 x 2.8GHz Intel Xeon MP (Due to HT Linux sees it as an 8-CPU box) |
RAM |
16G |
Disk for Vesta |
410G |
OS |
Linux (RHEL3 for a time, now SUSE Enterprise Server 9) |
In these large servers, much of the RAM wound up being used by the kernel for disk cache.
At two satellite installations we used the following server hardware:
- California with ~100 users:
CPU |
2 x 2.0GHz Intel Xeon |
RAM |
3G |
Disk for Vesta |
100G |
OS |
Linux |
- India with ~150 users:
CPU |
2 x 2.4GHz Intel Xeon |
RAM |
3.6G |
Disk for Vesta |
269G |
OS |
Linux |
To give some context to these server hardware choices, here's some information on the size of the sources in our past three projects:
Project #1 (1998-2001) |
total files |
15k |
total file size |
451M |
|
Project #2 (2002-2004) |
total files |
44k |
total file size |
2.5G |
|
Project #3 (2005-present) |
total files |
109k |
total file size |
7.6G |