Speed up multiple debian / ubuntu updates with apt-cacher
Today I upgraded from ubuntu 7.04 to 7.10, and it was 1.4GB. This is a lot, even with fast DSL, given that you will most probably need to do it more than once (desktop, laptop, office etc.). This time I chose to do it more cleverly: using a local cache. Apt-cacher to the rescue. But first, what are the necessary steps?
About 1 and 2: Apt-cacher can actually also run in stand-alone mode without the need for an apache server, but I'm not going to cover this here. In fact, I believe that the apache-mode is much nicer (see later).
About 3: Tweak /etc/apt-cacher/apt-cacher.conf to your needs. The default settings are usable out of the box and everything is very well documented in there. Very nice: There is an option for rate limiting and external http proxy configuration (even more redundancy!).
About 4: Now for the easy part: Modify you clients' /etc/apt/sources.list files. Here is how the result should look:
Original line:
deb http://ftp.de.debian.org/debian stable main contrib non-free
Apt-cacher enabled replacement:
deb http://cacheserver/apt-cacher/ftp.de.debian.org/debian stable main contrib non-free
Easy!
But now for the brain part:
Q: How to do this automagically for all entries with one line of sed?
I have two possible answers for you:
About A:
sudo su
cd /etc/apt
cp sources.list sources.list.pre-caching
export APTCACHER="your.server.name"
sed -i "s/\(http:\/\/\)\([^\/]*\)\//\1$APTCACHER\/apt-cacher\/\2\//" sources.list
About B:
Same as A, but
sed -i "/^[^#]/h;s/\(http:\/\/\)\([^\/]*\)\//\1$APTCACHER\/apt-cacher\/\2\//;/^[^#]/{x;p;x}" sources.list
Note that it is not a good idea to execute the sed commands more than once!
For the curious:
Explanation of the regular expression in A:
- Have an apache webserver ready
- apt-get install apt-cacher on that machine (your server)
- Configure apt-cacher (see below)
- Configure client machines (see below)
About 1 and 2: Apt-cacher can actually also run in stand-alone mode without the need for an apache server, but I'm not going to cover this here. In fact, I believe that the apache-mode is much nicer (see later).
About 3: Tweak /etc/apt-cacher/apt-cacher.conf to your needs. The default settings are usable out of the box and everything is very well documented in there. Very nice: There is an option for rate limiting and external http proxy configuration (even more redundancy!).
About 4: Now for the easy part: Modify you clients' /etc/apt/sources.list files. Here is how the result should look:
Original line:
deb http://ftp.de.debian.org/debian stable main contrib non-free
Apt-cacher enabled replacement:
deb http://cacheserver/apt-cacher/ftp.de.debian.org/debian stable main contrib non-free
Easy!
But now for the brain part:
Q: How to do this automagically for all entries with one line of sed?
I have two possible answers for you:
- A: Backup, then replace the host part
- B: Backup, then duplicate every entry and replace the host part only in the duplicate to have sort-of a fall-back mechanism if either apt-cacher or the original mirror is unavailable or slow.
About A:
sudo su
cd /etc/apt
cp sources.list sources.list.pre-caching
export APTCACHER="your.server.name"
sed -i "s/\(http:\/\/\)\([^\/]*\)\//\1$APTCACHER\/apt-cacher\/\2\//" sources.list
About B:
Same as A, but
sed -i "/^[^#]/h;s/\(http:\/\/\)\([^\/]*\)\//\1$APTCACHER\/apt-cacher\/\2\//;/^[^#]/{x;p;x}" sources.list
Note that it is not a good idea to execute the sed commands more than once!
For the curious:
Explanation of the regular expression in A:
- Only one command 's/pattern/replacement/' (substitute) is used.
- It searches for "http://" followed by "not a slash" or "until the next slash" (which is the host name).
- This is grouped with \(...\) to later reference it with \1 and \2 in the replacement.
- In the replacement we just add the apt-cacher URL between \1 (http://) and \2 (e.g. ftp.de.debian.org).
- In the middle we have the same 's' command.
- This time we want to duplicate each entry (but not the comments!) and replace only in the duplicate.
- '^[^#]/h' is the command to 'hold' (remember) a line if it does not start with '#' (i.e. is not a comment line)
- Next is the substitution from above (it is tried on every line including comments)
- '/^[^#]/{x;p;x}' is the last (group of) commands. It checks again if we're in a comment line, and if not, it exchanges ('x') the current buffer (i.e. the current line) with the hold buffer we have filled in the first command. Then the current buffer containing the original (non-substituted) line of the 'hold' is printed with 'p' and then swapped again (for sed to print the substituted line automatically).
- The result is that we have, for every line that is not a comment, created a duplicate of it where the duplicate has the required substitution.