"I'm not proud of being a congenital pain in the ass. But I will take money for it."

Selective HTTP proxy through a firewall, take two

Thu 01 September 2022 | -- (permalink)

In a previous episode, we wrote about a method of setting up selective HTTP proxy through a firewall via a tunnel. We've been using this technique for years now, and it's held up quite well. But it doesn't cover all cases, because it relies on using a "real" browser like Firefox. But what if there's no browser, just tools that issue HTTP requests, or a browser that doesn't support proxy autoconfiguration?

So today's write-up is of a similar technique using tinyproxy and autossh. For simplicity, we assume that in this case both ends of the tunnel are Linux servers, but the technique should generalize.

First, install tinyproxy on both servers. As before, we'll call the server inside the lab proxy.lab.example.com and will assume you're using port 8888 for the proxy service on both servers. If you already set up a tinyproxy instance on the lab server for the proxy autoconfiguration hack, you should be able to reuse it for this.

sudo apt-get install tinyproxy

Second, create a dedicated non-privileged user on both servers and create the user's .ssh directory. On Debian, this should do:

sudo adduser --disabled-password --gecos 'AutoSSH HTTP Proxy Tunnel' autoproxy
sudo install -d -o autoproxy -g autoproxy /home/autoproxy/.ssh

Next, create an ssh key pair for the dedicated user on the client machine:

sudo sudo -u autoproxy ssh-keygen -t ed25519 -N ''

ssh running as the dedicated user will default to using this key, but for a dedicated service it's probably better to lock things down a bit, so put the following in the dedicated user's ~/.ssh/config:

Host proxy.lab.example.com
  User autoproxy
  IdentityFile /home/autoproxy/.ssh/id_ed25519
  IdentitiesOnly yes

That Host stanza is also a good place to put other ssh configuration parameters you might need to get through the firewall, such as Hostname, Port, or ProxyJump

Next, copy the dedicated user's ssh public key over to the server inside the lab and install it as the dedicated user's ~/.ssh/authorized_keys file. The tunnel itself should work with that key installed verbatim as the authorized_keys file, but for safety you'll probably want to lock down that key so that it can only be used for this tunnel, by prefixing various restrictions to that key entry. Something like this should do:

command="/bin/true",restrict,port-forwarding ssh-ed25519 ...

where the text from ssh-ed25519 onward is the public key you copied over.

Next, you'll need to pick a TCP port you can use internally on the client machine, since we're already using 8888 for tinyproxy. This can be pretty much any available TCP port, we'll use 8080 here.

At this point you're ready to test the tunnel configuration. Even if you're sure you got it right, you'll need to ssh manually at least once to set up the client's known_hosts file. So, to test:

sudo sudo -u autoproxy ssh -v -N -L127.0.0.1:8080:127.0.0.1:8888 proxy.lab.example.com

If everything goes well, ssh should mumble a bit then set up the tunnel and wait for something to terminate the ssh session. If ssh exits on its own, something went wrong and you'll need to debug that before going any further. Usual tricks for doing this involve running ssh -vvv on the client and sshd -ddd on the server then reading the resulting verbose logging.

At this point you should be able to test that the tunnel is up and that it allows you to connect to the remote proxy. On the client:

curl -Lvx http://127.0.0.1:8080/ http://somehost.lab.example.com/some/path

Assuming you've gotten past the manual test, you're ready to make the tunnel "permanent". Exit the test session (^C or kill or whatever), then install autossh:

sudo apt-get install autossh

We want the tunnel to come up automatically whenever the client machine boots, so we create a systemd.unit file. Put this in /etc/systemd/system/autoproxy.service:

[Unit]
Description=AutoSSH HTTP Proxy Tunnel
After=tinyproxy.service

[Service]
ExecStart=/usr/bin/autossh -N -L127.0.0.1:8080:127.0.0.1:8888 proxy.lab.example.com
Restart=always
User=autoproxy

[Install]
WantedBy=default.target

Enable and start the service, after which systemd and autossh will do their best to keep it running whenever the client is up:

sudo systemctl enable autoproxy.service
sudo systemctl start  autoproxy.service

Last, we promised that this was going to be selective proxy, which is why you installed tinyproxy on the client machine. It turns out that this is really easy, you just need to add an upstream line on to tinyproxy.conf on the client:

upstream 127.0.0.1:8080 ".lab.example.com"

Restart tinyproxy on the client and you're done.

If you've gotten this far, you should be able to use tinyproxy on the client machine as an HTTP(S) proxy for local clients (for example, apt-get) and it should Just Work. The local tinyproxy configuration will send traffic destined for *.lab.example.com through the tunnel, and will handle all other traffic itself.

Configuring specific clients to use a local proxy is beyond the scope of this writeup, but a lot of programs honor a convention based on environment variables HTTP_PROXY and HTTPS_PROXY (sometimes written in lowercase, who knows why, so try both if necessary). Many other programs and APIs allow explicit configuration (APT, curl, the Python requests library, many others), and some support all of the above. Read The Fine Manual.

You can of course reuse this kind of proxy setup for a client that needs to tunnel into multiple locations (set up a separate tunnel for each and configure the client's tinyproxy to know about all of them), or to tunnel into the lab from multiple clients (add a new line to authorized_keys on the server for each client key, or just reuse the same key pair for all clients if you like living dangerously).

Finally, do pay attention to the other settings in tinyproxy.conf on both ends of the tunnel. The defaults are usually pretty reasonable, but at minimum you should pay attention to the settings of Connect Port, Listen, and Allow to make sure that you're allowing everything you want to allow and disallowing everything else (you're punching a hole in a firewall that presumably was there for a reason, so you need to be careful).


Footnote, a month later: OK, here are hints on how to set up use of the proxy for APT and Docker. All of these rely on being able to redirect all traffic to the proxy and letting tinyproxy sort it out.

APT is easy, just add a file /etc/apt/apt.conf.d/60lab-proxy with content:

Acquire::http::Proxy "http://127.0.0.1:8080";

tinyproxy will sort out which traffic needs to go through the tunnel.

Docker clients are also pretty easy: add ~/.docker/config.json:

{
    "proxies":
    {
        "default":
        {
            "httpProxy":  "http://172.17.0.1:8080",
            "httpsProxy": "http://172.17.0.1:8080",
            "noProxy": "127.0.0.0/8,172.17.0.0/16"
        }
    }
}

We use the default docker0 interface address here rather than 127.0.0.1, because Docker plays games with the latter. If you do fun things with Docker's virtual networking, you may have to find a different address.

Last, you might want to be able to pull Docker images via the proxy. This is a little more complicated, but not much. Create a directory /etc/systemd/system/docker.service.d, then, within that directory, create a file http-proxy.conf with content:

[Service]
Environment="HTTP_PROXY=http://127.0.0.1:8080"
Environment="HTTPS_PROXY=http://127.0.0.1:8080"
Environment="NO_PROXY=127.0.0.0/8,172.17.0.0/16"

Tell systemd to reload its .service files and restart the docker service, and you're done:

sudo systemctl daemon-reload
sudo systemctl restart docker

The noProxy/NO_PROXY settings are to avoid problems within Docker environments where Docker tries to use the proxy to talk to itself and gets confused. This might be due to tinyproxy being a fairly minimal proxy implementation while the Docker HTTP API was clearly designed by people who drank deep of the REST Kool-Aid, but since it's easy to sidestep the issue with an extra variable setting I haven't bothered digging for the root cause.


Another footnote, six months further down the road: when the local clients of this mess are doing something sufficiently kinky, one starts running into the limitations of this setup. I eventually hit this point: the client applications are a multi-level nested mess of Docker-ized and chrooted build environments, each of which would like to think that it's talking directly to the server in the lab, and the illusion eventually starts breaking down. Some of this is the complexity of setting up the client proxy environment correctly in all these layers, some of it appears to be limitations of tinyproxy (or, more precisely, tinyproxy's inability to handle every kinky thing that every kinky proxy client wants to do). Eventually this all gets to the point where one needs another approach.

So we resort to an even sillier kludge: we run a local reverse proxy, visible only to the client machine, and feed that reverse proxy via a forward proxy through the tunnel. There may be a better way to do this, but I already had Apache 2.4 and unbound running on this server for other reasons, so what worked for me was:

  1. Add a local-zone setup with one local-data entry to the unbound configuration, assigning the server's DNS name to the client's IP address:

    server:
        local-zone: "lab.example.com." transparent
        local-data: "somehost.lab.example.com. 300 IN A 192.0.2.1"
    
  2. Create a virtual host entry in the client's Apache configuration holding the new proxy. Technically, this is a reverse proxy using a forward proxy to reach the back end:

    <VirtualHost *:80>
        ServerName somehost.lab.example.com
        <Location />
            Require ip 192.0.2.1 127.0.0 172.17.0.0/16
        </Location>
        ProxyRequests       off
        ProxyPreserveHost   on
        ProxyPass       / http://somehost.lab.example.com/
        ProxyPassReverse    / http://somehost.lab.example.com/
        ProxyRemote     "http://somehost.lab.example.com/" "http://127.0.0.1:8080"
    </VirtualHost>
    

Tweak all the addresses (especially the Require line in the Apache configuration) as needed, add IPv6 addresses if appropriate.

Adding HTTPS support is not particularly difficult, just add a second VirtualHost entry for port 443 with SSLEngine enabled, but to make it work you'd need a local copy of the key and certificate for somehost.lab.example.com so that your reverse proxy can use that certificate, which might be a problem if you don't control the back-end server. Given that the traffic is going through an ssh tunnel anyway, it may not be worth the trouble for your application.