Thu 01 September 2022 | -- (permalink)
In a previous episode, we wrote about a method of setting up selective HTTP proxy through a firewall via a tunnel. We've been using this technique for years now, and it's held up quite well. But it doesn't cover all cases, because it relies on using a "real" browser like Firefox. But what if there's no browser, just tools that issue HTTP requests, or a browser that doesn't support proxy autoconfiguration?
So today's write-up is of a similar technique using tinyproxy
and
autossh
. For simplicity, we assume that in this case both ends of
the tunnel are Linux servers, but the technique should generalize.
First, install tinyproxy
on both servers. As before, we'll call the
server inside the lab proxy.lab.example.com
and will assume you're
using port 8888
for the proxy service on both servers. If you
already set up a tinyproxy
instance on the lab server for the proxy
autoconfiguration hack, you should be able to reuse it for this.
sudo apt-get install tinyproxy
Second, create a dedicated non-privileged user on both servers and
create the user's .ssh
directory. On Debian, this should do:
sudo adduser --disabled-password --gecos 'AutoSSH HTTP Proxy Tunnel' autoproxy
sudo install -d -o autoproxy -g autoproxy /home/autoproxy/.ssh
Next, create an ssh key pair for the dedicated user on the client machine:
sudo sudo -u autoproxy ssh-keygen -t ed25519 -N ''
ssh running as the dedicated user will default to using this key, but
for a dedicated service it's probably better to lock things down a
bit, so put the following in the dedicated user's ~/.ssh/config
:
Host proxy.lab.example.com
User autoproxy
IdentityFile /home/autoproxy/.ssh/id_ed25519
IdentitiesOnly yes
That Host
stanza is also a good place to put other ssh configuration
parameters you might need to get through the firewall, such as
Hostname
, Port
, or ProxyJump
Next, copy the dedicated user's ssh public key over to the server
inside the lab and install it as the dedicated user's
~/.ssh/authorized_keys
file. The tunnel itself should work with
that key installed verbatim as the authorized_keys
file, but for
safety you'll probably want to lock down that key so that it can only
be used for this tunnel, by prefixing various restrictions to that key
entry. Something like this should do:
command="/bin/true",restrict,port-forwarding ssh-ed25519 ...
where the text from ssh-ed25519
onward is the public key you copied
over.
Next, you'll need to pick a TCP port you can use internally on the
client machine, since we're already using 8888
for tinyproxy
.
This can be pretty much any available TCP port, we'll use 8080
here.
At this point you're ready to test the tunnel configuration. Even if
you're sure you got it right, you'll need to ssh manually at least
once to set up the client's known_hosts
file. So, to test:
sudo sudo -u autoproxy ssh -v -N -L127.0.0.1:8080:127.0.0.1:8888 proxy.lab.example.com
If everything goes well, ssh
should mumble a bit then set up the
tunnel and wait for something to terminate the ssh session. If ssh
exits on its own, something went wrong and you'll need to debug that
before going any further. Usual tricks for doing this involve running
ssh -vvv
on the client and sshd -ddd
on the server then reading
the resulting verbose logging.
At this point you should be able to test that the tunnel is up and that it allows you to connect to the remote proxy. On the client:
curl -Lvx http://127.0.0.1:8080/ http://somehost.lab.example.com/some/path
Assuming you've gotten past the manual test, you're ready to make the
tunnel "permanent". Exit the test session (^C
or kill
or
whatever), then install autossh
:
sudo apt-get install autossh
We want the tunnel to come up automatically whenever the client
machine boots, so we create a systemd.unit
file. Put this in
/etc/systemd/system/autoproxy.service
:
[Unit]
Description=AutoSSH HTTP Proxy Tunnel
After=tinyproxy.service
[Service]
ExecStart=/usr/bin/autossh -N -L127.0.0.1:8080:127.0.0.1:8888 proxy.lab.example.com
Restart=always
User=autoproxy
[Install]
WantedBy=default.target
Enable and start the service, after which systemd
and autossh
will
do their best to keep it running whenever the client is up:
sudo systemctl enable autoproxy.service
sudo systemctl start autoproxy.service
Last, we promised that this was going to be selective proxy, which
is why you installed tinyproxy
on the client machine. It turns out
that this is really easy, you just need to add an upstream
line on
to tinyproxy.conf
on the client:
upstream 127.0.0.1:8080 ".lab.example.com"
Restart tinyproxy
on the client and you're done.
If you've gotten this far, you should be able to use tinyproxy
on
the client machine as an HTTP(S) proxy for local clients (for example,
apt-get
) and it should Just Work. The local tinyproxy
configuration will send traffic destined for *.lab.example.com
through the tunnel, and will handle all other traffic itself.
Configuring specific clients to use a local proxy is beyond the scope
of this writeup, but a lot of programs honor a convention based on
environment variables HTTP_PROXY
and HTTPS_PROXY
(sometimes
written in lowercase, who knows why, so try both if necessary). Many
other programs and APIs allow explicit configuration (APT, curl
, the
Python requests
library, many others), and some support all of the
above. Read The Fine Manual.
You can of course reuse this kind of proxy setup for a client that
needs to tunnel into multiple locations (set up a separate tunnel for
each and configure the client's tinyproxy
to know about all of
them), or to tunnel into the lab from multiple clients (add a new line
to authorized_keys
on the server for each client key, or just reuse
the same key pair for all clients if you like living dangerously).
Finally, do pay attention to the other settings in tinyproxy.conf
on
both ends of the tunnel. The defaults are usually pretty reasonable,
but at minimum you should pay attention to the settings of
Connect Port
, Listen
, and Allow
to make sure that you're allowing
everything you want to allow and disallowing everything else (you're
punching a hole in a firewall that presumably was there for a reason,
so you need to be careful).
Footnote, a month later: OK, here are hints on how to set up use of
the proxy for APT and Docker. All of these rely on being able to
redirect all traffic to the proxy and letting tinyproxy
sort it out.
APT is easy, just add a file /etc/apt/apt.conf.d/60lab-proxy
with
content:
Acquire::http::Proxy "http://127.0.0.1:8080";
tinyproxy
will sort out which traffic needs to go through the
tunnel.
Docker clients are also pretty easy: add ~/.docker/config.json
:
{
"proxies":
{
"default":
{
"httpProxy": "http://172.17.0.1:8080",
"httpsProxy": "http://172.17.0.1:8080",
"noProxy": "127.0.0.0/8,172.17.0.0/16"
}
}
}
We use the default docker0
interface address here rather than
127.0.0.1, because Docker plays games with the latter. If you do fun
things with Docker's virtual networking, you may have to find a
different address.
Last, you might want to be able to pull Docker images via the proxy.
This is a little more complicated, but not much. Create a directory
/etc/systemd/system/docker.service.d
, then, within that directory,
create a file http-proxy.conf
with content:
[Service]
Environment="HTTP_PROXY=http://127.0.0.1:8080"
Environment="HTTPS_PROXY=http://127.0.0.1:8080"
Environment="NO_PROXY=127.0.0.0/8,172.17.0.0/16"
Tell systemd
to reload its .service
files and restart the docker
service, and you're done:
sudo systemctl daemon-reload
sudo systemctl restart docker
The noProxy
/NO_PROXY
settings are to avoid problems within Docker
environments where Docker tries to use the proxy to talk to itself and
gets confused. This might be due to tinyproxy
being a fairly
minimal proxy implementation while the Docker HTTP API was clearly
designed by people who drank deep of the REST Kool-Aid, but since it's
easy to sidestep the issue with an extra variable setting I haven't
bothered digging for the root cause.
Another footnote, six months further down the road: when the local
clients of this mess are doing something sufficiently kinky, one
starts running into the limitations of this setup. I eventually hit
this point: the client applications are a multi-level nested mess of
Docker-ized and chrooted build environments, each of which would like
to think that it's talking directly to the server in the lab, and the
illusion eventually starts breaking down. Some of this is the
complexity of setting up the client proxy environment correctly in all
these layers, some of it appears to be limitations of tinyproxy
(or,
more precisely, tinyproxy
's inability to handle every kinky thing
that every kinky proxy client wants to do). Eventually this all gets
to the point where one needs another approach.
So we resort to an even sillier kludge: we run a local reverse
proxy, visible only to the client machine, and feed that reverse proxy
via a forward proxy through the tunnel. There may be a better way to
do this, but I already had Apache 2.4 and unbound
running on this
server for other reasons, so what worked for me was:
-
Add a
local-zone
setup with onelocal-data
entry to theunbound
configuration, assigning the server's DNS name to the client's IP address:server: local-zone: "lab.example.com." transparent local-data: "somehost.lab.example.com. 300 IN A 192.0.2.1"
-
Create a virtual host entry in the client's Apache configuration holding the new proxy. Technically, this is a reverse proxy using a forward proxy to reach the back end:
<VirtualHost *:80> ServerName somehost.lab.example.com <Location /> Require ip 192.0.2.1 127.0.0 172.17.0.0/16 </Location> ProxyRequests off ProxyPreserveHost on ProxyPass / http://somehost.lab.example.com/ ProxyPassReverse / http://somehost.lab.example.com/ ProxyRemote "http://somehost.lab.example.com/" "http://127.0.0.1:8080" </VirtualHost>
Tweak all the addresses (especially the Require
line in the Apache
configuration) as needed, add IPv6 addresses if appropriate.
Adding HTTPS support is not particularly difficult, just add a second
VirtualHost
entry for port 443 with SSLEngine
enabled, but to make
it work you'd need a local copy of the key and certificate for
somehost.lab.example.com
so that your reverse proxy can use that
certificate, which might be a problem if you don't control the
back-end server. Given that the traffic is going through an ssh
tunnel anyway, it may not be worth the trouble for your application.