Apr 21, 2023 - Setting up a reverse proxy using nginx and docker

A reverse proxy is a web server that sits between client devices and backend servers, receiving requests from clients and directing them to the appropriate server. The backend servers are shielded from direct internet access and their IP addresses are hidden, providing an additional layer of security.

reverse-proxy

Overall, reverse proxies are useful for improving security, performance, and scalability of web applications and services. They’re commonly used for load balacing traffic between any number of backend servers, and for SSL offloading.

In this brief post I provide a template for setting up an nginx reverse proxy using docker.

Docker-compose

This is a docker-compose file with two services, the nginx web server that will act as a reverse proxy, and a certbot agent for enabling SSL connections to it:

version: '3.3'

services:

  nginx:
    image: nginx:1.19-alpine
    restart: always
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./volumes/nginx:/etc/nginx/conf.d
      - ./volumes/certbot/conf:/etc/letsencrypt
      - ./volumes/certbot/www:/var/www/certbot
    command: "/bin/sh -c 'while :; do sleep 6h & wait $${!}; nginx -s reload; done & nginx -g \"daemon off;\"'"

  certbot:
    image: certbot/certbot
    restart: always
    volumes:
      - ./volumes/certbot/conf:/etc/letsencrypt
      - ./volumes/certbot/www:/var/www/certbot
    entrypoint: "/bin/sh -c 'trap exit TERM; while :; do certbot renew; sleep 12h & wait $${!}; done;'"

The nginx service uses the official Nginx Docker image with version 1.19-alpine. It maps ports 80 and 443 to the host machine, which allows incoming HTTP and HTTPS requests to be forwarded to the Nginx container. The volumes section maps three directories on the host machine to directories in the container:

./volumes/nginx is mapped to /etc/nginx/conf.d, which allows custom Nginx configuration files to be added to the container.

./volumes/certbot/conf is mapped to /etc/letsencrypt, which stores the SSL/TLS certificates generated by Certbot.

./volumes/certbot/www is mapped to /var/www/certbot, which is where Certbot writes temporary files during the certificate renewal process.

The certbot service uses the official Certbot Docker image. It also maps the same volumes as the nginx service. The entrypoint section specifies a shell command that is executed when the container starts up. This command runs a loop that sleeps for 12 hours before evaluating the renewal of SSL/TLS certificates using Certbot.

Now let’s see how each service is configured.

Nginx

Below you’ll’ find an nginx configuration file that sets it up as a load balancer and reverse proxy for the thomasvilhena.com domain:

### Nginx Load Balancer

upstream webapi {
	server 10.0.0.10;
	server 10.0.0.11;
	server 10.0.0.12; down;
}

server {
	listen 80;
	server_name localhost thomasvilhena.com;
	server_tokens off;
		
	location ^~ /.well-known/acme-challenge/ {
		default_type "text/plain";
		alias /var/www/certbot/.well-known/acme-challenge/;
	}
		
	location = /.well-known/acme-challenge/ {
		return 404;
	}
		
	location / {
		return 301 https://thomasvilhena.com$request_uri;
	}
}


server {
	listen 443 ssl http2;
	server_name localhost thomasvilhena.com;
	server_tokens off;

	ssl_certificate /etc/letsencrypt/live/thomasvilhena.com/fullchain.pem;
	ssl_certificate_key /etc/letsencrypt/live/thomasvilhena.com/privkey.pem;
	include /etc/letsencrypt/options-ssl-nginx.conf;
	ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;

	location / {
		proxy_pass http://webapi;
		proxy_set_header Host $http_host;
		proxy_set_header X-Real-IP $remote_addr;
		proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
		proxy_set_header X-Forwarded-Proto $scheme;
		proxy_set_header X-NginX-Proxy true;
		proxy_redirect off;
	}
}

The “upstream” section defines a group of servers to be load balanced, with three sample servers listed (10.0.0.10, 10.0.0.11, and 10.0.0.12). One server is marked as “down” which means it won’t receive requests.

The first “server” block listens on port 80 and redirects all requests to the HTTPS version of the site. It also includes some configuration for serving temporary files over HTTP which are required for the SSL certificate renewal process through Let’s Encrypt.

The second “server” block listens on port 443 for HTTPS traffic and proxies requests to the defined “upstream” group of servers. The “location /” block specifies that all URLs will be proxied. The various “proxy_set_header” directives are used to set the headers needed for the upstream servers to function correctly.

Certbot

Certbot requires two configuration files:

/volumes/certbot/conf/options-ssl-nginx.conf contains recommended security settings for SSL/TLS configurations in Nginx. Here’s a sample content:

ssl_session_cache shared:le_nginx_SSL:10m;
ssl_session_timeout 1440m;
ssl_session_tickets off;

ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers off;

ssl_ciphers "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384";

/volumes/certbot/conf/ssl-dhparams.pem contains Diffie-Hellman parameters used for SSL/TLS connections. It is generated by running the following command:

openssl dhparam -out /etc/letsencrypt/ssl-dhparams.pem 2048

Here’s a sample content:

-----BEGIN DH PARAMETERS-----
MIIBCAKCAQEA3r1mOXp1FZPW+8kRJGBOBGGg/R87EBfBrrQ2BdyLj3r3OvXX1e+E
8ZdKahgB/z/dw0a+PmuIjqAZpXeEQK/OJdKP5x5G5I5bE11t0fbj2hLWTiJyKjYl
/2n2QvNslPjZ8TpKyEBl1gMDzN6jux1yVm8U9oMcT34T38uVfjKZoBCmV7g4OD4M
QlN2I7dxHqLShrYXfxlNfyMDZpwBpNzNwCTcetNtW+ZHtPMyoCkPLi15UBXeL1I8
v5x5m5DilKzJmOy8MPvKOkB2QIFdYlOFL6/d8fuVZKj+iFBNemO7Blp6WjKsl7Hg
T89Sg7Rln2j8uVfMNc3eM4d0SEzJ6uRGswIBAg==
-----END DH PARAMETERS-----

That’s it, now you just need to docker-compose up and your reverse proxy should be up and running ✅

Apr 10, 2023 - The burden of complexity

Complexity is present in any project. Some have more, some have less, but it’s always there. The manner in which a team handles complexity can pave the way for a project’s success or lead towards its technical demise.

In the context of software, complexity arises from a variety of factors, such as complicated requirements, technical dependencies, large codebases, integration challenges, architectural decisions, team dynamics, among others.

When talking to non-technical folks, especially those not acquainted with the concepts of software complexity and technical debt, it can be helpful to present the topic from a more managerial perspective.

So I propose the following qualitative diagram that regards complexity as an inherent property of a software project, and simultaneously, a responsibility that a software development team must constantly watch and manage for being able to deliver value in the long run:

surface

From the diagram:

  • The Complexity Burden curve represents the theoretical amount of effort necessarily spent servicing complexity, as oposed to productive work. This is an inevitable aspect of software development and can manifest in various forms, including spending time understanding and working with complex code, encountering more intricate bugs and errors, updating depencencies, struggling to onboard new team members due to excessively elaborate designs, among others.

  • The Team’s Capacity line is the maximum amount of effort the team is able to provide, which varies over time and can be influenced by factors such as changes in the product development process, team size, and efforts to eliminate toil [1]. Additionally, reductions in the complexity burden of a project can unlock productivity, influencing the team’s capacity as well.

  • The Complexity Threshold represents the point where the team’s capacity becomes equal to the complexity burden. In this theoretical situation, the team is only allocating capacity towards servicing complexity. Value delivery is compromised.

With these definitions in place, let’s review the two colored zones depicted in the diagram.

The Improvment Zone

Projects are typically in the improvement zone, which means that the team has enough capacity to handle the complexity burden and still perform productive work. The lower the complexity burden, the more efficient and productive the team will be in delivering results. The team can choose to innovate, develop new features, optimize performance, and improve UX. It’s worth noting that doing so may result in added complexity. This is acceptable as long as there is sufficient capacity to deal with the added complexity in the next cycles of development and the team remains continuously committed to addressing technical debt.

The Degradation Zone

A project enters the degradation zone when the team's capacity is insufficient for adequately servicing complexity, adding pressure on an already strangled project. The team will constantly be putting out fires, new features will take longer to ship, bugs will be more likely to be introduced, developers may suggest rewriting the application, availability may be impaired, and customers may not be satisfied. The viable ways out of this situation are to significantly reduce complexity or to increase capacity. Other efforts will be mostly fruitless.

Closing thoughts

The concept of complexity burden can be a valuable tool for enriching discussions around promoting long-term value delivery and preventing a project from becoming bogged down by complexity, leaving little room for new feature development. It’s important to make decisions with a clear understanding of the complexity burden and how it may be affected.

It’s worth pointing out that if the productive capacity of a team is narrow, meaning if the proportion of the team’s capacity allocated towards the complexity burden is already too high, the team will find itself in a situation where continuing to innovate may be too risky. The wise decision then will be to prioritize paying off technical debt and investing in tasks to alleviate the complexity burden.

Even though they are related, it’s crucial to distinguish between the complexity burden and technical debt. The former materializes as the amount of (mostly) non-productive work a team is encumbered by, while the latter is a liability that arises from design or implementation choices that prioritize short-term gains over long-term sustainability [2]. A project can become highly complex even with low technical debt.

Finally, a project is a dynamic endeavor, and a team may find itself momentarily in the “degradation” zone in one cycle and in the “improvement” zone in the next. What matters most is to be aware of the technical context and plan next steps preemptively, aiming to maintain the complexity burden at a healty level.


Reference

[1] Google - Site Reliability Engineering. Chapter 5 - Eliminating Toil

[2] Wikipedia - Technical debt

Apr 15, 2022 - An inherent source of correlation in the crypto market

If you follow the crypto market, you may already be familiar with the strong direct correlation between cryptocurrencies prices, native or tokenized. When BTC goes down, pretty much everything goes down with it, and when BTC is up, everything is most likely to go up too. This correlation isn’t exclusive to BTC, impacts on the price of many other cryptocurrencies (such as ETH) also reverberate across a wide range of crypto assets, with different degrees of strength among them.

surface

Have you ever asked yourself why is it so? One straightforward answer is that large events impacting the price of BTC (or other crypto) will make crypto investors want to assess their exposure not only to BTC but to other crypto assets in a similar manner. This answer is intuitive, but mostly behavioral and hard to quantify. However, there are other structural sources of correlation between cryptocurrencies that are often overlooked, and in this post I analyze one of them: Decentralized exchange (DEX) pairs.

DEX pairs

A DEX pair, popularized by Uniswap, can be viewed as a component in the blockchain that provides liquidity to the crypto market and allows wallets to trade between one asset and another in a decentralized way. For instance, the BTC/ETH pair allows traders to swap between these two currencies in either direction. Likewise, the BTC/USDC pair allows traders to exchange bitcoin for stablecoins, and vice-versa.

And how do DEX pairs build-up correlation between cryptocurrencies? To answer this we need to dive a bit into how DEX pairs work:

surface

First, in order to provide liquidity a DEX pair needs to have a reasonable supply of both of its tradable assets. Then it implements a mathematical formula for calculating the exchange rate between these two assets honoring the supply/demand rule, i.e., the more scarce one of the assets becomes in the DEX pair’s supply, the more valuable it’ll be in relation to the other asset.

Second, the sensitivity of the exchange rate in a DEX pair will depend on the total value locked (TVL) in its supply. Each trade performed against the DEX pair changes the relation of its assets, thus changing the effective exchange rate for succeeding trades. The higher the TVL, the less sensitive the exchange rate will be with regard to the trade size.

Implications on correlation

Now we can start exploring the implications on correlation. You see, a DEX pair is basically a bag which locks pairs of cryptocurrencies supplies together, creating a hefty relationship between them. Even so, if you have one or two DEX pairs to play with you may not achieve much with respect to correlation of the assets prices against the dollar. But if we define a closed system with at least three DEX pairs like the one shown below, interesting things start to happen:

surface

In this system we have:

  • One BTC/USDC pair defining a price of US$ 45k per BTC
  • One ETH/USDC pair defining a price of US$ 3.2k per ETH
  • One BTC/ETH pair creating a relationship between these two assets
  • Pricing consistency between all pairs
  • US$ 300M TVL in each pair

Remember that each DEX pair defines an independent exchange rate between two assets. Then, if we buy a lot of BTC in the BTC/USDC pair with stablecoins, for instance a US$ 1M trade, we’ll generate an upwards pressure in the price of BTC as defined by that pair:

surface

Trade details:

  • 1M USDC input
  • 22.075 BTC output
  • Effective Ex. Rate of 45300
  • Resulting BTC/USDC pair Ex. Rate of 45602

This new price will be unbalanced in regards to the other pairs, triggering an arbitrage opportunity since a trader holding USDC could now buy ETH in the ETH/USDC pair, exchange ETH for BTC in the BTC/ETH pair and finally sell BTC for stablecoins in the BTC/USDC pair making a profit.

surface

Now let’s consider that a trader took advantage of this arbitrage opporutnity to the fullest and analyze the resulting exchange rates when the system reaches equilibrium:

surface

So, comparing to the initial state of the system, the US$ 1M trade to buy BTC had the effect of:

  • Rising the price of BTC by 0.89% (from US$ 45,000.00 to US$ 45,400.00)
  • Rising the price of ETH by 0.44% (from US$ 3,200.00 to US$ 3,214.20)
  • Inflating the TVL in the system by 0.44% (from US$ 900M to US$ 904M)

As you can see, the initial rise in the BTC price opened up an arbitrage opportunity that once explored to exhaustion had the effect of rising the price of ETH as well. To put it simply, this closed system created an inherent correlation between ETH and BTC prices.

Conclusion

In this qualitative analysis we’ve seen how a system of DEX pairs builds-up correlation between crypto assets as a result of exploring arbitrage between these pairs. Even though the analysis was based on a simulated US$ 1M trade to buy BTC, similar and consistent results hold for selling BTC, as well as for buying/selling ETH, within this closed system.

As of the time of this writing Uniswap on Ethereum mainnet alone holds US$ 4.77b of TVL in hundreds of DEX pairs, creating an entangled net of relationships between crypto assets and contributing to the correlation among them.


Notes

  • The simulation whose results are presented in this post was based in Uniswap’s V2 protocol implementation. Similar results should hold for the more complex and recent V3 implementation which adopts the concept of virtual supplies.

  • The complete source code for running this simulation is provided on GitHub. The routine used for generating the results presented in this post can be found in this code file.