If you invest in cryptocurrencies such as Bitcoin and Ethereum and pay tax in Germany, some day you might find yourself having to calculate how much you owe the Finanzamt. That day came for me this year and here’s how I worked it out with Python.

My crypto backstory: a couple of years ago, I was working close to the Fintech industry as an event organiser for the Fintech Week in Germany and decided to get my hands dirty with a bit of cryptocurrency. I hit on a time where my money suddenly grew and I asked myself: if I had this amount of money in cash, would I invest it in cryptocurrency today? The answer was no, so I made a withdrawal that – in retrospect – was a good decision.

The downside: I had to pay tax on it, because I was selling crypto that I had acquired within a year.

My understanding – and it’s only my understanding so you should talk to a tax advisor instead of relying on what you read here – is that you have to tax any income due to sales of financial assets you acquired within the past year.

Challenge 1: Multiple purchase prices at different times

The problem is: I bought bits and pieces of different cryptocurrencies over a period of time, some within the past year, some before, all at different prices. How do I know the purchase price for the crypto I sold?

The answer is: the purchase price varies a lot. Add to that the fact that, when you sell cryptocurrency, Coinbase (the exchange I was using) doesn’t sell it all in one chunk: it sells whatever it can at different prices. This became a brain-sizzler and impossible to calculate manually.

Challenge 2: EUR exchange rate for any given time

As I understand it, even when you exchange cryptocurrency, e.g. Etherum for Filecoin, you have to calculate the EUR value as if you had sold ETH for EUR and then bought FIL with the EUR you had gained. Coinbase only includes the prices in the asset you exchange for (not always EUR) so I had to query their API to get an approximation of the exchange rate at the time.

This is quite complex because you can’t just query a specific timestamp and get the exchange rate, or pull in data for the whole year. You have to query candles (i.e highest and lowest price within a time frame), and decide on a granularity (i.e. the timeframe for each candle). I chose a granularity of 86400 seconds, i.e. one day, and took a mean of the opening, closing, highest and lowest prices for that day. There’s a limit to the number of candles you can query, so you also have to split the query into several requests and then merge them afterwards.

If you want to see the details, it’s in the get_xrs.py file.

Challenge 3: Considering only the relevant year

I wanted to calculate my tax for 2021 – so any sales that were made in 2021. But you still have to parse all your data so you know how much stock you have at the time of a sale. This was quite easy to implement but nevertheless difficult to understand in the beginning.

If I ever go through this madness again, I can simply specify in the year argument when instantiating BalSheet which year I’m interested in.

Challenge 4: Parsing a Coinbase statement

Parsing the Coinbase statement is no mean feat. It’s easy when you know how – just a question of importing a CSV – but understanding the meaning of the various headings is challenging. It took me a while to understand the different between order ID and trade ID.

I tried Python’s built-in namedtuple, which worked well – it works just like a tuple but you can access elements with a dot notation, e.g. candle.high, candle.low etc. Because they’re tuples, though, you can’t change them so you have to overwrite any changes with the _replace method.

In future, I can just specify with the filename parameter where my Coinbase statement is (although I think they’ll probably change the format as Coinbase Pro is being discontinued).

The heart of the application: the balance sheet

‘Application’ sounds a little fancy for a glorified spreadsheet, but this is what I’m going to call it because it has multiple moving parts. Through several attempts, I learned to keep this as simple as possible and avoid the temptation to over-generalise. So I decided for the FILO method (first in, last out) and hard-coded that.

The balance sheet doesn’t keep track of anything that’s not necessary for the calculation: so it’s really just the taxable income that is tracked.

At the core of what it’s doing is this: each purchase, which has a different price, gets logged. But when it comes to selling, we need to work through our stock for the currency we’re selling, and pick just enough of it to fulfil the sale. If that’s only part of an item (i.e. a purchase), or several items and part of a last item, we need to update the stock at each purchase price every time we remove from them.

The way I think of it is that every time you purchase, you add a layer of rock. Every time you sell, you drill down a certain distance and need to keep track of which layers you drill through and how deep the hole is. Or in terms of heaps of sand in a builders yard: each one purchased at a different price and time. If you sell 3 tonnes of sand, you need to take sand from each pile in turn until you have 3 tonnes. If the first pile is 0.5 tonnes, the second 1.5 and the third 2 tonnes, you need all of the first two piles and 1 tonne of the last pile. The last pile, at this point, is 1 tonne. The next sale will start from the last pile.

This logic is contained with the objects BalSheet and Item. Each pile of sand, or layer of rock, is an Item. The BalSheet object deals with purchasing items and selling them in the right order, and keeping track of how much income needs taxing.

You can see how it works in the bal_sheet.py.

Test test test

Because this is very complex in real-world conditions, I started off by writing this logic, then using good old-fashioned assert statements (and eventually PyTest) to check it was working with some examples I worked out myself. Because I know it works on relatively simple examples, I can assume it works for more complex cases and the values it returns are plausible.

The tests are quite simple, but the good thing is that once they’re set up, you can be sure you haven’t broken things that worked earlier. Tests are in the /tests folder and if you install pytest, they should all pass.

Bringing it all together

What sounded like a relatively simple problem to start with ended up having several moving parts that need to interact with each other. In particular, parsing the statement and getting exchange rate information for every transaction was tricky: this required a function to find the day that a particular timestamp falls within. Once it was all set up, though, it was a case of running through the transactions one by one, adding them to the balance sheet as either a buy or sell (depending on whether it was a +ve or -ve amount), and then totting it up at the end.

For the sake of simplicity, I ignored fees — I could have saved myself a little tax, but a little exploration of my statement in Pandas shows that there wasn’t much to be saved here so the tax authorities can have a small bonus to save me the time of working it out down to the cent.

On GitHub

If you want to dive deeper into the code — maybe the core balance sheet part could be useful – you can find the repository on GitHub.