Retirement Simulator #9: Track averages and final outcomes

Published Dec. 31, 2021, 9:36 p.m.

Welcome to the 9th tutorial in this series!  Last time we entered the multiverse and plotted 100 out of 1000 simulated histories of investing in US markets.  Those plots are cool and would probably tell us something if we kept track of our results.  I have chosen to keep track of the results using a "list of lists" method. 

You may recall that last time we defined our "money" and "dates" lists inside of our for loop and just reset the list to empty at the start of the loop.  We would actually like to keep those data so we will remove the definition of money and dates inside the loop.

Before the loop begins but under the definition of the number of simulations (nsims) we want to define a list of lists:

money = [[] for x in range(nsims)]
dates = [[] for x in range(nsims)]

I'm going to define this again but I am going to add some space between the outside brackets and everything else:

money = [    [] for x in range(nsims)    ]
dates = [    [] for x in range(nsims)    ]

In words this would be the equivalent of saying:  "money" equals a list of empty lists copied 'nsims' number of times.

This means that the preamble to our for loop and a little following should look like this:

nsims = 1000
money= [[] for x in range(nsims)]
dates = [[] for x in range(nsims)]
for sim in range(nsims):
    todays_date = date.today()
    retirement_date = date(2030, 5, 30)
    retirement = amount_already_in_retirement

Now, anywhere in our code that we had "money" or "dates" we need to replace it with "money[sim]" or "dates[sim]".

### money --> money[sim]
### dates --> dates[sim]

This syntax is necessary now because we are accessing a particular list within the "list of lists".  The for loop should look like this:

nsims = 1000
money= [[] for x in range(nsims)]
dates = [[] for x in range(nsims)]
for sim in range(nsims):
    todays_date = date.today()
    retirement_date = date(2030, 5, 30)
    retirement = amount_already_in_retirement
    while todays_date < retirement_date:
        i+=1        
        retirement += add_to_retirement
        retirement *= (1+paycheck_interest_rate)
        todays_date += pay_frequency
        this_year = todays_date.year
        if this_year > year_check:
            # annual_interst_rate  = random.gauss(mean, sigma)
            annual_interst_rate  = random.choice(x_dist)
            paycheck_interest_rate = get_paycheck_interest_rate(annual_interst_rate, paychecks_per_year)
        year_check = todays_date.year
        money[sim].append(retirement)
        dates[sim].append(todays_date)
    if sim%10==0:
        axs[0,1].plot(dates[sim], money[sim])
        plt.pause(0.001)
    print(f"I have {retirement:.2f} as of  {todays_date}.")

So then, what happens at the end of this loop is that you have filled up all the empty lists that you created before the loop started.  Now we can do some fancy python stuff to compute the average of all the money lists and also get a snapshot of the distribution of the final amounts (which is the info we are most intersted in).

We can start by defining two new variables:

finalamounts = []
average_amount = np.zeros(shape = len(dates[0]))

We define an empty list for our "finalamounts" and a numpy array full of zeros using np.zeros.  I used the "length" (len) of the list 'dates[0]' but I could have used any of the money or dates lists.  If you don't understand why I did this, don't worry.  I will explain.

We all (probably) know what an average is.  And if I tasked you will creating an average from the list "money" lists I am sure you would succeed.  What I show in the video is a method that I found / derived from using np arrays (which I don't use often).

We should be able to agree that an average of some numbers is the sum of the numbers divided by the number of numbers.

######  Here is an aside on np arrays the the "+=" architecture.  Skip if you don't care... ######

Well, python gets a lot of milage out of their "+=" syntax, but it works differently for numbers than it does for lists and differently still for numpy arrays.  In the latter:

"""
############################################
for numpy arrays a and b :
if a = [1    and b = [2    then a + b = [3 
        1,            3,                 4,
        1,            4,                 5,
        1,            5,                 6,
        10]           6]                 16]
#############################################
"""

In a terminal that would look like this:

>>> import numpy as np
>>> a = np.array([1,1,1,1,10])
>>> b = np.array([2,3,4,5,6])
>>> a+b
array([ 3,  4,  5,  6, 16])

For lists, the + opperator works differently:

"""
for lists a and b:
if a = [1,1,1,1,10]  and b = [2,3,4,5,6]  
then a+b = [1,1,1,1,10,2,3,4,5,6]
#############################################
"""

In a terminal window:

>>> a = [1,1,1,1,10]
>>> b = [2,3,4,5,6]
>>> a+b
[1, 1, 1, 1, 10, 2, 3, 4, 5, 6]

Furthermore, np.arrays handle scalar multiplication more 'natively' than lists.

>>> a = np.array([1,1,1,1,10])
>>> a/10
array([0.1, 0.1, 0.1, 0.1, 1. ])
>>>

whereas if you try that with a list:

>>> a = [1,1,1,1,10]
>>> a/10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for /: 'list' and 'int'
>>>

I think I will make a one-off tutorial about this kind of thing later.  For now, just remember that we have chosen to convert our lists to np.arrays so we can take advantage of the architecture.

#####  BACK TO OUR REGULARLY SCHEDULED PROGRAM...  #######

So the following:

finalamounts = []
average_amount = np.zeros(shape = len(dates[0]))

gives us an empty list and a np.array full of zerors (the number of zeros is the same as the length of our dates[0] list == nsims.)

So we would like to fill our 'finalamounts' list with the final amount from each simulation, AND, we would like our average_amounts np.array to end up having the average amount at all points in time from all simulations.

We can accopmplish this in a for loop:

for sim in range(nsims):
    finalamounts.append(money[sim][-1])
    money[sim] = np.array(money[sim])
    average_amount += money[sim]/nsims

In the box above we loop throgh a range based on our nsims variable, we append to our 'finalamounts' list the money from a particular simulation (sim) and use the [-1] call to get the last value of that list (the final amount... Duh.)  Then comes the tricky bit...  we convert our 'money[sim]' variable from a list into a np.array so that it is the same type as our 'average_amounts' np.array.  Now we can simply add the newly converted np.array to our average_amount container and simultaneously divide by the nsims variable (thus computing the average at the last step of the loop).  If you don't believe this, do the math yourself.  You can divide by the 'nsims' variable once at the end, or at each step.  It makes no differnce.

Once this loop has completed, we now have a list of final amounts (good for a new histogram) and a np.array with the computed average amount (good for plotting against our plot of retirement simulation trajectories).

To plot the average_amount on our current subplot (which was axs[0,1]) :

axs[0,1].plot(dates[0], average_amount, linewidth = 3, linestyle = "--", color = "k",label=f"Retire{todays_date.year}")
axs[0,1].legend()

Here we add a label and also invoke a 'legend' that will display the label info for any plots that are labeled.  The linestyle and linewidth also make sure that our plot stands out.

To plot a histogram of our final amounts we should use a new subplot: axs[1,0]:

axs[1,0].hist(finalamounts, histtype="step", label = f"Retire {todays_date.year}")
axs[1,0].legend()

plt.show()

the plt.show() must be at the end for all plots to be displayed.   Hopefully this worked for you.  Your average line should look right.  Your histogram should look like a "Poisson" distribution (look it up!).   In our next and final retirement simulator tutorial we will take advantage of all of our present architecture and show how easy it is to make some meaningful analysis once everything is in place like we built it.  Python makes the next bit very easy.  Thanks for watching, see you in the next one.

skip_nextRetirement Simulator #10: Test and Compare Different Retirement dates (And make it pretty!)
  • Retirement Simulator Introduction and Overview

  • Retirement Simulator #1: March through time with datetime and put money under the mattress!

  • Retirement Simulator #2: Apply an interest rate and verify it!

  • Retirement Simulator #3: Use a Gaussian distribution of rates!

  • Retirement Simulator #3+: Verify our Gaussian distribution!

  • Retirement Simulator #4: Make a US market rates distribution

  • Retirement Simulator #5: Fit the Market Data with scipy curve_fit

  • Retirement Simulator #6: Sampling from our custom market distribution

  • Retirement Simulator #7: Organize multiple plots with subplots! and learn about plt.pause()

  • Retirement Simulator #8: Multiverse Investing: Simulate 1000 random investors

  • Retirement Simulator #9: Track averages and final outcomes
    (currently viewing)
  • Retirement Simulator #10: Test and Compare Different Retirement dates (And make it pretty!)