Published Dec. 31, 2021, 9:26 p.m.
Welcome to the 4th tutorial! Bones are built! Now time to add muscle and organs (we already had an appendix?) To do that we want to use real world market data from which to get a "distribution" of rates.
When I googled it I found these links:
https://www.macrotrends.net/2526/sp-500-historical-annual-returns
https://www.macrotrends.net/1319/dow-jones-100-year-historical-chart
You can use any data you like. From these websites, I copied the data and then pasted it into a text file. (I formatted it to get rid of everything except the year and the rate). The data for the first file looks like the following:
2021 ,20.15
2020 ,16.26
2019 ,28.88
2018 ,-6.24
2017 ,19.42
2016 ,9.54
2015 ,-0.73
2014 ,11.39
2013 ,29.60
2012 ,13.41
2011 ,0.00
2010 ,12.78
2009 ,23.45
2008 ,-38.49
2007 ,3.53
2006 ,13.62
2005 ,3.00
2004 ,8.99
2003 ,26.38
2002 ,-23.37
2001 ,-13.04
2000 ,-10.14
1999 ,19.53
1998 ,26.67
1997 ,31.01
1996 ,20.26
1995 ,34.11
1994 ,-1.54
1993 ,7.06
1992 ,4.46
1991 ,26.31
1990 ,-6.56
1989 ,27.25
1988 ,12.40
1987 ,2.03
1986 ,14.62
1985 ,26.33
1984 ,1.40
1983 ,17.27
1982 ,14.76
1981 ,-9.73
1980 ,25.77
1979 ,12.31
1978 ,1.06
1977 ,-11.50
1976 ,19.15
1975 ,31.55
1974 ,-29.72
1973 ,-17.37
1972 ,15.63
1971 ,10.79
1970 ,0.10
1969 ,-11.36
1968 ,7.66
1967 ,20.09
1966 ,-13.09
1965 ,9.06
1964 ,12.97
1963 ,18.89
1962 ,-11.81
1961 ,23.13
1960 ,-2.97
1959 ,8.48
1958 ,38.06
1957 ,-14.31
1956 ,2.62
1955 ,26.40
1954 ,45.02
1953 ,-6.62
1952 ,11.78
1951 ,16.46
1950 ,21.78
1949 ,10.26
1948 ,-0.65
1947 ,0.00
1946 ,-11.87
1945 ,30.72
1944 ,13.80
1943 ,19.45
1942 ,12.43
1941 ,-17.86
1940 ,-15.29
1939 ,-5.45
1938 ,25.21
1937 ,-38.59
1936 ,27.92
1935 ,41.37
1934 ,-5.94
1933 ,46.59
1932 ,-15.15
1931 ,-47.07
1930 ,-28.48
1929 ,-11.91
1928 ,37.88
The second file is thus:
2021 , 14.13
2020 , 7.25
2019 , 22.34
2018 , -5.63
2017 , 25.08
2016 , 13.42
2015 , -2.23
2014 , 7.52
2013 , 26.50
2012 , 7.26
2011 , 5.53
2010 , 11.02
2009 , 18.82
2008 , -33.84
2007 , 6.43
2006 , 16.29
2005 , -0.61
2004 , 3.15
2003 , 25.32
2002 , -16.76
2001 , -7.10
2000 , -6.17
1999 , 25.22
1998 , 16.10
1997 , 22.64
1996 , 26.01
1995 , 33.45
1994 , 2.14
1993 , 13.72
1992 , 4.17
1991 , 20.32
1990 , -4.34
1989 , 26.96
1988 , 11.85
1987 , 2.26
1986 , 22.58
1985 , 27.66
1984 , -3.74
1983 , 20.27
1982 , 19.60
1981 , -9.23
1980 , 14.93
1979 , 4.19
1978 , -3.15
1977 , -17.27
1976 , 17.86
1975 , 38.32
1974 , -27.57
1973 , -16.58
1972 , 14.58
1971 , 6.11
1970 , 4.82
1969 , -15.19
1968 , 4.27
1967 , 15.20
1966 , -18.94
1965 , 10.88
1964 , 14.57
1963 , 17.00
1962 , -10.81
1961 , 18.71
1960 , -9.34
1959 , 16.40
1958 , 33.96
1957 , -12.77
1956 , 2.27
1955 , 20.77
1954 , 43.96
1953 , -3.77
1952 , 8.42
1951 , 14.37
1950 , 17.63
1949 , 12.88
1948 , -2.13
1947 , 2.23
1946 , -8.14
1945 , 26.65
1944 , 12.09
1943 , 13.81
1942 , 7.61
1941 , -15.38
1940 , -12.72
1939 , -2.92
1938 , 28.06
1937 , -32.82
1936 , 24.82
1935 , 38.53
1934 , 4.14
1933 , 66.69
1932 , -23.07
1931 , -52.67
1930 , -33.77
1929 , -17.17
1928 , 49.48
1927 , 27.67
1926 , 4.05
1925 , 25.37
1924 , 26.16
1923 , -2.70
1922 , 21.50
1921 , 12.30
1920 , -32.90
1919 , 30.45
1918 , 10.51
1917 , -21.71
1916 , -4.19
1915 , 81.49
Whoa. 1915! What a year to invest in the market. Anyway, You can copy one or both of these into respective files. We will read this data into our code. Truth be told, the year is not a relavant piece of data for our purposes. If you want to challenge yourself, you can erase the year data and figure out how to read in the rate data yourself. However, if you follow the video I will show you how to read them both in.
Save the files as text files in the work folder area. I chose the second one (dow market returns) and save it as "dow_market_returns.txt". Then I create a varible in my code with that file name:
filename = "dow_market_returns.txt"
Then I do my three-line trick for reading in a text file:
f = open(filename, 'r')
lines = f.readlines()
f.close()
Now we have the list "lines". To get the data we need to loop through the lines and "cast some types". That would look as follows:
years=[]
rates = []
for line in lines:
year, rate = line.split(',')
years.append(int(year))
rates.append(float(rate)/100)
We established two empty lists (years and rates). Then we loop through the lines, spliting each line into the year and the rate. We then append those to our lists while casting each one into the 'type' that we want (int or float). For the rate we need to divide the number by 100 to get the number that we actually want.
Now for a first step toward using real data we will use another function from the random class: choice. the random.choice function selects a single value (at random) from a list. We happen to have a list of realistic market rates. So we can simply replace the random.gauss function with this new one like so:
year_check = todays_date.year
while todays_date < retirement_date:
retirement += add_to_retirement
retirement *= (1+paycheck_interest_rate)
todays_date += pay_frequency
this_year = todays_date.year
if this_year > year_check:
# annual_interst_rate = random.gauss(mean, sigma)
annual_interst_rate = random.choice(rates)
paycheck_interest_rate = get_paycheck_interest_rate(annual_interst_rate, paychecks_per_year)
print(this_year, annual_interst_rate)
year_check = todays_date.year
Now we are using real data and so perhaps our simulation is a bit more realistic. However, if this rubs you the wrong way, then you are not alone. We don't REALLY want to use a list of OLD rates. We want the list of old rates to inform a model from which we can determine NEW rates. And that is what we will begin to do in the next video. But before we finish, we can take our test from the 3+ video and change it slightly so that we now plot our list of rates. The list of rates is much smaller but should be enough to show something like a gaussian shape. The lack of statistics make it a bit underwhelming for a traditional gaussian shape. Also, it may be that a gaussian only approximates the actual shape. Whatever the case may be, a guassian is as good a guess as any and so next time we are going to fit this data with a gaussian function and use the 'optimized parameters' to our fit to make our own sample distribution. See you there!
skip_nextRetirement Simulator #5: Fit the Market Data with scipy curve_fit