One of the most often repeated sayings in the investing world (at least, for the portion that believes in index investing) is that time in the market is more important than timing the market; that is, market timing—the practice of making investing decisions based on your prediction of how the market will perform in the future—is generally a difficult endeavor, and you are better off just investing regularly and remaining invested.

Sometimes investors (especially new ones) are hesitant to invest in the stock market because it is close to it's all time high. However, this is a common occurrence. To see just how common it is, I generated this plot.


And we can look at the first couple data points in this table.

Within X% of the Up to Then All Time High Percentage of Market Trading Days
0% 7.32%
1% 19.84%
2% 29.05%
3% 36.05%
4% 42.28%
5% 47.26%
6% 51.40%
  • The S&P 500 set new records 7.32% of the time!
  • It spent almost 20% of its time within 1% of its all time high.
  • It comes close to spending 50% of its time within 5% of its all time high.

In fact, it's not a stretch to say that the natural state of the S&P 500 is to stay within single digit percentages of it's all time high.


I pulled the data from Yahoo Finance, using all data available at the time of writing (1950/01/03 to 2019/06/27). I then used the following Python script.

I honestly found it a little hard to believe that the S&P 500 traded so often within single digit percentages of its all time high. If you find any errors in my script, please let me know.

#!/usr/bin/env python
import csv
from math import ceil
from matplotlib import pyplot as plt
import numpy as np

with open ('s&p500-yahoo.csv', 'r') as csvFile:
	reader = csv.reader(csvFile)
	# Create iterator and skip first row (header)
	iterreader = iter(reader)

	max = 0
	pdf = [0]*100 # techinically should be 101, but the stock market has never fallen to zero, so who cares
	num_market_days = 0

	# Loop over all rows in file
	for row in iterreader:
		num_market_days += 1

		closing = float(row[5])

		# If we've hit a new all time high, update the high
		if closing>max:
			max = closing

		# By using ceil, only an exact match will fall into the 0% bin
		# Then, anything within 0-1% will fall into the 1% bin, anything within 1-2% will fall into the 2% bin, etc. 
		# If I were to use round, I'd be saying that something within 2.4% of the max is actually within 2% of the max, which is wrong
		within_percent = ceil(100*(1-closing/max))
		pdf[within_percent] += 1

	cdf = [0]*100 # again, technically should be 101; see previous note on pdf

	# first iteration will use a negative index, but that will still pull cdf[99], which is 0, so it's fine
	for idx, val in enumerate(pdf):		
		cdf[idx] = cdf[idx-1] + val/num_market_days*100


# Plot in xkcd style
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.set_ylim([0, 100])
ax.set_xlim([0, 60])

cdf = np.array(cdf)
plt.xlabel('Within this percent of its up to then max')
plt.ylabel('Percentage of market trading days')
plt.title('Percentage of time the S&P 500 \nhas spent within X% of its max')
plt.annotate('The S&P 500 has spent 47.3% of market \ntrading days within 5% of its all time high', xy=(5.3, 47.3), xytext=(10, 43), arrowprops=dict(facecolor='black'))

plt.savefig('S&P500CDF.png', bbox_inches='tight')