Connexions

You are here: Home » Content » The Art of the PFUG » The Coordinated Max-Median Rule for Portfolio Selection

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
• Rice Digital Scholarship

This collection is included in aLens by: Digital Scholarship at Rice University

Click the "Rice Digital Scholarship" link to see all content affiliated with them.

Also in these lenses

• Lens for Engineering

This module and collection are included inLens: Lens for Engineering
By: Sidney Burrus

Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Inside Collection (Book):

Book by: Steven J. Cox. E-mail the author

The Coordinated Max-Median Rule for Portfolio Selection

Module by: Ricardo Affinito. E-mail the author

Summary: A partial summary of the work performed by one Computational Finance PFUG [under Rice University's VIGRE Summer Reserach Program] is provided in this module. VIGRE (Vertically Integrated Grants for Research and Education) is funded by the NSF (National Science Foundation). Empirical Research was geared towards assessing the performance of an "improved" n-at-a-time stock selection rule for portfolio construction. The "Coordinated Max-Median" algorithm developed is described in detail along with its computational challenges. Also included are various evaluations performed with real world data (S&P 500 Index). This Connexions Module summarizes the details of such research.

Motivation

The Max-Median Rule for Portfolio Selection

Previous research suggests that there exist strategies that, when implemented over a long period of time, might provide higher returns than overall market performance (see, e.g. [1]). One of these strategies, namely the “Max-Median Rule”, was investigated by Thompson and Baggett (see [2]), and served as a general motivation for this research. By selecting a handful of stocks, according to some robust criterion (e.g. the median) and rebalancing consistently without straying away from the strategy, virtually any investor could easily manage his or her portfolio quite reasonably. Over the long-haul, this strategy would provide decent returns when compared to a benchmark index (e.g. the S&P 500 Index). It is worthwhile noting that in strategies such as these, time is a major consideration (and one which investors can control, e.g. when investing retirement funds such as a 401K401K), and that these methods do not constitute day-trading strategies, and should be adhered-to consistently over a given period.

Several salient points of this motivating investment strategy are:

1. It is accessible to any individual investor.
2. Over an extensive time-period for which it was examined (i.e. 37 years - 1970 through 2006) it outperformed the S&P 500 Index by about 50%.
3. It was slightly more volatile on a yearly-basis. An effect that can, to a reasonable extent, be used to an investor's advantage in “longer”-term investment strategies.

These points clearly serve us as a motivation for further investigation and potential improvements. In particular, through recognizing that the existing strategy, albeit well-performing, is inherently a “one-at-a-time” strategy and therefore does not capture any correlation-related dynamics through its selection criteria.

Lastly, we were also motivated to investigate (at least initially) equally-weighted portfolios. An interesting finding (see, e.g. Wojciechowski, Baggett, and Thompson [3]), is that “for the 33 years from 1970 through 2002, not simply a flukish few, but a staggering 65 percent of the portfolios selected randomly from the 1,000 largest market cap stocks lie above the Capital Market Line (CML).” Also, it has been shown (see [2]) that any individual who invested equally in the S&P 500 Constituents (time period of 1970 through 2006) would have made, on a yearly average, 13.7%, as opposed to 8.9% with a competing market-cap weighted strategy. Both of these empirical realities make, at least preliminarily, a case against considering long-term market-cap weighted strategies.

The Coordinated Max-Median Rule

Introduction

We now consider a strategy which allows us to implicitly capture the joint performance of securities as part of our selection criteria. Our goal is to pick, from the universe of investible stocks, a meaningful handful on which to equally allocate a given investment quantity on a yearly basis. As a first step, we consider the S&P 500 constituents to be our universe of stocks from which can select a critical few (a number which we have set initially, and somewhat arbitrarily, to 20. This quantity seemed both appealing and reasonable in terms of being financially manageable and computationally feasible). It is also worthwhile noting, that we regard limiting an investor to select from the S&P 500 (or any other well-known index) as both a reasonable and soundly restricted starting point. Furthermore, we also know that stocks listed in the S&P 500 are representative of various market sectors (inherently diversified) as well as of various reasonable company sizes (in terms of market capitalization). Additionally, other filtering criteria inherent in a reasonable-size index (in terms of the number of constituents), seem to provide a good baseline both as a benchmark (to outperform) and as a sensible constraint to the universe of all potentially-considered stocks.

Preliminary Setup

Our first step is to select a subset of stocks from a given index in which to allocate a given investment at any given point in time. Here, and in general, we can start by considering a subset of n stocks from a given index I with K constituents. Our evaluations considered I=S&P 500 (for which K=500K=500) and assembling baskets of n=20n=20 each time. Based on this setup, there is a total of C205002.667×1035C205002.667×1035 unique baskets of randomly selected securities that we could potentially consider. Clearly, if we require evaluating some optimal objective function over all possible combinations, this becomes computationally infeasible.

Instead, we proceed by selecting stocks according to some plausible robust criterion that can be applied to any randomly assembled basket (the most appealing, both in terms of interpretation and prior results, being the median of the portfolio daily-returns). We also note, and quite emphatically so, that to both evaluate a meaningful amount of portfolios as well as to assess procedure repeatability, we clearly need to parallelize this effort.

Algorithm

Consider the following algorithm:

1. Step 1. Pick n-stocks (e.g. n=20n=20) from the S&P 500 Index at random.
2. Step 2. Form Portfolio j (start with j=1j=1) at time t=0t=0, i.e. Pj(t=0)Pj(t=0), by equal-weight investment in these n-stocks.
3. Step 3. On a day-to-day basis (and for T trading days in any given year) compute the daily-returns for Portfolio j:
rj(t):=Pj(t)-Pj(t-1)Pj(t-1);t=1,2,...,Trj(t):=Pj(t)-Pj(t-1)Pj(t-1);t=1,2,...,T
(1)
4. Step 4. Sort these for the years trading days.
5. Step 5. Calculate the median daily-return for Portfolio j, let Pj˜:=median(Pj)Pj˜:=median(Pj).
6. Step 6. Repeat Steps (1-5) above for j=1,2,...,Jj=1,2,...,J (e.g. J=10,000J=10,000) additional randomly selected portfolios.
7. Step 7. Pick the portfolio with the highest median, i.e. Pj*Pj* s.t. j*=argmaxj{1,2,...,J}[Pj˜]j*=argmaxj{1,2,...,J}[Pj˜].
8. Step 8. Invest equally in Pj*Pj*.
9. Step 9. Hold for one year, then liquidate.
10. Step 10. Repeat Steps (1-9), yearly, over the time-frame of interest.

We pick a subset of n=20n=20 yearly “investible” stocks according to the aforedescribed criterion at the end of any given year (using the most recent one-year data). We then allocate our investment quantity in this portfolio, on the first trading day of the subsequent year, holding it for one year and concurrently collecting data during this year to repeat this procedure at the end of the year. We essentially keep on repeating this procedure over the period for which we want to evaluate the strategy.

It is also interesting to note that under the previous motivating rule (i.e. the Max-Median Rule), we would always get an exact answer regarding which stocks had the single highest max-medians, in a finite, rather short, amount of time (an essentially NPNP-complete problem). This implied determinism, in the sense that any subsequent runs would produce the same results, amounts to a variance of zero. However, it is rather evident that our modified algorithm is inherently stochastic as we cannot evaluate all possibly imaginable combinations of portfolios. As a direct consequence, and by randomly selecting a reasonable number of portfolios for evaluation we expect to observe some natural variation, in the sense of the procedure's repeatability (each run will be essentially unique). It is possible (and rather interesting) to exploit this natural variation to assess the overall repeatability of this modified procedure.

Data Summary and Description

Data were obtained from the University of Pennsylvania Wharton/WRDS Repository [4]. The following data were utilized for our evaluations:

1. S&P 500 December Constituents' GVKEYs, 1965 to 2006 (Compustat).
2. S&P 500 Daily Data [including: Returns with Dividends, Share Price, Shares Outstanding, Adjustment Factors, PERMNOs (CRSP)].
3. Mapping Table from GVKEYs to PERMNOs.

Data were also obtained from Yahoo! Finance:

1. Company Tickers for S&P 500 December 2007 Constituents.
2. Index Returns for SPX (S& P 500 Market-Cap Weighted).
3. Index Returns for SPX.EW (S& P 500 Equally Weighted, available from mid-2003 to present).

For our evaluations we note that our yearly returns with dividends were calculated from the first trading day to the last trading day per year and that dividends were included. Also the size of the data files analyzed was approximately 900MB.

Parallel Processing Environment and Software

It is worthwhile mentioning some general details regarding the overall parallelized implementation of this procedure. It was successfully implemented using the software R, widely and freely available from the Comprehensive R Archive Network (CRAN). Several packages available for R, make a parallelized implementation of the algorithm very straightforward. In particular, we made use of snow (see, e.g. [5] and [6]), and snowfall (see [7]), both running over open-MPI. Some of the reasons for choosing this implementation were:

1. Framework provides a powerful programming interface to a computational cluster (such as those available at Rice University, e.g. SUG@R and ADA).
2. Freely available under the Comprehensive R Archive Network (CRAN).
3. Easily distributes computations of existing functions (after pertinent modifications) to various computation nodes.
4. Excellent for embarrassingly-parallel implementations and computations.

In essence, this approach was very appealing in terms of performance, development time, and cost (essentially free). Although faster and more efficient implementations are possible (e.g. C/C++ and Fortran with open-MPI, the aforementioned implementation was sufficient for our purposes).

The code utilized was initially developed for sequential execution (in SAS) and then converted to R with similar performance. It was subsequently converted from sequential to parallel to exploit the benefits of a parallel R-implementation. The standard steps for this conversion process are pretty standard, essentially:

1. We identify the loops which carry independent computations. we have two main loops. Firstly, simulating J portfolios can be regarded as J independent operations, which we can execute concurrently. Secondly, running the algorithm over a number of N years, can be regarded as N independent computational efforts.
2. Vectorize loops (Workhorse Functions).
3. Gather results (Combine/Gather Functions).
4. Distribute Execution (Call functions with snow).

Several 64 processor jobs were submitted to Rice's Cray XD1 Research Cluster (ADA) - occupying 16 nodes, each with a total of 4 processors. The jobs would take less than 20 hours to complete, for about 64 simulated tracks of 40 years each.

Results

Preliminary Results

Several simulations were run with observed, above average performance (number of portfolios inspected per simulation was in the range of J=25,000J=25,000 to J=50,000J=50,000). Figure 1 shows a simulation of J=50,000J=50,000 portfolios per year over a period of 43 years. Several interesting features can be noted through this figure. We can appreciate that any investor that made up a portfolio with an initial investment of $100,000 in 1970 and selected the same stocks chosen by our algorithm would have allowed his or her portfolio to compound to a total of$3.7M (by the end of 2008), which performed better than both an equal-investment strategy in the S&P 500 Index (about $2.7M) or a market-cap weighted investment strategy in the S&P 500 Index (slightly below$1M). Of course, we can imagine that the computational power that was used was not available in the early seventies, but moving forward it will be and to an extent this is what matters. Also it is clearly seen that the Coordinated Max-Median Rule is inherently a more volatile rule (as compared to the S&P 500). Next, Figure 2 describes 1 of 3 pilot runs that were evaluated with various measures suggesting the superiority of the max-median (as opposed to, say for instance the mean, as well as the benefit of using the “max” rather than the “min”). This seems plausible, at least heuristically, as the median is most robust (the mean is least robust), and in some sense the previous years “best” performing companies are more likely to perform better next year than the previous years “worst” performing companies (in terms of returns).

Recent Evaluations and Results

Recent evaluations have been mostly focused on evaluating the following:

1. Repeatability of the procedure in terms of the variability associated with its possible tracks for each realization.
2. Determining any additional gain (if any) in terms of returns as a function of the number of portfolios evaluated (J) at any given year.
3. The existence of any indications between current year portfolios' medians and subsequent year (same portfolio) performance. Are there any associations and if so how weak or strong are these?
4. Investigating a stable and plausible stopping rule and assessing how beneficial it might be to run the random search until this condition is met.

Several experiments were set up to determine if it would be worthwhile to inspect more randomly sought portfolios on a yearly basis as part of the overall procedure. A job simulating a total of 104 tracks (each consisting of J=25,000J=25,000 portfolios per year over a 43 year period, 1965 though 2008) was submitted to ADA and took approximately three days to complete. Several important observations can be made from the outcomes of these simulations (shown in Figures 3 and 4, below). We note that, here, we can exploit the independence regarding the portfolios evaluated to get 52 tracks of J=50,000J=50,000 portfolios each by combining pairs of J=25,000J=25,000 tracks and selecting the maximum of the pair (simply the maximum of a longer execution). Essentially this gives us information regarding what would have happened (in terms of the performance of the strategy should we have run it for twice as long). Analogously, tracks for J=100,000J=100,000 and J=200,000J=200,000 portfolios were constructed. Finally, some overall discussion of the results is given after the figures.

The total portfolio value was evaluated at the end of both years 2006 and 2008 and contrasted to both market performance (blue track) and the performance of a single track of a whopping J=2,600,000J=2,600,000 portfolios considered yearly (green track). As expected the variability of the procedure compounds as a function of time, and by chance we might under-perform the market. However, more often than not the procedure out-performed the market and by a quite reasonable amount. The proportion of portfolio-tracks simulated that were over an equally-weighted alternative at the end of 2006 was over 80% (for the cases where J=25,000J=25,000 and J=50,000J=50,000) and over 90% (for the cases where we assessed more randomly sought portfolios, i.e. J=100,000J=100,000, and J=200,000J=200,000). Also, there is weak evidence suggesting that, although running as many as J=2,600,000J=2,600,000 portfolios might at times outperform the market, this approach is generally not consistently higher on-average than considering tracks consisting of less yearly-evaluated portfolios.

Another rather interesting observation is made through the scatter-grams produced (see Figure 5, below) assessing the correlation between current year portfolio median and (same portfolio) next year performance contrasted to the performance of the S&P 500 index. The number of portfolios evaluated for this purpose was 2,000,0002,000,000 and the those that are highlighted as producing the maximum of the medians represent (<0.1%<0.1%, i.e. <2,000<2,000). The main purpose of this effort was to assess any associations between the current year medians as a forward-looking measure of portfolio performance (as we intend to pick the maximum and by chance we can pick portfolios of performance similar to those in the top 0.999 percentile). As expected the associations are weak, though not extremely weak (correlations are 0.2090.209 for the first case and -0.182-0.182 for the second), however can be noticed and depend highly of the year evaluated.

More often than not, we observed a positive correlation for the years inspected (the strongest correlations are those shown in the figures below). It turns out that for certain years (those with a negative correlation), we ought to utilize the “min-median” as a selection criterion. However, this cannot be known ex-ante, and the best we can do is utilize a measure that more often than not, produces above-average results. Here again, we can appreciate how these conflicting effects would average-out with time in a favorable direction, reiterating the fact that a strategy such as this one, if considered, should be evaluated over the long-haul.

Lastly, several evaluations were performed comparing the various max-medians of the portfolios simulated as a function of the number of portfolios run (i.e. J) and compared to the single-stock max-median (See Figure 6 below), which could, at least heuristically, serve as an upper bound. This resulted (empirically) to be somewhat unstable as there is no guarantee that any thresholds set in terms of percentage to the bound could be attained in any reasonable computing time, mainly due to the fact the after a reasonable amount of simulations (namely J=500,000J=500,000 and up to J=2,000,000J=2,000,000) the percentages of this single-stock max-median attained depended considerably on the year inspected, making a generalization impossible. The most recent evaluations were performed with stopping after 5 ticks past J=10,000J=10,000 simulations, which seems stable, however based on aforementioned results it seems to not provide any incremental benefit when contrasted to, for instance a hard-coded constant J stopping rule.

Future Directions

Several items are open at this point that might be worthwhile investigating in future research. Amongst them are the following (to mention a few):

1. The identification and investigation of any exogenous variables contributing to any observable associations between current-year portfolio medians and next-year portfolio performances. This is of particular interest as it would provide us with the possibility of meaningfully modifying the simple-criterion to make more informed decisions based on empirical evidence.
2. Considering data from previous years to make the decision at a given year (rather than only considering data from the previous year) as well as investigating any robust-type interpolations (e.g. median or quantile related regression methods).
3. Assessing the reproducibility of the procedure (or in general its performance) in other markets (international) and or other indexes (S&P 100, Russel 1000, NASDAQ, etc.)
4. Investigating a more meaningful rule regarding when to stop the random-search, and how it relates to overall procedure performance.

Conclusions

In this module, we have presented the details of a modified version of the existing Max-Median Rule allowing for the joint selection of securities within this long-term investment strategy. This modified rule, namely the Coordinated Max-Median Rule, essentially bases the median selection criterion on the joint portfolio performance, rather than on single-stock individual performances. We saw that these modifications came with a cost of increased combinatorial complexity and that due to the impossibility of evaluating all potentially-investible portfolios, a parallelized computational approach had to be considered to assess a satisfactory number of portfolios on a yearly basis for potential investment. The algorithm's implementation was discussed, and several conclusions were drawn, the most significant being that our modified algorithm, much more often than not, seems to out-perform the market (in terms of the S&P 500 Index) when a disciplined investor adheres to it for a reasonable amount of time. The data suggest that one of the contributing factors for this on-average higher performance, at least in part, are the correlations between current year portfolio medians and next year portfolio performance, which seem both weak and not always positive. We noted that, more often than not, these correlations tend to be positive, an effect that seemingly averages out in a positive direction over the long-haul. We have also evaluated the performance of the described procedure on real-world S&P 500 data consisting of 43 years, and several potential future improvements, such as further work regarding a more robust stopping rule and the assessment of the procedure reproducibility with other indexes and or markets, were discussed.

Acknowledgements

Special thanks are given to Drs. James Thompson and Scott Baggett, as well as to Drs. Linda Driskill and Tracy Volz, for their overall help and coaching throughout this summer research project. In particular special thanks are given to both the NSF and VIGRE for making this research a reality.

Bibliography

1. O'Shaughnessy, James P. (2003). What Works on Wall Street. A Guide to the Best-Performing Investment Strategies of All Time (Third Edition).
2. Thompson, James R., Baggett, L. Scott (2005). Everyman's Max-Median Rule for Portfolio Selection.
3. Thompson, James R., Baggett, L. Scott, Wojciechowski, William C. and Williams, Edward E. (2006). Nobels for Nonsense. The Journal of Post Keynesian Economics, Fall, pp. 3-18.
4. Wharton Research Data Services (URL: http://wrds.wharton.upenn.edu/)
5. Rossini, A., Tierney, L., and Li, N. (2003). Simple parallel statistical computing. in R. UW Biostatistics working paper series, Paper 193, University of Washington.
6. Tierney, L., Rossini, A., Li, N., and Sevcikova, H. (2004). The snow Package: Simple Network of Workstations. Version 0.2-1.
7. Knaus, Jochen (2008). Developing parallel programs using snowfall

Content actions

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks