By Vishnu Ravi — Physician & Software Developer
Around this time of year, final-year medical students are preparing their rank lists for the residency match which, for those who aren’t familiar, will determine where they complete their residency training and what field of medicine they will practice. They’ll submit their preferences on February 24th, and find out where they matched on March 18th, 23 days later. But how long does it take for a computer to figure out the match? Certainly not 23 days — but days? hours? minutes? seconds? After several friends asked me this question, I thought it would be fun to design and run a simulation experiment to find out.
The problem of matching residency programs with applicants is basically a generalization of the stable marriage problem in computer science, which is the problem of trying to find stable pairings between two sets of people (i.e., men and women who want to get married). A set of pairings is considered stable if there is no pair that prefers each other to their assigned matches. In other words, we don’t want any two people to have the incentive to elope. Each person creates a preference list, ranking the members of the other set from most preferred to least preferred, and these lists are used to calculate stable matches.
Of course our problem is a little bit different because most residency programs have more than one position available, so instead of a one-to-one match, it’s a many-to-one match. However, if you think of each program as actually being n programs where n is the number of positions that program has to offer and all n positions have identical preferences, then it is essentially the same problem. In terms of the stable marriage problem, applicants can be thought of as “wives” and each hospital as a set of identical “husbands”.
There is also usually a surplus of applicants, so a stable match is not found for every applicant. In addition, there are other “match variations” such as couples entering the match together rather than as individuals, applicants linking advanced and preliminary programs in order to match simultaneously to both, and residency programs requesting an odd or even number of matches. While these variations make our problem a bit more complex, it’s still the same old stable marriage problem at the core.
In 1962, David Gale and Lloyd Shapley published an algorithm to solve the stable marriage problem. The algorithm works by a number of “rounds”. The process begins with each member of one group—say, the women— “proposing” to the member of the other group—in this case, the man—that she likes best. Each man then reviews the proposals he received, tentatively holds on to the best proposal and rejects the rest. In the next round, the rejected women propose to their next best choice, and the men again retain the best proposal and reject the rest. This process continues until there are no more women left to propose, at which point the men accept the best proposals among those they’ve held onto. Assuming an equal number of men and women in the problem, everyone will have been paired up. Gale and Shapley proved mathematically that the matches created by this method would always be stable. This method is called “deferred acceptance”.
In our example, it might seem that the men are at an advantage, but actually, it’s the group that does the proposing that is preferred by the algorithm. While the algorithm always finds matches that are stable, there is a tradeoff between whether the men always get their optimal woman, or the women always get their optimal man. This is why the algorithm takes either a male-optimal or female-optimal form. We can also show mathematically that the male-optimal form is also the female-pessimal form, and vice versa. In practical application, the difference between the two is found to be small, though it does exist.
Alvin Roth and others did further work on this problem, leading to Shapley and Roth receiving a Nobel Prize in 2012. Among the things they worked on was the issue of couples in the match wanting to be placed in the same location. Their solution involved running the match first with only single applicants involved, then adding in the couples one by one, using a similar “proposal” mechanism which works down their list until they find a pair of hospitals willing to accept them. The couples might displace applicants in the process, and these displaced applicants are brought back into the match again in another round. There are a number of sophisticated techniques used to combat the instabilities that may result.
While the math showed that having couples in the match could theoretically prevent a stable matching from being found, the algorithms virtually never have this problem. The reasons are not completely understood, but probably have to do with the relatively small number of couples in the match, the fact that the number of distinct programs ranked by a couple is small relative to the number of possible programs, and many programs having unfilled positions at the end of the initial match.
The famous Gale-Shapley “deferred acceptance” method for stable matching is the basis for the National Resident Matching Program match algorithm, officially known as the Roth-Peranson algorithm, which finds stable pairings between graduating medical students and residency positions. Interestingly, the NRMP had independently discovered the basic idea behind this method, and was using it even before Gale and Shapley formally proved and published it. The NRMP algorithm used today is more complex as it must deal with several match variations including couples, but it is still largely based on Gale-Shapley. As we discussed before, there are two forms of the algorithm — hospital-optimal and student-optimal. When a student-optimal form is used, no student can improve his or her match by submitting a rank list that is different from his or her true preferences. In my experiment, I used the student-optimal solution, as the NRMP currently uses this version. Until 1997 though, the NRMP used a hospital-optimal solution. This became the subject of much controversy until they decided to make a change to the alternate form, although very few matches were actually affected by the switch.
Since the NRMP hasn’t made their algorithm’s code public, I used my own version of the Gale-Shapley algorithm adapted from an implementation by Rob W. Irving. In order to run my simulation, I generated a dataset closely mirroring the 2014 Main Resident Match. I created 34,270 applicants, each having a rank list with an average of 10 programs. I also generated 4,735 residency programs having a total number of 29,671 positions, and ranking an average of 60 applicants each. At this point, I had a database representing the entire US residency match, and I was ready to run the algorithm.
I set up my testing environment on an Amazon EC2 cloud server with the following specs: Intel Xeon E5–2670 v2 processor with a clock speed of 2.5 Ghz, 2 vCPUs, 15 GiB of RAM and a 32 GB SSD running Ubuntu Server 14.04 LTS. This is no supercomputer — actually not much more powerful than the Macbook Pro I’m using to write this article, and at a cost of $4.20 per day to use, it isn’t a particularly expensive system, either. After running the algorithm 100 times, I found that it took an average of 17 seconds to process all 34,270 applicants, including loading the dataset, calculating the stable matches, and saving the results to a database.
So there you have it. The fate of every medical student’s life may very well be decided in less than a half a minute!
You can watch my simulation run below! Note that printing everything out to the screen does slow things down a little bit.
If you’re interested in trying this simulation yourself, I’ve made the source code available here.
You can check out my other projects at vishnu.io. I’m not associated with the NRMP or any other matching organization. The experiment described here was just done for personal curiosity and is not meant to be scientific or official in any way.
Update: Thank you so much for the amazing response to this article! I’m including some of the most commonly asked questions below:
- What determines the time it takes for an algorithm to run?
The time it takes for an algorithm to run — in this case, the “wall-clock time” — depends on many factors including the number of steps or calculations the algorithm must perform to solve the problem, the speed at which the CPU can perform those steps, the speed at which data can be transferred to and from memory, the amount of data that must be transferred, and delays caused by other processes running concurrently.
2. How can a couple cause instability in the match?
Here’s an example — Let’s say we have a couple A and B who have been assigned to Hospitals 1 and 2 respectively. Now a single applicant C proposes to Hospital 1, and Hospital 1 prefers this applicant to A. C will then displace A from Hospital 1, and since A and B are coupled, B will also be displaced from Hospital 2. One hospital ends up winning and the other loses. If we force 1 to take A so 2 can have B, 1 will lose C and be unhappy. On the other hand, if we force 2 to give up B so 1 can have C, 2 will be unhappy.
3. Why does the NRMP take 23 days to return the results?
The NRMP has answered this question here.
- Gale, D and L. S. Shapley. “College Admissions and the Stability of Marriage,” American Mathematics Monthly, 69:9–14, 1962.
- Gusfield, D and R.W. Irving. “The Stable Marriage Problem: Structure and Algorithms.” MIT Press, Boston, Mass., 1989.
- Roth, Alvin E. “The Economics of Matching: Stability and Incentives,” Mathematics of Operations Research, 7:617–628, 1982.
- Roth, Alvin E. “The National Residency Matching Program as a Labor Market,” Journal of the American Medical Association, April 3, 1996, Vol. 275, No. 13, pages 1054–1056.