Morse, Little, Rikers and Beyond
Disclaimer: All information in this post is publicly available, either in the Independent Rikers Commission (a.k.a., Lippman Commission) report or in the report that I authored while working as Assistant Commissioner of Operations Research, that was made public through a Freedom of Information Act (FOIA) request made by the Nonprofit Vital City. Moreover, the contents of this post are only my opinions and do not represent the views of the city or any other entity I am or was affiliated with.
“Operations research is a scientific method of providing executive departments with a quantitative basis for decisions regarding operations under their control.”
Out of all the definitions of Operations Research, this is the one I like most. It really comprises the quantitative spirit of the discipline while emphasizing that ultimately, the goal is problem solving. The quote is attributed to Philip M. Morse (the father or OR in the U.S.) in a John D. Little’s article commemorating the 50th anniversary of the Operations Research Journal.
Some Background
In this post, we talk about connecting one of Little’s key discoveries (his eponymous law) and a very relevant set of problems that are in great need of OR solutions. Out of these problems, Rikers Island has been infamously at the center of innumerable controversies and plagued with problems since its construction. The aim of this post is not to comment on the social and justice aspects of the NYC correctional system, but rather talk about a common set of challenges that stochastic systems (such as Rikers Island) face because of their scale, and some quantitative strategies to guide urban planning for either replace existing facilities or build new ones.
As a motivation and context to this problem, Rikers is set to be closed by 2027 and replaced by different facilities in each borough. When this plan was set into law in 2019, the population of Rikers was in a steady decline, and it was then decided that 3,300 beds was enough capacity for the new facilities. Long-story short, a combination of factors including the pandemic, a different set of bail-laws and patterns of crime have set the population in the opposite direction. As part of my work I was tasked with creating projections of the population in the future in a data-driven way back in 2022-23. This post is a commentary on that analysis, which I believe has value beyond correctional facilities, and can be applied to many large scale stochastic systems (such as airports, traffic roads and so on).
The Forecast Challenge and the Connection with Little’s Law
Rikers (or any correctional facility) is a stochastic system where there is a series of arrivals that stay a random amount of time (in this case, pretrial cases) before exiting the system. Little’s law is the swiss knife of a scientist studying stochastic systems. The simplicity and beauty of the formula \(L=\lambda w\), expressing that the average number in the system \(L\) is equal to their arrival rate \(\lambda\) times the average waiting time \(w\) provides simple answer to in general a difficult question.
While an incredibly useful relation, it doesn’t apply then the arrival rate is time-varying and not converging to an average value in the long-term. As part of the task of building a prediction model for the future population in the system, I went back to the proof of Little’s law to understand the mechanics on how to modify it to the time-varying case and making it transient. Here, transience refers to the ability to have forecasts per month into the future, this is important as the population and arrivals exhibit heavy seasonality.
This study led to this modified version of Little’s law for time-varying arrival rates:
Theorem: Average Population in System for Time-varying Arrival Rates (a transient Little’s Law)
Consider a continuous time stochastic system with time epochs \(t=0,1,2,\dots\) with arrivals given by arbitrary exponential rates \(\lambda_1,\lambda_2,\dots\), such that the arrival rate between $(t-1,t)$ is $\lambda_t$, each arrival spend an independent exponential time with mean $w$ in the system. The average number in the system \(L_t\) at time \(t\) is given by:
\[L_t = (1-e^{-1/w})\lambda_t w +e^{-1/w}L_{t-1}\]
The theorem states that the average number in the system is a combination of what’s implied by Little’s law \(\lambda_t w\) plus the number in the system the previous period that haven’t completed their time in the system. Remarkably this retrieves the original Little’s law in the case where $\lambda_1=\lambda_2=…$ and also in the case where \(w\rightarrow 0\) (meaning that the service is fast enough for the number in system to converge to the quantity implied by Little’s law before the rate changes). Lastly, in the case where the arrival rates converge, again Little’s law is retrieved.
Predicting the Population at Rikers
The previous theorem is useful in two ways: on one hand it provides a way to predict the future population by modeling the arrival rates (say, with a time-series model); on the other, and more importantly, it also provides a model that can quantify how changes in policy (recall that the average time in the system \(w\) is determined by the time the case takes to process) change the average number in population. This is specially useful to quantify how speeding up the court system can do to reduce or change the population at Rikers, the kind of analysis that while hinted intuitively in the Lippman report, is lacking a quantitative basis.
Using this model, we forecasted the arrival rates as a time series model \(\hat\lambda_t = f({\lambda_{t-j}}_{j=1}^k)\) and used the result in our Theorem. This exercise produced the following results:
{
"type": "line",
"data": {
"labels": [
"2023-02",
"2023-03",
"2023-04",
"2023-05",
"2023-06",
"2023-07",
"2023-08",
"2023-09",
"2023-10",
"2023-11",
"2023-12",
"2024-01",
"2024-02",
"2024-03",
"2024-04",
"2024-05",
"2024-06",
"2024-07",
"2024-08",
"2024-09",
"2024-10",
"2024-11",
"2024-12",
"2025-01",
"2025-02",
"2025-03",
"2025-04",
"2025-05",
"2025-06"
],
"datasets": [
{
"label": "Predicted Population",
"fill": false,
"lineTension": 0.1,
"backgroundColor": "rgba(75,192,192,0.4)",
"borderColor": "rgba(75,192,192,1)",
"borderCapStyle": "butt",
"borderDash": [],
"borderDashOffset": 0,
"borderJoinStyle": "miter",
"pointBorderColor": "rgba(75,192,192,1)",
"pointBackgroundColor": "#fff",
"pointBorderWidth": 1,
"pointHoverRadius": 5,
"pointHoverBackgroundColor": "rgba(75,192,192,1)",
"pointHoverBorderColor": "rgba(220,220,220,1)",
"pointHoverBorderWidth": 2,
"pointRadius": 1,
"pointHitRadius": 10,
"data": [
5829.94,
5866.9,
5708.24,
5844.03,
5888.77,
6071.46,
6218.77,
6432.59,
6613.27,
6531.51,
6378.37,
6480.73,
6500.96,
6601.87,
6494.85,
6672.93,
6752.78,
6965.04,
7137.57,
7373.15,
7572.79,
7507.74,
7369.42,
7485.02,
7517.15,
7628.8,
7531.52,
7718.45,
7806.38
],
"spanGaps": false
},
{
"label": "Actual Population",
"fill": false,
"lineTension": 0.1,
"backgroundColor": "rgba(255,99,132,0.4)",
"borderColor": "rgba(255,99,132,1)",
"borderCapStyle": "butt",
"borderDash": [],
"borderDashOffset": 0,
"borderJoinStyle": "miter",
"pointBorderColor": "rgba(255,99,132,1)",
"pointBackgroundColor": "#fff",
"pointBorderWidth": 1,
"pointHoverRadius": 5,
"pointHoverBackgroundColor": "rgba(255,99,132,1)",
"pointHoverBorderColor": "rgba(220,220,220,1)",
"pointHoverBorderWidth": 2,
"pointRadius": 1,
"pointHitRadius": 10,
"data": [
5921,
5911,
5943,
6018,
6044,
6152,
6181,
6179,
6171,
6148,
6072,
6109,
6151,
6285,
6324,
6315,
6376,
6386,
6440,
6512,
6573,
6626,
6605,
6603,
6760,
6982,
7264,
7517,
7611
],
"spanGaps": false
},
{
"label": "Confidence Interval",
"fill": "+1",
"backgroundColor": "rgba(128, 128, 128, 0.2)",
"borderColor": "transparent",
"pointBorderColor": "transparent",
"pointBackgroundColor": "transparent",
"data": [
5500.52,
5367.83,
5081.43,
5116.54,
5081.53,
5199.87,
5295.09,
5466.24,
5611.88,
5501.66,
5325.06,
5407.49,
5411.21,
5498.15,
5379.76,
5547.81,
5619.36,
5824.38,
5990.81,
6221.11,
6416.29,
6347.85,
6206.81,
6319.72,
6349.69,
6459.37,
6360.74,
6546.08,
6632.77
]
},
{
"label": "C.I. ",
"fill": false,
"lineTension": 0.1,
"backgroundColor": "rgba(255,206,86,0.4)",
"borderColor": "transparent",
"pointBorderColor": "transparent",
"pointBackgroundColor": "transparent",
"data": [
6159.36,
6365.96,
6335.05,
6571.53,
6696.01,
6943.04,
7142.45,
7398.95,
7614.65,
7561.36,
7431.67,
7553.96,
7590.72,
7705.59,
7609.95,
7798.05,
7886.21,
8105.7,
8284.32,
8525.19,
8729.3,
8667.62,
8532.02,
8650.31,
8684.62,
8798.24,
8702.31,
8890.83,
8979.98
],
"spanGaps": false
}
]
},
"options": {
"aspectRatio": 1,
"scales": {
"y": {
"min": 4827,
"max": 9429
}
}
}
}
Given that it was a forecast done in January 2023, the results are in fact very good (holding well into 2025).
From Theory to Practice
The results of this analysis helped the city move into what I think is the reasonable direction of increasing the original capacity of 3,300 beds to 4,500 for the new jail facilities in the boroughs (the so-called Borough-based Jail project, estimated to cost north of $16 Billion).
While a positive result, I think the next administration still has to do serious thinking about the future of these facilities to satisfy the demand of a city as large as NYC.
From the point of view of people in our field, the message is even more striking. We need to make a greater effort in positioning our field more outwardly and less inwardly, as society clearly needs the solutions we can provide, but they don’t even know we exist. Going back to Little’s article commemorating Morse, he dubbed himself a “salesman”. Perhaps we can learn about salesmanship from our colleagues in the CS community by popularizing a probabilistic model to predict the next word in a text, and taking the world by storm. To finish the post on a positive note I’ll quote Little’s words from the same article as a form of credo on what our discipline should be:
“Throughout, one feels the strong push of the scientists to be permitted to observe field operations themselves, an almost unthinkable idea at the start. At the same time, they continuously sought to gain access to senior decision-makers by demonstrating outstanding results.”
Technical Appendix
This section deals with the technical aspects of the quantitative analysis presented, namely the proof of the Theorem and also the construction of the confidence intervals for forecasting.
Theorem’s Proof
The proof of the theorem can be divided in two parts: First, establishing that the average number in system at time $t=1$ for a $M/M/\infty$ queueing system that starts empty is $\lambda_1 w (1-e^{-1/w})$ and second, that for a system that starts with $L_0$ population in the system at time 0, the number of jobs remaining (out of $L_0$) at $t=1$ is $L_0e^{-1/w}$.
We start with the second result which is the most direct: The expected number at $t=1$ out of $L_0$ is equal to $E[\sum_{i=1}^{L_0}1{X_i\ge t=1}]=\sum_{i=1}^{L_0}P(X_i\ge 1)=L_0e^{-1/w}$, which a direct result of independence and the exponential distribution.
The first part is a little bit more elaborated: Imagine $N(1)$ jobs that entered the system between $(0,1]$, the probability that a job $i$ is in the system at time $t$ is given by $P(X_i>1-s)=e^{-(1-s)/w}$ (from the fact that the exponential has mean $w$), now between $0$ and $1$ at each infinitesimal step there is an arrival with probability $\lambda_1 dt$ (recall $N(t)$ is a Poisson Process), then, in expected value the number in system at time $t=1$ is given by:
\[\int_{0}^1e^{-(1-s)/w}\lambda_1 ds = \lambda_1 \int_0^1e^{-(1-s)/w}ds=\lambda_1 \int_0^1e^{-s/w}ds=\lambda_1 w (1-e^{-1/w})\]The rest of the proof follows by induction, as the same logic can be applied for any time $t$ by taking $L_{t-1}$ count in system. Moreover, the proof can be easily extended to non-exponential i.i.d. service times by replacing the exponential CDF with the corresponding distribution.
Confidence Intervals
The distribution of $L_t$ can be approximated by a normal random variable. The first term of the expression is a Poisson random variable with mean $\lambda w(1-e^{-\frac{1}{w}})$, then its normal approximation has the same mean and variance. The second term of the expression is a binomial random variable with success probability $q=e^{-\frac{1}{w}}$ over $L_{t-1}$ trials, its normal approximation has mean $L_{t-1}q$ with variance $L_{t-1}q(1-q)$. Adding the two approximations yields a normal random variable with mean $\lambda w(1-q)+L_{t-1}q$ and variance $\lambda w(1-q)+L_{t-1}q(1-q)=(\lambda w+L_{t-1}q)(1-q)$.
The above analysis holds when the arrival rate $\lambda$ is given. When the arrival rate also has variability (due to estimation error), the above variance expression is conditional on a fixed arrival rate $\lambda$. That is,
\[\text{var}(L_t|\lambda)=(\lambda w+L_{t-1}q)(1-q)\]Using the Law of total variance we get that the variance of $L_t$ when $\lambda$ is estimated with data (with std estimated error $\hat\sigma_t$) is:
\[\text{var}(L_t)=\text{E}(\text{var}(L_t|\lambda_t))+\text{var}(\text{E}(L_t|\lambda_t))\] \[=(\hat\lambda_t w+\hat L_{t-1}q)(1-q)+[w(1-q)]^2\text{var}(\lambda_t)+q^2\text{var}(L_{t-1})\] \[=(\hat\lambda_t w+\hat L_{t-1}q)(1-q)+(w(1-q)\hat\sigma_t)^2+q^2\text{var}(L_{t-1})\]Then, a prediction for the population using an econometric model for the arrival rate has expected value $\hat L_{t+j}=\hat\lambda_{t+j} w(1-e^{-\frac{1}{w}})+\hat L_{t+j-1}e^{-\frac{1}{w}}$ and variance $\text{var}(L_{t+j})=(\hat\lambda_{t+j} w+\hat L_{t+j-1}e^{-\frac{1}{w}})(1-e^{-\frac{1}{w}})+(w(1-e^{-\frac{1}{w}})\hat\sigma_{t+j})^2+e^{-\frac{2}{w}}\text{var}(L_{t+j-1})$ with confidence intervals given by $\hat L_{t+j}\pm c(\alpha)\sqrt{\text{var}(L_{t+j})}$ where $c(\alpha)$ is the critical value at a confidence level $\alpha$ of a standard normal random variable. For $\alpha=95\%$, the value of $c(95\%)=1.96$.